Adds ParMETIS repartitioning to calculate the new cell graph across all the MPI nodes.
ParMETIS also has methods that refine an existing solution, not just create a new one, which can be used to reduce the amount of particle movement.
As part of this work we are reducing the number of repartitioning techniques offered (these tend to give marginally worse solutions in tests) to "costs/costs", "costs/none", "none/costs" and "costs/time", that is balanced, vertex only, edge only and edges weighted by the expected time of next updates.
Initially this work intended to remove support for METIS, but testing the actual runtimes shows that that produces the best balance, so we are keeping that and offering it as a continuing option, just enhanced by the ParMETIS options. This may prove to be a better choice at larger scales than our current simulations.
Other significant updates in this request:

initial partitioning schemes given more obvious names:
 grid, region, memory or vectorized region balances by volume, and memory by particle distribution, the others are only interesting when (Par)METIS is not available.

weights are now calculated using floats and the sum is scaled into the range of
idx_t
. That should avoid integer overflow issues. 
Weights are no longer used from any MPI tasks. The position of these is strictly a free parameter of the solution.

The balance of weights between vertices and timebins is defined as an equipartition. Previously the limits were matched.

new clocks function
clocks_random_seed
to return "random" seeds based on the remainder of the current number of nanoseconds.