This implements multiple changes to the i/o, especially the parallel version.
- Wrap particles back into the box before writing them (Fix to #374 (closed)).
- Use the threadpool to construct the internal buffers that are sent to HDF5.
- Make the buffer construction generic across all three i/o routines.
- Allow for more general functions to transform particles into i/o buffer quantities.
- Choose more optimised ROMIO algorithms for parallel-io and delay the writing of meta-data until the file is closed.
This last change allows for a writing speed of 6GB/s on the cosma-6 Lustre storage. (Solved #115 (closed))