Skip to content

Distributed snapshots

Matthieu Schaller requested to merge distributed_io into master

This implements a new way of writing snapshots by distributing them over multiple files. Each MPI rank writes its own file with the particle it has. This allows to exploit compression and will help with post-processing tools that can't handle very large files (such as VELOCIraptor).

Features are:

  • There is no shuffling of the particles which means that the files can have a rather uneven distribution of particles.
  • There is also no option to choose the number of files written.
  • The meta-data is replicated exactly in each file. The only difference is the NumPart_ThisFile array in the header.
  • The files are named base_name_N.M.hdf5 where N is the number of the snapshot (as before) and M the file within that snapshot. The header also contains the ThisFile entry containing the value of M.
  • A new sub-directory for each snapshot is created to contain all the files of that snapshot.

This mimics Gadget-2's behaviour apart from the lack of particle redistribution and the fixed number of files. Gadgetviewer is happy with the files.

The distributed dump is written when setting the YAML parameter Snapshots:distributed to 1.

Additional changes include:

  • Small fixes to the parallel and serial i/o.
  • Better naming convention for the functions in the other i/o modes.
  • The Cell meta-data has changed slightly. The Offsets array has been renamed OffsetsInFile. A new array Files indicate in which file the particles are found. This array is an array of zeros in the non-distributed case.
  • Simplification of the code in io_write_cell_offsets().
  • Moved the writing of the simulation meta-data to a unique function called by all four i/o methods.
Edited by Matthieu Schaller

Merge request reports