Commit ddc81b1e authored by Matthieu Schaller's avatar Matthieu Schaller

Compression tests

parent a26725c1
......@@ -15,3 +15,4 @@ parameter files.
parameter_description
output_selection
lossy_filters
.. Lossy compression filters
.. _Compression_filters:
Compression Filters
~~~~~~~~~~~~~~~~~~~
Filters to compress the data in snapshots can be applied to reduce the
disk footprint of the datasets. The filters provided by SWIFT are
filters natively provided by HDF5, implying that the library will
automatically and transparently apply the reverse filter when reading
the data stored on disk. They can be applied in combination with, or
instead of, the lossless gzip compression filter.
**These compression filters are lossy, meaning that they modify the
data written to disk**
*The filters will reduce the accuracy of the data stored. No check is
made inside SWIFT to verify that the applied filters make sense. Poor
choices can lead to all the values of a given array reduced to 0, Inf,
or to have lost too much accuracy to be useful. The onus is entirely
on the user to choose wisely how they want to compress their data.*
The filters are not applied when using parallel-hdf5.
The available filters are listed below.
Scaling filters for floating-point numbers
------------------------------------------
The D-scale filters can be used to round floating-point values to a fixed
*absolute* accuracy.
They start by computing the minimum of an array that is then deducted from
all the values. The array is then multiplied by :math:`10^n` and truncated
to the nearest integer. These integers are stored with the minimal number
of bits required to store the values. That process is reversed when reading
the data.
For an array of values
+--------+--------+-------+
| 1.2345| -0.1267| 0.0897|
+--------+--------+-------+
and :math:`n=2`, we get stored on disk (but hidden to the user):
+--------+--------+-------+
| 136 | 0 | 22|
+--------+--------+-------+
This can be stored with 8 bits instead of the 32 bits needed to store the
original values in floating-point precision, realising a gain of 4x.
When reading the values (for example via ``h5py`` or ``swiftsimio``), that
process is transparently reversed and we get:
+--------+--------+-------+
| 1.2333| -0.1267| 0.0933|
+--------+--------+-------+
Using a scaling of :math:`n=2` hence rounds the values to two digits after
the decimal point.
SWIFT implements 4 variants of this filter:
* ``DScale1`` scales by :math:`10^1`
* ``DScale2`` scales by :math:`10^2`
* ``DScale3`` scales by :math:`10^3`
* ``DScale6`` scales by :math:`10^6`
An example application is to store the positions with ``pc`` accuracy in
simulations that use ``Mpc`` as their base unit by using the ``DScale6``
filter.
The compression rate of these filters depends on the data. On an
EAGLE-like simulation, compressing the positions from ``Mpc`` to ``pc`` (via
``Dscale6``) leads to rate of around 2.2x.
Modified floating-point representation filters
----------------------------------------------
These filters modify the bit-representation of floating point numbers
to get a different *relative* accuracy.
In brief, floating point (FP) numbers are represented in memory as
:math:`(\pm 1)\times a \times 2^b` with a certain number of bits used to store each
of :math:`a` (the mantissa) and :math:`b` (the exponent) as well as
one bit for the overall sign [#f1]_. For example, a standard 4-bytes
`float` uses 23 bits for :math:`a` and 8 bits for :math:`b`. The
number of bits in the exponent mainly drives the range of values that
can be represented whilst the number of bits in the mantissa drives
the relative accuracy of the numbers.
Converting to the more familiar decimal notation, we get that the
number of decimal digits that are correctly represented is
:math:`\log_{10}(2^{n(a)+1})`, with :math:`n(x)` the number of bits in
:math:`x`. The range of positive numbers that can be represented is
given by :math:`[2^{-2^{n(b)-1}+2}, 2^{2^{n(b)-1}}]`. For a standard
`float`, this gives a relative accuracy of :math:`7.2` decimal digits
and a representable range of :math:`[1.17\times 10^{-38}, 3.40\times
10^{38}]`. Numbers above the upper limit are labeled as `Inf` and
below this range they default to zero.
The filters in this category change the number of bits in the mantissa and
exponent. When reading the values (for example via ``h5py`` or
``swiftsimio``) the numbers are transparently restored to regular ``float``
but with 0s in the bits of the mantissa that were not stored on disk, hence
changing the result from what was stored originally before compression.
These filters offer a fixed compression ratio and a fixed relative
accuracy. The available options in SWIFT are:
+-----------------+--------------+--------------+-------------+---------------------------------------------------+-------------------+
| Filter name | :math:`n(a)` | :math:`n(b)` | Accuracy | Range | Compression ratio |
+=================+==============+==============+=============+===================================================+===================+
| No filter | 23 | 8 | 7.22 digits | :math:`[1.17\times 10^{-38}, 3.40\times 10^{38}]` | --- |
+-----------------+--------------+--------------+-------------+---------------------------------------------------+-------------------+
| ``FMantissa13`` | 13 | 8 | 4.21 digits | :math:`[1.17\times 10^{-38}, 3.40\times 10^{38}]` | 1.45x |
+-----------------+--------------+--------------+-------------+---------------------------------------------------+-------------------+
| ``FMantissa9`` | 9 | 8 | 3.01 digits | :math:`[1.17\times 10^{-38}, 3.40\times 10^{38}]` | 1.78x |
+-----------------+--------------+--------------+-------------+---------------------------------------------------+-------------------+
| ``BFloat16`` | 7 | 8 | 2.41 digits | :math:`[1.17\times 10^{-38}, 3.40\times 10^{38}]` | 2x |
+-----------------+--------------+--------------+-------------+---------------------------------------------------+-------------------+
| ``HalfFloat`` | 10 | 5 | 3.31 digits | :math:`[6.1\times 10^{-5}, 6.5\times 10^{4}]` | 2x |
+-----------------+--------------+--------------+-------------+---------------------------------------------------+-------------------+
The accuracy given in the table corresponds to the number of decimal digits
that can be correctly stored. The "no filter" row is displayed for
comparison purposes.
The first two filters are useful to keep the same range as a standard
`float` but with a reduced accuracy of 3 or 4 decimal digits. The last two
are the two standard reduced precision options fitting within 16 bits: one
with a much reduced relative accuracy and one with a much reduced
representable range.
An example application is to store the densities with the ``FMantissa9``
filter as we rarely need more than 3 decimal digits of accuracy for this
quantity.
------------------------
.. [#f1] Note that the representation in memory of FP numbers is more
complicated than this simple picture. See for instance this
`Wikipedia
<https://en.wikipedia.org/wiki/Single-precision_floating-point_format>`_
article.
.. Parameter File
Loic Hausammann, 1 June 2018
.. Output selection
.. _Output_list_label:
......@@ -77,7 +76,7 @@ top level of this file (in the exact same way as the file dumped by using the
`-o` option in SWIFT). By default, all fields are written, but by using the
"off" string, you can force the code to skip over unwanted outputs.
You must then, in the regular SWIFT parameter file, select the following
Users must then, in the regular SWIFT parameter file, select the following
options:
.. code:: YAML
......@@ -100,6 +99,21 @@ The corresponding section of the YAML file would look like:
Entries can simply be copied from the ``output.yml`` generated by the
``-o`` runtime flag.
Alternatively, instead of "on" or "off", users can also provide the name of
one of the :ref:`Compression_filters` to write the field with reduced
accuracy and reduced disk space. The corresponding YAML file would, for
example, look like:
.. code:: YAML
Default:
Coordinates_Gas: DScale6
Masses_Gas: off
Velocities_Gas: DScale1
Densities_Gas: FMantissa9
ParticleIDs_Gas: IntegerNBits
For convenience, there is also the option to set a default output status for
all fields of a particular particle type. This can be used, for example, to
skip an entire particle type in certain snapshots (see below for how to define
......@@ -142,10 +156,11 @@ cousins. To do this, we will define two top-level sections in our
Snapshot:
Masses_DM: off
# Turn off lots of stuff in snipshots!
# Turn off and compress lots of stuff in snipshots!
Snipshot:
Metal_Mass_Fractions_Gas: off
Element_Mass_Fractions_Gas: off
Densities_Gas: FMantissa9
...
To then select which outputs are 'snapshots' and which are 'snipshots', you
......
......@@ -1327,21 +1327,6 @@ Showing all the parameters for a basic cosmologica test-case, one would have:
delta_time: 1.1 # Delta log-a between outputs
------------------------
.. [#f1] The thorough reader (or overly keen SWIFT tester) would find that the speed of light is :math:`c=1.8026\times10^{12}\,\rm{fur}\,\rm{ftn}^{-1}`, Newton's constant becomes :math:`G_N=4.896735\times10^{-4}~\rm{fur}^3\,\rm{fir}^{-1}\,\rm{ftn}^{-2}` and Planck's constant turns into :math:`h=4.851453\times 10^{-34}~\rm{fur}^2\,\rm{fir}\,\rm{ftn}^{-1}`.
.. [#f2] which would translate into a constant :math:`G_N=1.5517771\times10^{-9}~cm^{3}\,g^{-1}\,s^{-2}` if expressed in the CGS system.
.. [#f3] This feature only makes sense for non-cosmological runs for which the
internal time unit is such that when rounded to the nearest integer a
sensible number is obtained. A use-case for this feature would be to
compare runs over the same physical time but with different numbers of
snapshots. Snapshots at a given time would always have the same set of
digits irrespective of the number of snapshots produced before.
Gravity Force Checks
--------------------
......@@ -1362,3 +1347,18 @@ If ``only_when_all_active:1`` and ``only_at_snapshots:1`` are enabled together,
and all the gparts are not active during the timestep of the snapshot dump, the
exact forces computation is performed on the first timestep at which all the
gparts are active after that snapshot output timestep.
------------------------
.. [#f1] The thorough reader (or overly keen SWIFT tester) would find that the speed of light is :math:`c=1.8026\times10^{12}\,\rm{fur}\,\rm{ftn}^{-1}`, Newton's constant becomes :math:`G_N=4.896735\times10^{-4}~\rm{fur}^3\,\rm{fir}^{-1}\,\rm{ftn}^{-2}` and Planck's constant turns into :math:`h=4.851453\times 10^{-34}~\rm{fur}^2\,\rm{fir}\,\rm{ftn}^{-1}`.
.. [#f2] which would translate into a constant :math:`G_N=1.5517771\times10^{-9}~cm^{3}\,g^{-1}\,s^{-2}` if expressed in the CGS system.
.. [#f3] This feature only makes sense for non-cosmological runs for which the
internal time unit is such that when rounded to the nearest integer a
sensible number is obtained. A use-case for this feature would be to
compare runs over the same physical time but with different numbers of
snapshots. Snapshots at a given time would always have the same set of
digits irrespective of the number of snapshots produced before.
......@@ -59,7 +59,7 @@ include_HEADERS = space.h runner.h queue.h task.h lock.h cell.h part.h const.h \
velociraptor_struct.h velociraptor_io.h random.h memuse.h mpiuse.h memuse_rnodes.h \
black_holes.h black_holes_io.h black_holes_properties.h black_holes_struct.h \
feedback.h feedback_struct.h feedback_properties.h task_order.h \
space_unique_id.h line_of_sight.h
space_unique_id.h line_of_sight.h io_compression.h
# source files for EAGLE cooling
QLA_COOLING_SOURCES =
......@@ -112,7 +112,7 @@ AM_SOURCES = space.c runner_main.c runner_doiact_hydro.c runner_doiact_limiter.c
threadpool.c cooling.c star_formation.c \
statistics.c profiler.c dump.c logger.c \
part_type.c xmf.c gravity_properties.c gravity.c \
collectgroup.c hydro_space.c equation_of_state.c \
collectgroup.c hydro_space.c equation_of_state.c io_compression.c \
chemistry.c cosmology.c restart.c mesh_gravity.c velociraptor_interface.c \
output_list.c velociraptor_dummy.c logger_io.c memuse.c mpiuse.c memuse_rnodes.c fof.c \
hashmap.c pressure_floor.c space_unique_id.c output_options.c line_of_sight.c \
......
......@@ -867,6 +867,9 @@ void io_write_array(hid_t h_grp, const int n, const void* array,
error("Error while changing shape of %s %s data space.", name,
array_content);
/* Dataset type */
hid_t h_type = H5Tcopy(io_hdf5_type(type));
const hsize_t chunk[2] = {(1024 > n ? n : 1024), 1};
hid_t h_prop = H5Pcreate(H5P_DATASET_CREATE);
h_err = H5Pset_chunk(h_prop, 1, chunk);
......@@ -881,13 +884,14 @@ void io_write_array(hid_t h_grp, const int n, const void* array,
array_content);
/* Write */
hid_t h_data = H5Dcreate(h_grp, name, io_hdf5_type(type), h_space,
H5P_DEFAULT, h_prop, H5P_DEFAULT);
hid_t h_data =
H5Dcreate(h_grp, name, h_type, h_space, H5P_DEFAULT, h_prop, H5P_DEFAULT);
if (h_data < 0)
error("Error while creating dataspace for %s %s.", name, array_content);
h_err = H5Dwrite(h_data, io_hdf5_type(type), h_space, H5S_ALL, H5P_DEFAULT,
array);
if (h_err < 0) error("Error while writing %s %s.", name, array_content);
H5Tclose(h_type);
H5Dclose(h_data);
H5Pclose(h_prop);
H5Sclose(h_space);
......@@ -2588,9 +2592,9 @@ void io_prepare_output_fields(struct output_options* output_options,
for (int ptype = 0; ptype < swift_type_count; ptype++) {
/* Internally also verifies that the default level is allowed */
const enum compression_levels compression_level_current_default =
output_options_get_ptype_default(params, section_name,
(enum part_type)ptype);
const enum lossy_compression_schemes compression_level_current_default =
output_options_get_ptype_default_compression(params, section_name,
(enum part_type)ptype);
if (compression_level_current_default == compression_do_not_write) {
ptype_default_write_status[ptype] = 0;
......@@ -2655,7 +2659,8 @@ void io_prepare_output_fields(struct output_options* output_options,
int value_id = 0;
for (value_id = 0; value_id < compression_level_count; value_id++)
if (strcmp(param_value, compression_level_names[value_id]) == 0) break;
if (strcmp(param_value, lossy_compression_schemes_names[value_id]) == 0)
break;
if (value_id == compression_level_count)
error("Choice of output selection parameter %s ('%s') is invalid.",
......@@ -2666,7 +2671,8 @@ void io_prepare_output_fields(struct output_options* output_options,
if (param_is_known) {
const int is_on =
strcmp(param_value,
compression_level_names[compression_do_not_write]) != 0;
lossy_compression_schemes_names[compression_do_not_write]) !=
0;
if (is_on && !ptype_default_write_status[param_ptype]) {
/* Particle should be written even though default is off:
......
......@@ -79,12 +79,12 @@
* @todo A better version using HDF5 hyper-slabs to write the file directly from
* the part array will be written once the structures have been stabilized.
*/
void write_distributed_array(const struct engine* e, hid_t grp,
const char* fileName,
const char* partTypeGroupName,
const struct io_props props, const size_t N,
const struct unit_system* internal_units,
const struct unit_system* snapshot_units) {
void write_distributed_array(
const struct engine* e, hid_t grp, const char* fileName,
const char* partTypeGroupName, const struct io_props props, const size_t N,
const enum lossy_compression_schemes lossy_compression,
const struct unit_system* internal_units,
const struct unit_system* snapshot_units) {
const size_t typeSize = io_sizeof_type(props.type);
const size_t num_elements = N * props.dimension;
......@@ -111,7 +111,7 @@ void write_distributed_array(const struct engine* e, hid_t grp,
error("Error while creating data space for field '%s'.", props.name);
/* Decide what chunk size to use based on compression */
int log2_chunk_size = e->snapshot_compression > 0 ? 12 : 18;
int log2_chunk_size = 20;
int rank;
hsize_t shape[2];
......@@ -139,8 +139,11 @@ void write_distributed_array(const struct engine* e, hid_t grp,
if (h_err < 0)
error("Error while changing data space shape for field '%s'.", props.name);
/* Dataset type */
hid_t h_type = H5Tcopy(io_hdf5_type(props.type));
/* Dataset properties */
const hid_t h_prop = H5Pcreate(H5P_DATASET_CREATE);
hid_t h_prop = H5Pcreate(H5P_DATASET_CREATE);
/* Create filters and set compression level if we have something to write */
if (N > 0) {
......@@ -151,12 +154,12 @@ void write_distributed_array(const struct engine* e, hid_t grp,
error("Error while setting chunk size (%llu, %llu) for field '%s'.",
chunk_shape[0], chunk_shape[1], props.name);
/* Impose check-sum to verify data corruption */
h_err = H5Pset_fletcher32(h_prop);
if (h_err < 0)
error("Error while setting checksum options for field '%s'.", props.name);
/* Are we imposing some form of lossy compression filter? */
if (lossy_compression != compression_write_lossless)
set_hdf5_lossy_compression(&h_prop, &h_type, lossy_compression,
props.name);
/* Impose data compression */
/* Impose GZIP data compression */
if (e->snapshot_compression > 0) {
h_err = H5Pset_shuffle(h_prop);
if (h_err < 0)
......@@ -168,11 +171,16 @@ void write_distributed_array(const struct engine* e, hid_t grp,
error("Error while setting compression options for field '%s'.",
props.name);
}
/* Impose check-sum to verify data corruption */
h_err = H5Pset_fletcher32(h_prop);
if (h_err < 0)
error("Error while setting checksum options for field '%s'.", props.name);
}
/* Create dataset */
const hid_t h_data = H5Dcreate(grp, props.name, io_hdf5_type(props.type),
h_space, H5P_DEFAULT, h_prop, H5P_DEFAULT);
const hid_t h_data = H5Dcreate(grp, props.name, h_type, h_space, H5P_DEFAULT,
h_prop, H5P_DEFAULT);
if (h_data < 0) error("Error while creating dataspace '%s'.", props.name);
/* Write temporary buffer to HDF5 dataspace */
......@@ -217,6 +225,7 @@ void write_distributed_array(const struct engine* e, hid_t grp,
/* Free and close everything */
swift_free("writebuff", temp);
H5Tclose(h_type);
H5Pclose(h_prop);
H5Dclose(h_data);
H5Sclose(h_space);
......@@ -763,23 +772,25 @@ void write_output_distributed(struct engine* e,
/* Did the user specify a non-standard default for the entire particle
* type? */
const enum compression_levels compression_level_current_default =
output_options_get_ptype_default(output_options->select_output,
current_selection_name,
(enum part_type)ptype);
const enum lossy_compression_schemes compression_level_current_default =
output_options_get_ptype_default_compression(
output_options->select_output, current_selection_name,
(enum part_type)ptype);
/* Write everything that is not cancelled */
int num_fields_written = 0;
for (int i = 0; i < num_fields; ++i) {
/* Did the user cancel this field? */
const int should_write = output_options_should_write_field(
output_options, current_selection_name, list[i].name,
(enum part_type)ptype, compression_level_current_default);
const enum lossy_compression_schemes compression_level =
output_options_get_field_compression(
output_options, current_selection_name, list[i].name,
(enum part_type)ptype, compression_level_current_default);
if (should_write) {
if (compression_level != compression_do_not_write) {
write_distributed_array(e, h_grp, fileName, partTypeGroupName, list[i],
Nparticles, internal_units, snapshot_units);
Nparticles, compression_level, internal_units,
snapshot_units);
num_fields_written++;
}
}
......
This diff is collapsed.
/*******************************************************************************
* This file is part of SWIFT.
* Copyright (c) 2020 Matthieu Schaller (schaller@strw.leidenuniv.nl).
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License as published
* by the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*
******************************************************************************/
#ifndef SWIFT_IO_COMPRESSION_H
#define SWIFT_IO_COMPRESSION_H
/* Config parameters. */
#include "../config.h"
/**
* @brief Compression levels for snapshot fields
*/
enum lossy_compression_schemes {
compression_do_not_write = 0, /*!< Do not write that field */
compression_write_lossless, /*!< Do not apply any lossy compression */
compression_write_d_scale_1, /*!< D-scale filter of magnitude 10^1 */
compression_write_d_scale_2, /*!< D-scale filter of magnitude 10^2 */
compression_write_d_scale_3, /*!< D-scale filter of magnitude 10^3 */
compression_write_d_scale_6, /*!< D-scale filter of magnitude 10^6 */
compression_write_f_mantissa_9, /*!< Conversion to 9-bits mantissa float */
compression_write_f_mantissa_13, /*!< Conversion to 13-bits mantissa float */
compression_write_half_float, /*!< Conversion to IEEE754 half-float */
compression_write_bfloat_16, /*!< Conversion to Bfloat16 */
/* Counter, always leave last */
compression_level_count,
};
/**
* @brief Names of the compression levels, used in the select_output.yml
* parameter file.
**/
extern const char* lossy_compression_schemes_names[];
enum lossy_compression_schemes compression_scheme_from_name(const char* name);
#ifdef HAVE_HDF5
#include <hdf5.h>
void set_hdf5_lossy_compression(hid_t* h_prop, hid_t* h_type,
const enum lossy_compression_schemes comp,
const char* field_name);
#endif /* HAVE_HDF5 */
#endif /* SWIFT_IO_COMPRESSION_H */
......@@ -26,6 +26,7 @@
#include "common_io.h"
#include "error.h"
#include "inline.h"
#include "io_compression.h"
#include "part.h"
#include "units.h"
......@@ -116,6 +117,9 @@ struct io_props {
/* Pointer to the field of the first particle in the array */
char* field;
/* Lossy compression scheme to use for this field */
enum lossy_compression_schemes lossy_compression;
/* Pointer to the start of the temporary buffer used in i/o */
char* start_temp_c;
int* start_temp_i;
......
......@@ -35,10 +35,6 @@
#include "error.h"
#include "parser.h"
/* Compression level names. */
const char* compression_level_names[compression_level_count] = {
"off", "on", "low", "med", "high"};
/**
* @brief Initialise the output options struct with the information read
* from file. Only rank 0 reads from file; this data is then broadcast
......@@ -145,10 +141,10 @@ void output_options_struct_restore(struct output_options* output_options,
* @return should_write integer determining whether this field should be
* written
**/
int output_options_should_write_field(
enum lossy_compression_schemes output_options_get_field_compression(
const struct output_options* output_options, const char* snapshot_type,
const char* field_name, const enum part_type part_type,
const enum compression_levels compression_level_current_default) {
const enum lossy_compression_schemes compression_level_current_default) {
/* Full name for the field path */
char field[PARSER_MAX_LINE_SIZE];
......@@ -158,19 +154,20 @@ int output_options_should_write_field(
char compression_level[FIELD_BUFFER_SIZE];
parser_get_opt_param_string(
output_options->select_output, field, compression_level,
compression_level_names[compression_level_current_default]);
int should_write = strcmp(compression_level_names[compression_do_not_write],
compression_level);
lossy_compression_schemes_names[compression_level_current_default]);
#ifdef SWIFT_DEBUG_CHECKS
int should_write =
strcmp(lossy_compression_schemes_names[compression_do_not_write],
compression_level);
message(
"Check for whether %s should be written returned %s from a provided "
"value of \"%s\"",
field, should_write ? "True" : "False", compression_level);
#endif
return should_write;
return compression_scheme_from_name(compression_level);
}
/**
......@@ -183,7 +180,7 @@ int output_options_should_write_field(
* @param snapshot_type The type of snapshot we are writing
* @param part_type The #part_type we are considering.
*/
enum compression_levels output_options_get_ptype_default(
enum lossy_compression_schemes output_options_get_ptype_default_compression(
struct swift_params* output_params, const char* snapshot_type,
const enum part_type part_type) {
......@@ -195,12 +192,14 @@ enum compression_levels output_options_get_ptype_default(
char compression_level[FIELD_BUFFER_SIZE];
parser_get_opt_param_string(
output_params, field, compression_level,
compression_level_names[compression_level_default]);
lossy_compression_schemes_names[compression_level_default]);
/* Need to find out which of the entries this corresponds to... */
int level_index;
for (level_index = 0; level_index < compression_level_count; level_index++) {
if (!strcmp(compression_level_names[level_index], compression_level)) break;
if (!strcmp(lossy_compression_schemes_names[level_index],
compression_level))
break;
}
/* Make sure that the supplied default option is either on or off, not a
......@@ -228,7 +227,7 @@ enum compression_levels output_options_get_ptype_default(
level_index);
#endif
return (enum compression_levels)level_index;
return (enum lossy_compression_schemes)level_index;
}
/**
......
......@@ -20,35 +20,17 @@
#define SWIFT_OUTPUT_OPTIONS_H
/* Local headers. */
#include "io_compression.h"
#include "output_list.h"
#include "part_type.h"
#include "restart.h"
/**
* @brief Compression levels for snapshot fields
*/
enum compression_levels {
compression_do_not_write = 0,
compression_write_lossless,
compression_write_low_lossy,
compression_write_med_lossy,
compression_write_high_lossy,
/* Counter, always leave last */
compression_level_count,
};
/*! Default value for SelectOutput */
#define compression_level_default compression_write_lossless
/*! Default name for the SelectOutput header */
#define select_output_header_default_name "Default"
/**
* @brief Names of the compression levels, used in the select_output.yml
* parameter file.
**/
extern const char* compression_level_names[];
/**
* @brief Output selection properties, including the parsed files.
**/
......@@ -76,12 +58,12 @@ void output_options_struct_restore(struct output_options* output_options,
FILE* stream);
/* Logic functions */
int output_options_should_write_field(
enum lossy_compression_schemes output_options_get_field_compression(
const struct output_options* output_options, const char* snapshot_type,
const char* field_name, const enum part_type part_type,
const enum compression_levels comp_level_current_default);
const enum lossy_compression_schemes comp_level_current_default);
enum compression_levels output_options_get_ptype_default(
enum lossy_compression_schemes output_options_get_ptype_default_compression(
struct swift_params* output_params, const char* snapshot_type,
const enum part_type part_type);
......
......@@ -381,10 +381,11 @@ void read_array_parallel(hid_t grp, struct io_props props, size_t N,
* @param N_total The total number of particles to write in this array.
* @param snapshot_units The units used for the data in this snapshot.
*/
void prepare_array_parallel(struct engine* e, hid_t grp, const char* fileName,
FILE* xmfFile, char* partTypeGroupName,
struct io_props props, long long N_total,
const struct unit_system* snapshot_units) {
void prepare_array_parallel(
struct engine* e, hid_t grp, const char* fileName, FILE* xmfFile,
char* partTypeGroupName, struct io_props props, long long N_total,
const enum lossy_compression_schemes lossy_compression,
const struct unit_system* snapshot_units) {
/* Create data space */
const hid_t h_space = H5Screate(H5S_SIMPLE);
......@@ -416,21 +417,36 @@ void prepare_array_parallel(struct engine* e, hid_t grp, const char* fileName,
if (h_err < 0)
error("Error while changing data space shape for field '%s'.", props.name);
/* Dataset type */
hid_t h_type = H5Tcopy(io_hdf5_type(props.type));
/* Dataset properties */
hid_t h_prop = H5Pcreate(H5P_DATASET_CREATE);
/* Create property list for collective dataset write. */
const hid_t h_plist_id = H5Pcreate(H5P_DATASET_XFER);
H5Pset_dxpl_mpio(h_plist_id, H5FD_MPIO_COLLECTIVE);
/* Set chunk size */
/* h_err = H5Pset_chunk(h_prop, rank, chunk_shape); */
/* if (h_err < 0) { */
/* error("Error while setting chunk size (%llu, %llu) for field '%s'.", */
/* chunk_shape[0], chunk_shape[1], props.name); */
/* } */
// h_err = H5Pset_chunk(h_prop, rank, chunk_shape);
// if (h_err < 0) {
// error("Error while setting chunk size (%llu, %llu) for field '%s'.",
// chunk_shape[0], chunk_shape[1], props.name);
//}
/* Are we imposing some form of lossy compression filter? */
// if (lossy_compression != compression_write_lossless)
// set_hdf5_lossy_compression(&h_prop, &h_type, lossy_compression,
// props.name);
/* Impose check-sum to verify data corruption */
// h_err = H5Pset_fletcher32(h_prop);
// if (h_err < 0)
// error("Error while setting checksum options for field '%s'.", props.name);
/* Create dataset */
const hid_t h_data =
H5Dcreate(grp, props.name, io_hdf5_type(props.type), h_space, H5P_DEFAULT,
H5P_DEFAULT, H5P_DEFAULT);
const hid_t h_data = H5Dcreate(grp, props.name, h_type, h_space, H5P_DEFAULT,
h_prop, H5P_DEFAULT);
if (h_data < 0) error("Error while creating dataspace '%s'.", props.name);
/* Write unit conversion factors for this data set */
......@@ -474,6 +490,8 @@ void prepare_array_parallel(struct engine* e, hid_t grp, const char* fileName,
props.dimension, props.type);
/* Close everything */
H5Tclose(h_type);
H5Pclose(h_prop);
H5Pclose(h_plist_id);