SWIFTsim merge requestshttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests2017-10-26T18:23:57Zhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/440Dopair2 vectorisation2017-10-26T18:23:57ZJames WillisDopair2 vectorisationAdds the following:
* Vectorised version of `runner_dopair2_force`
* Expands `testActivePair` to include more test cases and to also test the force pair tasks
* Creates a branching function for `DOPAIR2` so that the corner pairs are c...Adds the following:
* Vectorised version of `runner_dopair2_force`
* Expands `testActivePair` to include more test cases and to also test the force pair tasks
* Creates a branching function for `DOPAIR2` so that the corner pairs are calculated using the serial version of `DOPAIR2`
* `pairs_all_force` now checks if particles are active before updating themVectorization of all the core SPH tasksMatthieu SchallerMatthieu Schallerhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/446Fix the exit iteration padding.2017-10-30T14:19:24ZJames WillisFix the exit iteration padding.- Removes padding for the `exit_iteration` when looping over `cj`.
- Calculates the padding correctly for the `exit_iteration` when looping over `ci`- Removes padding for the `exit_iteration` when looping over `cj`.
- Calculates the padding correctly for the `exit_iteration` when looping over `ci`Matthieu SchallerMatthieu Schallerhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/448Fix spelling2017-10-30T17:18:09ZPeter W. DraperFix spellingJust checking how to operate a fork merge request.Just checking how to operate a fork merge request.Peter W. DraperPeter W. Draperhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/447Fix pair vec exit iteration2017-10-30T17:46:36ZJames WillisFix pair vec exit iterationUndo change for `exit_iteration`.Undo change for `exit_iteration`.Matthieu SchallerMatthieu Schallerhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/441List top-level cells with tasks2017-11-07T18:55:53ZMatthieu SchallerList top-level cells with tasksThis implements the idea behind #373 and should help with #366.
In this branch we:
- At rebuild time, construct a list of the top-level cells that have at least one task somewhere in their hierarchy.
- When unskipping tasks, laun...This implements the idea behind #373 and should help with #366.
In this branch we:
- At rebuild time, construct a list of the top-level cells that have at least one task somewhere in their hierarchy.
- When unskipping tasks, launch the threadpool on this list rather than on the whole set of top-level cells.
- When collecting the ti_end_min, use the threadpool on this list.
One thing I am not very happy with is the last point. We don't have an elegant way of doing a reduction over the threadpool. The current version should be better than the old, scalar, version but we could reduce the number of locks by having a more advanced mechanism. Would require quite a bit of change to the threadpool infrastructure though.
Other possible improvements:
- Use the list of cells with tasks also when rebuilding.
- Use a similar list when splitting in the rebuild as we don't need to launch threads on empty top-level cells.
- Use a more advanced list (constructed each step during unskip) that contains only the active local cells to speed-up the collection of time-steps. Peter W. DraperPeter W. Draperhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/445Doself subset vec2017-10-31T12:08:42ZJames WillisDoself subset vecAdds a vectorised version of `runner_doself_subset_density`.Adds a vectorised version of `runner_doself_subset_density`.Vectorization of all the core SPH tasksMatthieu SchallerMatthieu Schallerhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/449Resolve "Check for drifted state in vectorized tasks"2017-10-31T22:58:53ZJames WillisResolve "Check for drifted state in vectorized tasks"Closes #380Closes #380Matthieu SchallerMatthieu Schallerhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/453Large hdf5 reads2017-11-07T17:51:56ZMatthieu SchallerLarge hdf5 readsIn the on-going series of i/o fixes here is the latest set of changes.
Basically, applying the same trick as for the writes. If we attempt to read more than 2GB, we break the read up into smaller pieces. You can reach that regime when u...In the on-going series of i/o fixes here is the latest set of changes.
Basically, applying the same trick as for the writes. If we attempt to read more than 2GB, we break the read up into smaller pieces. You can reach that regime when using the EAGLE-50 on a 4 nodes or less but that is a rather extreme setup.Peter W. DraperPeter W. Draperhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/452Make loop bounds consistent.2017-11-08T12:05:53ZJames WillisMake loop bounds consistent.Matthieu SchallerMatthieu Schallerhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/455Add information to stdout and timestep.txt about what happened in the step2017-11-08T12:24:42ZMatthieu SchallerAdd information to stdout and timestep.txt about what happened in the stepAdd information to stdout and timestep.txt about what happened in the step (rebuild, repartition, snapshot, ...)Add information to stdout and timestep.txt about what happened in the step (rebuild, repartition, snapshot, ...)Peter W. DraperPeter W. Draperhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/442Support for OSX2017-11-08T13:45:22ZMatthieu SchallerSupport for OSXImplements #364. We:
- Detect whether the POSIX library implements the barriers or not.
- If they are implemented define the SWIFT barriers as the POSIX ones.
- If not use an alternative, simple, implementation.
- Be verbose abou...Implements #364. We:
- Detect whether the POSIX library implements the barriers or not.
- If they are implemented define the SWIFT barriers as the POSIX ones.
- If not use an alternative, simple, implementation.
- Be verbose about the implementation being used.
- Update the autotools macro that sets pthread flags to the latest version. Should handle OSX in a better way.
- Detect whether FPEs can be raised on this system and set the appropriate macro.
- Update the INSTALL.swift with instructions for OSX.
- Detect Skylake (mobile and desktop) and set the appropriate architecture flags.
- Fix the dump and logger tests to write unique files to /tmp/ and delete them once done.Peter W. DraperPeter W. Draperhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/457Use non-buffered MPI sends for small messages2018-11-06T16:32:20ZPeter W. DraperUse non-buffered MPI sends for small messagesAn attempt to tune these task MPI particle exchanges without affecting the other parts, like repartitioning, which work better with buffered MPI.
Seems to give the same results for MPI tic and toc improvements as tuning in #366. Sadly...An attempt to tune these task MPI particle exchanges without affecting the other parts, like repartitioning, which work better with buffered MPI.
Seems to give the same results for MPI tic and toc improvements as tuning in #366. Sadly
for longer runs the improvement is harder to find (other factors are far more dominant,
like longer running task chains), but it can be seen in EAGLE_50 runs, giving
a millisecond or two of improvement for small steps, so worth keeping.
Note that using eager sends like these works best when the receiving recvs are ready,
otherwise the remote node will need to buffer and copy the request anyway, so these should be
kept under control and not grown without suitable consideration.Matthieu SchallerMatthieu Schallerhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/459Task graph2017-11-15T17:41:08ZLoic HausammannTask graphWrite a 'dot' file containing the task-dependency graph at the 0th time-step of a simulation. This allows the used to see what physics model is actually being run. We also add a bash script to generate a png file from the 'dot' file.
...Write a 'dot' file containing the task-dependency graph at the 0th time-step of a simulation. This allows the used to see what physics model is actually being run. We also add a bash script to generate a png file from the 'dot' file.
This implements #92.Peter W. DraperPeter W. Draperhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/458Add time based edge repartitioning2017-11-15T17:58:32ZPeter W. DraperAdd time based edge repartitioningAlso rationalises the naming of the various repartitioning schemes so we have vertex/edge
naming (note that these are still fixed labels, you are not free to choose the vertex or
edge weights from all available).
The various weights...Also rationalises the naming of the various repartitioning schemes so we have vertex/edge
naming (note that these are still fixed labels, you are not free to choose the vertex or
edge weights from all available).
The various weights of vertices and edges are now accumulated as doubles. This avoids
issues with the size of time (easily > 2**32) and a lot of rescaling. Also reduces the
number of MPI exchanges.
Add scripts to process the cell dumps to show which cells are active and on the
edge of partitions.Matthieu SchallerMatthieu Schallerhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/460Improvements to i/o and parallel-i/o2017-11-29T15:01:46ZMatthieu SchallerImprovements to i/o and parallel-i/oThis implements multiple changes to the i/o, especially the parallel version.
- Wrap particles back into the box before writing them (Fix to #374).
- Use the threadpool to construct the internal buffers that are sent to HDF5.
-...This implements multiple changes to the i/o, especially the parallel version.
- Wrap particles back into the box before writing them (Fix to #374).
- Use the threadpool to construct the internal buffers that are sent to HDF5.
- Make the buffer construction generic across all three i/o routines.
- Allow for more general functions to transform particles into i/o buffer quantities.
- Choose more optimised ROMIO algorithms for parallel-io and delay the writing of meta-data until the file is closed.
This last change allows for a writing speed of 6GB/s on the cosma-6 Lustre storage. (Solved #115)Peter W. DraperPeter W. Draperhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/461Remove unnecessary dependency between sort and self/density tasks.2017-11-29T16:54:59ZMatthieu SchallerRemove unnecessary dependency between sort and self/density tasks.Does what it says on the tin.Does what it says on the tin.Peter W. DraperPeter W. Draperhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/463Add a print units2017-12-01T12:51:07ZLoic HausammannAdd a print unitsPeter W. DraperPeter W. Draperhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/465Find missing interactions in DOPAIR2 and vectorized version.2017-12-08T19:31:26ZJames WillisFind missing interactions in DOPAIR2 and vectorized version.Fixes three small bugs in the neighbour search:
- Don't remove rshift from hi_max and pass h_max to populate_max_index_no_cache_force.
- Fixes missing interactions in runner_dopair2_force_vec.
- Fixes missing interactions in DOPAI...Fixes three small bugs in the neighbour search:
- Don't remove rshift from hi_max and pass h_max to populate_max_index_no_cache_force.
- Fixes missing interactions in runner_dopair2_force_vec.
- Fixes missing interactions in DOPAIR2 due to < versus <=.Vectorization of all the core SPH tasksPeter W. DraperPeter W. Draperhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/443Debug interactions2017-12-12T12:09:34ZJames WillisDebug interactionsAdds an option to store the number of particle interactions for both the density and force tasks for each particle. Also stores a list of each particles' neighbours.Adds an option to store the number of particle interactions for both the density and force tasks for each particle. Also stores a list of each particles' neighbours.Matthieu SchallerMatthieu Schallerhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/466Make sure that bash script creates a FAIL in Jenkins.2017-12-12T16:57:59ZJames WillisMake sure that bash script creates a FAIL in Jenkins.@pdraper@pdraperPeter W. DraperPeter W. Draper