SWIFTsim merge requestshttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests2023-07-15T16:33:16Zhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/1624Draft: Add magma2 SPH scheme2023-07-15T16:33:16ZZhen XiangDraft: Add magma2 SPH schemeI just added the MAGMA2 scheme in the branch MAGMA2 and are there some advices for the future implementation?
I added the two extra profiles:
- hydro_iact_cosmo.h and hydro_iact_sphenix.h. One is added the cosmological term and anothe...I just added the MAGMA2 scheme in the branch MAGMA2 and are there some advices for the future implementation?
I added the two extra profiles:
- hydro_iact_cosmo.h and hydro_iact_sphenix.h. One is added the cosmological term and another is using the sphenix's diffusion parameter.
- There is a parameter called h_crit in the runner_iact_force and runner_iact_nonsym_force , which is the inverse of the resolution eta. And everytime I just change the parameter manually
- Also for the matrix inversion, I just paste the function invert_dimension_by_dimension_matrixin swift, and it might need clear a bit and call the function instead
- And I just directly use constant viscosity and diffusion parameters into the function without define it in the hydro_parameter.h. It also might need to change a bit
- Last thing is MAGMA2 use the quadratically mid point reconstruction and it is also worth to test the linearly reconstruction. Since it may increases the computing effciency.https://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/1595Remove engine_marktasks() entirely2024-03-18T20:54:14ZMatthieu SchallerRemove engine_marktasks() entirelyThis is a relic of a long-gone past where marktasks and unskip needed to be different. Now it just makes use duplicate entire code blocks for no good reasons.
`engine_unskip()` is superior as it only loops over the tree sections where ac...This is a relic of a long-gone past where marktasks and unskip needed to be different. Now it just makes use duplicate entire code blocks for no good reasons.
`engine_unskip()` is superior as it only loops over the tree sections where actual work could happen, so that's the version we should keep.
This version keeps marktasks as a way to verify unskip's job. Once we are happy they are identical (as they should) we can remove that check.
Actual changes to keep past these tests:
- Better names for the various BH ghosts between the task names and the cell-carried task pointers.
- More uniform activation (for readability) of the gravity drift related to MPI between unskip and marktasks.
- Remove the activation of some gas drifts when neighbouring a BH cell where the BH is not active.
- Fix a mismatch where marktasks was activating unnecessary star pairs.
- Fix a mismatch where marktasks was activating unnecessary BH ghosts.
- Fix a mismatch where markatask was not activating star drifts in the case of stars doing only gravity.
- Fix a mismatch where marktasks was not activating RT_in tasks.Matthieu SchallerMatthieu Schallerhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/1530Draft: Blank functions for MHD support2024-03-12T13:49:10ZMatthieu SchallerDraft: Blank functions for MHD supportChanges:
- Moved the signal velocity calculation to a new file that is SPH-scheme independant. There are two versions in there, one for MHD runs, one for non-MHD runs
- Added the `mhd` directory for the scheme specific things. In ther...Changes:
- Moved the signal velocity calculation to a new file that is SPH-scheme independant. There are two versions in there, one for MHD runs, one for non-MHD runs
- Added the `mhd` directory for the scheme specific things. In there there is only `None` scheme for now, i.e. no-MHD runs
- Added all the empty functions for the loops over neighbours
- Added all the empty functions for the things done in-between loops (drift, kick, reset...)
- Added a small structure that is carried by the particles to contain the MHD fields.
Todo:
- [ ] Deal with the case where the MHD wants three loops but the hydro only needs two...
- [x] Deal with Gasoline signal velocity
- [x] Prevent MHD + Gizmo
- [x] Unit tests
- [x] Cleanly deal with mu_0
- [x] Cleanly deal with stats
- [ ] Cleanly deal with div v
- [ ] Work out units of B^2 PS.https://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/1498Draft: Runtime dependency check2022-05-19T12:39:11ZBert VandenbrouckeDraft: Runtime dependency checkI will just leave this here with some extra documentation so that I can pick this up when I get back.
This MR implements the long needed runtime dependency check that checks, at runtime, whether any of the task dependencies get violated...I will just leave this here with some extra documentation so that I can pick this up when I get back.
This MR implements the long needed runtime dependency check that checks, at runtime, whether any of the task dependencies get violated. In principle, the task dependency graph looks something like this (for a single cell):
![mini_graph_full](/uploads/4c518fd6ca2876377e1b3aca10b78afd/mini_graph_full.png)
But, because of inactive particles, it sometimes looks like this in practice:
![mini_graph_partial](/uploads/6f256125f52a14aa621d299e504e0289/mini_graph_partial.png)
Not unskipping tasks is equivalent to graying out parts of the graph, which also grays out (and effectively removes) the corresponding dependencies. In the example, task 2a (this could be a pair interaction between an active and inactive cell) is effectively loose (i.e. has no dependencies), so it can be executed before/after or (if the cell is not locked) while task4-6 run. This is wrong and is what we somehow want to detect at runtime.
The runtime dependency check starts from the basic assumption that the task graph for steps where all particles are active (as constructed in `engine_maketasks`) is complete and needs to be satisfied at all times. Based on this, we can construct a _dependency mask_ for every task that marks all the tasks that need to run *before* that task, like this mask for task 5a:
![mini_graph_check](/uploads/08b50c29607907df26b05ffda82c777e/mini_graph_check.png)
If every cell stores a counter with the number of tasks of each type (and subtype) it already processed (this counter is actually already there if you use debugging checks - it is currently not used), then we could check that the counters marked by a tasks mask are all zero when the task starts. If this is not the case, then we know the graph was violated, and we detected a runtime dependency error.
To implement this kind of check, we have to make sure that each task has exactly one dependency mask. This is only true if the global graph (including all tasks at all levels and tasks involving proxies) is a Directed Acyclic Graph (DAG). This MR therefore includes some changes that convert the graph into a DAG by removing a circular dependency between the stars sort, stars density tasks, stars receive and stars sort. Cyclic dependencies between a task and itself (because of level recursion) are allowed, since those are easy to detect.
The second step (currently only partially implemented) is to convert the task dependencies as created during `engine_maketasks` into a dependency mask for every task. I have a simple Python code that can do this from the regular `dependency_graph.csv` output. However, when I try to apply the same logic to the actual task dependencies, I end up missing some dependencies because of inter-level dependencies. This can be fixed by first creating the equivalent of the global graph and then running the algorithm on that. So that is the next step.
In the current implementation, the task masks are constructed and written to a file for inspection, but they are not actually used. Once they are complete, they need to be stored in the engine, so that they are available to the runners at the start of a task in `runner_main`. Once that is done, the cell task counters need to be changed a bit so that they are updated recursively. With that, the runtime dependency check should work.Bert VandenbrouckeBert Vandenbrouckehttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/1469WIP: Do not activate empty stars and BH pairs2023-01-31T05:42:58ZBert VandenbrouckeWIP: Do not activate empty stars and BH pairsSupersedes !1457 after !1458 was merged.
Todo:
- [ ] Fix the dx_max incompatibility.Supersedes !1457 after !1458 was merged.
Todo:
- [ ] Fix the dx_max incompatibility.Matthieu SchallerMatthieu Schallerhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/1437Draft: Reduced MPI message size for hydro2021-09-29T21:49:12ZBert VandenbrouckeDraft: Reduced MPI message size for hydroThis merge request reduces the data volume that needs to be sent to update the proxies for the hydro interactions. In the original scheme, the entire particle was sent prior to the interaction, yielding a total communication volume of 2P...This merge request reduces the data volume that needs to be sent to update the proxies for the hydro interactions. In the original scheme, the entire particle was sent prior to the interaction, yielding a total communication volume of 2P or 3P, depending on the number of loops. However, because of the way the hydro interactions work, almost all variables stored in the particle change at most once during the loops, e.g. the position is drifted but remains constant after that, the density is updated during the first loop but should not be used prior to that anyway... In other words, the total volume that actually needs to be sent (distributed over the 2 or 3 communications) should be P (more or less). For the default SPHENIX hydro scheme, this means a potential volume reduction of 2/3!
To reduce the size of the messages, an additional "reduced particle" needs to be created for the various communications. Prior to the send, a new buffer of these "particles" is allocated and the required variables are copied over from the original particle into the reduced particle. This reduced buffer is then used for the send. On the receiving end, a new "reduced" receive buffer of the same size is allocated and upon successful completion of the communication the variables from the reduced buffer are copied over into the particles in the proxy. The latter is done during the recv task and will hence already show up in task plots and timers. The same is not true for the send, since that is emitted in the scheduler. The buffer allocation and copying for sends is therefore done in a new task that unlocks the send task, similar to what is already done for the limiter.
A major disadvantage of this new approach is that hydro schemes (and all subgrid schemes that use data stored in the part) can no longer be ignorant about what happens during the communications. To deal with this in a somewhat clean way, I have created a new file, `hydro_pack.h`, that deals with the definition of the "reduced particle" structs and all packing and unpacking related functions. This file needs to be implemented for each hydro scheme in a similar way as `hydro.h` and `hydro_part.h`, and should in turn use a similar file (with structs and functions) to deal with subgrid schemes. The developer/maintainer of a specific hydro scheme or subgrid scheme is then responsible for deciding which variables need to be sent prior to each loop, and for copying these variables between the particles on both ranks and the intermediate "reduced particle".
A lot of things remain to be done:
- [ ] right now, inactive cells (cells with only inactive particles) are only communicated once, i.e. by `send_xv`. Since inactive particles can be neighbours of active particles on the remote rank, we need to make sure all the required variables are sent through for these cells. This means either activating the other sends again for these cells, or creating a different send for inactive cells that sends more information than the reduced `send_xv`
- [ ] because of the above problem, the reduced `send_xv` still sends the whole particle, but in a less efficient way than what was done before (because we still use an intermediate buffer)
- [ ] the current implementation only filters out hydro variables; subgrid related variables are blindly `memcpy`d for now
- [ ] the current implementation only works for SPHENIX
- [ ] lots of documentation is missing
I have tested the new implementation on steps 100-120 of the EAGLE_12 low z example (on 4 ranks and 3 threads per rank) and that seems to work. A preliminary test on a FLAMINGO benchmark run shows a total message volume reduction of 15% and a time gain of 5s (on a total of 140s) for a representative large step. So some progress, but not spectacular.Matthieu SchallerMatthieu Schallerhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/1393Slimming down of foreign gpart + reduced comm size2024-01-10T10:45:23ZMatthieu SchallerSlimming down of foreign gpart + reduced comm sizeHere are the changes to improve the foreign memory usage of `gpart`s and help with the related communications.
This includes:
- Creation of two new particle types. One for the regular foreign `gpart` and one for the `gpart` when runni...Here are the changes to improve the foreign memory usage of `gpart`s and help with the related communications.
This includes:
- Creation of two new particle types. One for the regular foreign `gpart` and one for the `gpart` when running FOF.
- Modification of the gravity cache construction to use the foreign particle type when acting on a foreign cell.
- Usage of the new packing task to fish out the fields we want. (No need to unpack anything however)
- Modification of the `gpart` and `fof` communications.
- Modification of the construction of the foreign buffers to accommodate both types.
All the physics changes are the same as what was tried in !1318. The difference is that the packing is done here by hand and that the case of FOF is correctly handled.Peter W. DraperPeter W. Draperhttps://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/1353Draft: Subtask speedup - Still requires work2022-08-06T03:35:00ZMatthieu SchallerDraft: Subtask speedup - Still requires workFor now, compile with `CFLAGS=-DONLY_SUBTASKS` added to the configuration line.
This introduces significantly faster neighbour finding in particle distributions with strong density gradients. For instance planetary applications or probl...For now, compile with `CFLAGS=-DONLY_SUBTASKS` added to the configuration line.
This introduces significantly faster neighbour finding in particle distributions with strong density gradients. For instance planetary applications or problematic feedback-disturbed galaxies.
Changes:
- Add the brute-force density checks to the planetary scheme (not just sphenix)
- Rewrite the recursion logic in the hydro and stars sub-task:
- The interaction functions have extra parameters to optionally consider particles only between 0.5 * width and width
- The subtask recursion now continues to lower level if we reach a level where h is too large. From that level on, we will just use the feature of only considering particles in the appropriate range of h.
- When recursing, only the h_max of _active_ particles is considered, not all particles.
Notes:
- We will consider c7adb289391c613ccebf2be0bd630b83f86fe171 separately.
- In a second phase, I'll remove entirely the self + pair tasks and keep only the subs.
Todo:
- [x] Verify RT isn't broken
- [x] Verify MPI runs are happy.
- [x] Port changes to the other hydro schemes.
- [ ] Handle particles drifting out of their cells.
Fixes #688.https://gitlab.cosma.dur.ac.uk/swift/swiftsim/-/merge_requests/995WIP GIZMO entropy switch2022-06-27T15:53:33ZBert VandenbrouckeWIP GIZMO entropy switchReimplementation of !816 on top of the refactored version of Gizmo. !816 can now be safely closed.
All relevant code from the original merge request has been reimplemented. I still need to run a complete set of hydro tests to make sure ...Reimplementation of !816 on top of the refactored version of Gizmo. !816 can now be safely closed.
All relevant code from the original merge request has been reimplemented. I still need to run a complete set of hydro tests to make sure this does not break anything. I have not tested yet whether this fixes the issues with the Zeldovich pancake.Matthieu SchallerMatthieu Schaller