Updated the homepage information for computer scientists

86174363 · Matthieu Schaller · af6812cd · 86174363
Commit 86174363 authored 7 years ago by Matthieu Schaller
--- a/data/cs.md
+++ b/data/cs.md
 # Computer Scientist
-## Scaling
+## Parallelisation strategy
-Cosmological simulations are typically very hard to scale to large numbers of
-cores, due to the fact that information is needed from each of the nodes to
-perform a given time-step. SWIFT uses smart domain decomposition, vectorisation,
-and asynchronous communication to provide a 36.7x speedup over our direct
-competition (the publicly available GADGET-2 code) and near-perfect weak
-scaling.
-![SWIFT Scaling Plot](scalingplot.png)
-The left panel ("Weak Scaling") shows how the runtime of a problem changes when
-the number of threads is increased proportionally to the number of particles in
-the system (i.e. a fixed 'load per thread'). The right panel ("Strong Scaling")
-shows how the runtime changes for a fixed load as it is spread over more
-threads. The right panel shows the 36.7x speedup that SWIFT offers over
-GADGET-2.
+SWIFT uses a hybrid MPI + threads parallelisation scheme with a
+modified version of the publicly available lightweight tasking library
+[QuickShed](https://gitlab.cosma.dur.ac.uk/swift/quicksched) as its
+backbone. Communications between compute nodes are scheduled by the
+library itself and use asynchronous call to MPI to maximise the
+overlap between communication and computation. The domain
+decomposition itself is performed by splitting the graph of all the
+compute tasks, using the METIS library, such as to minimise the number
+of required MPI communications. The core calculations in SWIFT used
+hand-written SIMD intrinsics to process multiple particles in parallel
+and achieve maximal performance.
+## Strong- and weak-scaling
+Cosmological simulations are typically very hard to scale to large
+numbers of cores, due to the fact that information is needed from each
+of the nodes to perform a given time-step. SWIFT uses smart domain
+decomposition, vectorisation, and asynchronous communication to
+provide a 36.7x speedup over the de-facto standard (the publicly
+available GADGET-2 code) and near-perfect weak scaling even on
+problems larger than presented in the published astrophysics
+literature
+![SWIFT Scaling Plot](scalingplot.png) The left panel ("Weak Scaling")
+shows how the run-time of a problem changes when the number of threads
+is increased proportionally to the number of particles in the system
+(i.e. a fixed 'load per thread'). The right panel ("Strong Scaling")
+shows how the run-time changes for a fixed load as it is spread over
+more threads. The right panel shows the 36.7x speedup that SWIFT
+offers over GADGET-2. This uses a representative problem (a snapshot
+of the [EAGLE](http://adsabs.harvard.edu/abs/2014ApJS..210...14K)
+simulation at late time where the hierarchy of time-steps is very deep
+and where most other codes struggle to harvest any scaling or performance.
+## I/O performance
+SWIFT uses the parallel-hdf5 library to read and write snapshots
+efficiently on distributed file systems. By careful tuning of the
+Lustre parameters, SWIFT can write snapshots at the maximal disk
+writing speed of a given system.