Skip to content
Snippets Groups Projects
Commit 44683d78 authored by Matthieu Schaller's avatar Matthieu Schaller
Browse files

Webpage, wiggles and new conclusion section title

parent 6f5b2239
No related branches found
No related tags found
2 merge requests!136Master,!80PASC paper
......@@ -166,13 +166,15 @@ OpenMP\cite{ref:Dagum1998} and MPI\cite{ref:Snir1998}, and domain
decompositions based on space-filling curves \cite{warren1993parallel}.
The design and implementation of \swift \cite{gonnet2013swift,%
theuns2015swift,gonnet2015efficient}, a large-scale cosmological
simulation code built from scratch, provided the perfect
opportunity to test some newer approaches, i.e.~task-based parallelism,
fully asynchronous communication, and graph partition-based
domain decompositions.
This paper describes the results obtained with these parallelisation
techniques.
theuns2015swift,gonnet2015efficient}, a large-scale cosmological simulation
code built from scratch, provided the perfect opportunity to test some newer
approaches, i.e.~task-based parallelism, fully asynchronous communication, and
graph partition-based domain decompositions. The code is open-source and
available at the address \url{www.swiftsim.com} where all the test cases
presented in this paper can also be found.
This paper describes the results
obtained with these parallelisation techniques.
%#####################################################################################################
......@@ -570,7 +572,8 @@ algorithm described above in the case of 32 MPI ranks.
Using 16 threads per node (no use of hyper-threading) with one MPI
rank per node, a reasonable parallel efficiency is achieved when
increasing the thread count from 1 (1 node) to 256 (16 nodes) even
on a relatively small test case.
on a relatively small test case. Wiggles are likely due to the way thread
affinity is set by the operating system at run time.
\label{fig:cosma}}
\end{figure*}
......@@ -669,7 +672,7 @@ test are shown on Fig.~\ref{fig:JUQUEEN2}.
%#####################################################################################################
\section{Conclusions}
\section{Discussion \& Conclusion}
When running on the SuperMUC machine with 32 nodes (512 cores), each MPI rank
contains approximately $1.6\times10^7$ particles in $2.5\times10^5$
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment