Skip to content
GitLab
Projects
Groups
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
SWIFT
SWIFTsim
Commits
d55defab
Commit
d55defab
authored
Jan 21, 2016
by
Pedro Gonnet
Browse files
tweaked section 2.
parent
d2fe743b
Changes
1
Hide whitespace changes
Inline
Sidebyside
theory/paper_pasc/pasc_paper.tex
View file @
d55defab
...
...
@@ 19,7 +19,9 @@
% Latex tricks
\newcommand
{
\oh
}
[1]
{
\mbox
{$
{
\mathcal
O
}
(
#
1
)
$}}
\newcommand
{
\eqn
}
[1]
{
(
\ref
{
eqn:#1
}
)
}
\makeatletter
\newcommand
{
\pushright
}
[1]
{
\ifmeasuring
@#1
\else\omit\hfill
$
\displaystyle
#
1
$
\fi\ignorespaces
}
\makeatother
% Some acronyms
\newcommand
{
\gadget
}{{
\sc
Gadget2
}
\xspace
}
...
...
@@ 167,20 +169,27 @@ tackle ever {\em larger} problems, but not fixedsize problems
Although this switch from growth in speed to growth in parallelism
has been anticipated and observed for quite some time, very little
has changed in terms of how we design and implement parallel
computations, e.g.~branchandbound synchronous parallelism using
OpenMP
\cite
{
ref:Dagum1998
}
and MPI
\cite
{
ref:Snir1998
}
, and domain
decompositions based on spacefilling curves
\cite
{
warren1993parallel
}
.
The design and implementation of
\swift
\cite
{
gonnet2013swift,
%
theuns2015swift,gonnet2015efficient
}
, a largescale cosmological simulation
code built from scratch, provided the perfect opportunity to test some newer
computations.
Branchandbound synchronous parallelism using
OpenMP
\cite
{
ref:Dagum1998
}
and MPI
\cite
{
ref:Snir1998
}
, as well as domain
decompositions based on geometry or spacefilling curves
\cite
{
warren1993parallel
}
are still commonplace, despite both the
architectures and problem scales having changed dramatically since
their introduction.
The design and implementation of
\swift\footnote
{
\swift
is an opensource software project and the latest version of
the source code, along with all the data needed to run the test cased
presented in this paper, can be downloaded at
\web
.
}
\cite
{
gonnet2013swift,theuns2015swift,gonnet2015efficient
}
, a largescale
cosmological simulation code built from scratch, provided the perfect
opportunity to test some newer
approaches, i.e.~taskbased parallelism, fully asynchronous communication, and
graph partitionbased domain decompositions. The code is opensource and
available at the address
\web
where all the test cases
presented in this paper can also be found.
graph partitionbased domain decompositions.
This paper describes these techniques, as well as the results
obtained with them on different architectures.
This paper describes these techniques, which are not exclusive to
cosmological simulations or any specific architecture, as well as
the results obtained with them.
%#####################################################################################################
...
...
@@ 221,12 +230,12 @@ Once the densities $\rho_i$ have been computed, the time derivatives of the
velocity and internal energy, which require
$
\rho
_
i
$
, are
computed as followed:
%
\begin{
eqnarray
}
\frac
{
dv
_
i
}{
dt
}
&
=
&

\sum
_{
j,~r
_{
ij
}
<
\hat
{
h
}_{
ij
}}
m
_
j
\left
[
\frac
{
P
_
i
}{
\Omega
_
i
\rho
_
i
^
2
}
\nabla
_
rW(r
_{
ij
}
,h
_
i)
+
\frac
{
P
_
j
}{
\Omega
_
j
\rho
_
j
^
2
}
\nabla
_
rW(r
_{
ij
}
,h
_
j)
\right
],
\
label
{
eqn:dvdt
}
\\
\frac
{
du
_
i
}{
dt
}
&
=
&
\frac
{
P
_
i
}{
\Omega
_
i
\rho
_
i
^
2
}
\sum
_{
j,~r
_{
ij
}
< h
_
i
}
m
_
j(
\mathbf
v
_
i 
\mathbf
v
_
j)
\cdot
\nabla
_
rW(r
_{
ij
}
,h
_
i),
\label
{
eqn:dudt
}
\end{
eqnarray
}
\begin{
align
}
\frac
{
dv
_
i
}{
dt
}
&
= 
\sum
_{
j,~r
_{
ij
}
<
\hat
{
h
}_{
ij
}}
m
_
j
\left
[
\frac
{
P
_
i
}{
\Omega
_
i
\rho
_
i
^
2
}
\nabla
_
rW(r
_{
ij
}
,h
_
i)
\right
. +
\label
{
eqn:dvdt
}
\\
&
\pushright
{
\left
.
\frac
{
P
_
j
}{
\Omega
_
j
\rho
_
j
^
2
}
\nabla
_
rW(r
_{
ij
}
,h
_
j)
\right
],
\
nonumber
}
\\
\frac
{
du
_
i
}{
dt
}
&
=
\frac
{
P
_
i
}{
\Omega
_
i
\rho
_
i
^
2
}
\sum
_{
j,~r
_{
ij
}
< h
_
i
}
m
_
j(
\mathbf
v
_
i 
\mathbf
v
_
j)
\cdot
\nabla
_
rW(r
_{
ij
}
,h
_
i),
\label
{
eqn:dudt
}
\end{
align
}
%
where
$
\hat
{
h
}_{
ij
}
=
\max\{
h
_
i,h
_
j
\}
$
, and the particle pressure
$
P
_
i
=
\rho
_
i
u
_
i
(
\gamma

1
)
$
and correction term
$
\Omega
_
i
=
1
+
...
...
@@ 256,7 +265,6 @@ separately:
Finding the interacting neighbours for each particle constitutes
the bulk of the computation.
Many codes, e.g. in Astrophysics simulations
\cite
{
Gingold1977
}
,
the abovementioned approaches cease to work efficiently.
rely on spatial
{
\em
trees
}
for neighbour finding
\cite
{
Gingold1977,Hernquist1989,Springel2005,Wadsley2004
}
,
i.e.~
$
k
$
d trees
\cite
{
Bentley1975
}
or octrees
\cite
{
Meagher1982
}
...
...
@@ 297,6 +305,7 @@ Finally, the necessary communication between nodes can itself be
modelled in a taskbased way, interleaving communication seamlesly
with the rest of the computation.
\subsection
{
Taskbased parallelism
}
Taskbased parallelism is a sharedmemory parallel programming
...
...
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment