Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
Q
QuickSched
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Deploy
Releases
Model registry
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
SWIFT
QuickSched
Commits
38010b7d
Commit
38010b7d
authored
10 years ago
by
Pedro Gonnet
Browse files
Options
Downloads
Patches
Plain Diff
fixed description of BH solver, need to add correct numbers and figures.
parent
6a42473a
No related branches found
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
paper/paper.tex
+71
-79
71 additions, 79 deletions
paper/paper.tex
with
71 additions
and
79 deletions
paper/paper.tex
+
71
−
79
View file @
38010b7d
...
@@ -1088,46 +1088,61 @@ The function recurses as follows:
...
@@ -1088,46 +1088,61 @@ The function recurses as follows:
recurse over all the cell's sub-cells (line~9), and all
recurse over all the cell's sub-cells (line~9), and all
pairs of the cell's sub-cells (line~11),
pairs of the cell's sub-cells (line~11),
\item
If called with a single unsplit cell (line~13),
\item
If called with a single unsplit cell (line~13),
create a self-interaction task on that cell (line~14),
create a self-interaction task as well as a particle-cell
\item
If called with two cells that are sufficiently well
task on that cell (line~14),
separated (line~21), create two particle-cell pair
\item
If called with two non-neighbouring cells (line~21),
interactions (lines~23 and~28) over both cells in
do nothing, as these interactions
opposite orders, which depend on the center of mass
will be computed by the particle-cell task,
task of each cell,
\item
If called with two neighbouring cells and both cells
\item
If called with two cells that are not well
are split (line~33),
separated and both cells are split (line~33),
recurse over all pairs of sub-cells spanning
recurse over all pairs of sub-cells spanning
both cells (line~37), and
both cells (line~37), and
\item
If called with two
cells that are not well separated
\item
If called with two
neighbouring cells
and
either
of the cells are not split, create
and
one
of the cells are not split, create
a particle-particle pair task over both cells.
a particle-particle pair task over both cells.
\end{itemize}
\end{itemize}
\noindent
where every interaction task additionally locks
\noindent
where every interaction task additionally locks
the cells on which it operates (lines~16, 25, 30, and 42--43).
the cells on which it operates (lines~16, 25, 30, and 42--43).
In order to prevent generating
In order to prevent generating
a large number of very small tasks, the task generation only recurses
a large number of very small tasks, the task generation only recurses
if the cells contain more than a minimum number
$
n
_
\mathsf
{
task
}$
if the cells contain more than a minimum number
$
n
_
\mathsf
{
task
}$
of threads each (lines~7 and~34).
of threads each (lines~7 and~34).
The tasks themselves are then left to recurse over the sub-trees,
which is why in these cases, the tasks are made to depend on the
The particle-particle pair interaction tasks are implemented
center of mass tasks (lines~17--18 and~41--47)
by computing the interactions between all particle pairs spanning
which may be used in the ensuing interactions.
both cells in a double for-loop.
The particle-cell interactions for each leaf node are computed by
traversing the tree recursively starting from the root node:
\begin{itemize}
\item
If called with a node that is a hierarchical parent of
the leaf node, or with a node that is a direct neighbour of
a hierarchical parent of the leaf node, recurse over the
node's sub-cells,
\item
Otherwise, compute the interaction between the leaf node's
center of mass and all the particles in the leaf node.
\end{itemize}
This task decomposition differs from the traditional tree-walk
in the Barnes-Hut algorithm in that the particle-cell interactions
are grouped per leaf, with each leaf doing its own tree walk.
This approach was chosen to maximize the memory locality
of the particle-cell calculation, as the particles in the leaf,
which are traversed for each particle-cell interaction, are
contiguous in memory, and are thus more likely to remain in the
lowest-level cache.
This Barnes-Hut tree-code was used to approximate the gravitational
This Barnes-Hut tree-code was used to approximate the gravitational
N-Body problem for 1
\,
000
\,
000 particles with random coordinates
N-Body problem for 1
\,
000
\,
000 particles with
uniformly
random coordinates
in
$
[
0
,
1
]
^
3
$
.
in
$
[
0
,
1
]
^
3
$
.
The parameters
$
n
_
\mathsf
{
max
}
=
100
$
and
$
n
_
\mathsf
{
task
}
=
5000
$
The parameters
$
n
_
\mathsf
{
max
}
=
100
$
and
$
n
_
\mathsf
{
task
}
=
5000
$
were used to generate the tasks, and cell pairs were considered
were used to generate the tasks.
well separated if not directly adjacent.
Using the above scheme generated 97
\,
553 tasks, of which
Using the above scheme generated 97
\,
553 tasks, of which
512 self-interaction tasks, ??? particle-particle interaction
512 self-interaction tasks, ??? particle-particle interaction
task, ??? particle-cell interaction tasks, and 37
\,
449
task, and ??? particle-cell interaction tasks.
center of mass tasks.
A total of 141
\,
840 dependencies were generated, along with
A total of 141
\,
840 dependencies were generated, along with
104
\,
392 locks on 37
\,
449 resources.
104
\,
392 locks on 37
\,
449 resources.
For these tests,
OpenMP
parallelism was used and resource
For these tests,
{
\tt
pthread
}
s
parallelism was used and resource
re-owning was switched off.
re-owning was switched off.
Resource ownership was attributed by dividing the global
Resource ownership was attributed by dividing the global
{
\tt
parts
}
array by the number of queues and assigning each cell's
{
\tt
parts
}
array by the number of queues and assigning each cell's
...
@@ -1549,60 +1564,42 @@ and ID, respectively.
...
@@ -1549,60 +1564,42 @@ and ID, respectively.
\begin{figure}
\begin{figure}
\begin{center}\begin{minipage}
{
0.9
\textwidth
}
\begin{center}\begin{minipage}
{
0.9
\textwidth
}
\begin{lstlisting}
[basicstyle=
\scriptsize\tt
]
\begin{lstlisting}
[basicstyle=
\scriptsize\tt
]
void comp
_
com(struct cell *c)
{
int j, k;
c->com[0] = 0.0; c->com[1] = 0.0; c->com[2] = 0.0;
c->mass = 0.0;
if (c->split)
for (k = 0; k < 8; k++)
{
struct cell *cp = c->progeny[k];
for (j = 0; j < 3; j++)
c->com[j] += cp->com[j] * cp->mass;
c->mass += cp->mass;
}
else
for (k = 0; k < 8; k++)
{
struct part *p =
&
c->parts[k];
for (j = 0; j < 3; j++)
c->com[j] += p->x[j] * p->mass;
c->mass += p->mass;
}
c->com[0] /= c->mass; c->com[1] /= c->mass; c->com[2] /= c->mass;
}
void comp
_
self(struct cell *c)
{
void comp
_
self(struct cell *c)
{
int j, k;
if (c->split)
if (c->split)
for (j = 0; j < 8; j++)
{
for (
int
j = 0; j < 8; j++)
{
comp
_
self(c->progeny[j]);
comp
_
self(c->progeny[j]);
for (k = j + 1; k < 8; k++)
for (
int
k = j + 1; k < 8; k++)
comp
_
pair(c->progeny[j], c->progeny[k]);
comp
_
pair(c->progeny[j], c->progeny[k]);
}
}
else
else
for (j = 0; j < c->count; j++)
for (
int
j = 0; j < c->count; j++)
for (k = j + 1; k < c->count; k++)
for (
int
k = j + 1; k < c->count; k++)
interact c->parts[j] and c->parts[k].
interact c->parts[j] and c->parts[k].
}
}
void comp
_
pair(struct cell *ci, struct cell *cj)
{
void comp
_
pair(struct cell *ci, struct cell *cj)
{
int j, k;
if (ci and cj are not neighbours)
if (ci and cj well separated)
{
return;
comp
_
pair
_
pc(ci, cj);
if (ci->split
&&
cj->split)
{
comp
_
pair
_
pc(cj, ci);
for (int j = 0; j < 8; j++)
}
else if (ci->split
&&
cj->split)
for (int k = 0; k < 8; k++)
for (j = 0; j < 8; j++)
for (k = 0; k < 8; k++)
comp
_
pair(ci->progeny[j], cj->progeny[k]);
comp
_
pair(ci->progeny[j], cj->progeny[k]);
else
}
else
{
for (j = 0; j < ci->count; j++)
for (
int
j = 0; j < ci->count; j++)
for (k = 0; k < cj->count; k++)
for (
int
k = 0; k < cj->count; k++)
interact ci->parts[j] and cj->parts[k].
interact ci->parts[j] and cj->parts[k].
}
}
}
void comp
_
pair
_
cp(struct cell *ci, struct cell *cj)
{
void comp
_
pair
_
cp(struct cell *leaf, struct cell *c)
{
int k;
if (c is a parent of leaf ||
for (k = 0; k < ci->count; k++)
c is a neighbour of a parent of leaf)
{
interact ci->parts[k] and cj center of mass.
for (int k = 0; k < 8; k++)
comp
_
pair
_
cp(leaf, c->progeny[k]);
}
else if (leaf and c are not direct neighbours)
{
for (int k = 0; k < leaf->count; k++)
interact leaf->parts[k] and c center of mass.
}
}
}
\end{lstlisting}
\end{lstlisting}
\end{minipage}\end{center}
\end{minipage}\end{center}
...
@@ -1613,7 +1610,7 @@ void comp_pair_cp(struct cell *ci, struct cell *cj) {
...
@@ -1613,7 +1610,7 @@ void comp_pair_cp(struct cell *ci, struct cell *cj) {
\begin{figure}
\begin{figure}
\begin{center}\begin{minipage}
{
0.9
\textwidth
}
\begin{center}\begin{minipage}
{
0.9
\textwidth
}
\begin{lstlisting}
[basicstyle=
\scriptsize\tt
]
\begin{lstlisting}
[basicstyle=
\scriptsize\tt
]
enum
{
tSELF , tPAIR
_
PP , tPAIR
_
PC
, tCOM
}
;
enum
{
tSELF , tPAIR
_
PP , tPAIR
_
PC
}
;
void make
_
tasks(struct qsched *s, struct cell *ci, struct cell *cj)
{
void make
_
tasks(struct qsched *s, struct cell *ci, struct cell *cj)
{
int j, k;
int j, k;
qsched
_
task
_
t tid;
qsched
_
task
_
t tid;
...
@@ -1627,28 +1624,23 @@ void make_tasks(struct qsched *s, struct cell *ci, struct cell *cj) {
...
@@ -1627,28 +1624,23 @@ void make_tasks(struct qsched *s, struct cell *ci, struct cell *cj) {
}
}
else
{
else
{
tid = qsched
_
addtask(s, tSELF, qsched
_
flags
_
none,
&
ci,
tid = qsched
_
addtask(s, tSELF, qsched
_
flags
_
none,
&
ci,
sizeof(struct cell *), ci->count * ci->count);
sizeof(struct cell *),
ci->count * ci->count);
qsched
_
addlock(s, tid, ci->res);
tid = qsched
_
addtask(s, tPAIR
_
PC, qsched
_
flags
_
none,
&
ci,
sizeof(struct cell *), ci->count);
qsched
_
addlock(s, tid, ci->res);
qsched
_
addlock(s, tid, ci->res);
if (ci->split) qsched
_
addunlock(s, ci->com, tid);
}
}
}
else if (ci and cj are well separated)
{
}
else if (ci->split
&&
cj->split
&&
data[0] = ci; data[1] = cj;
ci->count * cj->count > n
_
task * n
_
task)
tid = qsched
_
addtask(s, tPAIR
_
PC, qsched
_
flags
_
none, data,
sizeof(struct cell *) * 2, ci->count);
qsched
_
addlock(s, tid, ci->res);
qsched
_
addunlock(s, cj->com, tid);
data[0] = cj; data[1] = ci;
tid = qsched
_
addtask(s, tPAIR
_
PC, qsched
_
flags
_
none, data,
sizeof(struct cell *) * 2, cj->count);
qsched
_
addlock(s, tid, cj->res);
qsched
_
addunlock(s, ci->com, tid);
}
else if (ci->split
&&
cj->split
&&
ci->count * cj->count > n
_
task * n
_
task)
for (j = 0; j < 8; j++)
for (j = 0; j < 8; j++)
for (k = 0; k < 8; k++) make
_
tasks(s, ci->progeny[j], cj->progeny[k]);
for (k = 0; k < 8; k++)
make
_
tasks(s, ci->progeny[j], cj->progeny[k]);
else
{
else
{
data[0] = ci; data[1] = cj;
data[0] = ci; data[1] = cj;
tid = qsched
_
addtask(s, tPAIR
_
PP, qsched
_
flags
_
none, data,
tid = qsched
_
addtask(s, tPAIR
_
PP, qsched
_
flags
_
none, data,
sizeof(struct cell *) * 2, ci->count * cj->count);
sizeof(struct cell *) * 2,
ci->count * cj->count);
qsched
_
addlock(s, tid, ci->res);
qsched
_
addlock(s, tid, ci->res);
qsched
_
addlock(s, tid, cj->res);
qsched
_
addlock(s, tid, cj->res);
if (ci->split
&&
cj->split)
{
if (ci->split
&&
cj->split)
{
...
@@ -1679,7 +1671,7 @@ void exec_fun ( int type , void *data ) {
...
@@ -1679,7 +1671,7 @@ void exec_fun ( int type , void *data ) {
comp
_
pair( cells[0] , cells[1] );
comp
_
pair( cells[0] , cells[1] );
break;
break;
case tPAIR
_
PC:
case tPAIR
_
PC:
comp
_
pair
_
pc( cells[0] ,
cells[1]
);
comp
_
pair
_
pc( cells[0] ,
root
);
break;
break;
case tCOM:
case tCOM:
comp
_
com( cells[0] );
comp
_
com( cells[0] );
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment