Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
Q
QuickSched
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Deploy
Releases
Model registry
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
SWIFT
QuickSched
Commits
38010b7d
Commit
38010b7d
authored
10 years ago
by
Pedro Gonnet
Browse files
Options
Downloads
Patches
Plain Diff
fixed description of BH solver, need to add correct numbers and figures.
parent
6a42473a
No related branches found
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
paper/paper.tex
+71
-79
71 additions, 79 deletions
paper/paper.tex
with
71 additions
and
79 deletions
paper/paper.tex
+
71
−
79
View file @
38010b7d
...
...
@@ -1088,46 +1088,61 @@ The function recurses as follows:
recurse over all the cell's sub-cells (line~9), and all
pairs of the cell's sub-cells (line~11),
\item
If called with a single unsplit cell (line~13),
create a self-interaction task on that cell (line~14),
\item
If called with two cells that are sufficiently well
separated (line~21), create two particle-cell pair
interactions (lines~23 and~28) over both cells in
opposite orders, which depend on the center of mass
task of each cell,
\item
If called with two cells that are not well
separated and both cells are split (line~33),
create a self-interaction task as well as a particle-cell
task on that cell (line~14),
\item
If called with two non-neighbouring cells (line~21),
do nothing, as these interactions
will be computed by the particle-cell task,
\item
If called with two neighbouring cells and both cells
are split (line~33),
recurse over all pairs of sub-cells spanning
both cells (line~37), and
\item
If called with two
cells that are not well separated
and
either
of the cells are not split, create
\item
If called with two
neighbouring cells
and
one
of the cells are not split, create
a particle-particle pair task over both cells.
\end{itemize}
\noindent
where every interaction task additionally locks
the cells on which it operates (lines~16, 25, 30, and 42--43).
In order to prevent generating
a large number of very small tasks, the task generation only recurses
if the cells contain more than a minimum number
$
n
_
\mathsf
{
task
}$
of threads each (lines~7 and~34).
The tasks themselves are then left to recurse over the sub-trees,
which is why in these cases, the tasks are made to depend on the
center of mass tasks (lines~17--18 and~41--47)
which may be used in the ensuing interactions.
The particle-particle pair interaction tasks are implemented
by computing the interactions between all particle pairs spanning
both cells in a double for-loop.
The particle-cell interactions for each leaf node are computed by
traversing the tree recursively starting from the root node:
\begin{itemize}
\item
If called with a node that is a hierarchical parent of
the leaf node, or with a node that is a direct neighbour of
a hierarchical parent of the leaf node, recurse over the
node's sub-cells,
\item
Otherwise, compute the interaction between the leaf node's
center of mass and all the particles in the leaf node.
\end{itemize}
This task decomposition differs from the traditional tree-walk
in the Barnes-Hut algorithm in that the particle-cell interactions
are grouped per leaf, with each leaf doing its own tree walk.
This approach was chosen to maximize the memory locality
of the particle-cell calculation, as the particles in the leaf,
which are traversed for each particle-cell interaction, are
contiguous in memory, and are thus more likely to remain in the
lowest-level cache.
This Barnes-Hut tree-code was used to approximate the gravitational
N-Body problem for 1
\,
000
\,
000 particles with random coordinates
N-Body problem for 1
\,
000
\,
000 particles with
uniformly
random coordinates
in
$
[
0
,
1
]
^
3
$
.
The parameters
$
n
_
\mathsf
{
max
}
=
100
$
and
$
n
_
\mathsf
{
task
}
=
5000
$
were used to generate the tasks, and cell pairs were considered
well separated if not directly adjacent.
were used to generate the tasks.
Using the above scheme generated 97
\,
553 tasks, of which
512 self-interaction tasks, ??? particle-particle interaction
task, ??? particle-cell interaction tasks, and 37
\,
449
center of mass tasks.
task, and ??? particle-cell interaction tasks.
A total of 141
\,
840 dependencies were generated, along with
104
\,
392 locks on 37
\,
449 resources.
For these tests,
OpenMP
parallelism was used and resource
For these tests,
{
\tt
pthread
}
s
parallelism was used and resource
re-owning was switched off.
Resource ownership was attributed by dividing the global
{
\tt
parts
}
array by the number of queues and assigning each cell's
...
...
@@ -1549,60 +1564,42 @@ and ID, respectively.
\begin{figure}
\begin{center}\begin{minipage}
{
0.9
\textwidth
}
\begin{lstlisting}
[basicstyle=
\scriptsize\tt
]
void comp
_
com(struct cell *c)
{
int j, k;
c->com[0] = 0.0; c->com[1] = 0.0; c->com[2] = 0.0;
c->mass = 0.0;
if (c->split)
for (k = 0; k < 8; k++)
{
struct cell *cp = c->progeny[k];
for (j = 0; j < 3; j++)
c->com[j] += cp->com[j] * cp->mass;
c->mass += cp->mass;
}
else
for (k = 0; k < 8; k++)
{
struct part *p =
&
c->parts[k];
for (j = 0; j < 3; j++)
c->com[j] += p->x[j] * p->mass;
c->mass += p->mass;
}
c->com[0] /= c->mass; c->com[1] /= c->mass; c->com[2] /= c->mass;
}
void comp
_
self(struct cell *c)
{
int j, k;
if (c->split)
for (j = 0; j < 8; j++)
{
for (
int
j = 0; j < 8; j++)
{
comp
_
self(c->progeny[j]);
for (k = j + 1; k < 8; k++)
for (
int
k = j + 1; k < 8; k++)
comp
_
pair(c->progeny[j], c->progeny[k]);
}
else
for (j = 0; j < c->count; j++)
for (k = j + 1; k < c->count; k++)
for (
int
j = 0; j < c->count; j++)
for (
int
k = j + 1; k < c->count; k++)
interact c->parts[j] and c->parts[k].
}
void comp
_
pair(struct cell *ci, struct cell *cj)
{
int j, k;
if (ci and cj well separated)
{
comp
_
pair
_
pc(ci, cj);
comp
_
pair
_
pc(cj, ci);
}
else if (ci->split
&&
cj->split)
for (j = 0; j < 8; j++)
for (k = 0; k < 8; k++)
if (ci and cj are not neighbours)
return;
if (ci->split
&&
cj->split)
{
for (int j = 0; j < 8; j++)
for (int k = 0; k < 8; k++)
comp
_
pair(ci->progeny[j], cj->progeny[k]);
else
for (j = 0; j < ci->count; j++)
for (k = 0; k < cj->count; k++)
}
else
{
for (
int
j = 0; j < ci->count; j++)
for (
int
k = 0; k < cj->count; k++)
interact ci->parts[j] and cj->parts[k].
}
}
void comp
_
pair
_
cp(struct cell *ci, struct cell *cj)
{
int k;
for (k = 0; k < ci->count; k++)
interact ci->parts[k] and cj center of mass.
void comp
_
pair
_
cp(struct cell *leaf, struct cell *c)
{
if (c is a parent of leaf ||
c is a neighbour of a parent of leaf)
{
for (int k = 0; k < 8; k++)
comp
_
pair
_
cp(leaf, c->progeny[k]);
}
else if (leaf and c are not direct neighbours)
{
for (int k = 0; k < leaf->count; k++)
interact leaf->parts[k] and c center of mass.
}
}
\end{lstlisting}
\end{minipage}\end{center}
...
...
@@ -1613,7 +1610,7 @@ void comp_pair_cp(struct cell *ci, struct cell *cj) {
\begin{figure}
\begin{center}\begin{minipage}
{
0.9
\textwidth
}
\begin{lstlisting}
[basicstyle=
\scriptsize\tt
]
enum
{
tSELF , tPAIR
_
PP , tPAIR
_
PC
, tCOM
}
;
enum
{
tSELF , tPAIR
_
PP , tPAIR
_
PC
}
;
void make
_
tasks(struct qsched *s, struct cell *ci, struct cell *cj)
{
int j, k;
qsched
_
task
_
t tid;
...
...
@@ -1627,28 +1624,23 @@ void make_tasks(struct qsched *s, struct cell *ci, struct cell *cj) {
}
else
{
tid = qsched
_
addtask(s, tSELF, qsched
_
flags
_
none,
&
ci,
sizeof(struct cell *), ci->count * ci->count);
sizeof(struct cell *),
ci->count * ci->count);
qsched
_
addlock(s, tid, ci->res);
tid = qsched
_
addtask(s, tPAIR
_
PC, qsched
_
flags
_
none,
&
ci,
sizeof(struct cell *), ci->count);
qsched
_
addlock(s, tid, ci->res);
if (ci->split) qsched
_
addunlock(s, ci->com, tid);
}
}
else if (ci and cj are well separated)
{
data[0] = ci; data[1] = cj;
tid = qsched
_
addtask(s, tPAIR
_
PC, qsched
_
flags
_
none, data,
sizeof(struct cell *) * 2, ci->count);
qsched
_
addlock(s, tid, ci->res);
qsched
_
addunlock(s, cj->com, tid);
data[0] = cj; data[1] = ci;
tid = qsched
_
addtask(s, tPAIR
_
PC, qsched
_
flags
_
none, data,
sizeof(struct cell *) * 2, cj->count);
qsched
_
addlock(s, tid, cj->res);
qsched
_
addunlock(s, ci->com, tid);
}
else if (ci->split
&&
cj->split
&&
ci->count * cj->count > n
_
task * n
_
task)
}
else if (ci->split
&&
cj->split
&&
ci->count * cj->count > n
_
task * n
_
task)
for (j = 0; j < 8; j++)
for (k = 0; k < 8; k++) make
_
tasks(s, ci->progeny[j], cj->progeny[k]);
for (k = 0; k < 8; k++)
make
_
tasks(s, ci->progeny[j], cj->progeny[k]);
else
{
data[0] = ci; data[1] = cj;
tid = qsched
_
addtask(s, tPAIR
_
PP, qsched
_
flags
_
none, data,
sizeof(struct cell *) * 2, ci->count * cj->count);
sizeof(struct cell *) * 2,
ci->count * cj->count);
qsched
_
addlock(s, tid, ci->res);
qsched
_
addlock(s, tid, cj->res);
if (ci->split
&&
cj->split)
{
...
...
@@ -1679,7 +1671,7 @@ void exec_fun ( int type , void *data ) {
comp
_
pair( cells[0] , cells[1] );
break;
case tPAIR
_
PC:
comp
_
pair
_
pc( cells[0] ,
cells[1]
);
comp
_
pair
_
pc( cells[0] ,
root
);
break;
case tCOM:
comp
_
com( cells[0] );
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment