Skip to content
Snippets Groups Projects
Commit 60244d24 authored by Matthieu Schaller's avatar Matthieu Schaller
Browse files

Added P2L kernel to the FMM document and script to generate them

parent e48aaf16
No related branches found
No related tags found
1 merge request!1077Improved multipole acceptance criterion (MAC)
...@@ -179,26 +179,35 @@ the other kernels: ...@@ -179,26 +179,35 @@ the other kernels:
a_x(\mathbf{x}_a) &= G\sum_{\mathbf{m}}^p \mathsf{M}_{\mathbf{m}} \mathsf{D}_{\mathbf{m}+\left(1,0,0\right)}(\mathbf{R}). a_x(\mathbf{x}_a) &= G\sum_{\mathbf{m}}^p \mathsf{M}_{\mathbf{m}} \mathsf{D}_{\mathbf{m}+\left(1,0,0\right)}(\mathbf{R}).
\label{eq:fmm:M2P} \label{eq:fmm:M2P}
\end{align} \end{align}
A traditional tree-code uses solely that kernel to obtain the forces A traditional tree-code uses solely that kernel to obtain the forces from
from the multipoles (or often just monopoles, i.e. setting $p=0$ throughout) the multipoles (or often just monopoles, i.e. setting $p=0$ throughout) to
to the particles.\\ the particles. Similarly, the field tensor of a cell can receive the
All the kernels (Eqs.~\ref{eq:fmm:P2M}-\ref{eq:fmm:M2P}) are rather contribution from a single particle at a distance $\mathbf{R} \equiv
straightforward to evaluate as they are only made of additions and \mathbf{z}_A - \mathbf{x}_b$ via the P2L kernel:
multiplications (provided $\mathsf{D}$ can be evaluated quickly), \begin{equation}
which are extremely efficient instructions on modern architectures \mathsf{F}_{\mathbf{n}}(\mathbf{z}_A) = G m_b
(see Appendix \ref{sec:pot_derivatives} for the full \mathsf{D}_{\mathbf{n}}(\mathbf{R}).
expressions). However, the fully expanded sums can lead to rather \label{eq:fmm:P2L}
large and prone to typo expressions. To avoid any mishaps, we use a \end{equation}
\texttt{python} script to generate C code in which all the sums are The M2P and P2L kernels can be used to speed up the calculations
unrolled and correct by construction. In \swift, we implemented the involving only as single particle. All the kernels
kernels up to order $p=5$, as it proved to be accurate enough for our (Eqs.~\ref{eq:fmm:P2M}-\ref{eq:fmm:P2L}) are rather straightforward to
purpose, but this could be extended to higher order easily. This evaluate as they are only made of additions and multiplications
implies storing $56$ numbers per cell for each $\textsf{M}$ and (provided $\mathsf{D}$ can be evaluated quickly), which are extremely
$\textsf{F}$ plus three numbers for the location of the centre of efficient instructions on modern architectures (see Appendix
mass. For leaf-cells with large numbers of particles, as in \swift, \ref{sec:pot_derivatives} for the full expressions). However, the
this is a small memory overhead. One further small improvement fully expanded sums can lead to rather large and prone to typo
consists in choosing $\mathbf{z}_A$ to be the centre of mass of cell expressions. To avoid any mishaps, we use a \texttt{python} script to
$A$ rather than its geometrical centre. The first order multipoles generate C code in which all the sums are unrolled and correct by
construction. In \swift, we implemented the kernels up to order $p=5$,
as it proved to be accurate enough for our purpose, but this could be
extended to higher order easily. This implies storing $56$ numbers per
cell for each $\textsf{M}$ and $\textsf{F}$ plus three numbers for the
location of the centre of mass. For leaf-cells with large numbers of
particles, as in \swift, this is a small memory overhead. One further
small improvement consists in choosing $\mathbf{z}_A$ to be the centre
of mass of cell $A$ rather than its geometrical centre. The first
order multipoles
($\mathsf{M}_{100},\mathsf{M}_{010},\mathsf{M}_{001}$) then vanish by ($\mathsf{M}_{100},\mathsf{M}_{010},\mathsf{M}_{001}$) then vanish by
construction. This allows us to simplify some of the expressions and construction. This allows us to simplify some of the expressions and
helps reduce, albeit by a small fraction, the memory footprint of the helps reduce, albeit by a small fraction, the memory footprint of the
......
...@@ -240,7 +240,10 @@ print("-------------------------------------------------\n") ...@@ -240,7 +240,10 @@ print("-------------------------------------------------\n")
if order > 0: if order > 0:
print("#if SELF_GRAVITY_MULTIPOLE_ORDER > %d" % (order - 1)) print("#if SELF_GRAVITY_MULTIPOLE_ORDER > %d" % (order - 1))
print("/* Shift %s order terms (1st order mpole (all 0) commented out) */" % ordinal(order)) print(
"/* Shift %s order terms (1st order mpole (all 0) commented out) */"
% ordinal(order)
)
# Create all the terms relevent for this order # Create all the terms relevent for this order
for i in range(order + 1): for i in range(order + 1):
...@@ -324,6 +327,26 @@ if order > 0: ...@@ -324,6 +327,26 @@ if order > 0:
print("#endif") print("#endif")
print("")
print("-------------------------------------------------")
print("gravity_P2L():")
print("-------------------------------------------------\n")
if order > 0:
print("#if SELF_GRAVITY_MULTIPOLE_ORDER > %d\n" % (order - 1))
# Loop over LHS order
for i in range(order + 1):
for j in range(order + 1):
for k in range(order + 1):
if i + j + k == order:
print("l_b->F_%d%d%d += m * D_%d%d%d;" % (i, j, k, i, j, k))
if order > 0:
print("#endif")
print("") print("")
print("-------------------------------------------------") print("-------------------------------------------------")
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment