Commit c2c7d5ff authored by Matthieu Schaller's avatar Matthieu Schaller
Browse files

Updates to the FMM theory blurb.

parent 75d65343
......@@ -10,24 +10,19 @@ evaluate
\mathbf{x}_b)\qquad \forall~a\in N
\label{eq:fmm:n_body}
\end{equation}
efficiently for large numbers of particles $N$. In the case of collisionless
dynamics, the particles are a mere Monte-Carlo sampling of the
underlying coarse-grained phase-space distribution which justifies the
use of approximate method to evaluate Eq.~\ref{eq:fmm:n_body}. The
\emph{Fast Multipole Method} (FMM) \citep{Greengard1987, Cheng1999},
popularized in the field and adapted specifically for gravity solvers
by \cite{Dehnen2000, Dehnen2002}, is an $\mathcal{O}(N)$ method
designed to solve Eq.~\ref{eq:fmm:n_body} by expanding the potential both
around $\mathbf{x}_i$ and $\mathbf{x}_j$ and grouping similar terms
efficiently for large numbers of particles $N$. In the case of
collisionless dynamics, the particles are a mere Monte-Carlo sampling
of the underlying coarse-grained phase-space distribution which
justifies the use of approximate method to evaluate
Eq.~\ref{eq:fmm:n_body}. The \emph{Fast Multipole Method} (FMM)
\citep{Greengard1987, Cheng1999}, popularized in the field and adapted
specifically for gravity solvers by \cite{Dehnen2000, Dehnen2002}, is
an $\mathcal{O}(N)$ method designed to solve Eq.~\ref{eq:fmm:n_body}
by expanding the potential in Taylor series \emph{both} around
$\mathbf{x}_i$ and $\mathbf{x}_j$ and grouping similar terms
arising from nearby particles. \\
In what follows, we use the compact multi-index notation of
\cite{Dehnen2014} (repeated in appendix \ref{sec:multi_index_notation}
for completeness) to simplify expressions and ease
comparisons. $\mathbf{k}$, $\mathbf{m}$ and $\mathbf{n}$ are
multi-indices and $\mathbf{r}$, $\mathbf{R}$, $\mathbf{x}$,
$\mathbf{y}$ and $\mathbf{z}$ are vectors, whilst $a$ and $b$ are
particle indices.\\
\subsubsection{Double expansion of the potential}
\begin{figure}
\includegraphics[width=\columnwidth]{cells.pdf}
......@@ -37,15 +32,22 @@ particle indices.\\
around the distance vector $\mathbf{R}$ linking the two centres of mass
($\mathbf{z}_A$ and $\mathbf{z}_B$) of cell $A$ and $B$. The
expansion converges towards the exact expression provided
$|\mathbf{R}|<|\mathbf{r}_a + \mathbf{r}_b|$.}
$|\mathbf{R}|>|\mathbf{r}_a + \mathbf{r}_b|$.}
\label{fig:fmm:cells}
\end{figure}
In what follows, we use the compact multi-index notation of
\cite{Dehnen2014} (repeated in appendix \ref{sec:multi_index_notation}
for completeness) to simplify expressions and ease
comparisons. $\mathbf{k}$, $\mathbf{m}$ and $\mathbf{n}$ are
multi-indices and $\mathbf{r}$, $\mathbf{R}$, $\mathbf{x}$,
$\mathbf{y}$ and $\mathbf{z}$ are vectors, whilst $a$ and $b$ are
particle indices. Note also that we make no assumption on the specific
functional form of the potential $\varphi$.\\
For a single pair of particles $a$ and $b$ located in cell $A$ and $B$
with centres of mass $\mathbf{z}_A$ and $\mathbf{z}_B$
respectively, as shown on Fig.~\ref{fig:fmm:cells}, the potential
generated by $b$ at the location of $a$ can be rewritten as
with centres of mass $\mathbf{z}_A$ and $\mathbf{z}_B$ respectively,
as shown on Fig.~\ref{fig:fmm:cells}, the potential generated by $b$
at the location of $a$ can be rewritten as
\begin{align}
\varphi(\mathbf{x}_a - \mathbf{x}_b)
&= \varphi\left(\mathbf{x}_a - \mathbf{z}_A - \mathbf{x}_b +
......@@ -78,7 +80,7 @@ to order $p$, we get
\label{eq:fmm:fmm_one_part}
\end{equation}
with the approximation converging as $p\rightarrow\infty$ towards the
correct value provided $|\mathbf{R}|<|\mathbf{r}_a +
correct value provided $|\mathbf{R}|>|\mathbf{r}_a +
\mathbf{r}_b|$. If we now consider all the particles within $B$ and
combine their contributions to the potential at location
$\mathbf{x}_a$ in cell $A$, we get
......@@ -95,6 +97,8 @@ This last equation forms the basis of the FMM. The algorithm
decomposes the equation into three separated sums evaluated at
different stages.\\
\subsubsection{The FMM algorithm}
In a first step, multipoles are constructed from the
innermost sum. For each cell, we compute all the terms
\begin{equation}
......@@ -141,7 +145,6 @@ for the acceleration along $x$, we have:
\frac{1}{\mathbf{n}!} \mathbf{r}_a^{\mathbf{n}}
\mathsf{F}_{\mathbf{n}+\left(1,0,0\right)}(\mathbf{z}_A). \label{eq:fmm:L2P_force}
\end{equation}
In practice, the multipoles can be constructed recursively from the
leaves of the tree to the root and the local expansions from the root
to the leaves by shifting the $\mathsf{M}$ and $\mathsf{F}$ tensors
......@@ -159,12 +162,12 @@ read:
\frac{\mathbf{y}^\mathbf{m}}{\mathbf{m}!}\mathsf{F}_{\mathbf{m} +
\mathbf{n}}(\mathbf{x}). \label{eq:fmm:L2L}
\end{align}
All the kernels (Eqs.~\ref{eq:fmm:P2M}-\ref{eq:fmm:L2L}) are rather
straightforward to evaluate as they are only made of additions and
multiplications (provided $\mathsf{D}$ can be evaluated quickly),
which are extremely efficient instructions on modern
architectures. However, the fully expanded sums can lead to rather
which are extremely efficient instructions on modern architectures
(see Appendix \ref{sec:pot_derivatives} for the full
expressions). However, the fully expanded sums can lead to rather
large and prone to typo expressions. To avoid any mishaps, we use a
\texttt{python} script to generate C code in which all the sums are
unrolled and correct by construction. In \swift, we implemented the
......@@ -180,3 +183,5 @@ $A$ rather than its geometrical centre. The first order multipoles
construction. This allows us to simplify some of the expressions and
helps reduce, albeit by a small fraction, the memory footprint of the
tree structure.
\subsubsection{The Multipole acceptance criterion}
......@@ -21,7 +21,7 @@ function and write for the potential:
\end{align}
This function alongside the trunctation function used in
\gadget\footnote{For completeness, the \gadget expression reads:\\
$\chi(r, r_s) = \mathrm{erfc}(\frac{1}{2}\frac{r}{r_s})$.} is shown
$\varphi_s(r) = \frac{1}{r} \times \mathrm{erfc}(\frac{1}{2}\frac{r}{r_s})$.} is shown
on Fig.~\ref{fig:fmm:potential_short}. This choice of $\sigma(w)$ can
seem rather cumbersome at first but writing
$\alpha(w) \equiv (1+e^w)^{-1}$, one can express all derivatives of
......
......@@ -158,8 +158,8 @@ subplot(311, xscale="log", yscale="log")
plot(r_rs, phi_newton, '--', lw=1.4, label="${\\rm Newtonian}$", color=colors[0])
plot(r_rs, phi_short_gadget2, '-', lw=1.4, label="${\\rm Gadget}$", color=colors[2])
plot(r_rs, phi_short_swift, '-', lw=1.4, label="${\\rm SWIFT}$", color=colors[3])
plot(r_rs, phi_short_swift2, '-.', lw=1.4, color=colors[3])
plot([1., 1.], [1e-5, 1e5], 'k-', alpha=0.5, lw=0.5)
plot(r_rs, phi_short_swift2, ':', lw=1.4, color=colors[3])
plot([1., 1.], [1e-5, 1e5], 'k-.', alpha=0.5, lw=0.5)
xlim(1.1*r_min/ r_s, 0.9*r_max / r_s)
ylim(1.1/r_max, 0.9/r_min)
......@@ -172,11 +172,11 @@ subplot(312, xscale="log", yscale="log")
plot(r_rs, np.ones(np.size(r)), '--', lw=1.4, color=colors[0])
plot(r_rs, 1. - corr_short_gadget2, '-', lw=1.4, color=colors[2])
plot(r_rs, 1. - corr_short_swift, '-', lw=1.4, color=colors[3])
plot(r_rs, 1. - corr_short_swift2, '-.', lw=1.4, color=colors[3])
plot(r_rs, np.ones(np.size(r))*0.01, 'k:', alpha=0.5, lw=0.5)
plot([1., 1.], [-1e5, 1e5], 'k-', alpha=0.5, lw=0.5)
plot(r_rs, 1. - corr_short_swift2, ':', lw=1.4, color=colors[3])
plot(r_rs, np.ones(np.size(r))*0.01, 'k-.', alpha=0.5, lw=0.5)
plot([1., 1.], [-1e5, 1e5], 'k-.', alpha=0.5, lw=0.5)
plot([-1, -1], [-1, -1], 'k-', lw=1.2, label="${\\textrm{Exact}~e^x}$")
plot([-1, -1], [-1, -1], 'k-.', lw=1.2, label="${6^\\textrm{th}~\\textrm{order~series}~e^x}$")
plot([-1, -1], [-1, -1], 'k:', lw=1.2, label="${6^\\textrm{th}~\\textrm{order~series}~e^x}$")
yticks([1e-2, 1e-1, 1], ["$0.01$", "$0.1$", "$1$"])
xlim(1.1*r_min/r_s, 0.9*r_max/r_s)
......@@ -184,17 +184,17 @@ ylim(3e-3, 1.5)
#ylabel("$\\chi_s(r)$", labelpad=-3)
ylabel("$\\varphi_s(r) \\times r$", labelpad=-2)
legend(loc="center left", frameon=False, handletextpad=0.3, handlelength=2.2, fontsize=7)
legend(loc="center left", frameon=False, handletextpad=0.3, handlelength=1.6, fontsize=7)
# 1 - Correction
subplot(313, xscale="log", yscale="log")
plot(r_rs, corr_short_gadget2, '-', lw=1.4, color=colors[2])
plot(r_rs, corr_short_swift, '-', lw=1.4, color=colors[3])
plot(r_rs, corr_short_swift2, '-.', lw=1.4, color=colors[3])
plot(r_rs, corr_short_swift2, ':', lw=1.4, color=colors[3])
plot([1., 1.], [1e-5, 1e5], 'k-', alpha=0.5, lw=0.5)
plot(r_rs, np.ones(np.size(r)), 'k:', alpha=0.5, lw=0.5)
plot(r_rs, np.ones(np.size(r))*0.01, 'k:', alpha=0.5, lw=0.5)
plot([1., 1.], [1e-5, 1e5], 'k-.', alpha=0.5, lw=0.5)
plot(r_rs, np.ones(np.size(r)), 'k-.', alpha=0.5, lw=0.5)
plot(r_rs, np.ones(np.size(r))*0.01, 'k-.', alpha=0.5, lw=0.5)
xlim(1.1*r_min/r_s, 0.9*r_max/r_s)
ylim(3e-3, 1.5)
......@@ -216,8 +216,8 @@ plot(r_rs, force_newton, '--', lw=1.4, label="${\\rm Newtonian}$", color=colors[
plot(r_rs, force_short_gadget2, '-', lw=1.4, label="${\\rm Gadget}$", color=colors[2])
plot(r_rs, force_short_swift, '-', lw=1.4, label="${\\rm SWIFT}$", color=colors[3])
#plot(r_rs, (1./r**2) * force_corr, '-', lw=1.2, color='r')
plot(r_rs, force_short_swift2, '-.', lw=1.4, color=colors[3])
plot([1., 1.], [1e-5, 1e5], 'k-', alpha=0.5, lw=0.5)
plot(r_rs, force_short_swift2, ':', lw=1.4, color=colors[3])
plot([1., 1.], [1e-5, 1e5], 'k-.', alpha=0.5, lw=0.5)
xlim(1.1*r_min/ r_s, 0.9*r_max / r_s)
ylim(1.1/r_max**2, 0.9/r_min**2)
......@@ -231,33 +231,31 @@ subplot(312, xscale="log", yscale="log")
plot(r_rs, np.ones(np.size(r)), '--', lw=1.4, color=colors[0])
plot(r_rs, eta_short_gadget2, '-', lw=1.4, color=colors[2])
plot(r_rs, eta_short_swift, '-', lw=1.4, color=colors[3])
plot(r_rs, eta_short_swift2, '-.', lw=1.4, color=colors[3])
plot(r_rs, np.ones(np.size(r))*0.01, 'k:', alpha=0.5, lw=0.5)
plot([1., 1.], [-1e5, 1e5], 'k-', alpha=0.5, lw=0.5)
plot(r_rs, eta_short_swift2, ':', lw=1.4, color=colors[3])
plot(r_rs, np.ones(np.size(r))*0.01, 'k-.', alpha=0.5, lw=0.5)
plot([1., 1.], [-1e5, 1e5], 'k-.', alpha=0.5, lw=0.5)
plot([-1, -1], [-1, -1], 'k-', lw=1.2, label="${\\textrm{Exact}~e^x}$")
plot([-1, -1], [-1, -1], 'k-.', lw=1.2, label="${6^\\textrm{th}~\\textrm{order~series}~e^x}$")
plot([-1, -1], [-1, -1], 'k:', lw=1.2, label="${6^\\textrm{th}~\\textrm{order~series}~e^x}$")
yticks([1e-2, 1e-1, 1], ["$0.01$", "$0.1$", "$1$"])
xlim(1.1*r_min/r_s, 0.9*r_max/r_s)
ylim(3e-3, 1.5)
#ylabel("$\\eta_s(r)$", labelpad=-3)
ylabel("$|\\mathbf{f}_s(r)|\\times r^2$", labelpad=-2)
legend(loc="center left", frameon=False, handletextpad=0.3, handlelength=2.2, fontsize=7)
legend(loc="center left", frameon=False, handletextpad=0.3, handlelength=1.6, fontsize=7)
# 1 - Correction
subplot(313, xscale="log", yscale="log")
plot(r_rs, 1. - eta_short_gadget2, '-', lw=1.4, color=colors[2])
plot(r_rs, 1. - eta_short_swift, '-', lw=1.4, color=colors[3])
plot(r_rs, 1. - eta_short_swift2, '-.', lw=1.4, color=colors[3])
plot(r_rs, 1. - eta_short_swift2, ':', lw=1.4, color=colors[3])
plot([1., 1.], [1e-5, 1e5], 'k-', alpha=0.5, lw=0.5)
plot(r_rs, np.ones(np.size(r)), 'k:', alpha=0.5, lw=0.5)
plot(r_rs, np.ones(np.size(r))*0.01, 'k:', alpha=0.5, lw=0.5)
plot([1., 1.], [1e-5, 1e5], 'k-.', alpha=0.5, lw=0.5)
plot(r_rs, np.ones(np.size(r)), 'k-.', alpha=0.5, lw=0.5)
plot(r_rs, np.ones(np.size(r))*0.01, 'k-.', alpha=0.5, lw=0.5)
xlim(1.1*r_min/r_s, 0.9*r_max/r_s)
ylim(3e-3, 1.5)
#ylabel("$1 - \\eta_s(r)$", labelpad=-2)
ylabel("$1 - |\\mathbf{f}_s(r)|\\times r^2$", labelpad=-3)
yticks([1e-2, 1e-1, 1], ["$0.01$", "$0.1$", "$1$"])
xlabel("$r / r_s$", labelpad=1)
......@@ -273,7 +271,7 @@ subplot(311, xscale="log", yscale="log")
plot(k_rs, phit_newton, '--', lw=1.4, label="${\\rm Newtonian}$", color=colors[0])
plot(k_rs, phit_long_gadget2, '-', lw=1.4, label="${\\rm Gadget}$", color=colors[2])
plot(k_rs, phit_long_swift, '-', lw=1.4, label="${\\rm SWIFT}$", color=colors[3])
plot([1., 1.], [1e-5, 1e5], 'k-', alpha=0.5, lw=0.5)
plot([1., 1.], [1e-5, 1e5], 'k-.', alpha=0.5, lw=0.5)
legend(loc="lower left", frameon=True, handletextpad=0.3, handlelength=1.6, fontsize=8)
......@@ -288,8 +286,8 @@ subplot(312, xscale="log", yscale="log")
plot(k_rs, phit_newton * k**2, '--', lw=1.4, label="${\\rm Newtonian}$", color=colors[0])
plot(k_rs, phit_long_gadget2 * k**2, '-', lw=1.4, label="${\\rm Gadget}$", color=colors[2])
plot(k_rs, phit_long_swift * k**2, '-', lw=1.4, label="${\\rm SWIFT}$", color=colors[3])
plot([1., 1.], [1e-5, 1e5], 'k-', alpha=0.5, lw=0.5)
plot(r_rs, np.ones(np.size(r))*0.01, 'k:', alpha=0.5, lw=0.5)
plot([1., 1.], [1e-5, 1e5], 'k-.', alpha=0.5, lw=0.5)
plot(r_rs, np.ones(np.size(r))*0.01, 'k-.', alpha=0.5, lw=0.5)
xlim(1.1*r_min/ r_s, 0.9*r_max / r_s)
ylim(3e-3, 1.5)
......@@ -301,8 +299,8 @@ subplot(313, xscale="log", yscale="log")
plot(k_rs, 1. - phit_long_gadget2 * k**2, '-', lw=1.4, label="${\\rm Gadget}$", color=colors[2])
plot(k_rs, 1. - phit_long_swift * k**2, '-', lw=1.4, label="${\\rm SWIFT}$", color=colors[3])
plot([1., 1.], [1e-5, 1e5], 'k-', alpha=0.5, lw=0.5)
plot(r_rs, np.ones(np.size(r)), 'k:', alpha=0.5, lw=0.5)
plot(r_rs, np.ones(np.size(r))*0.01, 'k:', alpha=0.5, lw=0.5)
plot(r_rs, np.ones(np.size(r)), 'k-.', alpha=0.5, lw=0.5)
plot(r_rs, np.ones(np.size(r))*0.01, 'k-.', alpha=0.5, lw=0.5)
xlim(1.1*r_min/ r_s, 0.9*r_max / r_s)
ylim(3e-3, 1.5)
......
......@@ -34,9 +34,9 @@ params = {'axes.labelsize': 9,
'ytick.labelsize': 8,
'text.usetex': True,
'figure.figsize' : (3.15,3.15),
'figure.subplot.left' : 0.115,
'figure.subplot.left' : 0.14,
'figure.subplot.right' : 0.99 ,
'figure.subplot.bottom' : 0.065 ,
'figure.subplot.bottom' : 0.1 ,
'figure.subplot.top' : 0.99 ,
'figure.subplot.wspace' : 0. ,
'figure.subplot.hspace' : 0. ,
......@@ -117,12 +117,12 @@ colors=['#4477AA', '#CC6677', '#DDCC77', '#117733']
subplot(311)
plot(r, W_newton, '--', lw=1.4, label="${\\rm Newtonian}$", color=colors[0])
plot(r, W_plummer, ':', lw=1.4, label="${\\rm Plummer}$", color=colors[1])
plot(r, W_gadget2, '-', lw=1.4, label="${\\rm Gadget}$", color=colors[2])
plot(r, W_gadget2, '-', lw=1.4, label="${\\rm Spline}$", color=colors[2])
plot(r, W, '-', lw=1.4, label="${\\rm SWIFT}$", color=colors[3])
plot([epsilon, epsilon], [0, 10], 'k-', alpha=0.5, lw=0.5)
plot([epsilon/plummer_equivalent_factor, epsilon/plummer_equivalent_factor], [0, 10], 'k-', alpha=0.5, lw=0.5)
legend(loc="upper right", frameon=True, handletextpad=0.1, handlelength=3.2, fontsize=8)
legend(loc="upper right", frameon=True, handletextpad=0.3, handlelength=1.6, fontsize=8, framealpha=1.)
xlim(0,r_max_plot)
xticks([0., 0.5, 1., 1.5, 2., 2.5], ["", "", "", "", "", ""])
......@@ -135,7 +135,7 @@ ylabel("$\\rho(r)$", labelpad=2)
subplot(312)
plot(r, phi_newton, '--', lw=1.4, label="${\\rm Newtonian}$", color=colors[0])
plot(r, phi_plummer, ':', lw=1.4, label="${\\rm Plummer}$", color=colors[1])
plot(r, phi_gadget2, '-', lw=1.4, label="${\\rm Gadget}$", color=colors[2])
plot(r, phi_gadget2, '-', lw=1.4, label="${\\rm Spline}$", color=colors[2])
plot(r, phi, '-', lw=1.4, label="${\\rm SWIFT}$", color=colors[3])
plot([epsilon, epsilon], [-10, 10], 'k-', alpha=0.5, lw=0.5)
plot([epsilon/plummer_equivalent_factor, epsilon/plummer_equivalent_factor], [0, 10], 'k-', alpha=0.5, lw=0.5)
......@@ -160,7 +160,7 @@ text(epsilon/plummer_equivalent_factor+0.03, 0.05, "$\\epsilon_{\\rm Plummer}$",
xlim(0,r_max_plot)
xticks([0., 0.5, 1., 1.5, 2., 2.5], ["$%.1f$"%(0./epsilon), "", "$%.1f$"%(1./epsilon), "", "$%.1f$"%(2./epsilon)])
xlabel("$r/H$", labelpad=-7)
xlabel("$r/H$", labelpad=-2.)
ylim(0, 0.95)
ylabel("$|\\overrightarrow{\\nabla}\\varphi(r)|$", labelpad=0)
......
......@@ -11,9 +11,9 @@ is cheaper to compute and has a very similar overall shape. The C2
kernel has the advantage of being branch-free leading to an expression
which is faster to evaluate using vector units available on modern
architectures; it also does not require any divisions to evaluate the
softened forces. We set $\tilde\delta(\mathbf{x}) =
\rho(|\mathbf{x}|) = W(|\mathbf{x}|, 3\epsilon_{\rm Plummer})$, with
$W(r, H)$ given by
softened forces. We set
$\tilde\delta(\mathbf{x}) = \rho(|\mathbf{x}|) = W(|\mathbf{x}|,
3\epsilon_{\rm Plummer})$, with $W(r, H)$ given by
\begin{align}
W(r,H) &= \frac{21}{2\pi H^3} \times \nonumber \\
......@@ -35,24 +35,24 @@ and $u = r/H$. The potential $\varphi(r,H)$ corresponding to this density distri
\end{align}
These choices, lead to a potential at $|\mathbf{x}| = 0$ equal to the
central potential of a Plummer sphere (i.e. $1/\epsilon_{\rm
Plummer}$)\footnote{Note the factor $3$ in the definition of
$\rho(|\mathbf{x}|)$ which differs from the factor $2.8$ used
in \textsc{Gadget} as a consequence of the change of kernel
shape.}. The softened density profile, its corresponding potential and
resulting forces are shown on Fig. \ref{fig:fmm:softening} (for
details see Sec. 2 of~\cite{Price2007}).
central potential of a Plummer sphere (i.e.
$\varphi(0) = 1/\epsilon_{\rm Plummer}$)\footnote{Note the factor $3$
in the definition of $\rho(|\mathbf{x}|)$ which differs from the
factor $2.8$ used in \textsc{Gadget} as a consequence of the change
of kernel shape.}. The softened density profile, its corresponding
potential and resulting forces are shown on
Fig. \ref{fig:fmm:softening} (for details of these are obtained see
section 2 of~\cite{Price2007}). For comparison purposes, we also
implemented the more traditional spline-kernel softening in \swift.
\begin{figure}
\includegraphics[width=\columnwidth]{potential.pdf}
\caption{The density (top), potential (middle) and forces (bottom)
generated py a point mass in our softened gravitational scheme.
A Plummer-equivalent sphere is shown for comparison. The spline
kernel of \citet{Monaghan1985}, used in \textsc{Gadget}, is shown
for comparison but note that it has not been re-scaled to match the
Plummer-sphere potential at $r=0$. %(for completeness, we chose
%$\epsilon=2$).
}
generated py a point mass in our softened gravitational scheme. A
Plummer-equivalent sphere is shown for comparison. The spline kernel
of \citet{Monaghan1985}, used for instance in \textsc{Gadget}, is
shown for comparison but note that it has not been re-scaled to
match the Plummer-sphere potential at $r=0$. }
\label{fig:fmm:softening}
\end{figure}
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment