Periodic gravity speed and accuracy improvements
Merge request reports
Activity
36 36 return 1.f + x * (1.f + x * (0.5f + x * (1.f / 6.f + 1.f / 24.f * x))); 37 37 } 38 38 39 /** 40 * @brief Approximate version of expf(x) using a 6th order Taylor expansion 41 * 42 */ 43 __attribute__((always_inline)) INLINE static float good_approx_expf(float x) { 36 36 return 1.f + x * (1.f + x * (0.5f + x * (1.f / 6.f + 1.f / 24.f * x))); 37 37 } 38 38 39 /** 40 * @brief Approximate version of expf(x) using a 6th order Taylor expansion 41 * 42 */ 43 __attribute__((always_inline)) INLINE static float good_approx_expf(float x) { Actually, I need the best possible approximation to e^x / ( 1+e^x ) for x in [0, 6]. It can just be 1 for any large x and I don't care about negative values. Might be that there is a Pade approximant of order less than 6 that does better than my current use of the approximate exp(). Note though that I can re-use the same exp(x) in both the numerator and denominator.
I need to work out the approximation quality I need. Essentially if I am close enough to the true value then I can use the analytic expression for the corresponding function in Fourier space. This is already much much faster than before.
I need to work out what "close enough" means though in terms of accuracy of the whole method.
Edited by Matthieu Schaller
36 36 return 1.f + x * (1.f + x * (0.5f + x * (1.f / 6.f + 1.f / 24.f * x))); 37 37 } 38 38 39 /** 40 * @brief Approximate version of expf(x) using a 6th order Taylor expansion 41 * 42 */ 43 __attribute__((always_inline)) INLINE static float good_approx_expf(float x) { One more thing: With this choice of function and the right coefficients, you can show that the correction factor is close to 1 (within less than 0.001) for all distances that are less than 0.1 of the mesh cell size. That means that for all self/pairs that take place at 4 levels or more below the top-level of the tree (where the FFT is performed), there is no need to compute any of this. With the erf choice, this is only true at 0.005 of the top-level mesh size, which is rarely reached.
This feature comes from the sharper transition of this function compared to the erf and allows to avoid computing any of these terms for most of the interesting situations.
I still need to write the bit of code that exploits this feature but it will make most of the discussion less relevant.
36 36 return 1.f + x * (1.f + x * (0.5f + x * (1.f / 6.f + 1.f / 24.f * x))); 37 37 } 38 38 39 /** 40 * @brief Approximate version of expf(x) using a 6th order Taylor expansion 41 * 42 */ 43 __attribute__((always_inline)) INLINE static float good_approx_expf(float x) { 36 36 return 1.f + x * (1.f + x * (0.5f + x * (1.f / 6.f + 1.f / 24.f * x))); 37 37 } 38 38 39 /** 40 * @brief Approximate version of expf(x) using a 6th order Taylor expansion 41 * 42 */ 43 __attribute__((always_inline)) INLINE static float good_approx_expf(float x) { Alright, then lets spice it up a bit.
We also need to evaluate the first few derivatives of that thing for the multipole-multipole part of the calculation. The formulation I currently use is good as it does not require me to recompute more things. I can just build them using powers of 1 / ( 1 + e^x ). I think that this is a nice property to retain.
Added 66 commits:
-
886706a1...59abd8af - 65 commits from branch
master
- 60d8ad06 - Merge branch 'master' into periodic_gravity
-
886706a1...59abd8af - 65 commits from branch
Added 1 commit:
- 9571b2a3 - Code formatting
mentioned in commit eef5614e