Vectorise kernel
Should fix issue #154 (closed). Includes updates to testKernel.c
that test kernel_deval_vec
against the serial kernel_deval
.
I have tested it with Cubic Spline, Wendland C2 + C6 kernels.
Merge request reports
Activity
220 * Return 0 if $u > \\gamma = H/h$ 221 * 222 * @param u The ratio of the distance to the smoothing length $u = x/h$. 223 * @param W (return) The value of the kernel function $W(x,h)$. 224 * @param dW_dx (return) The norm of the gradient of $|\\nabla W(x,h)|$. 225 */ 226 227 __attribute__((always_inline)) 228 INLINE static void kernel_deval_vec(vector *u, vector *w, vector *dw_dx) { 229 230 vector ind, c[kernel_degree + 1], x; 231 int j, k; 232 233 /* Go to the range [0,1[ from [0,H[ */ 234 x.v = u->v * vec_set1((float)kernel_igamma); 235 242 c[j].f[k] = kernel_coeffs[ind.i[k] * (kernel_degree + 1) + j]; 243 244 /* Init the iteration for Horner's scheme. */ 245 w->v = (c[0].v * x.v) + c[1].v; 246 dw_dx->v = c[0].v; 247 248 /* And we're off! */ 249 for (int k = 2; k <= kernel_degree; k++) { 250 dw_dx->v = (dw_dx->v * x.v) + w->v; 251 w->v = (x.v * w->v) + c[k].v; 252 } 253 254 /* Return everything */ 255 w->v = w->v * vec_set1((float)kernel_constant) * vec_set1((float)kernel_igamma3); 256 dw_dx->v = dw_dx->v * vec_set1((float)kernel_constant) * vec_set1((float)kernel_igamma4); 257 Added 1 commit:
- 5597fb55 - Improved output
I have taken the liberty to improve a bit the output (create the same number of points for the scalar and vector version) to ease the comparison.
I had almost written you an email complaining that the numbers didn't match before noticing you did not output the same number of lines for both cases.
Added 1 commit:
- d6739db2 - Defined vector kernel constants prior to function call, so that they are set only once.
Added 1 commit:
- bd3e3f2e - Code formatting
Added 1 commit:
- 660bef5f - Added macro FILL_VEC to setup constants as vectors, depending on the vector size.
mentioned in commit efdf7330
Please register or sign in to reply