Make mpicollectgroup1_reduce() loop over all the elements as described by the MPI standard.
I was looking at the implementation of a reduce operation over MPI for the multipoles and I stumbled upon this.
If my reading of section 5.9.5 of the MPI standard is correct then I think we need a loop over both the vectors passed into this function and not just the in
one. Since we are doing a reduce of len
1 here, it likely makes no difference I'd better be safe here and use the recommended signature. What do you think?