Reduction in data transfer tasks for the GPU
Currently we move all the particle data to/from the GPU every step even if its unused. I started discussing this in !355 (closed) but I need to finish it at some point, should help significantly for small steps I hope.