Skip to content

proxy tags exchange

When doing a zoom like setup, i.e., top level cell layouts with very different domain sizes and particles counts per cell, I was getting MPI hangs. I traced it back to proxy_tags_exchange, where there is a loop over each proxy and each cell, with each cell creating a MPI send/recv call (which for this setup was 1000's per proxy).

It looked like either, there were to many calls going into the queue and it was getting confused, or I suspect that the MPI tag, in this case the cid was not retaining uniqueness across all these calls.

I'm not sure why this wouldn't have been caught in all the runs so far, I would suspect this zoom setup possibly creates a huge number of proxies and comm calls.

I've changed the proxy_tag_exchange function to communicate the proxy cells in one go, each proxy packs its cells and sends them off, rather than communicating cell-by-cell. The drastically reduces the number of comms, but naturally makes the code more complicated.

I've tested it on my setup (DMO), and it gets around the hang issues I was having.

Merge request reports

Loading