Skip to content

Overestimation of the number of links

I am running an EAGLE-200 box with both hydro and gravity. It crashed because it ran out of memory. I have restarted it with more information. One of the culprit is the number of foreign part/gpart we allocated (see #456 (closed)) but the more interesting one is the number of link "objects".

On this run, we have e->size_links == 4'568'936'336 (!!!) but e->nr_links == 1'145'103.

This means that the estimate of the number of links is wrong by a factor 4000. It also means that for this run we are allocating more than 64GB per MPI rank for the links!

The obvious fix is to come with a better estimate of the number of links we need.

Interesting numbers for one rank are:

  • Number of top-level cells: 262'144 (64^3)
  • Number of cells: 24'644'944
  • Number of tasks: 1'743'339

@nnrw56 @pdraper you might be interested in this.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information