Skip to content
Snippets Groups Projects

Make some threadpool mapper functions visible as global symbols.

Merged Peter W. Draper requested to merge mpi-threadpool-debugging into master

When debugging the threadpool all mapper functions need to be visible as global symbols (so they can be found by dladdr()), so remove static and update the function names to their local namespaces as needed.

These are all associated with MPI.

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Peter W. Draper requested review from @matthieu

    requested review from @matthieu

  • added 1 commit

    Compare with previous version

  • @dc-rope1 I think these are the functions being called '(null)' in the threadpool debugging logs.

  • Matthieu Schaller approved this merge request

    approved this merge request

  • @dc-rope1 could you merge this into your branch and confirm it solved the issue?

  • I have tested this and now (some of) the redistribute mappers appear in the threadpools:

    wedge_threadpool_32_rank0

    But there are still (null) threadpools. I had wondered if these were accumulate_sizes calls since they were also static. However, removing static does not cause them to appear nor (null) to disappear. I suspect this is because of the way these are implemented (?):

    /* Generic function for accumulating sized counts for TYPE parts. Note uses
     * local memory to reduce contention, the amount of memory required is
     * precalculated by an additional loop determining the range of cell IDs. */
    #define ACCUMULATE_SIZES_MAPPER(TYPE)                                          \
      accumulate_sizes_mapper_##TYPE(void *map_data, int num_elements,             \
                                     void *extra_data) {                           \
        struct TYPE *parts = (struct TYPE *)map_data;                              \
        struct counts_mapper_data *mydata =                                        \
            (struct counts_mapper_data *)extra_data;                               \
        double size = mydata->size;                                                \
        struct space *s = mydata->s;                                               \
        double dim[3] = {mydata->s->dim[0], mydata->s->dim[1], mydata->s->dim[2]}; \
        double *lcounts = NULL;                                                    \
        int lcid = mydata->s->nr_cells;                                            \
        int ucid = 0;                                                              \
        for (int k = 0; k < num_elements; k++) {                                   \
          for (int j = 0; j < 3; j++) {                                            \
            if (parts[k].x[j] < 0.0)                                               \
              parts[k].x[j] += dim[j];                                             \
            else if (parts[k].x[j] >= dim[j])                                      \
              parts[k].x[j] -= dim[j];                                             \
          }                                                                        \
          const int cid =                                                          \
              cell_getid_pos(s, parts[k].x[0], parts[k].x[1], parts[k].x[2]);      \
          if (cid > ucid) ucid = cid;                                              \
          if (cid < lcid) lcid = cid;                                              \
        }                                                                          \
        int nused = ucid - lcid + 1;                                               \
        if ((lcounts = (double *)calloc(sizeof(double), nused)) == NULL)           \
          error("Failed to allocate counts thread-specific buffer");               \
        for (int k = 0; k < num_elements; k++) {                                   \
          const int cid =                                                          \
              cell_getid_pos(s, parts[k].x[0], parts[k].x[1], parts[k].x[2]);      \
          lcounts[cid - lcid] += size;                                             \
        }                                                                          \
        for (int k = 0; k < nused; k++)                                            \
          atomic_add_d(&mydata->counts[k + lcid], lcounts[k]);                     \
        free(lcounts);                                                             \
      }
    
    /**
     * @brief Accumulate the sized counts of particles per cell.
     * Threadpool helper for accumulating the counts of particles per cell.
     *
     * part version.
     */
    void ACCUMULATE_SIZES_MAPPER(part);
    
    /**
     * @brief Accumulate the sized counts of particles per cell.
     * Threadpool helper for accumulating the counts of particles per cell.
     *
     * gpart version.
     */
    void ACCUMULATE_SIZES_MAPPER(gpart);
    
    /**
     * @brief Accumulate the sized counts of particles per cell.
     * Threadpool helper for accumulating the counts of particles per cell.
     *
     * spart version.
     */
    void ACCUMULATE_SIZES_MAPPER(spart);
    
    /* qsort support. */
    static int ptrcmp(const void *p1, const void *p2) {
      const double *v1 = *(const double **)p1;
      const double *v2 = *(const double **)p2;
      return (*v1) - (*v2);
    }

    If it is due to this implementation then I'm not sure how to get around this beyond hard coding each particle type variant. Note that this is probably also true for engine_redistribute_savelink_mapper which is implemented in the same way.

  • Strange not what I see:

    4274 partition_accumulate_sizes_mapper_gpart
    2199 partition_accumulate_sizes_mapper_part

    just grep-ing for that name and counting the occurrences in my logs. But I see your snapshot above uses the old names for these functions, not the partition_ prefixed new ones.

  • BTW, since these are macros they will be expanded before the actual compiler sees the code, so defining this way will not make any difference to things like use of static.

  • Is the branch you tested this on quite outdated and has more mapper functions that are static?

  • It's not been brought up to date with master since roughly Christmas. I'll rectify that when I work out why it grinds to a halt after a merge. However, having cherry-picked the necessary changes (thanks @pdraper) for this MR I do now get all the threadpools with no (null) entries.

  • Are there then any static mappers left in your branch then?

  • mentioned in commit b7fcac46

Please register or sign in to reply
Loading