Skip to content
Snippets Groups Projects

Removed calls to fminf and fmaxf in the critical sections of the code.

Merged Matthieu Schaller requested to merge no_call_to_fminf into master

Is there any reason not to do this ? I have replaced calls to fminf and fmaxf by ternary operators. That leads to massive speed-ups.

According to vTune, we were spending large amounts of time inside these functions.

Is there any reason to use them over what I have done ? Any special behaviour of the functions that we would need in some cases ?

Merge request reports

Merged by avatar (May 29, 2025 3:03am UTC)

Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Goes from this: Screenshot_from_2016-08-17_18-25-04

    to this: Screenshot_from_2016-08-17_18-25-14

    In both cases we have no vectorization and I used the latest Intel compiler with the default optimizations activated. I was suprised to see this pop us as a function call. Though compilers would optimize that out.

  • We'll contact Intel about this and get their opinion on this.

    In the mean time, I'll merge this in as it is a clear improvement.

  • Matthieu Schaller Status changed to merged

    Status changed to merged

  • mentioned in commit 4d45b74f

  • So according to the gcc documentation:

    The ISO C99 functions _Exit, acoshf, acoshl, acosh, [...], fminf, [...], vfscanf, vscanf, vsnprintf and vsscanf are handled as built-in functions except in strict ISO C90 mode (-ansi or -std=c90).

    The default language mode is -std=gnu11, so this shouldn't be happening. @pdraper, any insight on whether this could be the cause?

    Otherwise, I'm curious to see what Intel have to say about it.

    Edited by Pedro Gonnet
  • For reference, both cases shown above are compiled as follows on cosma-5:

    module load swift swift/c5/intel/intelmpi/5.1.2
    ./configure --disable-vec --enable-debug
Please register or sign in to reply
Loading