Skip to content
Snippets Groups Projects
Commit 0ee9968a authored by Jonathan Frawley's avatar Jonathan Frawley
Browse files

More results

parent a23ffff1
No related branches found
No related tags found
No related merge requests found
#!/bin/bash
#SBATCH --job-name="swiftaps"
#SBATCH --ntasks=1
#SBATCH --ntasks=2
#SBATCH --ntasks-per-node=1
#SBATCH --output=swiftaps.out
#SBATCH --error=swiftaps.err
......
#!/bin/bash
#SBATCH --job-name="swiftarm"
#SBATCH --ntasks=1
#SBATCH --ntasks=2
#SBATCH --ntasks-per-node=1
#SBATCH --output=swiftarm.out
#SBATCH --error=swiftarm.err
......
This diff is collapsed.
This diff is collapsed.
Command: /cosma/home/ds007/dc-fraw1/performance_analysis_workshop/swift-cs-performance-workshop-2021/benchmark-slow/swiftsim/examples/swift_mpi --cosmology --self-gravity -v 1 --threads=64 -n 1 -P Restarts:enable:0 -PInitialConditions:file_name:/cosma5/data/do008/dc-fraw1/swift_initial_conditions/pmillenium/PMill-768.hdf5 p-mill-768.yml
Resources: 1 node (32 physical, 64 logical cores per node)
Memory: 503 GiB per node
Tasks: 1 process
Machine: b108.pri.cosma7.alces.network
Start time: Thu Jan 21 15:56:08 2021
Total time: 1714 seconds (about 29 minutes)
Full path: /cosma/home/ds007/dc-fraw1/performance_analysis_workshop/swift-cs-performance-workshop-2021/benchmark-slow/swiftsim/examples
Summary: swift_mpi is Compute-bound in this configuration
Compute: 94.7% |========|
MPI: 0.1% ||
I/O: 5.2% ||
This application run was Compute-bound. A breakdown of this time and advice for investigating further is in the CPU section below.
As very little time is spent in MPI calls, this code may also benefit from running at larger scales.
CPU:
A breakdown of the 94.7% CPU time:
Scalar numeric ops: 12.5% ||
Vector numeric ops: 32.9% |==|
Memory accesses: 54.6% |====|
The per-core performance is memory-bound. Use a profiler to identify time-consuming loops and check their cache performance.
MPI:
A breakdown of the 0.1% MPI time:
Time in collective calls: 100.0% |=========|
Time in point-to-point calls: 0.0% |
Effective process collective rate: 750 MB/s
Effective process point-to-point rate: 0.00 bytes/s
I/O:
A breakdown of the 5.2% I/O time:
Time in reads: 100.0% |=========|
Time in writes: 0.0% |
Effective process read rate: 99.5 MB/s
Effective process write rate: 0.00 bytes/s
Most of the time is spent in read operations with a low effective transfer rate. This may be caused by contention for the filesystem or inefficient access patterns. Use an I/O profiler to investigate which write calls are affected.
Threads:
A breakdown of how multiple threads were used:
Computation: 96.4% |=========|
Synchronization: 3.6% ||
Physical core utilization: 165.5% |================|
System load: 161.8% |===============|
The system load is high. Check that other jobs or system processes are not running on the same nodes.
Memory:
Per-process memory usage may also affect scaling:
Mean process memory usage: 59.1 GiB
Peak process memory usage: 74.7 GiB
Peak node memory usage: 16.0% |=|
The peak node memory usage is very low. Larger problem sets can be run before scaling to multiple nodes.
Energy:
A breakdown of how energy was used:
CPU: not supported
System: not supported
Mean node power: not supported
Peak node power: 0.00 W
Energy metrics are not available on this system.
CPU metrics are not supported (no intel_rapl module)
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment