|
|
# GPU Profling
|
|
|
|
|
|
What is GPU profiling?
|
|
|
|
|
|
## Getting ```nvvc```
|
|
|
|
|
|
Getting nvvc, installing, and finding it
|
|
|
|
|
|
## Compiling
|
|
|
|
|
|
Using the stuff
|
|
|
|
|
|
## ```nvprof```
|
|
|
|
|
|
Using nvprof on the cluster (options!)
|
|
|
|
|
|
## Using ```nvvc```
|
|
|
|
|
|
<VIDEO>
|
|
|
|
|
|
### GPU Profiling on CRAY
|
|
|
|
|
|
You can profile tests using the ```craype-accel-nvidia35``` module. Then, use ```nvprof ./binary``` and this will run your profile. To get this to work with ```test_27_cells``` you will need to remove the memory clear at the end of the test as there are some as-yet undiagnosed problems with this...
|
|
|
|
|
|
To analyse the data on your local machine you can use either [this nvidia tool](https://github.com/NVIDIA/cuda-profiler/tree/master/one_hop_profiling) to do things in real-time or (probably preferrably) you can just generate some stuff on Piz Daint with ```nvprof``` and copy this to your local machine.
|
|
|
|
|
|
To do that, run your code with the following:
|
|
|
```
|
|
|
nvprof --metrics achieved_occupancy,executed_ipc -o metrics.prof --export-profile timeline.prof
|
|
|
```
|
|
|
and this will generate two files, ```metrics.prof``` and ```timeline.prof```. Copy these to your local machine and launch ```nvvp``` (which can be downloaded [here](https://developer.nvidia.com/cuda-downloads), around 1.4GB). More information on running the profiler can be found in [these manual pages](http://docs.nvidia.com/cuda/profiler-users-guide/index.html#collecting-remote-data) but to get you started you want to use File -> Import, choose nvprof, and select your two files. Then import them and get to work! |
|
|
\ No newline at end of file |