... | ... | @@ -42,3 +42,31 @@ To do that, run your code with the following: |
|
|
```
|
|
|
nvprof --metrics achieved_occupancy,executed_ipc -o metrics.prof --export-profile timeline.prof
|
|
|
```
|
|
|
|
|
|
### Testing a single task
|
|
|
|
|
|
On branch cuda_test, you can edit and compile a test running a single task. To do so, copy the task that you wish to test in the tests/testcuda.cu file and update do_test_pair or do_test. You will also need to switch runPair on or off in the main.
|
|
|
|
|
|
To compile the script, do (script written for daint)
|
|
|
```
|
|
|
./make_cuda.sh
|
|
|
```
|
|
|
|
|
|
and run the following script
|
|
|
```
|
|
|
#!/bin/bash -l
|
|
|
#SBATCH --job-name=job
|
|
|
#SBATCH --time=01:00:00
|
|
|
#SBATCH --nodes=1
|
|
|
#SBATCH --ntasks-per-core=1
|
|
|
#SBATCH --ntasks-per-node=1
|
|
|
#SBATCH --cpus-per-task=12
|
|
|
#SBATCH --partition=normal
|
|
|
#SBATCH --constraint=gpu
|
|
|
#SBATCH --res=eurohack
|
|
|
|
|
|
nvprof --export-profile timeline.prof ./testcuda -p 8 -r 10
|
|
|
nvprof --metrics achieved_occupancy,executed_ipc -o metrics.prof ./testcuda -p 8 -r 10
|
|
|
nvprof --source-level-analysis pc_sampling -o pcsampling.prof ./testcuda -p 8 -r 10
|
|
|
nvprof --analysis-metrics -o analysis_metrics.prof ./testcuda -p 8 -r 10
|
|
|
``` |