... | @@ -45,7 +45,9 @@ Example timeline once the data has loaded: |
... | @@ -45,7 +45,9 @@ Example timeline once the data has loaded: |
|
|
|
|
|
### PC Sampling
|
|
### PC Sampling
|
|
|
|
|
|
<VIDEO>
|
|
[PC Sampling](http://docs.nvidia.com/cuda/profiler-users-guide/#pc-sampling) tells us where we are spending most of the time in our code, on a line-by-line basis. Here's a [quick video](/uploads/8ab48e54f89e07520ed6679fa4558404/nvvp.mp4) that shows how to get to the PC Sampling area in ```nvvp```.
|
|
|
|
|
|
|
|
At the moment the majority of the time spent in our code is on waiting for stuff to come down from global memory to registers, which we hope to improve with caching.
|
|
|
|
|
|
### Realtime Profiling on Piz Daint
|
|
### Realtime Profiling on Piz Daint
|
|
|
|
|
... | @@ -54,8 +56,7 @@ To analyse the data on your local machine you can use either [this nvidia tool]( |
... | @@ -54,8 +56,7 @@ To analyse the data on your local machine you can use either [this nvidia tool]( |
|
### Testing a single task
|
|
### Testing a single task
|
|
|
|
|
|
On branch cuda_test, you can edit and compile a test running a single task. To do so, copy the task that you wish to test in the tests/testcuda.cu file and update do_test_pair or do_test. You will also need to switch runPair on or off in the main.
|
|
On branch cuda_test, you can edit and compile a test running a single task. To do so, copy the task that you wish to test in the tests/testcuda.cu file and update do_test_pair or do_test. You will also need to switch runPair on or off in the main.
|
|
|
|
[nvvp.mp4](/uploads/f07073d5d586739869d394cb0cb09f95/nvvp.mp4)To compile the script, do (script written for daint)
|
|
To compile the script, do (script written for daint)
|
|
|
|
```
|
|
```
|
|
./make_cuda.sh
|
|
./make_cuda.sh
|
|
```
|
|
```
|
... | | ... | |