Loic Hausammann · 9baf67b3
--- a/eurohack.md
+++ b/eurohack.md
+Summary of Eurohack17
+=====================
+We were not able to profile our code using the MegaKernel™ due to CUDA limitations, therefore our work was focused on tasks.
+
+All the speedup are given in comparison to the naive version and are from not fully optimized code.
+
+What we have tried:
+ * Shared memory gives a speedup of about 3x (for self density with shared memory fitting the cell size)^1.
+ * Symmetry about 1.5x
+ * Sorted computation 70x
+
+What tricks we have learn:
+ * Should avoid to have threads in wraps waiting (e.g. in loop `if i == j; continue`, increase manually i and avoid making the thread waiting on the others)
+
+
+What we are working on:
+ * Shared memory with a smaller size than the cell size
+
+
+![Screenshot_2017-09-25_14-15-12](/uploads/02029b0cb2021237c111f54b9a0e79c5/Screenshot_2017-09-25_14-15-12.png)
+
+^1 Number from memory, therefore may not be exact
\ No newline at end of file