From 6f4180093b36a7325b442b258321f55071a7aa10 Mon Sep 17 00:00:00 2001
From: "Peter W. Draper" <p.w.draper@durham.ac.uk>
Date: Tue, 24 Sep 2019 13:01:53 +0100
Subject: [PATCH] Add some markup

---
 README.md | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/README.md b/README.md
index ff7a86e..2601f6e 100644
--- a/README.md
+++ b/README.md
@@ -1,7 +1,10 @@
+SWIFTmpistepsim
+===============
 
-This package is a standalone part of [SWIFT](http://www.swiftsim.com) that
+This project is a standalone part of [SWIFT](http://www.swiftsim.com) that
 aims to roughly simulate the MPI interactions that taking a single step of a
-SWIFT simulation makes.
+SWIFT simulation makes. Making it possible to more easily see the performance
+of MPI calls, and also investigations of tuning more obvious.
 
 The actual process within SWIFT is that queues of cell-based tasks are ran,
 with their priorities and dependencies determining the order that the tasks
@@ -9,7 +12,7 @@ are ran in. Tasks are only added to a queue when they are ready to run, that
 is they are not waiting for other tasks. This order also determines when the
 sends and recvs needed to update data on other ranks are initiated as this
 happens when the associated task is queued. The sends and recvs are considered
-to be complete when MPI_Test returns true and this unlocks any dependencies
+to be complete when `MPI_Test` returns true and this unlocks any dependencies
 they have. Obviously a step cannot complete until all the sends and recvs are
 themselves also complete, so the performance of the MPI library and lower
 layers is critical. This seems to be most significant, not when we have a lot
@@ -17,9 +20,9 @@ of work, or very little, but for intermediary busy steps, when the local work
 completes much sooner than the MPI exchanges.
 
 In SWIFT the enqueuing of tasks, thus send and recvs initiation (using
-MPI_Isend and MPI_Irecv) can happen from all the available threads, but the
-polling of MPI_Test is done primarily using two queues, but these can steal
-work from other queues, and other queues can steal MPI_Test calls as well.
+`MPI_Isend` and `MPI_Irecv`) can happen from all the available threads, but the
+polling of `MPI_Test` is done primarily using two queues, but these can steal
+work from other queues, and other queues can steal `MPI_Test` calls as well.
 Enqueuing and processing can happen at the same time.
 
 To keep this simple this package uses three threads to simulate all this, a
@@ -53,9 +56,9 @@ numbers of ranks as the original run of SWIFT.
 
 The verbose output and output log can be inspected to see what delays are
 driving the elapsed time for the step. Mainly these seem to be outlier
-MPI_Test calls that take tens of milliseconds.
+`MPI_Test` calls that take tens of milliseconds.
 
-A script post-process.py can be ran on the output log to pair the sends and
+A script `post-process.py` can be ran on the output log to pair the sends and
 recvs across the ranks. This allows the inspection of how well things like
 eager exchanges are working and what effect the size of the packets has.
 
-- 
GitLab