Skip to content
Snippets Groups Projects
Commit 9f5e8883 authored by Matthieu Schaller's avatar Matthieu Schaller
Browse files

Merge branch 'scheduler_timeout' into 'master'

Deadlock detector

See merge request !1731
parents 1698369a 471b9972
No related branches found
No related tags found
2 merge requests!1766added read Vz factor from the yml files.,!1731Deadlock detector
......@@ -31,7 +31,7 @@ This feature will create an individual file for each step specified by the ``Sch
Using this feature has several requirements:
- You need to compile SWIFT including either ``--enable-debugging-checks`` or ``--enable-cell-graph``. Otherwise, cells won't have IDs.
- There is a limit on how many cell IDs SWIFT can handle while enforcing them to be reproduceably unique. That limit is up to 32 top level cells in any dimension, and up to 16 levels of depth. If any of these thresholds are exceeded, the cells will still have unique cell IDs, but the actual IDs will most likely vary between any two runs.
- There is a limit on how many cell IDs SWIFT can handle while enforcing them to be reproducibly unique. That limit is up to 32 top level cells in any dimension, and up to 16 levels of depth. If any of these thresholds are exceeded, the cells will still have unique cell IDs, but the actual IDs will most likely vary between any two runs.
To plot the task dependencies, you can use the same script as before: ``tools/plot_task_dependencies.py``. The dependency graph now may have some tasks with a pink-ish background colour: These tasks represent dependencies that are unlocked by some other task which is executed for the requested cell, but the cell itself doesn't have an (active) task of that type itself in that given step.
......@@ -43,7 +43,7 @@ At the beginning of each simulation the file ``task_level_0.txt`` is generated.
It contains the counts of all tasks at all levels (depths) in the tree.
The depths and counts of the tasks can be plotted with the script ``tools/plot_task_levels.py``.
It will display the individual tasks at the x-axis, the number of each task at a given level on the y-axis, and the level is shown as the colour of the plotted point.
Additionally, the script can write out in brackets next to each tasks's name on the x-axis on how many different levels the task exists using the ``--count`` flag.
Additionally, the script can write out in brackets next to each task's name on the x-axis on how many different levels the task exists using the ``--count`` flag.
Finally, in some cases the counts for different levels of a task may be very close to each other and overlap on the plot, making them barely visible.
This can be alleviated by using the ``--displace`` flag:
It will displace the plot points w.r.t. the y-axis in an attempt to make them better visible, however the counts won't be exact in that case.
......@@ -141,7 +141,7 @@ Each line of the logs contains the following information:
size: size, in bytes, of the request
sum: sum, in bytes, of all requests that are currently not logged as complete
The stic values should be synchronized between ranks as all ranks have a
The stic values should be synchronised between ranks as all ranks have a
barrier in place to make sure they start the step together, so should be
suitable for matching between ranks. The unique keys to associate records
between ranks (so that the MPI_Isend and MPI_Irecv pairs can be identified)
......@@ -205,6 +205,8 @@ by using the size of the task data files to schedule parallel processes more
effectively (the ``--weights`` argument).
.. _dumperThread:
Live internal inspection using the dumper thread
------------------------------------------------
......@@ -236,6 +238,38 @@ than once. For a non-MPI run the file is simply called ``.dump``, note for MPI
you need to create one file per rank, so ``.dump.0``, ``.dump.1`` and so on.
Deadlock Detector
---------------------------
When configured with ``--enable-debugging-checks``, the parameter
.. code-block:: yaml
Scheduler:
deadlock_waiting_time_s: 300.
can be specified. It specifies the time (in seconds) the scheduler should wait
for a new task to be executed during a simulation step (specifically: during a
call to ``engine_launch()``). After this time passes without any new tasks being
run, the scheduler assumes that the code has deadlocked. It then dumps the same
diagnostic data as :ref:`the dumper thread <dumperThread>` (active tasks, queued
tasks, and memuse/MPIuse reports, if swift was configured with the corresponding
flags) and aborts.
A value of zero or a negative value for ``deadlock_waiting_time_s`` disable the
deadlock detector.
You are likely well advised to try and err on the upper side for the time to
choose for the ``deadlock_waiting_time_s`` parameter. A value in the order of
several (tens of) minutes is recommended. A too small value might cause your run to
erroneously crash and burn despite not really being deadlocked, just slow or
badly balanced.
Neighbour search statistics
---------------------------
......
......
......@@ -151,6 +151,7 @@ Scheduler:
task_level_output_frequency: 0 # (Optional) Dumping frequency of the task level data. By default, writes only at the first step.
free_foreign_during_restart: 0 # (Optional) Should the code free the foreign data when dumping restart files in order to get breathing space?
free_foreign_during_rebuild: 0 # (Optional) Should the code free the foreign data when calling a rebuld in order to get breathing space?
deadlock_waiting_time_s: 0. # (Optional) If runners didn't fetch a new task from a queue after this many seconds, assume swift deadlocked and abort. Non-positive values turn the detector off. Needs --enable-debugging-checks and MPI to take effect.
# Parameters governing the time integration (Set dt_min and dt_max to the same value for a fixed time-step run.)
TimeIntegration:
......
......
......@@ -1731,6 +1731,10 @@ void engine_launch(struct engine *e, const char *call) {
e->sched.deadtime.active_ticks += active_time;
e->sched.deadtime.waiting_ticks += getticks() - tic;
#ifdef SWIFT_DEBUG_CHECKS
e->sched.last_successful_task_fetch = 0LL;
#endif
if (e->verbose)
message("(%s) took %.3f %s.", call, clocks_from_ticks(getticks() - tic),
clocks_getunit());
......
......
......@@ -778,4 +778,7 @@ void engine_struct_dump(struct engine *e, FILE *stream);
void engine_struct_restore(struct engine *e, FILE *stream);
int engine_dump_restarts(struct engine *e, int drifted_all, int force);
/* dev/debug */
void engine_dump_diagnostic_data(struct engine *e);
#endif /* SWIFT_ENGINE_H */
......@@ -52,6 +52,31 @@ extern int engine_max_parts_per_ghost;
extern int engine_max_sparts_per_ghost;
extern int engine_max_parts_per_cooling;
/**
* @brief dump diagnostic data on tasks, memuse, mpiuse, queues.
*
* @param e the #engine
*/
void engine_dump_diagnostic_data(struct engine *e) {
/* OK, do our work. */
message("Dumping engine tasks in step: %d", e->step);
task_dump_active(e);
#ifdef SWIFT_MEMUSE_REPORTS
/* Dump the currently logged memory. */
message("Dumping memory use report");
memuse_log_dump_error(e->nodeID);
#endif
#if defined(SWIFT_MPIUSE_REPORTS) && defined(WITH_MPI)
/* Dump the MPI interactions in the step. */
mpiuse_log_dump_error(e->nodeID);
#endif
/* Add more interesting diagnostics. */
scheduler_dump_queues(e);
}
/* Particle cache size. */
#define CACHE_SIZE 512
......@@ -75,23 +100,7 @@ static void *engine_dumper_poll(void *p) {
while (1) {
if (access(dumpfile, F_OK) == 0) {
/* OK, do our work. */
message("Dumping engine tasks in step: %d", e->step);
task_dump_active(e);
#ifdef SWIFT_MEMUSE_REPORTS
/* Dump the currently logged memory. */
message("Dumping memory use report");
memuse_log_dump_error(e->nodeID);
#endif
#if defined(SWIFT_MPIUSE_REPORTS) && defined(WITH_MPI)
/* Dump the MPI interactions in the step. */
mpiuse_log_dump_error(e->nodeID);
#endif
/* Add more interesting diagnostics. */
scheduler_dump_queues(e);
engine_dump_diagnostic_data(e);
/* Delete the file. */
unlink(dumpfile);
......@@ -263,6 +272,13 @@ void engine_config(int restart, int fof, struct engine *e,
error("Scheduler:task_level_output_frequency should be >= 0");
}
#if defined(SWIFT_DEBUG_CHECKS)
e->sched.deadlock_waiting_time_ms = parser_get_opt_param_float(
params, "Scheduler:deadlock_waiting_time_s", -1.f);
/* User provides parameter in s. We want it in ms. */
e->sched.deadlock_waiting_time_ms *= 1000.f;
#endif
/* Deal with affinity. For now, just figure out the number of cores. */
#if defined(HAVE_SETAFFINITY)
const int nr_cores = sysconf(_SC_NPROCESSORS_ONLN);
......
......
......@@ -2811,6 +2811,77 @@ struct task *scheduler_unlock(struct scheduler *s, struct task *t) {
return NULL;
}
/**
* Take note of the time at which a task was successfully fetched from the
* queue.
*
* @param s The #scheduler.
*/
void scheduler_mark_last_fetch(struct scheduler *s) {
#if defined(SWIFT_DEBUG_CHECKS)
if (s->deadlock_waiting_time_ms <= 0.f) return;
ticks now = getticks();
ticks last = s->last_successful_task_fetch;
while (atomic_cas(&s->last_successful_task_fetch, last, now) != last) {
now = getticks();
last = s->last_successful_task_fetch;
}
#endif
}
/**
* Abort the run if you're stuck doing nothing for too long.
* This function is intended to abort the mission if you're
* deadlocked somewhere and somehow. You might get core dumps
* this way. Alternatively, you might manually set a breakpoint
* with gdb when this function is called.
*
* @param s The #scheduler.
*/
void scheduler_check_deadlock(struct scheduler *s) {
#if defined(SWIFT_DEBUG_CHECKS)
if (s->deadlock_waiting_time_ms <= 0.f) return;
/* lock_lock(&s->last_task_fetch_lock); */
ticks now = getticks();
ticks last = s->last_successful_task_fetch;
if (last == 0LL) {
/* Ensure that the first check each engine_launch doesn't fail. There is no
* guarantee how long it will take from the point where
* last_successful_task_fetch was reset to get to this point. A poorly
* chosen scheduler->deadlock_waiting_time_ms may abort a big run in places
* where there is no deadlock. Better safe than sorry, so at start-up, the
* last successful task fetch time is marked as 0. So we just exit without
* checking the time. */
while (atomic_cas(&s->last_successful_task_fetch, last, now) != last) {
now = getticks();
last = s->last_successful_task_fetch;
}
return;
}
/* ticks on different CPUs may disagree a bit. So we may end up
* with last > now, and consequently negative idle time, which
* then overflows unsigned long longs and gives false positives. */
const ticks big = max(now, last);
const ticks small = min(now, last);
const double idle_time = clocks_diff_ticks(big, small);
if (idle_time > s->deadlock_waiting_time_ms) {
message(
"Detected what looks like a deadlock after %g ms of no new task being "
"fetched from queues. Dumping diagnostic data.",
idle_time);
engine_dump_diagnostic_data(s->e);
error("Aborting now.");
}
#endif
}
/**
* @brief Get a task, preferably from the given queue.
*
......@@ -2854,11 +2925,12 @@ struct task *scheduler_gettask(struct scheduler *s, int qid,
TIMER_TIC
res = queue_gettask(&s->queues[qids[ind]], prev, 0);
TIMER_TOC(timer_qsteal);
if (res != NULL)
if (res != NULL) {
break;
else
} else {
qids[ind] = qids[--count];
}
}
if (res != NULL) break;
}
}
......@@ -2877,10 +2949,13 @@ struct task *scheduler_gettask(struct scheduler *s, int qid,
}
pthread_mutex_unlock(&s->sleep_mutex);
}
scheduler_check_deadlock(s);
}
/* Start the timer on this task, if we got one. */
if (res != NULL) {
scheduler_mark_last_fetch(s);
/* Start the timer on this task, if we got one. */
res->tic = getticks();
#ifdef SWIFT_DEBUG_TASKS
res->rid = qid;
......@@ -2943,6 +3018,11 @@ void scheduler_init(struct scheduler *s, struct space *space, int nr_tasks,
s->tasks = NULL;
s->tasks_ind = NULL;
scheduler_reset(s, nr_tasks);
#if defined(SWIFT_DEBUG_CHECKS)
s->e = space->e;
s->last_successful_task_fetch = 0LL;
#endif
}
/**
......
......
......@@ -125,6 +125,20 @@ struct scheduler {
/* Frequency of the task levels dumping. */
int frequency_task_levels;
#if defined(SWIFT_DEBUG_CHECKS)
/* Stuff for the deadlock detector */
/* How long to wait (in ms) before assuming we're in a deadlock */
float deadlock_waiting_time_ms;
/* Time at which last task was successfully retrieved from a queue */
ticks last_successful_task_fetch;
/* needed to dump queues on deadlock detection */
struct engine *e;
#endif /* SWIFT_DEBUG_CHECKS */
};
/* Inlined functions (for speed). */
......
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment