Skip to content

Draft: RT Rescheduling: First version

Mladen Ivkovic requested to merge rt_reschedule into master

Hey y'all

I've gotten a first working version of the subcycling going. I figured it'd be best to let you have a look at it in its infant version and comment/restructure before I start stacking things on top of it.

The main idea is the following:

  • add a new task at the end of the RT task group, rt_reschedule. (See dependency graph below.)
  • This task then "reschedules" the relevant RT tasks for the given cell. "reschedules" means it does 2 things: 1) set task->skip back to 0, and 2) set the number of dependencies, t->wait, to the correct value.
  • This task also blocks the further progression down the task dependency. (By increasing the number of locks of the dependant task it'd usually unlock by 1.)
  • Finally, only enqueue (using the proper scheduler_enqueue() the RT task that is at the top level of the RT task hierarchy, and the tasks should then be automatically re-run as if it's their first time.

There is a minor complication with that: The rt_reschedule task itself needs to be rescheduled as well. However, we can't do this inside the rt_reschedule task, since once it's done, scheduler_done() will mark it as to be skipped.

A possible solution was to add a second task, rt_requeue, which would come after rt_reschedule, which then re-schedules rt_reschedule, while itself being re-scheduled by the rt_reschedule task beforehand. However, in some cases that lead to dependency issues - in the debugging mode for RT, the tasks itself have almost no work to do, and other threads managed to finish up all the other work and were trying to re-schedule the rt_requeue task while it itself wasn't finished yet, i.e. didn't complete the scheduler_done() call and was still marked as unskipped.

To keep things clean and avoid possible problems with that, I ended up using the following solution: After the rt_reschedule task has been called, and after the respective call to scheduler_done() in the runner_main() loop, we make a call to a requeue function (not task). This ensures that the rt_reschedule task is properly marked as done, and can be safely unskipped again.

To illustrate, here's a schematic image:

RTTaskDependencies-simplified.pdf

And as promised, a dependency graph: dependency_graph_0

Finally, I added a yaml parameter to enable the RT subcycling. By default, it is turned off (for now), so it shouldn't bother anybody else's work.

Edited by Matthieu Schaller

Merge request reports