Draft: RT Rescheduling: First version
Hey y'all
I've gotten a first working version of the subcycling going. I figured it'd be best to let you have a look at it in its infant version and comment/restructure before I start stacking things on top of it.
The main idea is the following:
- add a new task at the end of the RT task group,
rt_reschedule
. (See dependency graph below.) - This task then "reschedules" the relevant RT tasks for the given cell. "reschedules" means it does 2 things: 1) set
task->skip
back to 0, and 2) set the number of dependencies,t->wait
, to the correct value. - This task also blocks the further progression down the task dependency. (By increasing the number of locks of the dependant task it'd usually unlock by 1.)
- Finally, only enqueue (using the proper
scheduler_enqueue()
the RT task that is at the top level of the RT task hierarchy, and the tasks should then be automatically re-run as if it's their first time.
There is a minor complication with that: The rt_reschedule
task itself needs to be rescheduled as well. However, we can't do this inside the rt_reschedule
task, since once it's done, scheduler_done()
will mark it as to be skipped.
A possible solution was to add a second task, rt_requeue
, which would come after rt_reschedule
, which then re-schedules rt_reschedule
, while itself being re-scheduled by the rt_reschedule
task beforehand. However, in some cases that lead to dependency issues - in the debugging mode for RT, the tasks itself have almost no work to do, and other threads managed to finish up all the other work and were trying to re-schedule the rt_requeue
task while it itself wasn't finished yet, i.e. didn't complete the scheduler_done()
call and was still marked as unskipped.
To keep things clean and avoid possible problems with that, I ended up using the following solution: After the rt_reschedule
task has been called, and after the respective call to scheduler_done()
in the runner_main()
loop, we make a call to a requeue function (not task). This ensures that the rt_reschedule
task is properly marked as done, and can be safely unskipped again.
To illustrate, here's a schematic image:
RTTaskDependencies-simplified.pdf
And as promised, a dependency graph:
Finally, I added a yaml parameter to enable the RT subcycling. By default, it is turned off (for now), so it shouldn't bother anybody else's work.