Operating Systems

Scheduling

Prof. Dr. Oliver Hahm

2024-12-12

What does the OS need to implement in order to enable multitasking?

Process Switching

Dispatcher

Dispatching and Scheduling

Tasks of multitasking OS are among others:
- Dispatching: Assign the CPU to another process (process switching)
- Scheduling: Determine the order of process execution and the exact point in time when the process switch occurs

The dispatcher carries out the state transitions of the processes
The scheduler determines these transitions happen

Performance Considerations

The scheduler may run …
- periodically (e.g., on Linux)
- for every interrupt (e.g., on RIOT)
Is called frequently and hence, should be as efficient as possible
Every call to the scheduler may trigger the dispatcher to run
Must be efficient as well

What does the dispatcher have to do?

The Dispatcher

We already know…

During process switching, the dispatcher removes the CPU from the running process and assigns it to the process, which is the first one in the queue
For transitions between the states ready and blocked, the dispatcher removes the corresponding process control blocks from the status lists and accordingly inserts them new
Transitions from or to the state running always imply a switch of the process, which is currently executed by the CPU

If a process switches into the state running or from the state running to another state, the dispatcher needs to…

store the context (register contents) of the executed process in the process control block (PCB)
assign the CPU to another process
restore the context (register contents) of the process, which will be executed next, from its process control block (PCB)

Which process is executed if no process is on the runqueue?

Idle Process

Many OS have a idle process
If no process is in the state ready an idle process gets the CPU assigned
The idle process is always ready to run and has the lowest priority
Many modern CPU provide power-saving modes \(\rightarrow\) most OS will enter a power-saving mode when the idle process is running
For each CPU core (in hyperthreading systems for each logical CPU) a system idle process exists

Scheduling

Scheduling Criteria and Scheduling Policies

The scheduler of an OS specifies the order in which the dispatcher puts the processes in the state ready
The best scheduling policy (or scheduling algorithm) depends on the use case
- No scheduling policy…
  - is optimally suited for every system and
  - can take all scheduling criteria optimal into account.
The scheduling policy is always a tradeoff between different scheduling criteria

What might be a scheduling criteria?

Scheduling criteria

Scheduling criteria are among others CPU load, response time (latency), turnaround time, throughput, efficiency, real-time behavior (compliance with deadlines), waiting time, overhead, fairness, consideration of priorities, even resource utilization

When to interrupt a running process?

Non-preemptive and preemptive Scheduling

Two types of scheduling policies exist
- Non-preemptive scheduling or cooperative scheduling
  - Any running process will either run until completion or voluntarily yields
  - Problematic: A process may occupy the CPU for as long as it wants
- Examples: Windows 3.x, MacOS 8/9, Windows 95/98/Me (for 16-Bit processes)
- Preemptive scheduling
  - The CPU may be removed from a process before its execution is completed
  - Drawback: Higher overhead compared with non-preemptive scheduling
- Examples: Linux, MacOS X, Windows 95/98/Me (for 32-Bit processes), Windows NT (incl. XP/Visa/7/8/10/11), FreeBSD, RIOT

Preemptive Scheduling in RIOT

In RIOT a running process is only removed from the run queue if a process with a higher priority becomes ready to run.

How can we measure
the performance of a
scheduling policy?

Performance Metrics

Waiting Time: The time a process has to wait before getting the CPU assigned
CPU Time: The time that the process needs to access the CPU to complete its execution
Runtime: = "lifetime" = time period between the creation and the termination of a process = (CPU time + waiting time)

Impact on the overall Performance of a Computer

This example demonstrates the impact of the scheduling method used on the overall performance of a computer
- The processes \(P_{A}\) and \(P_{B}\) are to be executed one after the other

Process	CPU time
A	\(24\) ms
B	\(2\) ms

Which order seems to be preferable?

If a short-running process runs before a long-running process, the runtime and waiting time of the long process process get slightly worse
If a long-running process runs before a short-running process, the runtime and waiting time of the short process get significantly worse

Execution	Runtime		Average	Waiting time		Average
order	A	B	runtime	A	B	waiting time
\(P_{A},P_{B}\)	24 ms	26 ms	\(\frac{24+26}{2} = 25\) ms	0 ms	24 ms	\(\frac{0+24}{2} = 12\) ms
\(P_{B},P_{A}\)	26 ms	2 ms	\(\frac{2+26}{2} = 14\) ms	2 ms	0 ms	\(\frac{0+2}{2} = 1\) ms

Schedule Representation

The execution order of processes according to a certain scheduling strategy can be represented as a Gantt Chart
A Gantt chart is a type of bar chart which can be used to illustrate a schedule
Gantt charts were designed by the engineer and consultant Henry Gantt

Scheduling Policies (Algorithms)

Scheduling Policies

Several scheduling policies exist
- Each policy tries to comply with the well-known scheduling criteria and principles in varying degrees
Some scheduling policies:
- Priority-driven scheduling
- First Come First Served (FCFS) = First In First Out (FIFO)
- Last Come First Served (LCFS)
- Round Robin (RR) with time quantum
- Shortest/Longest Job First (SJF/LJF)
- Shortest/Longest Remaining Time First (SRTF/LRTF)
- Highest Response Ratio Next (HRRN)
- Earliest Deadline First (EDF)
- Static multilevel scheduling
- Multilevel feedback scheduling
- Completely Fair Scheduler (CFS)

Priority-driven Scheduling

Processes are executed according to their priority (= importance or urgency)
The highest priority process in state ready gets the CPU assigned
Can be preemptive and non-preemptive
The priority values can be assigned static or dynamic
- Static priorities remain unchanged throughout the lifetime of a process and are often used in real-time systems
- Dynamic priorities are adjusted during a process’ lifetime
  \(\Longrightarrow\) Multilevel feedback scheduling (see slide )
Risk of (static) priority-driven scheduling: Processes with low priority values may starve (\(\Longrightarrow\) this is not fair)

Priority-driven Scheduling

Source: William Stallings. Operating Systems. 4^th edition. Prentice Hall (2001). P.401

Priority-driven Scheduling – Example

Four processes shall be processed on a system with a single CPU
All processes are at time point 0 in state ready
Execution order of the processes as Gantt chart (timeline)

Process	CPU time	Priority
A	8 ms	15
B	4 ms	3
C	7 ms	4
D	13 ms	8

Runtime of the processes

Process	A	B	C	D
Runtime	32	4	11	24

Avg. runtime = \(\frac{32+4+11+24}{4} = 17.75\mbox{ ms}\)

Waiting time of the processes

Process	A	B	C	D
Waiting time	24	0	4	11

Avg. waiting time = \(\frac{24+0+4+11}{4} = 9.75\mbox{ ms}\)

First Come First Served (FCFS)

Works according to the principle First In First Out (FIFO)
Running processes are not interrupted
- It is non-preemptive scheduling
FCFS is fair
- All processes are eventually executed
The average waiting time may be very high under certain circumstances
- The execution of short-lived processes may have to wait for a long time if processes with long execution times have arrived before
FCFS/FIFO can be used for \(\Longrightarrow\) batch processing
FIFO is used in Linux for non-preemptive real-time processes

First Come First Served – Example

Four processes shall be processed on a system with a single CPU
Execution order of the processes as Gantt chart

Process	CPU time	Creation time
A	8 ms	0 ms
B	4 ms	1 ms
C	7 ms	3 ms
D	13 ms	5 ms

Runtime of the processes

Process	A	B	C	D
Runtime	8	11	16	27

Avg. runtime = \(\frac{8+11+16+27}{4} = 15.5\mbox{ ms}\)

Waiting time of the processes

Process	A	B	C	D
Waiting time	0	7	9	14

Avg. waiting time = \(\frac{0+7+9+14}{4} = 7.5\mbox{ ms}\)

Last Come First Served (LCFS)

Works according to the principle Last In First Out (LIFO)
Processes are executed in the reverse order of creation
- The concept is equal with a stack
Running processes are not interrupted
- The processes have the CPU assigned until process termination or voluntary resigning
LCFS is not fair
- In case of continuous creation of new processes, the old processes are not taken into account and thus may starve
LCFS can be used for \(\Longrightarrow\) batch processing
- Is seldom used in pure form

Last Come First Served – Example

Four processes shall be processed on a system with a single CPU
Execution order of the processes as Gantt chart

Process	CPU time	Creation time
A	8 ms	0 ms
B	4 ms	1 ms
C	7 ms	3 ms
D	13 ms	5 ms

Runtime of the processes

Process	A	B	C	D
Runtime	8	31	25	16

\(\frac{8+31+25+16}{4} = 20\mbox{ ms}\)

Waiting time of the processes

Process	A	B	C	D
Waiting time	0	27	18	3

\(\frac{0+27+18+3}{4} = 12\mbox{ ms}\)

Last Come First Served – Preemptive Variant (LCFS-PR)

A new process in state ready replaces the currently executed processes from the CPU
- Preempted processes are enqueued at the end
- If no new processes are created, the running process has the CPU assigned until process termination or voluntary resigning
Prefers processes with a short execution time
- The execution of a process with a short execution time may be completed before new process are created
- Processes with a long execution time may get the CPU resigned several times and thus significantly delayed
LCFS-PR is not fair
- Processes with a long execution time may never get the CPU assigned and starve
Is seldom used in pure form

Last Come First Served Example – Preemptive Variant

Four processes shall be processed on a system with a single CPU
Execution order of the processes as Gantt chart

Process	CPU time	Creation time
A	8 ms	0 ms
B	4 ms	1 ms
C	7 ms	3 ms
D	13 ms	5 ms

Runtime of the processes

Process	A	B	C	D
Runtime	32	24	20	13

\(\frac{32+24+20+13}{4} = 22.25\mbox{ ms}\)

Waiting time of the processes

Process	A	B	C	D
Waiting time	24	20	13	0

\(\frac{24+20+13+0}{4} = 14.25\mbox{ ms}\)

Which scheduling strategy may be well suited for generic user space applications?

Round Robin – RR (1/2)

Time slices with a fixed duration (may be \(\infty\)!) are specified
The processes are queued in a cyclic queue according to the FIFO principle
- The first process of the queue gets the CPU assigned for the duration of a time slice
- After the expiration of the time slice, the process gets the CPU resigned (\(\Rightarrow\) is preempted) and is enqueued at the end of the queue
- Whenever a process is completed successfully, it is removed from the queue
  - New processes are inserted at the end of the queue

The CPU time is distributed fair among the processes
RR with time slice size \(\infty\) behaves like \(\longrightarrow\) FCFS

Round Robin – RR (2/2)

The longer the execution time of a process is, the more rounds are required for its complete execution
The duration of the time slices influences the performance of the system
- The shorter they are, the more process switches must take place
  \(\Longrightarrow\) increased overhead
- The longer they are, the more gets the simultaneousness lost
  \(\Longrightarrow\) The system hangs/becomes jerky
The usual duration of time slices is in single or double-digit millisecond range
Prefers processes with short execution time
Preemptive scheduling policy
Round Robin scheduling can be used for interactive systems
Round Robin is used in Linux for preemptive real-time processes

Round Robin – Example

Four processes shall be processed on a system with a single CPU
All processes are at time point 0 in state ready
Time quantum \(q = 1\) ms
Execution order of the processes as Gantt chart

Process	CPU time
A	8 ms
B	4 ms
C	7 ms
D	13 ms

Runtime of the processes

Process	A	B	C	D
Runtime	26	14	24	32

Avg. runtime = \(\frac{26+14+24+32}{4} = 24\mbox{ ms}\)

Waiting time of the processes

Process	A	B	C	D
Waiting time	18	10	17	19

Avg. waiting time = \(\frac{18+10+17+19}{4} = 16\mbox{ ms}\)

Shortest Job First (SJF) / Shortest Process Next (SPN)

The process with the shortest execution time get the CPU assigned first
Non-preemptive scheduling policy
Problem: The runtime of each process needs to be known in advance
Solution: Execution time is estimated by analyzing its behavior in the past
SJF is not fair
- Prefers processes, which have a short execution time
- Processes with a long execution time may get the CPU assigned only after a very long waiting period or starve
If the execution time of the processes can be estimated, SJF can be used for batch processing

Shortest Job First – Example

Four processes shall be processed on a system with a single CPU
All processes are at time point 0 in state ready
Execution order of the processes as Gantt chart

Process	CPU time
A	8 ms
B	4 ms
C	7 ms
D	13 ms

Runtime of the processes

Process	A	B	C	D
Runtime	19	4	11	32

\(\frac{19+4+11+32}{4} = 16.5\mbox{ ms}\)

Waiting time of the processes

Process	A	B	C	D
Waiting time	11	0	4	19

\(\frac{11+0+4+19}{4} = 8.5\mbox{ ms}\)

Shortest Remaining Time First (SRTF)

Preemptive SJF is called Shortest Remaining Time First (SRTF)
On process creation the remaining execution time of the running process is compared with each process in state ready in the queue
- If the currently running process has the shortest remaining execution time, the CPU remains assigned to this process
- If one or more processes in state ready have a shorter remaining execution time, the process with the shortest remaining execution time gets the CPU assigned
Estimation of runtime is required
As long as no new process is created, no running process gets interrupted
- The processes in state ready are compared with the running process only when a new process is created!
Processes with a long execution time may starve (\(\Longrightarrow\) not fair)

Shortest Remaining Time First – Example

Four processes shall be processed on a system with a single CPU
Execution order of the processes as Gantt chart

Process	CPU time	Creation time
A	8 ms	0 ms
B	4 ms	3 ms
C	7 ms	16 ms
D	13 ms	11 ms

Runtime of the processes

Process	A	B	C	D
Runtime	12	4	7	21

\(\frac{12+4+7+21}{4} = 11\mbox{ ms}\)

Waiting time of the processes

Process	A	B	C	D
Waiting time	4	0	0	8

\(\frac{4+0+0+8}{4} = 3\mbox{ ms}\)

Longest Job First (LJF)

The process with the longest execution time get the CPU assigned first
Non-preemptive scheduling policy
Estimation of runtime is required
LJF is not fair
- Prefers processes, which have a long execution time
- Processes with a short execution time may get the CPU assigned only after a very long waiting period or starve
If the execution time of the processes can be estimated, LJF can be used for batch processing

Longest Job First – Example

Four processes shall be processed on a system with a single CPU
All processes are at time point 0 in state ready
Execution order of the processes as Gantt chart

Process	CPU time
A	8 ms
B	4 ms
C	7 ms
D	13 ms

Runtime of the processes

Process	A	B	C	D
Runtime	21	32	28	13

\(\frac{21+32+28+13}{4} = 23.5\mbox{ ms}\)

Waiting time of the processes

Process	A	B	C	D
Waiting time	13	28	21	0

\(\frac{13+28+21+0}{4} = 15.5\mbox{ ms}\)

Longest Remaining Time First (LRTF)

Preemptive LJF is called Longest Remaining Time First (LRTF)
If a new process is created, the remaining execution time of the running process is compared with each process in state ready in the queue
- If the currently running process has the longest remaining execution time, the CPU remains assigned to this process
- If one or more processes in state ready have a longer remaining execution time, the process with the longest remaining execution time gets the CPU assigned
Estimation of runtime is required
As long as no new process is created, no running process gets interrupted
- The processes in state ready are compared with the running process only when a new process is created!
Processes with a short duration may starve (\(\Longrightarrow\) not fair)

Longest Remaining Time First – Example

Four processes shall be processed on a system with a single CPU
Execution order of the processes as Gantt chart

Process	CPU time	Creation time
A	8 ms	0 ms
B	4 ms	6 ms
C	7 ms	21 ms
D	13 ms	11 ms

Runtime of the processes

Process	A	B	C	D
Runtime	32	4	7	20

\(\frac{32+4+7+20}{4} = 15.75\mbox{ ms}\)

Waiting time of the processes

Process	A	B	C	D
Waiting time	24	0	0	7

\(\frac{24+0+0+7}{4} = 7.75\mbox{ ms}\)

Highest Response Ratio Next (HRRN)

Fair variant of SJF/SRTF/LJF/LRTF
- Takes the age of the process into account in order to avoid starvation
The response ratio is calculated for each process \[\mbox{Response ratio} = \frac{\mbox{Estimated execution time} + \mbox{Waiting time}}{\mbox{Estimated execution time}}\]
Response ratio value of a process after creation: 1.0
- The value rises fast for short processes
- Objective: Response ratio should be as small as possible for each process
After process termination or if a process becomes blocked, the CPU is assigned to the process with the highest response ratio
Just as with SJF/SRTF/LJF/LRTF, the execution times of the processes must be estimated via by statistical recordings
It is impossible that processes starve \(\Longrightarrow\) HRRN is fair

Earliest Deadline First (EDF)

Used in real-time operating operating systems (RTOS)
Objective: processes should comply with their deadlines when possible
Processes in ready state are arranged according to their deadline
- The process with the closest deadline gets the CPU assigned next
The queue is reviewed and reorganized whenever…
- a new process switches into state ready
- or an active process terminates
Can be implemented as preemptive and non-preemptive scheduling
- Preemptive EDF can be used in RTOS
- Non-preemptive EDF can be used for batch processing
EDF is used in Linux for preemptive real-time processes

Earliest Deadline First – Example

Four processes shall be processed on a system with a single CPU
All processes are at time point 0 in state ready
Execution order of the processes as Gantt chart

Process	CPU time	Deadline
A	8 ms	25
B	4 ms	18
C	7 ms	9
D	13 ms	34

Runtime of the processes

Process	A	B	C	D
Runtime	19	11	7	32

Avg. runtime = \(\frac{19+11+7+32}{4} = 17.25\mbox{ ms}\)

Waiting time of the processes

Process	A	B	C	D
Waiting time	11	7	0	19

Avg. waiting time = \(\frac{11+7+0+19}{4} = 9.25\mbox{ ms}\)

Multilevel Scheduling

Each scheduling policy require compromises wrt scheduling criteria
- Procedure in practice: Several scheduling strategies are combined
  \(\Longrightarrow\) Static or dynamic multilevel scheduling

Static Multilevel Scheduling

The list of processes of ready state is split into multiple sublists
- For each sublist, a different scheduling policy may be used

The sublists have different priorities or time multiplexes (e.g., 80%:20% or 60%:30%:10%)
- Makes it possible to separate time-critical from non-time-critical processes

Example of allocating the processes to different process classes (sublists) with different scheduling strategies:

Priority	Process class	Scheduling policy
1	Real-time processes (time-critical)	Priority-driven scheduling
2	Interactive processes	Round Robin
3	Compute-intensive batch processes	First Come First Served

Multilevel Feedback Scheduling (1/2)

It is impossible to predict the execution time precisely in advance
- Solution: Processes, which utilized much execution time in the past, get sanctioned
Multilevel feedback scheduling works with multiple queues
- Each queue has a different priority or time multiplex
  (e.g., 70%:15%:10%:5%)
Each new process is added to the top queue
- This way it has the highest priority
Each queue uses Round Robin
- If a process returns the CPU on voluntary basis, it is added to the same queue again
- If a process utilized its entire time slice, it is inserted in the next lower queue, with has a lower priority
  - The priorities are therefore dynamically assigned with this policy
Multilevel feedback scheduling is preemptive scheduling

Multilevel Feedback Scheduling (2/2)

Benefit:
- No complicated estimations!
  - New processes are quickly assigned to a priority category
Prefers new processes over older (longer-running) processes

Source: William Stallings. Operating Systems. 4^th edition. Prentice Hall (2001). P.413

Processes with many I/O operations are preferred because they typically yield when waiting for I/O
Older, longer-running processes are delayed

Many modern operating systems use variants of multilevel feedback scheduling for the scheduling of the processes.
Examples: Linux for regular processes (until Kernel 2.4), Mac OS X, FreeBSD, NetBSD, and the Windows NT family

Completely Fair Scheduler (Linux since 2.6.23)

The kernel implements a CFS for every CPU core and maintains a variable vruntime (virtual runtime) for every SCHED_OTHER process
- The value represents a virtual processor runtime in nanoseconds

Goal: All processes should get a similar (fair) share of computing time of the CPU core they are assigned to
\(\Longrightarrow\) For \(n\) processes, each process should get \(1/n\) of the CPU time

If a process gets replaced from the CPU core, the vruntime value is increased by the time the process did run on the CPU core
The nodes (processes) in the tree move continuously from right to left
\(\Longrightarrow\) fair distribution of CPU resources

vruntime indicates how long the particular process has already used the CPU core
- The process with the lowest vruntime gets access to the CPU core next
The management of the processes is done using a red-black tree (self-balancing binary search tree)
- The processes are sorted in the tree by their vruntime values

If a process got the CPU core assigned, it can run until its vruntime value has reached the targeted portion of \(1/n\) of the available CPU time
The scheduler aims for an equal vruntime value for all processes

The CFS only takes care of regular (i.e., non-real-time) processes that are assigned to the scheduling policy SCHED_OTHER

The scheduler takes into account the static process priorities (nice values) of the processes
The vruntime values are weighted differently depending on the nice value
- In other words: The virtual clock can run at different speeds

Classic and modern Scheduling Methods

	Scheduling		Fair	CPU time	Takes priorities
	NP	P		must be known	into account
Priority-driven scheduling	X	X	no	no	yes
First Come First Served = FIFO	X		yes	no	no
Last Come First Served	X	X	no	no	no
Round Robin		X	yes	no	no
Shortest/Longest Job First	X		no	yes	no
Shortest Remaining Time First		X	no	yes	no
Longest Remaining Time First		X	no	yes	no
Highest Response Ratio Next	X		yes	yes	no
Earliest Deadline First	X	X	yes	no	no
Static multilevel scheduling		X	no	no	yes (static)
Multilevel feedback scheduling		X	yes	no	yes (dynamic)
Completely Fair Scheduler		X	yes	no	yes

NP = non-preemptive scheduling, P = preemptive scheduling
A scheduling policy is fair when each process gets the CPU assigned at some point
It is impossible to calculate the execution time precisely in advance

Linux’ Scheduling Policies

In Linux e.g., each process is assigned to a specific scheduling policy
For real-time processes…
- SCHED_FIFO (priority-driven scheduling, non-preemptive)
- SCHED_RR (preemptive)
- SCHED_DEADLINE (EDF scheduling, preemptive)
For non real-time processes…
- SCHED_OTHER (default Linux time-sharing scheduling) implemented as…
  - Multilevel Feedback Scheduling (until Kernel 2.4)
  - O(1) scheduler (Kernel 2.6.0 until 2.6.22)
  - Completely Fair Scheduler (since Kernel 2.6.23)

The RIOT Scheduler – Example

Summary

You should now be able to answer the following questions:

What steps does the dispatcher need to carry out for switching between processes?
What is scheduling?
How do preemptive scheduling and non-preemptive scheduling work?
Explain the functioning of several common scheduling methods?
How does scheduling in modern operating systems works in detail?