The check for an active timeout in z_is_thread_ready() was originally
added to cover the case of a sleeping thread. However, since there is
now a bit in the thread state that indicates if the thread is sleeping
we can drop that superfluous check.
Making this change necessitates moving k_wakeup()'s call to
z_abort_thread_timeout() so that it is within the locked
_sched_spinlock section to ensure that we do not end up with
a stray thread timeout in the timeout list.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
SMP does not need to mark the current thread as queued in
k_yield() as that will naturally get done in do_swap().
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
When the PM subsystem is enabled, the idle thread locks the scheduler for
the duration the system is suspended. If a meta-IRQ preempts the idle
thread in this state, the idle thread is tracked in `metairq_preempted`.
However, when returning from the preemption, the idle thread is not removed
from `metairq_preempted`, unlike all the other threads. As a result, the
scheduler keeps running the idle thread even if there are higher priority
threads ready to run.
This change treats the idle thread the same way as all other threads when
returning from a meta-IRQ preemption.
Fixes#64705
Signed-off-by: Kalle Kietäväinen <kalle.kietavainen@silabs.com>
Define the generic _current directly and get rid of the generic
arch_current_get().
The SMP default implementation is now known as z_smp_current_get().
It is no longer inlined which saves significant binary size (about 10%
for some random test case I checked).
Introduce z_current_thread_set() and use it in place of
arch_current_thread_set() for updating the current thread pointer
given this is not necessarily an architecture specific operation.
The architecture specific optimization, when enabled, should only care
about its own things and not have to also update the generic
_current_cpu->current copy.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Mostly a revert of commit b1def7145f ("arch: deprecate `_current`").
This commit was part of PR #80716 whose initial purpose was about providing
an architecture specific optimization for _current. The actual deprecation
was sneaked in later on without proper discussion.
The Zephyr core always used _current before and that was fine. It is quite
prevalent as well and the alternative is proving rather verbose.
Furthermore, as a concept, the "current thread" is not something that is
necessarily architecture specific. Therefore the primary abstraction
should not carry the arch_ prefix.
Hence this revert.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Gives a hint to the compiler that the bail-out paths in both
k_thread_suspend() and k_thread_resume() are unlikely events.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Adds customized yield implementations based upon the selected
scheduler (dumb, multiq or scalable). Although each follows the
same broad outline, some of them allow for additional tweaking
to extract maximal performance. For example, the multiq variant
improves the performance of k_yield() by about 20%.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
The routine k_reschedule() allows an application to manually force
a schedule point. Although similar to k_yield(), it has different
properties. The most significant difference is that k_yield() if
invoked from a cooperative thread will voluntarily give up execution
control to the next thread of equal or higher priority while
k_reschedule() will not.
Applications that play with EDF deadlines via k_thread_deadline_set()
may need to use k_reschedule() to force a reschedule.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Sleeping and suspended are now orthogonal states. That is, a thread
may be both sleeping and suspended and the two do not interact. One
repercussion of this is that suspending a thread will no longer
abort its timeout.
Threads are now created in the 'sleeping' state instead of a
'suspended' state. This dovetails nicely with the start delay that
can be given to a newly created thread--it is as though the very
first operation that a thread with a start delay is a sleep.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Fix the issue: https://github.com/zephyrproject-rtos/zephyr/issues/79863
The expected_wakeup_ticks and sys_clock_tick_get_32() are uint32_t values,
and may wrap around individually.
If the expected_wakeup_ticks has a wraparound and sys_clock_tick_get_32()
doesn't, so expected_wakeup_ticks < sys_clock_tick_get_32(), the API return
value will be corrupted.
The API return value, that is the remaining time, should be calculated in
32bit-unsigned-integer manner, and any wraparound will be treated properly.
Signed-off-by: Akaiwa Wataru <akaiwa@sonas.co.jp>
Traditionally threads have been initialized with a PRESTART flag set,
which gets cleared when the thread runs for the first time via either
its timeout or the k_thread_start() API.
But if you think about it, this is no different, semantically, than
SUSPENDED: the thread is prevented from running until the flag is
cleared.
So unify the two. Start threads in the SUSPENDED state, point
everyone looking at the PRESTART bit to the SUSPENDED flag, and make
k_thread_start() be a synonym for k_thread_resume().
There is some mild code size savings from the eliminated duplication,
but the real win here is that we make space in the thread flags byte,
which had run out.
Signed-off-by: Andy Ross <andyross@google.com>
k_thread_suspend() is an async API intended to stop any thread in any
state from any context. Some apps just want to use it to "suspend
myself", which is a much (!) simpler operation. Detect that specific
usage as a performance case.
Signed-off-by: Andy Ross <andyross@google.com>
`_current` is now functionally equals to `arch_curr_thread()`, remove
its usage in-tree and deprecate it instead of removing it outright,
as it has been with us since forever.
Signed-off-by: Yong Cong Sin <ycsin@meta.com>
Signed-off-by: Yong Cong Sin <yongcong.sin@gmail.com>
Add the following arch-specific APIs:
- arch_curr_thread()
- arch_set_curr_thread()
which allow SMP architectures to implement a faster "get current
thread pointer" than the default provided by the kernel. The 'set'
function is required for the 'get' to work, more on that later.
When `CONFIG_ARCH_HAS_CUSTOM_CURRENT_IMPL` is selected, calls to
`_current` & `k_sched_current_thread_query()` will be redirected to
`arch_curr_thread()`, which ideally should translate into a single
instruction read, avoiding the current
"lock > read CPU > read current thread > unlock" path in SMP
architectures and thus greatly improves the read performance.
However, since the kernel relies on a copy of the "current thread"s on
every CPU for certain operations (i.e. to compare the priority of the
currently scheduled thread on another CPU to determine if IPI should be
sent), we can't eliminate the copy of "current thread" (`current`) from
the `struct _cpu` and therefore the kernel now has to invoke
`arch_set_curr_thread()` in addition to what it has been doing. This
means that it will take slightly longer (most likely one instruction
write) to change the current thread pointer on the current
CPU.
Signed-off-by: Yong Cong Sin <ycsin@meta.com>
Signed-off-by: Yong Cong Sin <yongcong.sin@gmail.com>
Move the initialization of the priority q for running out of sched.c to
remove one more ifdef from sched.c. No change in functionality but
better matches the rest of sched.c and priority_q.h such that the
ifdefry needed is done in in priority_q.h.
Signed-off-by: Tom Burdick <thomas.burdick@intel.com>
Inlining z_unpend_first_thread() has been observed to give a
+8% and +16% performance boost to the thread_metric benchmark's
message processing and synchronization tests respectively.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Removing the routine z_ready_thread_locked() as it is not
used anywhere. It was a leftover artefact from development
that previously escaped cleanup.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Applies the 'unlikely' attribute to various kernel objects that
use z_unpend_first_thread() to optimize for the non-blocking path.
This boosts the thread_metric synchronization benchmark numbers
on the frdm_k64f board by about 10%.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
This improves context switching by 7% when measured using the
thread_metric benchmark.
Before:
**** Thread-Metric Preemptive Scheduling Test **** Relative Time: 120
Time Period Total: 5451879
After:
**** Thread-Metric Preemptive Scheduling Test **** Relative Time: 30
Time Period Total: 5853535
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Implements a set of tests designed to show how the performance of the
three scheduler queue implementations (DUMB, SCALABLE and MULTIQ)
varies with respect to the number of threads in the ready queue.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Removes the ISR restriction from k_thread_priority_set().
Since the first commit, the routine k_thread_priority_set() has
had a restriction preventing it from being called from within
the context of an ISR. As there does not (any longer) appear to
be a reason why the restriction exists, it is being removed.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Removes the ALWAYS_INLINE attribute from the definition of the
routine z_unpend_thread_no_timeout() to fix an unresolved symbol
error that was occurring with some compilers.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Utilize a code spell-checking tool to scan for and correct spelling errors
in all files within the `kernel` directory.
Signed-off-by: Pisit Sawangvonganan <pisit@ndrsolution.com>
When this new Kconfig option is enabled, preempted threads on an SMP
system may generate additional IPIs when they are switched out.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
The deadline of deadline scheduler should lager than zero
because if deadline is negative, it menas the task should
be finished in past.
Signed-off-by: TaiJu Wu <tjwu1217@gmail.com>
Platforms that support IPIs allow them to be broadcast via the
new arch_sched_broadcast_ipi() routine (replacing arch_sched_ipi()).
Those that also allow IPIs to be directed to specific CPUs may
use arch_sched_directed_ipi() to do so.
As the kernel has the capability to track which CPUs may need an IPI
(see CONFIG_IPI_OPTIMIZE), this commit updates the signalling of
tracked IPIs to use the directed version if supported; otherwise
they continue to use the broadcast version.
Platforms that allow directed IPIs may see a significant reduction
in the number of IPI related ISRs when CONFIG_IPI_OPTIMIZE is
enabled and the number of CPUs increases. These platforms can be
identified by the Kconfig option CONFIG_ARCH_HAS_DIRECTED_IPIS.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
The CONFIG_IPI_OPTIMIZE configuration option allows for the flagging
and subsequent signaling of IPIs to be optimized.
It does this by making each bit in the kernel's pending_ipi field
a flag that indicates whether the corresponding CPU might need an IPI
to trigger the scheduling of a new thread on that CPU.
When a new thread is made ready, we compare that thread against each
of the threads currently executing on the other CPUs. If there is a
chance that that thread should preempt the thread on the other CPU
then we flag that an IPI is needed for that CPU. That is, a clear bit
indicates that the CPU absolutely will not need to reschedule, while a
set bit indicates that the target CPU must make that determination for
itself.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
1. The flagging of IPIs is moved out of k_thread_priority_set() into
z_thread_prio_set(). This allows for an IPI to be done for a thread
that had its priority bumped due to the handling of priority
inheritance from a mutex.
2. k_thread_priority_set()'s check for sched_locked only applies to
non-SMP builds that are using the old arch_swap() framework to switch
between threads.
Incidentally, nearly all calls to flag_ipi() are now performed with
sched_spinlock being locked. The only exception is in slice_timeout().
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
Namespaced the generated headers with `zephyr` to prevent
potential conflict with other headers.
Introduce a temporary Kconfig `LEGACY_GENERATED_INCLUDE_PATH`
that is enabled by default. This allows the developers to
continue the use of the old include paths for the time being
until it is deprecated and eventually removed. The Kconfig will
generate a build-time warning message, similar to the
`CONFIG_TIMER_RANDOM_GENERATOR`.
Updated the includes path of in-tree sources accordingly.
Most of the changes here are scripted, check the PR for more
info.
Signed-off-by: Yong Cong Sin <ycsin@meta.com>
The previous abort-lifecycle fix missed a case: other threads can
enter k_thread_join(), see that the thread is already dead, and then
need to call z_thread_switch_spin() to wait for a context switch. But
the new "dummification" code was (by design!) terminating the thread
such that no context would be saved to it. So switch_handle stayed
NULL and if you hit that timing case correctly[1] you'd deadlock
waiting for a switch that would never come.
Fix is just to set switch_handle when dummifying to any non-NULL
value.
Also add an assertion to catch the obvious case that a thread is
actually dead on the exit path of k_thread_abort() to make sure the
variant path continues to set flags correctly
[1] CI was doing it fairly reliably via tests/kernel/smp_abort on
qemu_cortex_a53 only. Only one of my dev systems could see it,
and then only about 15% of the time.
Signed-off-by: Andy Ross <andyross@google.com>
We've had threads spinning on the thread state bits, but weren't being
careful to ensure that those bits were the last things seen to change
in a halting thread. Move it to the end, and add a barrier for
correctness.
Signed-off-by: Andy Ross <andyross@google.com>
Nicolas Pitre points out that since these thread structs are just
dummies for the context swtiching, they can be presumed to be "write
only" and thus there's no point in having one per CPU, everyone can
share the same one.
The only gotcha is that we never really documented (nor really have a
place to document) that rule, so it's not theoretically impossible for
an architecture to read back what it might have written underneath
arch_switch(). Leave this in a separate commit for bisection
purposes, but the risk seems very low.
Signed-off-by: Andy Ross <andyross@google.com>
After a k_thread_abort(), the resulting thread struct is documented as
unused/free memory that may be re-used (for example, to respawn a new
thread).
But in the special case of aborting the current thread from within an
ISR, that wasn't quite happening. The scheduler cleanup would
complete, but the architecture layer would still try to context switch
away from the aborted thread on exit, and that can include writes to
the now-reused thread struct! The specifics will depend on
architecture (some do a full context save on entry, most don't), but
in the case of USE_SWITCH=y it will at the very least write the
switch_handle field.
Fix this simply, with a per-cpu "switch dummy" thread struct for use
as a target for context switches like this. There is some non-trivial
memory cost to that; thread structs on many architectures are large.
Pleasingly, this also addresses a known deadlock on SMP: because the
"spin in ISR" step now happens as the very last stage of
k_thread_abort() handling, the existing scheduler lock works to
serialize calls such that it's impossible for a cycle of threads to
independently decide to spin on each other: at least one will see
itself as "already aborting" and break the cycle.
Fixes#64646
Signed-off-by: Andy Ross <andyross@google.com>
Updates z_get_next_switch_handle() to set the new thread's base.cpu
value as it is done in do_swap(). This helps to ensure that the
last CPU on which the thread executed remains current.
Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
This reverts commit 61c70626a5.
This PR introduced 2 regressions in main CI:
71977 & 71978
Let's revert it by now to get main's CI passing again.
Signed-off-by: Alberto Escolar Piedras <alberto.escolar.piedras@nordicsemi.no>
This reverts commit 20611f13ca.
This PR introduced 2 regressions in main CI:
71977 & 71978
Let's revert it by now to get main's CI passing again.
Signed-off-by: Alberto Escolar Piedras <alberto.escolar.piedras@nordicsemi.no>
This reverts commit 02b24911f7.
This PR introduced 2 regressions in main CI:
71977 & 71978
Let's revert it by now to get main's CI passing again.
Signed-off-by: Alberto Escolar Piedras <alberto.escolar.piedras@nordicsemi.no>
We've had threads spinning on the thread state bits, but weren't being
careful to ensure that those bits were the last things seen to change
in a halting thread. Move it to the end, and add a barrier for
correctness.
Signed-off-by: Andy Ross <andyross@google.com>