Some thread fields were 32-bit wide, when they are not even close to
using that full range of values. They are instead changed to 8-bit fields.
- prio can fit in one byte, limiting the priorities range to -128 to 127
- recursive scheduler locking can be limited to 255; a rollover results
most probably from a logic error
- flags are split into execution flags and thread states; 8 bits is
enough for each of them currently, with at worst two states and four
flags to spare (on x86, on other archs, there are six flags to spare)
Doing this saves 8 bytes per stack. It also sets up an incoming
enhancement when checking if the current thread is preemptible on
interrupt exit.
Change-Id: Ieb5321a5b99f99173b0605dd4a193c3bc7ddabf4
Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>
Remove legacy option and use SYS_CLOCK_EXISTS where appropriate.
Change-Id: I3d524ea2776e638683f0196c0cc342359d5d810f
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Use least significant bits for common flags and high bits for
arch-specific ones.
Change-Id: I982719de4a24d3588c19a0d30bbe7a27d9a99f13
Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>
Not needed, since only the thread itself can modifiy its own
sched_locked count.
Change-Id: I3d3d8be548d2b24ca14f51637cc58bda66f8b9ee
Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>
Also remove mentions of unified kernel in various places in the kernel,
samples and documentation.
Change-Id: Ice43bc73badbe7e14bae40fd6f2a302f6528a77d
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
The way the ready thread cache was implemented caused it to not always
be "hot", i.e. there could be some misses, which happened when the
cached thread was taken out of the ready queue. When that happened, it
was not replaced immediately, since doing so could mean that the
replacement might not run because the flow could be interrupted and
another thread could take its place. This was the more conservative
approach that insured that moving a thread to the cache would never be
wasted.
However, this caused two problems:
1. The cache could not be refilled until another thread context-switched
in, since there was no thread in the cache to compare priorities
against.
2. Interrupt exit code would always have to call into C to find what
thread to run when the current thread was not coop and did not have the
scheduler locked. Furthermore, it was possible for this code path to
encounter a cold cache and then it had to find out what thread to run
the long way.
To fix this, filling the cache is now more aggressive, i.e. the next
thread to put in the cache is found even in the case the current cached
thread is context-switched out. This ensures the interrupt exit code is
much faster on the slow path. In addition, since finding the next thread
to run is now always "get it from the cache", which is a simple fetch
from memory (_kernel.ready_q.cache), there is no need to call the more
complex C code.
On the ARM FRDM K64F board, this improvement is seen:
Before:
1- Measure time to switch from ISR back to interrupted task
switching time is 215 tcs = 1791 nsec
2- Measure time from ISR to executing a different task (rescheduled)
switch time is 315 tcs = 2625 nsec
After:
1- Measure time to switch from ISR back to interrupted task
switching time is 130 tcs = 1083 nsec
2- Measure time from ISR to executing a different task (rescheduled)
switch time is 225 tcs = 1875 nsec
These are the most dramatic improvements, but most of the numbers
generated by the latency_measure test are improved.
Fixes ZEP-1401.
Change-Id: I2eaac147048b1ec71a93bd0a285e743a39533973
Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>
The fact that a thread is timing out was tracked via two flags: the
K_TIMING thread flag bit, and the thread's timeout's
delta_ticks_from_prev being -1 or not. This duplication could
potentially cause discrepancies if the two flags got out-of-sync, and
there was no benfits to having both.
Since timeouts that are not parts of a thread rely on the value of
delta_ticks_from_prev, standardize on it.
Since the K_TIMING bit is removed from the thread's flags, K_READY would
not reflect the reality anymore. It is removed and replaced by
_is_thread_prevented_froM_running(), which looks at the state flags that
are relevant. A thread that is ready now is not prevented from running
and does not have an active timeout.
Change-Id: I902ef9fb7801b00626df491f5108971817750daa
Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>
Also remove NO_METRIC, which is not referenced anywhere anymore.
Change-Id: Ieaedf075af070a13aa3d975fee9b6b332203bfec
Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>
Move _thread_base initialization to _init_thread_base(), remove mention
of "nano" in timeouts init and move timeout init to _init_thread_base().
Initialize all base fields via the _init_thread_base in semaphore groups
code.
Change-Id: I05b70b06261f4776bda6d67f358190428d4a954a
Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>
In addition to more priorities taking more memory to host them, finding
the next thread to run when it is not cached is slower since each extra
set of 32 priorities maps to a loop iteration. That loop is remove
entirely when the number of priorities is less than 32 (31 + the idle
thread).
Fixes ZEP-1303.
Change-Id: I3205df90d379a0f4456ff1d7f1aaa67ad2cddf15
Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>
There was a lot of duplication between architectures for the definition
of threads and the "nanokernel" guts. These have been consolidated.
Now, a common file kernel/unified/include/kernel_structs.h holds the
common definitions. Architectures provide two files to complement it:
kernel_arch_data.h and kernel_arch_func.h. The first one contains at
least the struct _thread_arch and struct _kernel_arch data structures,
as well as the struct _callee_saved and struct _caller_saved register
layouts. The second file contains anything that needs what is provided
by the common stuff in kernel_structs.h. Those two files are only meant
to be included in kernel_structs.h in very specific locations.
The thread data structure has been separated into three major parts:
common struct _thread_base and struct k_thread, and arch-specific struct
_thread_arch. The first and third ones are included in the second.
The struct s_NANO data structure has been split into two: common struct
_kernel and arch-specific struct _kernel_arch. The latter is included in
the former.
Offsets files have also changed: nano_offsets.h has been renamed
kernel_offsets.h and is still included by the arch-specific offsets.c.
Also, since the thread and kernel data structures are now made of
sub-structures, offsets have to be added to make up the full offset.
Some of these additions have been consolidated in shorter symbols,
available from kernel/unified/include/offsets_short.h, which includes an
arch-specific offsets_arch_short.h. Most of the code include
offsets_short.h now instead of offsets.h.
Change-Id: I084645cb7e6db8db69aeaaf162963fe157045d5a
Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>