Browse Source
Expands the caching documentation to include a high-level overview of caching strategies and the features available in Zephyr. Signed-off-by: Dane Wagner <dane.wagner@gmail.com>pull/82886/head
7 changed files with 222 additions and 14 deletions
@ -0,0 +1,183 @@
@@ -0,0 +1,183 @@
|
||||
.. _cache_guide: |
||||
|
||||
Caching Basics |
||||
############## |
||||
|
||||
This section discusses the basics of cache coherency and under what situations a |
||||
user needs to explicitly deal with caching. For more detailed info on Zephyr's |
||||
caching tools, see :ref:`cache_config` for Zephyr Kconfig options or |
||||
:ref:`cache_api` for the API reference. This section primarily focuses on the |
||||
data cache though there is typically also an instruction cache for systems with |
||||
cache support. |
||||
|
||||
.. note:: |
||||
|
||||
The information here assumes that the architecture-specific MPU support is |
||||
enabled. See the architecture-specific documentation for details. |
||||
|
||||
.. note:: |
||||
|
||||
While cache coherence can be a concern for data shared between SMP cores, Zephyr |
||||
in general ensures that memory will be seen in a coherent state from multiple |
||||
cores. Most applications will only need to use the cache APIs for interaction |
||||
with external hardware like DMA controllers or foreign CPUs running a |
||||
different OS image. For more information on cache coherence between SMP cores, |
||||
see :kconfig:option:`CONFIG_KERNEL_COHERENCE`. |
||||
|
||||
When dealing with memory shared between a processor core and other bus masters, |
||||
cache coherency needs to be considered. Typically processor caches exist as |
||||
close to each processor core as possible to maximize performance gain. Because |
||||
of this, data moved into and out of memory by DMA engines will be stale in the |
||||
processor's cache, resulting in what appears to be corrupt data. If you are |
||||
moving data using DMA and the processor doesn't see the data you expect, cache |
||||
coherency may be the issue. |
||||
|
||||
There are multiple approaches to ensuring that the data seen by the processor |
||||
core and peripherals is coherent. The simplest is just to disable caching, but |
||||
this defeats the purpose of having a hardware cache in the first place and |
||||
results in a significant performance hit. Many architectures provide methods for |
||||
disabling caching for only a portion of memory. This can be useful when cache |
||||
coherence is more important than performance, such as when using DMA with SPI. |
||||
Finally, there is the option to flush or invalidate the cache for regions of |
||||
memory at runtime. |
||||
|
||||
Globally Disabling the Data Cache |
||||
--------------------------------- |
||||
|
||||
As mentioned above, globally disabling data caching can have a significant |
||||
performance impact but can be useful for debugging. |
||||
|
||||
Requirements: |
||||
|
||||
* :kconfig:option:`CONFIG_DCACHE`: DCACHE control enabled in Zephyr. |
||||
|
||||
* :kconfig:option:`CONFIG_CACHE_MANAGEMENT`: cache API enabled. |
||||
|
||||
* Call :c:func:`sys_cache_data_disable()` to globally disable the data cache. |
||||
|
||||
Disabling Caching for a Memory Region |
||||
------------------------------------- |
||||
|
||||
Disabling caching for only a portion of memory can be a good performance |
||||
compromise if performance on the uncached memory is not critical to the |
||||
application. This is a good option if the application requires many small |
||||
unrelated buffers that are smaller than a cache line. |
||||
|
||||
Requirements: |
||||
|
||||
* :kconfig:option:`CONFIG_DCACHE`: DCACHE control enabled in Zephyr. |
||||
|
||||
* :kconfig:option:`CONFIG_MEM_ATTR`: enable the ``mem-attr`` library for |
||||
handling memory attributes in the device tree. |
||||
|
||||
* Annotate your device tree according to :ref:`mem_mgmt_api`. |
||||
|
||||
Assuming the MPU driver is enabled, it will configure the specified regions |
||||
according to the memory attributes specified during kernel initialization. When |
||||
using a dedicated uncached region of memory, the linker needs to be instructed |
||||
to place buffers into that region. This can be accomplished by specifying the |
||||
memory region explicitly using ``Z_GENERIC_SECTION``: |
||||
|
||||
.. code-block:: c |
||||
|
||||
/* SRAM4 marked as uncached in device tree */ |
||||
uint8_t buffer[BUF_SIZE] Z_GENERIC_SECTION("SRAM4"); |
||||
|
||||
.. note:: |
||||
|
||||
Configuring a distinct memory region with separate caching rules requires the |
||||
use of an MPU region which may be a limited resource on some architectures. |
||||
MPU regions may be needed by other memory protection features such as |
||||
:ref:`userspace <mpu_userspace>`, :ref:`stack protection <mpu_stack_objects>`, |
||||
or :ref:`memory domains<memory_domain>`. |
||||
|
||||
Automatically Disabling Caching by Variable |
||||
------------------------------------------- |
||||
|
||||
Zephyr has the ability to automatically define an uncached region in memory and |
||||
allocate variables to it using ``__nocache``. Any variables marked with this |
||||
attribute will be placed in a special ``nocache`` linker region in memory. This |
||||
region will be configured as uncached by the MPU driver during initialization. |
||||
This is a simpler option than explicitly declaring a region of memory uncached |
||||
but provides less control over the placement of these variables, as the linker |
||||
may allocate this region anywhere in RAM. |
||||
|
||||
Requirements: |
||||
|
||||
* :kconfig:option:`CONFIG_DCACHE`: DCACHE control enabled in Zephyr. |
||||
|
||||
* :kconfig:option:`CONFIG_NOCACHE_MEMORY`: enable allocation of the ``nocache`` |
||||
linker region and configure it as uncached. |
||||
|
||||
* Add the ``__nocache`` attribute at the end of any uncached buffer definition: |
||||
|
||||
.. code-block:: c |
||||
|
||||
uint8_t buffer[BUF_SIZE] __nocache; |
||||
|
||||
.. note:: |
||||
|
||||
See note above regarding possible limitations on MPU regions. The ``nocache`` |
||||
region is still a distinct MPU region even though it is automatically created |
||||
by Zephyr instead of being explicitly defined by the user. |
||||
|
||||
Runtime Cache Control |
||||
--------------------- |
||||
|
||||
The most performant but most complex option is to control data caching at |
||||
runtime. The two most relevant cache operations in this case are **flushing** |
||||
and **invalidating**. Both of these operations operate on the smallest unit of |
||||
cacheable memory, the cache line. Data cache lines are typically 16 to 128 |
||||
bytes. See :kconfig:option:`CONFIG_DCACHE_LINE_SIZE`. Cache line sizes are |
||||
typically fixed in hardware and not configurable, but Zephyr does need to know |
||||
the size of cache lines in order to correctly and efficiently manage the cache. |
||||
If the buffers in question are smaller than the data cache line size, it may be |
||||
more efficient to place them in an uncached region, as unrelated data packed |
||||
into the same cache line may be destroyed when invalidating. |
||||
|
||||
Flushing the cache involves writing all modified cache lines in a specified |
||||
region back to shared memory. Flush the cache associated with a buffer after the |
||||
processor has written to it and before a remote bus master reads from that |
||||
region. |
||||
|
||||
.. note:: |
||||
|
||||
Some architectures support a cache configuration called **write-through** |
||||
caching in which data writes from the processor core propagate through to |
||||
shared memory. While this solves the cache coherence problem for CPU writes, |
||||
it also results in more traffic to main memory which may result in performance |
||||
degradation. |
||||
|
||||
Invalidating the cache works similarly but in the other direction. It marks |
||||
cache lines in the specified region as stale, ensuring that the cache line will |
||||
be refreshed from main memory when the processor next reads from the specified |
||||
region. Invalidate the data cache of a buffer that a peripheral has written to |
||||
before reading from that region. |
||||
|
||||
In some cases, the same buffer may be reused for e.g. DMA reads and DMA writes. |
||||
In that case it is possible to first flush the cache associated with a buffer |
||||
and then invalidate it, ensuring that the cache will be refreshed the next time |
||||
the processor reads from the buffer. |
||||
|
||||
Requirements: |
||||
|
||||
* :kconfig:option:`CONFIG_DCACHE`: DCACHE control enabled in Zephyr. |
||||
|
||||
* :kconfig:option:`CONFIG_CACHE_MANAGEMENT`: cache API enabled. |
||||
|
||||
* Call :c:func:`sys_cache_data_flush_range()` to flush a memory region. |
||||
|
||||
* Call :c:func:`sys_cache_data_invd_range()` to invalidate a memory region. |
||||
|
||||
* Call :c:func:`sys_cache_data_flush_and_invd_range()` to flush and invalidate. |
||||
|
||||
Alignment |
||||
--------- |
||||
|
||||
As mentioned in :c:func:`sys_cache_data_invd_range()` and associated functions, |
||||
buffers should be aligned to the cache line size. This can be accomplished by |
||||
using ``__aligned``: |
||||
|
||||
.. code-block:: c |
||||
|
||||
uint8_t buffer[BUF_SIZE] __aligned(CONFIG_DCACHE_LINE_SIZE); |
Loading…
Reference in new issue