CHERIoT Programmers' Guide

David Chisnall

Table of contents

4. C/C++ extensions for CHERIoT

The CHERIoT platform adds a small number of C/C++ annotations to support the compartment model.

4.1. Exposing compartment entry points

Compartments are discussed in detail in Chapter 5. Compartments and libraries. A compartment can expose functions as entry points via a simple attribute.

The cheri_compartment({name}) attribute specifies the name of the compartment that defines a function. This is used in concert with the -cheri-compartment= compiler flag. This allows the compiler to know whether a particular function (which may be in another compilation unit) is defined in the same compartment as the current compilation unit, allowing direct calls for functions in the same compilation unit and cross-compartment calls for other cases.

This can be used on either definitions or declarations but is most commonly used on declarations.

If a function is defined while compiling a compilation unit belonging to a different compartment then the compiler will raise an error. In CHERIoT RTOS, this attribute is always used via the __cheri_compartment({name}) macro. This makes it possible to simply use #define __cheri_compartment(x) when compiling for other platforms.

Most of the time, you will not need to worry about the compiler flags directly. The xmake provided by CHERIoT RTOS will set the compiler flags for you automatically. Listing 8 shows the prototype of a trivial function that increments an integer that is private to a compartment.

/**
 * A function to increment a private variable inside a
 * compartment.
 */
__cheri_compartment("example_compartment") int increment();

Listing 8. Exporting a function for use by other compartments from a header.examples/compartment_annotation/interface.h

The body of this function is then shown in Listing 9. Note that this does not require the attribute, it is inherited from the prototype. If you forget to include the header, you will see a linker error about a missing symbol.

int increment()
{
	counter++;
	return 0;
}

Listing 9. The body of a function that is exposed for cross-compartment calls.examples/compartment_annotation/compartment.cc

The build system specifies the -cheri-compartment= flag based on the compartment target definition in the xmake.lua. Listing 10 shows this for the simple example compartment.

-- An example compartment that we can call
compartment("example_compartment")
	add_files("compartment.cc")

Listing 10. Build system code for defining a compartment.examples/compartment_annotation/xmake.lua

If you get the compartment name wrong, the compiler will generate an error. For example, if you change the compartment name in this example to "hello" in the source code, you will see the following:

error: entry.cc:15:35: error: CHERI compartment entry declared for compartment 'hello' but implemented in 'entry' (provided with -cheri-compartment=)
   15 | void __cheri_compartment("hello") entry()
      |                                   ^
1 error generated.

4.2. Passing callbacks to other compartments.

The cheri_ccallback attribute specifies a function that can be used as an entry point by compartments that are passed a function pointer to it. This attribute must also be used on the type of function pointers that hold cross-compartment invocations. Any time the address of such a function is taken, the result will be a sealed capability that can be used to invoke the compartment and call this function.

The compiler does not know, when calling a callback, whether it points to the current compartment. As such, calling a CHERI callback function will always be a cross-compartment call, even if the target is in the current compartment.

This attribute can also be used via the __cheri_callback macro, which allows it to be defined away when targeting other platforms.

Listing 11 shows both how to declare a typedef for a function pointer type that can be used for cross-compartment callbacks and how to expose a function that takes one. This is a simple function that will increment a private counter and invoke the callback.


/**
 * A cross-compartment callback that takes an integer and
 * returns an integer.
 */
typedef __cheri_callback int (*Callback)(int);

/**
 * Example of a function that takes a cross-compartment
 * callback as an argument.
 */
__cheri_compartment("example_compartment") int monotonic(
  Callback);

Listing 11. Exposing a function that takes a cross-compartment callback for use by other compartments.examples/compartment_annotation/interface.h

The implementation of this function (Listing 12) calls it just as it would call any other function pointer. The difference is dealt with entirely by the compiler. For a normal call, the compiler will emit a simple jump-and-link to the address, whereas in this case it will invoke the switcher (see Section 2.2. Changing trust domain with the switcher) with the callback as an extra argument.

Every function that's exposed for cross-compartment invocation has an entry in the compartment's export table, containing the metadata that the switcher will use. Every function that is directly called by another compartment will then have an entry in the calling compartment's import table that the loader will initialise with a sealed capability to the export table entry. Callback functions work in a similar way, except that the import-table entry is for the compartment that exposes the callback.

When you take the address of a callback function, the compiler simply inserts a load of the import-table entry, giving exactly the same kind of sealed capability that you would use for direct cross-compartment calls. At the call site, the only difference between a direct cross-compartment call and a callback is that the former will contain the load from the import table, whereas the latter will simply move the callback into the register that is used to pass the callee to the switcher.

int monotonic(Callback callback)
{
	return callback(++counter);
}

Listing 12. The body of a function that invokes a cross-compartment callback.examples/compartment_annotation/compartment.cc

The callback is then declared just like any other function, but with the correct attribute, as shown in Listing 13.

The function attributes can be provided either before the start of the function or before the function name (after the return type). In some cases, the latter can avoid ambiguity (the attribute definitely applies to the function, not to the return type), but both are equivalent the rest of the time.

int __cheri_callback callback(int counter)
{
	printf("Counter value: %d\n", counter);
	return 0;
}

Listing 13. A function that can be invoked as a cross-compartment callback.examples/compartment_annotation/entry.cc

The callback function is passed just like any other function pointer, as shown in Listing 14. Note that the two ways of taking the address of a function in C/C++ (callback and &callback) are equivalent. Both work, some people prefer the former because it is more concise, others prefer the latter because it is a visual marker that a pointer is being constructed.

	increment();
	monotonic(callback);
	monotonic(&callback);

Listing 14. A function that can be invoked as a cross-compartment callback.examples/compartment_annotation/entry.cc

When you run this example

4.3. Exposing library entry points

Libraries are discussed in detail in Chapter 5. Compartments and libraries. Like compartments, they can export functions, via a simple annotation. Unlike compartments, they are simply a mechanism for code sharing, not a security boundary. Libraries do not have mutable globals and each call to a library is assumed to have access to everything in the caller. Libraries are intended to provide almost the same abstraction as if you'd copied and pasted code into each compartment that calls them, though without the accompanying code duplication.

The cheri_libcall attribute specifies that this function is provided by a library (shared between compartments). This attribute is implicit for all compiler built-in functions, including memcpy and similar freestanding C environment functions. As with cheri_compartment(), this may be used on both definitions and declarations.

Unlike the compartment annotation, the library annotation does not specify the library that provides the function (though you can validate this later with the auditing tooling, as described in Chapter 10. Auditing firmware images. This allows library functions to be moved between libraries easily, a refactoring that does not affect most of the security model. For example, the RTOS used to provide a library that implemented all of the helpers for atomic operations. This was later split into separate libraries for different sized objects, allowing code to link only the atomic operations for types that it uses.

This attribute can also be used via the __cheri_libcall macro, which allows it to be defined away when targeting other platforms. This is how it is used in Listing 15, which declares a simple library function.

/**
 * A simple example library function.
 */
__cheri_libcall void library_function();

Listing 15. A declaration of a library functionexamples/library_annotation/interface.h

As with the compartment annotations, these don't need to be placed on both the prototype and the declaration. Listing 16 shows the definition, which omits the attribute.

void library_function()
{
	// Print the stack capability from within the library.
	Debug::log("Stack pointer: {}",
	           __builtin_cheri_stack_get());
}

Listing 16. A definition of a library functionexamples/library_annotation/library.cc

Both the library function and the call site, shown in Listing 17 use the CHERIoT RTOS debugging APIs that are described in detail in Chapter 8. Features for debug builds. These allow you to, among other things, pretty-print capabilities. These use a compiler builtin to get the capability to the stack and print it.

/// Thread entry point.
void __cheri_compartment("entry") entry()
{
	// Print the current stack capability.
	Debug::log("Stack pointer: {}",
	           __builtin_cheri_stack_get());
	// Call the function exported from the library.
	library_function();
}

Listing 17. Calling a simple library function.examples/library_annotation/entry.cc

When you run this example, you should see the stack capability printed twice, once by the entry compartment and once by the library. The library is called from the compartment and so you should see the stack pointer move, but the bounds will remain the same. When you run it, you should see something like this:

Entry compartment: Stack pointer: 0x80000af0 (v:1 0x80000720-0x80000b20 l:0x400 o:0x0 p: - RWcgml -- ---)
Library: Stack pointer: 0x80000ad0 (v:1 0x80000720-0x80000b20 l:0x400 o:0x0 p: - RWcgml -- ---)

The bounds (0x80000720-0x80000b20) remain constant across the call. This means that malicious code in the library could inspect or modify everything on the caller's stack. In contrast, if you try the same thing in a compartment, you will see this stack truncated.

Try modifying this example to place the function in a compartment instead of a library. Don't forget to modify the xmake.lua file to change the library target to compartment.

4.4. Interrupt state control

The cheri_interrupt_state attribute (commonly used as a C++11 / C23 attribute spelled cheri::interrupt_state) is applied to functions and takes an argument that is one of the following:

enabled
Interrupts are enabled when calling this function.
disabled
Interrupts are disabled when calling this function.
inherit
The interrupt state is unchanged (inherited from the caller) when invoking the function.

For most functions, inherit is the default. For cross-compartment calls, enabled is the default and inherit is not permitted.

The compiler may not inline functions at call sites that would change the interrupt state and will always call them via a sentry capability set up by the loader. This makes it possible to statically reason about interrupt state in lexical scopes.

If a compartment is able to provide arbitrary interrupt-disabled functions, that compartment is in the TCB for availability. It is a good idea to move interrupts-disabled code into library functions where the contents of the library can be audited and the exact binary for the interrupt-disabled function can be part of a software bill of materials (SBOM), which can then allow you to reason about the whole system's availability guarantees.

If you need to wrap a few statements to run with interrupts disabled, you can use the convenience helper CHERI::with_interrupts_disabled. This is annotated with the attribute that disables interrupts and invokes the passed lambda. This maintains the structured-programming discipline for code running with interrupts disabled: it is coupled to a lexical scope.

Documentation for the with_interrupts_disabled function
template<typename T>
	[[cheri::interrupt_state(disabled)]] auto with_interrupts_disabled(T &&fn)
	{
		return fn();
	}

Invokes the passed callable object with interrupts disabled.

You need to be very careful using this attribute. Listing 18 shows a very simple example of how disabling interrupts can have adverse effects. The spin_for_ticks function in this example will simply spin for the requested number of ticks, reading the cycle counter until enough time has elapsed. This is called by a thread entry-point function that runs with low priority, with increasing tick counts.

The rdcycle64 function reads the cycle timer. The thread_sleep call is sleeping for a single scheduler tick. This function and the meaning of a scheduler tick are explained in more detail in Chapter 6. Communicating between threads. For now, assume that the thread is attempting to sleep for the number of cycles shown by the printf call at the start, outside of the loop.


/**
 * A function that runs with interrupts disabled and
 * consumes CPU for the requested number of ticks.
 */
[[cheri::interrupt_state(disabled)]] void
spin_for_ticks(uint32_t ticks)
{
	uint64_t end =
	  rdcycle64() + (uint64_t(ticks) * TIMERCYCLES_PER_TICK);
	while (rdcycle64() < end) {}
}

/// Low-priority thread entry point.
void __cheri_compartment("interrupts") low()
{
	int sleeps = 2;
	while (true)
	{
		printf("low-priority thread running\n");
		spin_for_ticks(sleeps++);
	}
}

Listing 18. A low-priority thread that uses an interrupts-disabled function to consume CPU.examples/interrupts_disabled/interrupts.cc

The other thread in this program is shown in Listing 19. This runs with high priority and so will always preempt the low-priority thread when it is able to, but disabling interrupts means that preemption is impossible. Timer interrupts do not fire and so the scheduler cannot interrupt the function.

/// High-priority thread entry point.
void __cheri_compartment("interrupts") high()
{
	printf("One tick is %d cycles\n", TIMERCYCLES_PER_TICK);
	while (true)
	{
		// Get the current cycle time
		uint64_t start = rdcycle64();
		// Sleep for one scheduler tick
		Timeout t{1};
		thread_sleep(&t);
		// Report how long the sleep was
		printf("Cycles elapsed with high-priority thread "
		       "yielding: %lld\n",
		       rdcycle64() - start);
	}
}

Listing 19. A high-priority thread that is starved but an interrupts-disabled function called from a low-priority thread.examples/interrupts_disabled/interrupts.cc

When you run this, you will see that the actual time spent sleeping increases each iteration:

One tick is 10000 cycles
low-priority thread running
Cycles elapsed with high-priority thread yielding: 23461
low-priority thread running
Cycles elapsed with high-priority thread yielding: 33450
low-priority thread running
Cycles elapsed with high-priority thread yielding: 43449
low-priority thread running
Cycles elapsed with high-priority thread yielding: 53448

The low-priority thread is allowed to start running when the high-priority thread yields but then prevents any other thread in the system from running. If you did anything like this in a realtime system, this would guarantee that you would would miss your realtime deadlines.

The key problem here is that the interrupts-disabled function has an unbounded run time. It will consume the CPU for a data-dependent amount of time with no practical upper bound. When you are building realtime systems, even very soft realtime systems, you must ensure that the worst-case execution time for responding to events is bounded.

4.5. Importing MMIO access

The MMIO_CAPABILITY({type}, {name}) macro is used to access memory-mapped I/O devices. These are specified in the board definition file by the build system. The DEVICE_EXISTS({name}) macro can be used to detect whether the current target provides a device with the specified name.

The type parameter is the type used to represent the MMIO region. The macro evaluates to a volatile {type} *, so MMIO_CAPABILITY(struct UART, uart) will provide a volatile struct UART * pointing (and bounded) to the device that the board definition exposes as uart.

4.6. Manipulating capabilities with C builtins

The compiler provides a set of built-in functions for manipulating capabilities. These are typically of the form __builtin_cheri_{noun}_{verb}. You can read all of the fields of a CHERI capability with get as the verb and the following nouns:

address
The current address that's used when the capability is used a pointer.
base
The lowest address that this authorises access to.
top
The address immediately after the end of the range that this authorises access to.
length
The distance between the base and the top.
perms
The architectural permissions that this capability holds.
sealed
Is this a sealed capability?
tag
Is this a valid capability?
type
The type of this capability (zero means unsealed).

The verbs vary because they express the guarded manipulation guarantees for CHERI capabilities. You can't, for example, arbitrarily set the permissions on a capability, you can only remove permissions. Capabilities can be modified with the nouns and verbs listed in Table 3. CHERI capability manipulation builtin functions.

Noun Modification verb Operation
address set Set the address for the capability.
bounds set Sets the base at or below the current address and the length at or above the requested length, as closely as possible to give a valid capability
bounds set_exact Sets the base to the current address and the length to the requested length or returns an untagged capability if the result is not representable.
perms and Clears all permissions except those provided as the argument.
tag clear Invalidates the capability but preserves all other fields.
Table 3. CHERI capability manipulation builtin functions

Setting the object type is more complex. This is done with __builtin_cheri_seal, which takes an authorising capability (something with the permit-seal permission) as the second argument and sets the object type of the result to the address of the sealing capability. Conversely, __builtin_cheri_unseal uses a capability with the permit-unseal capability and address matching the object type to restore the original unsealed value.

Most of the time, C code will avoid using the builtins directly and instead use the wrappers defined in cheri-builtins.h. This file contains a set of macros that wrap the builtins to remove the __builtin_ prefix.

Although most of the macros in cheri-builtins.h match the names of the underlying builtins, the permissions ones follow the CHERIoT RTOS coding convention of avoiding abbreviations and so use permissions instead of perms. The predicates prefix the operation with _is so __builtin_cheri_equal_exact becomes cheri_is_equal_exact.

You can see how to use most of the introspection builtins via their macro wrappers in Listing 20. This prints a capability, showing its address, tag (valid) bit, length, bounds, and permissions. The permissions are expanded as the letters from the tables in Section 1.2. Decomposing permissions in CHERIoT. The builtins are thing wrappers around the instructions, which represent the permissions as a bitmask. Individual bits must be extracted by a bitwise and operation.

void print_capability(void *ptr)
{
	unsigned permissions = cheri_permissions_get(ptr);
	printf(
	  "0x%x (valid:%d length: 0x%x 0x%x-0x%x otype:%d "
	  "permissions: %c "
	  "%c%c%c%c%c%c %c%c %c%c%c)\n",
	  cheri_address_get(ptr),
	  cheri_tag_get(ptr),
	  cheri_length_get(ptr),
	  cheri_base_get(ptr),
	  cheri_top_get(ptr),
	  cheri_type_get(ptr),
	  (permissions & CHERI_PERM_GLOBAL) ? 'G' : '-',
	  (permissions & CHERI_PERM_LOAD) ? 'R' : '-',
	  (permissions & CHERI_PERM_STORE) ? 'W' : '-',
	  (permissions & CHERI_PERM_LOAD_STORE_CAP) ? 'c' : '-',
	  (permissions & CHERI_PERM_LOAD_GLOBAL) ? 'g' : '-',
	  (permissions & CHERI_PERM_LOAD_MUTABLE) ? 'm' : '-',
	  (permissions & CHERI_PERM_STORE_LOCAL) ? 'l' : '-',
	  (permissions & CHERI_PERM_SEAL) ? 'S' : '-',
	  (permissions & CHERI_PERM_UNSEAL) ? 'U' : '-',
	  (permissions & CHERI_PERM_USER0) ? '0' : '-');
}

Listing 20. Pretty printing a capability using the C builtin wrappers.examples/manipulate_capabilities_c/example.c

Listing 21 uses this function to print both some initial capabilities from heap and stack memory and then manipulates them. First, it explicitly sets the bounds of the heap capability to 23 bytes, then removes all permissions except load.

	// A stack allocation
	char stackBuffer[23];
	print_capability(stackBuffer);
	// A heap allocation
	char *heapBuffer = malloc(23);
	print_capability(heapBuffer);
	// Setting the bounds of a heap capability
	char *bounded = cheri_bounds_set(heapBuffer, 23);
	print_capability(bounded);
	// Removing permissions from a heap capability
	bounded = cheri_permissions_and(bounded, CHERI_PERM_LOAD);
	print_capability(bounded);
	print_capability(heapBuffer);

Listing 21. Manipulating capabilities using the C builtin wrappers.examples/manipulate_capabilities_c/example.c

When you run this example, you should see something like this (the exact addresses may vary):

0x80000ae1 (valid:1 length: 0x17 0x80000ae1-0x80000af8 otype:0 permissions: - RWcgml -- -)
0x80006710 (valid:1 length: 0x18 0x80006710-0x80006728 otype:0 permissions: G RWcgm- -- -)
0x80006710 (valid:1 length: 0x17 0x80006710-0x80006727 otype:0 permissions: G RWcgm- -- -)
0x80006710 (valid:1 length: 0x17 0x80006710-0x80006727 otype:0 permissions: - R----- -- -)
0x80006710 (valid:1 length: 0x18 0x80006710-0x80006728 otype:0 permissions: G RWcgm- -- -)

First, note the difference between the permissions on the stack and heap allocation. The heap allocation has global permission: it may be stored anywhere. The stack allocation lacks global, but has store-local permission, which allows it to be used to store other capabilities that lack the global permission. This ensures that stack pointers can be stored only on the stack.

The bounds on the original heap allocation are rounded up to a multiple of the heap's allocation granule size. The CHERIoT allocator allocates 8-byte chunks, so this is rounded up to 24 (0x18) bytes. For a capability this small, CHERIoT can precisely represent the desired size and so the bounds setting operation succeeds and you can derive a capability with the precise bounds that we requested. Next, this removes all permissions except load. This pointer now provides a read-only view of the data, which cannot be stored anywhere except on the stack and which cannot be used to load capabilities.

Finally, this example prints the heap allocation again to remind you that these permissions and bounds apply to the pointer and not to the object. We have not removed permissions from an object, we have created a pointer that has fewer permissions to that object. There is no limit to the number of pointers that can exist to a single object.

4.7. Comparing capabilities with C builtins

By default, the C/C++ == operator on capabilities compares only the address.

This is subject to change in a future revision of CHERI C. It makes porting some existing code easier, but breaks the substitution principle (if a == b, you would expect to be able to use b or a interchangeably).

You can compare capabilities for exact equality with __builtin_cheri_equal_exact. This returns true if the two capabilities that are passed to it are identical, false otherwise. Exact equality means that the address, bounds, permissions, object type, and tag are all identical. It is, effectively, a bitwise comparison of all of the bits in the two capabilities, including the tag bits.

Ordered comparison, using operators such as less-than or greater-than, always operate with the address. There is no total ordering over capabilities. Two capabilities with different bounds or different permissions but the same address will return false when compared with either < or >.

This is fine according to a strict representation of the C abstract machine because comparing two pointers to different objects is undefined behaviour. It can be confusing but, unfortunately, there is no good alternative. Comparison of pointers is commonly used for keying in collections. For example, the C++ std::map class uses the ordered comparison operators for building a tree and relies on it working correctly for keys that are pointers. Ideally, these would explicitly operate over the address, but that would require invasive modifications when porting to CHERI platforms.

In general, in new code, you should avoid comparing pointers for anything other than exact equality, unless you are certain that they have the same base and bounds. Instead, be explicit about exactly what you are testing. Do you care if the permissions are different? Do you care about the bounds? Do you care if the value is tagged? Or do you just want to care about the address? In each case, you should explicitly compare the components of the capability that you care about.

You can also compare capabilities for subset relationships with __builtin_cheri_subset_test. This returns true if the second argument is a subset of the first. A capability is a subset of another if every right that it conveys is held by the other. This means the bounds of the subset capability must be smaller than or equal to the superset and all permissions held by the subset must be held by the superset.

4.8. Sizing allocations

CHERI capabilities cannot represent arbitrary bases and bounds. The larger the bounds, the more strongly aligned the base and bounds must be.

NOTE: The current CHERIoT encoding gives byte-granularity bounds for objects up to 511 bytes, then requires one more bit of alignment for each bit needed to represent the size, up to 8 MiB. Capabilities larger than 8 MiB cover the entire address space. This is ample for small embedded systems where most compartments or heap objects are expected to be under tens of KiBs. Other CHERI systems make different trade offs.

Calculating the length can be non-trivial and can vary across CHERI systems. The compiler provides two builtins that help.

The first, __builtin_cheri_round_representable_length, returns the smallest length that is larger than (or equal to) the requested length and can be accurately represented. The compressed bounds encoding requires both the top and base to be aligned on the same amount and so there's a corresponding mask that needs to be used for alignment. The __builtin_cheri_representable_alignment_mask builtin returns the mask that can be applied to the base and top addresses to align them.

4.9. Manipulating capabilities with CHERI::Capability

The raw C builtins can be somewhat verbose. CHERIoT RTOS provides a CHERI::Capability class in cheri.hh to simplify inspecting and manipulating CHERI capabilities.

These provide methods that are modelled to allow you to pretend that they give direct access to the fields of the capability. For example, you can write:

capability.address() += 4;
capability.permissions() &= permissionSet;

This modifies the address of capability, increasing it by four, and removes all permissions not present in permissionSet. Other operations are also defined to be orthogonal.

Permissions are exposed as a PermissionSet object. This is a constexpr class that provides a rich set of operations on permissions. This can be used as a template parameter and can be used in static assertions for compile-time validation of derivation chains. The loader makes extensive use of this class to ensure correctness.

The equality comparison for CHERI::Capability uses exact comparison, unlike raw C/C++ pointer comparison. This is less confusing for new code (it respects the substitution principle) but users may be confused that a == b is true but Capability{a} == Capability{b} is false.

See cheri.hh for more details and for other convenience wrappers around the compiler builtins.