4. C/C++ extensions for CHERIoT

The CHERIoT platform adds a small number of C/C++ annotations to support the compartment model.

4.1. Exposing compartment entry points

Compartments are discussed in detail in Chapter 5. Compartments and libraries. A compartment can expose functions as entry points via a simple attribute.

The cheri_compartment({name}) attribute specifies the name of the compartment that defines a function. This is used in concert with the -cheri-compartment= compiler flag. This allows the compiler to know whether a particular function (which may be in another compilation unit) is defined in the same compartment as the current compilation unit, allowing direct calls for functions in the same compilation unit and cross-compartment calls for other cases.

This can be used on either definitions or declarations but is most commonly used on declarations.

If a function is defined while compiling a compilation unit belonging to a different compartment, the compiler will raise an error. In CHERIoT RTOS, this attribute is always used via the __cheriot_compartment({name}) macro. This makes it possible to simply use #define __cheriot_compartment(x) when compiling for other platforms.

Most of the time, you will not need to worry about the compiler flags directly. The xmake provided by CHERIoT RTOS will set the compiler flags for you automatically. Listing 8 shows the prototype of a trivial function that increments an integer that is private to a compartment.

/**
 * A function to increment a private variable inside a
 * compartment.
 */
__cheriot_compartment(
  "example_compartment") int increment();

Listing 8. Exporting a function for use by other compartments from a header.examples/compartment_annotation/interface.h

The body of this function is then shown in Listing 9. Note that this does not require the attribute, it is inherited from the prototype. If you forget to include the header, you will see a linker error about a missing symbol.

int increment()
{
	counter++;
	return 0;
}

Listing 9. The body of a function that is exposed for cross-compartment calls.examples/compartment_annotation/compartment.cc

The build system specifies the -cheri-compartment= flag based on the compartment target definition in the xmake.lua. Listing 10 shows this for the simple example compartment.

-- An example compartment that we can call
compartment("example_compartment")
	add_files("compartment.cc")

Listing 10. Build system code for defining a compartment.examples/compartment_annotation/xmake.lua

If you get the compartment name wrong, the compiler will generate an error. For example, if you change the compartment name in Listing 8 to "wrong_compartment", you will see the following error when compiling compartment.cc, which contains the definition of this function:

error: compartment.cc:21:5: error: CHERI compartment entry declared for compartment 'wrong_compartment' but implemented in 'example_compartment' (provided with -cheri-compartment=)
   21 | int monotonic(Callback callback)
      |     ^

4.2. Passing callbacks to other compartments.

The cheri_callback attribute specifies a function that can be used as an entry point by compartments that are passed a function pointer to it. This attribute must also be used on the type of function pointers that hold cross-compartment invocations. Any time the address of such a function is taken, the result will be a sealed capability that can be used to invoke the compartment and call this function.

The compiler does not know, when calling a callback, whether it points to the current compartment. As such, calling a CHERI callback function will always be a cross-compartment call, even if the target is in the current compartment.

This attribute can also be used via the __cheriot_callback macro, which allows it to be defined away when targeting other platforms.

Listing 11 shows both how to declare a typedef for a function pointer type that can be used for cross-compartment callbacks and how to expose a function that takes one. This is a simple function that will increment a private counter and invoke the callback.

/**
 * A cross-compartment callback that takes an integer and
 * returns an integer.
 */
typedef __cheriot_callback int (*Callback)(int);

/**
 * Example of a function that takes a cross-compartment
 * callback as an argument.
 */
__cheriot_compartment("example_compartment") int monotonic(
  Callback);

Listing 11. Exposing a function that takes a cross-compartment callback for use by other compartments.examples/compartment_annotation/interface.h

The implementation of this function (Listing 12) calls it just as it would call any other function pointer. The difference is dealt with entirely by the compiler. For a normal call, the compiler will emit a simple jump-and-link to the address, whereas in this case it will invoke the switcher (see Section 2.2. Changing trust domain with the switcher) with the callback as an extra argument.

Every function that's exposed for cross-compartment invocation has an entry in the compartment's export table, containing the metadata that the switcher will use. Every function that is directly called by another compartment will then have an entry in the calling compartment's import table that the loader will initialise with a sealed capability to the export table entry. Callback functions work in a similar way, except that the import table entry is for the compartment that exposes the callback.

When you take the address of a callback function, the compiler simply inserts a load of the import table entry, giving exactly the same kind of sealed capability that you would use for direct cross-compartment calls. At the call site, the only difference between a direct cross-compartment call and a callback is that the former will contain the load from the import table, whereas the latter will simply move the callback into the register that is used to pass the callee to the switcher.

int monotonic(Callback callback)
{
	return callback(++counter);
}

Listing 12. The body of a function that invokes a cross-compartment callback.examples/compartment_annotation/compartment.cc

The callback is then declared just like any other function, but with the correct attribute, as shown in Listing 13.

The function attributes can be provided either before the start of the function or before the function name (after the return type). In some cases, the latter can avoid ambiguity (the attribute definitely applies to the function, not to the return type), but both are equivalent the rest of the time.

int __cheriot_callback callback(int counter)
{
	printf("Counter value: %d\n", counter);
	return 0;
}

Listing 13. A function that can be invoked as a cross-compartment callback.examples/compartment_annotation/entry.cc

The callback function is passed just like any other function pointer, as shown in Listing 14. Note that the two ways of taking the address of a function in C/C++ (callback and &callback) are equivalent. Both work; some people prefer the former because it is more concise, others prefer the latter because it is a visual marker that a pointer is being constructed.

	increment();
	monotonic(callback);
	monotonic(&callback);

Listing 14. A function that can be invoked as a cross-compartment callback.examples/compartment_annotation/entry.cc

When you run this example, you should see:

Counter value: 2
Counter value: 3

The callback is invoked in the compartment that implements it and has access to the copy of the counter (passed by value) that the caller provides, but it cannot modify the counter.

4.3. Exposing library entry points

Libraries are discussed in detail in Chapter 5. Compartments and libraries. Like compartments, they can export functions via a simple annotation. Unlike compartments, they are simply a mechanism for code sharing, not a security boundary. Libraries do not have mutable globals and each call to a library is assumed to have access to everything in the caller. Libraries are intended to provide almost the same abstraction as if you'd copied and pasted code into each compartment that calls them, though without the accompanying code duplication.

The cheri_libcall attribute specifies that this function is provided by a library (shared between compartments). This attribute is implicit for all compiler built-in functions, including memcpy and similar freestanding C environment functions. As with cheri_compartment(), this may be used on both definitions and declarations.

Unlike the compartment annotation, the library annotation does not specify the library that provides the function (though you can validate this later with the auditing tools, as described in Chapter 10. Auditing firmware images). This allows library functions to be moved between libraries easily, a refactoring that does not affect most of the security model. For example, the RTOS used to provide a library that implemented all of the helpers for atomic operations. This was later split into separate libraries for different sized objects, allowing code to link only the atomic operations for types that it uses.

This attribute can also be used via the __cheriot_libcall macro, which allows it to be defined away when targeting other platforms. This is how it is used in Listing 15, which declares a simple library function.

/**
 * A simple example library function.
 */
__cheriot_libcall void library_function();

Listing 15. A declaration of a library functionexamples/library_annotation/interface.h

As with the compartment annotations, these don't need to be placed on both the prototype and the declaration. Listing 16 shows the definition, which omits the attribute.

void library_function()
{
	// Print the stack capability from within the library.
	Debug::log("Stack pointer: {}",
	           __builtin_cheri_stack_get());
}

Listing 16. A definition of a library functionexamples/library_annotation/library.cc

Both the library function and the call site, shown in Listing 17, use the CHERIoT RTOS debugging APIs that are described in detail in Chapter 8. Features for debug builds. Among other things, these allow you to pretty-print capabilities. These use a compiler builtin to get the capability to the stack and print it.

/// Thread entry point.
void __cheriot_compartment("entry") entry()
{
	// Print the current stack capability.
	Debug::log("Stack pointer: {}",
	           __builtin_cheri_stack_get());
	// Call the function exported from the library.
	library_function();
}

Listing 17. Calling a simple library function.examples/library_annotation/entry.cc

When you run this example, you should see the stack capability printed twice, once by the entry compartment and once by the library. The library is called from the compartment so you should see the stack pointer move, but the bounds will remain the same. When you run it, you should see something like this:

Entry compartment: Stack pointer: 0x80000af0 (v:1 0x80000720-0x80000b20 l:0x400 o:0x0 p: - RWcgml -- ---)
Library: Stack pointer: 0x80000ad0 (v:1 0x80000720-0x80000b20 l:0x400 o:0x0 p: - RWcgml -- ---)

The bounds (0x80000720-0x80000b20) remain constant across the call. This means that malicious code in the library could inspect or modify everything on the caller's stack. In contrast, if you try the same thing in a compartment, you will see this stack truncated.

Try modifying this example to place the function in a compartment instead of a library. Don't forget to modify the xmake.lua file to change the library target to compartment.

4.4. Interrupt state control

The cheriot_interrupt_state attribute (commonly used as the C++11 / C23 attribute cheriot::interrupt_state) is applied to functions and takes an argument that is one of the following:

enabled: Interrupts are enabled when calling this function.
disabled: Interrupts are disabled when calling this function.
inherit: The interrupt state is unchanged (inherited from the caller) when invoking this function.

For most functions, inherit is the default. For cross-compartment calls, enabled is the default and inherit is not permitted.

The compiler may not inline functions at call sites that would change the interrupt state and will always call them via a sentry capability set up by the loader. This makes it possible to statically reason about interrupt state in lexical scopes.

If a compartment is able to provide arbitrary interrupt-disabled functions, that compartment is in the TCB for availability. It is a good idea to move interrupt-disabled code into library functions where the contents of the library can be audited and the exact binary for the interrupt-disabled function can be part of a software bill of materials (SBOM), which can then allow you to reason about the whole system's availability guarantees.

If you need to wrap a few statements to run with interrupts disabled, you can use the convenience helper CHERI::with_interrupts_disabled. This is annotated with the attribute that disables interrupts and invokes the passed lambda. This maintains the structured-programming discipline for code running with interrupts disabled: it is coupled to a lexical scope.

Documentation for the with_interrupts_disabled function

template<typename T>
	[[cheriot::interrupt_state(disabled)]] auto with_interrupts_disabled(T &&fn)

Invokes the passed callable object with interrupts disabled.

You need to be very careful using this attribute. Listing 18 shows a very simple example of how disabling interrupts can have adverse effects. The spin_for_ticks function in this example will simply spin for the requested number of ticks, reading the cycle counter until enough time has elapsed. This is called by a thread entry-point function that runs with low priority, with increasing tick counts.

The rdcycle64 function reads the cycle timer. The thread_sleep call is sleeping for a single scheduler tick. This function and the meaning of a scheduler tick are explained in more detail in Chapter 6. Communicating between threads. For now, assume that the thread is attempting to sleep for the number of cycles shown by the printf call at the start, outside of the loop.


/**
 * A function that runs with interrupts disabled and
 * consumes CPU for the requested number of ticks.
 */
[[cheriot::interrupt_state(disabled)]] void
spin_for_ticks(uint32_t ticks)
{
	uint64_t end =
	  rdcycle64() + (uint64_t(ticks) * TIMERCYCLES_PER_TICK);
	while (rdcycle64() < end) {}
}

/// Low-priority thread entry point.
void __cheriot_compartment("interrupts") low()
{
	int sleeps = 2;
	while (true)
	{
		printf("low-priority thread running\n");
		spin_for_ticks(sleeps++);
	}
}

Listing 18. A low-priority thread that uses an interrupts-disabled function to consume CPU.examples/interrupts_disabled/interrupts.cc

The other thread in this program is shown in Listing 19. This runs with high priority and so will always preempt the low-priority thread when it is able to, but disabling interrupts means that preemption is impossible. Timer interrupts do not fire and so the scheduler cannot interrupt the function.

/// High-priority thread entry point.
void __cheriot_compartment("interrupts") high()
{
	printf("One tick is %d cycles\n", TIMERCYCLES_PER_TICK);
	while (true)
	{
		// Get the current cycle time
		uint64_t start = rdcycle64();
		// Sleep for one scheduler tick
		Timeout t{1};
		thread_sleep(&t);
		// Report how long the sleep was
		printf("Cycles elapsed with high-priority thread "
		       "yielding: %lld\n",
		       rdcycle64() - start);
	}
}

Listing 19. A high-priority thread that is starved by an interrupts-disabled function called from a low-priority thread.examples/interrupts_disabled/interrupts.cc

When you run this, you will see that the actual time spent sleeping increases each iteration:

One tick is 10000 cycles
low-priority thread running
Cycles elapsed with high-priority thread yielding: 23461
low-priority thread running
Cycles elapsed with high-priority thread yielding: 33450
low-priority thread running
Cycles elapsed with high-priority thread yielding: 43449
low-priority thread running
Cycles elapsed with high-priority thread yielding: 53448

The low-priority thread is allowed to start running when the high-priority thread yields but then prevents any other thread in the system from running. If you did anything like this in a realtime system, this would guarantee that you would would miss your realtime deadlines.

The key problem here is that the interrupts-disabled function has an unbounded run time. It will consume the CPU for a data-dependent amount of time with no practical upper bound. When you are building realtime systems, even very soft realtime systems, you must ensure that the worst-case execution time for responding to events is bounded.

4.5. Importing MMIO access

The MMIO_CAPABILITY({type}, {name}) macro is used to access memory-mapped I/O devices. These are specified in the board definition file by the build system. The DEVICE_EXISTS({name}) macro can be used to detect whether the current target provides a device with the specified name.

The type parameter is the type used to represent the MMIO region. The macro evaluates to a volatile {type} *, so MMIO_CAPABILITY(struct UART, uart) will provide a volatile struct UART * pointing (and bounded) to the device that the board definition exposes as uart. This is precisely what happens in Listing 20, which prints 'Hello world!' to the UART directly.

	static const char hello[] = "Hello world!\n";
	for (char c : hello)
	{
		MMIO_CAPABILITY(Uart, uart)->blocking_write(c);
	}

Listing 20. Retrieving a pointer to a UART's MMIO space and using it.examples/raw_uart/raw_uart.cc

4.6. Sealing opaque types

Sealed capabilities were introduced in Section 1.6. Sealing pointers for tamper proofing. They provide a simple hardware-enforced mechanism for providing type-safe opaque types.

You normally implement opaque types in C/C++ by forward-declaring a struct type and then handing out pointers to that type. For example, you might write something like this:

struct MyType;
MyType *create_my_type();

This function will return a new instance of some type, but the caller can't see the implementation details. They can, of course, cast it to a char* or similar and read and write the underlying data. The opaque type is a software-engineering boundary telling the caller that they should not depend on the representation of this type.

Sealing on CHERIoT makes it easy to turn that software-engineering boundary into a security boundary. The same interface can be written for CHERIoT as:

struct MyType;
MyType *__sealed_capability create_my_type();

The returned value is marked as being tamper proof. The hardware ensures that the caller cannot modify the underlying object. If the caller casts this to a char* and tries to modify it then they will get a run-time trap. The __sealed_capability qualifier ensures that callers don't do this accidentally. The compiler will error if you try to dereference a sealed capability.

You can implicitly cast the MyType *__sealed_capability to void* but not to a MyType *. You can explicitly cast away the __sealed_capability qualifier but that just lets you compile things that will trap at run time.

The builtins for sealing and unsealing respect these types, as do the RTOS APIs (Section 7.7. Allocating on behalf of a caller) that use them. This means that you can write a function that expects a MyType *__sealed_capability and preserve type safety throughout your code and untrusted code. When a caller gives you back this kind of pointer and you unseal it, you will get a MyType * that is either a valid value or untagged.

4.7. Manipulating capabilities with C builtins

The compiler provides a set of built-in functions for manipulating capabilities. These are typically of the form __builtin_cheri_{noun}_{verb}. You can read all of the fields of a CHERI capability with get as the verb and the following nouns:

address: The current address that's used when the capability is used as a pointer.
base: The lowest address that this authorises access to.
top: The address immediately after the end of the range that this authorises access to.
length: The distance between the base and the top.
perms: The architectural permissions that this capability holds.
sealed: Is this a sealed capability?
tag: Is this a valid capability?
type: The type of this capability (zero means unsealed).

The verbs vary because they express the guarded manipulation guarantees for CHERI capabilities. You can't, for example, arbitrarily set the permissions on a capability, you can only remove permissions. Capabilities can be modified with the nouns and verbs listed in Table 3. CHERI capability manipulation builtin functions.

Table 3. CHERI capability manipulation builtin functions
Noun	Modification verb	Operation
address	set	Set the address for the capability.
bounds	set	Sets the base at or below the current address and the length at or above the requested length, as closely as possible to give a valid capability
bounds	set_exact	Sets the base to the current address and the length to the requested length or returns an untagged capability (one that will trap if used) if the result is not representable.
perms	and	Clears all permissions except those provided as the argument.
tag	clear	Invalidates the capability but preserves all other fields.

Setting the object type for sealed capabilities is more complex and requires a second capability that authorises sealing. The address field for capabilities with permit-seal or permit-unseal permissions refers to the object-type space, rather than the memory address space. The __builtin_cheri_seal function takes an authorising capability (something with the permit-seal permission) as the second argument and sets the object type of the result to the address of the sealing capability. Conversely, __builtin_cheri_unseal uses a capability with the permit-unseal permission and an address matching the object type to restore the original unsealed value.

Most of the time, C code will avoid using the builtins directly and instead use the wrappers defined in cheri-builtins.h. This file contains a set of macros that wrap the builtins to remove the __builtin_ prefix.

Although most of the macros in cheri-builtins.h match the names of the underlying builtins, the permissions macros follow the CHERIoT RTOS coding convention of avoiding abbreviations and so use permissions instead of perms. The predicates prefix the operation with _is so __builtin_cheri_equal_exact becomes cheri_is_equal_exact.

You can see how to use most of the introspection builtins via their macro wrappers in Listing 21. This prints a capability, showing its address, tag (valid) bit, length, bounds, and permissions. The permissions are expanded as the letters from the tables in Section 1.4. Decomposing permissions in CHERIoT. The builtins are thing wrappers around the instructions, which represent the permissions as a bitmask. Individual bits must be extracted by a bitwise AND operation.

void print_capability(void *ptr)
{
	unsigned permissions = cheri_permissions_get(ptr);
	printf(
	  "0x%x (valid:%d length: 0x%x 0x%x-0x%x otype:%d "
	  "permissions: %c "
	  "%c%c%c%c%c%c %c%c %c%c%c)\n",
	  cheri_address_get(ptr),
	  cheri_tag_get(ptr),
	  cheri_length_get(ptr),
	  cheri_base_get(ptr),
	  cheri_top_get(ptr),
	  cheri_type_get(ptr),
	  (permissions & CHERI_PERM_GLOBAL) ? 'G' : '-',
	  (permissions & CHERI_PERM_LOAD) ? 'R' : '-',
	  (permissions & CHERI_PERM_STORE) ? 'W' : '-',
	  (permissions & CHERI_PERM_LOAD_STORE_CAP) ? 'c' : '-',
	  (permissions & CHERI_PERM_LOAD_GLOBAL) ? 'g' : '-',
	  (permissions & CHERI_PERM_LOAD_MUTABLE) ? 'm' : '-',
	  (permissions & CHERI_PERM_STORE_LOCAL) ? 'l' : '-',
	  (permissions & CHERI_PERM_SEAL) ? 'S' : '-',
	  (permissions & CHERI_PERM_UNSEAL) ? 'U' : '-',
	  (permissions & CHERI_PERM_USER0) ? '0' : '-');
}

Listing 21. Pretty-printing a capability using the C builtin wrappers.examples/manipulate_capabilities_c/example.c

Listing 22 uses this function to print some initial capabilities from both heap and stack memory and then manipulates them. First, it explicitly sets the bounds of the heap capability to 23 bytes, then removes all permissions except load.

	// A stack allocation
	char stackBuffer[23];
	print_capability(stackBuffer);
	// A heap allocation
	char *heapBuffer = malloc(23);
	print_capability(heapBuffer);
	// Setting the bounds of a heap capability
	char *bounded = cheri_bounds_set(heapBuffer, 23);
	print_capability(bounded);
	// Removing permissions from a heap capability
	bounded = cheri_permissions_and(bounded, CHERI_PERM_LOAD);
	print_capability(bounded);
	print_capability(heapBuffer);

Listing 22. Manipulating capabilities using the C builtin wrappers.examples/manipulate_capabilities_c/example.c

When you run this example, you should see something like this (the exact addresses may vary):

0x80000ae1 (valid:1 length: 0x17 0x80000ae1-0x80000af8 otype:0 permissions: - RWcgml -- -)
0x80006710 (valid:1 length: 0x18 0x80006710-0x80006728 otype:0 permissions: G RWcgm- -- -)
0x80006710 (valid:1 length: 0x17 0x80006710-0x80006727 otype:0 permissions: G RWcgm- -- -)
0x80006710 (valid:1 length: 0x17 0x80006710-0x80006727 otype:0 permissions: - R----- -- -)
0x80006710 (valid:1 length: 0x18 0x80006710-0x80006728 otype:0 permissions: G RWcgm- -- -)

First, note the difference between the permissions on the stack and heap allocation. The heap allocation has global permission: it may be stored anywhere. The stack allocation lacks global, but has store-local permission, which allows it to be used to store other capabilities providing they don't have the global permission. These two conditions ensure that stack pointers (which lack global) can be stored only on the stack (the only memory that has store-local permission).

The bounds on the original heap allocation are rounded up to a multiple of the heap's allocation granule size. The CHERIoT allocator allocates 8-byte chunks, so this is rounded up to 24 (0x18) bytes. For a capability this small, CHERIoT can precisely represent the desired size and so the bounds-setting operation succeeds and you can derive a capability with the precise bounds that we requested.

Next, this removes all permissions except load. This pointer now provides a read-only view of the data, which cannot be stored anywhere except on the stack and which cannot be used to load capabilities.

Finally, this example prints the heap allocation again to remind you that these permissions and bounds apply to the pointer and not to the object. We have not removed permissions from an object, we have created a pointer that has fewer permissions to that object. There is no limit to the number of pointers that can exist to a single object.

4.8. Comparing capabilities with C builtins

By default, the C/C++ == operator on capabilities compares only the address.

This is subject to change in a future revision of CHERI C. It makes porting some existing code easier, but breaks the substitution principle (if a == b, you would expect to be able to use b or a interchangeably).

You can compare capabilities for exact equality with the __builtin_cheri_equal_exact, or the cheri_is_equal_exact macro that wraps the builtin. This returns true if the two capabilities that are passed to it are identical, false otherwise. Exact equality means that the address, bounds, permissions, object type, and tag are all identical. It is, effectively, a bitwise comparison of all of the bits in the two capabilities, including the tag bits.

You can see the difference between the two in Listing 23. This creates a capability with a small offset into an on-stack buffer and then restricts the bounds and removes permissions from it, then compares them for equality using both the == operator and the cheri_is_equal_exact macro.

	// A stack allocation
	char  stackBuffer[23];
	char *offset = stackBuffer + 4;
	print_capability(offset);
	// Reduce the bounds
	char *bounded = cheri_bounds_set(offset, 4);
	print_capability(bounded);
	printf("Equal? %d\n", bounded == offset);
	printf("Exactly equal? %d\n",
	       cheri_is_equal_exact(bounded, offset));
	// Remove permissions
	char *restricted =
	  cheri_permissions_and(bounded, CHERI_PERM_LOAD);
	print_capability(restricted);
	printf("Equal? %d\n", bounded == restricted);
	printf("Exactly equal? %d\n",
	       cheri_is_equal_exact(bounded, restricted));
	char *untagged = cheri_tag_clear(restricted);
	print_capability(untagged);
	printf("Equal? %d\n", untagged == restricted);
	printf("Exactly equal? %d\n",
	       cheri_is_equal_exact(untagged, restricted));

Listing 23. Comparing two capabilities for equality.examples/compare_capabilities/example.c

When you run this example, you should see output that looks something like this:

0x80000ae5 (valid:1 length: 0x17 0x80000ae1-0x80000af8 otype:0 permissions: - RWcgml -- -)
0x80000ae5 (valid:1 length: 0x4 0x80000ae5-0x80000ae9 otype:0 permissions: - RWcgml -- -)
Equal? 1
Exactly equal? 0
0x80000ae5 (valid:1 length: 0x4 0x80000ae5-0x80000ae9 otype:0 permissions: - R----- -- -)
Equal? 1
Exactly equal? 0
0x80000ae5 (valid:0 length: 0x4 0x80000ae5-0x80000ae9 otype:0 permissions: - R----- -- -)
Equal? 1
Exactly equal? 0

First it shows the original capability, which grants complete access to a stack allocation and has its address four bytes offset into the object. Then the bounded capability, which has the same address and permissions, but different bounds. These compare equal with address-based comparison but not exactly equal.

Next it removes permissions from the derived capability and compares these. Again, the difference in permissions is not reflected in the address-based equality but is in the exact equality.

The final case is the most interesting and the one where this can be the most confusing. The last pointer constructed in this example is not a capability. This is constructed by clearing the tag, which is the bit that indicates that the capability-sized word is, in fact, a capability. Losing the tag bit means that this is not a capability at all, merely 64 bits of data that happen to be loaded into a capability register. With the C equality operator, this still compares equal to any of the other capabilities, but the exact-equality comparison fails.

Ordered comparison, using operators such as less-than or greater-than, always operate with the address. There is no total ordering over capabilities. Two capabilities with different bounds or different permissions but the same address will return false when compared with either < or >.

This is fine according to a strict representation of the C abstract machine because comparing two pointers to different objects is undefined behaviour. It can be confusing but, unfortunately, there is no good alternative. Comparison of pointers is commonly used for keying in collections. For example, the C++ std::map class uses the ordered comparison operators for building a tree and relies on it working correctly for keys that are pointers. Ideally, these would explicitly operate over the address, but that would require invasive modifications when porting to CHERI platforms.

You can see the case that can make this confusing in Listing 24. This compares two capabilities using the ordered operators and then exact equality.

	if (bounded > offset)
	{
		printf("bounded > offset\n");
	}
	else if (bounded < offset)
	{
		printf("bounded < offset\n");
	}
	else if (cheri_is_equal_exact(bounded, offset))
	{
		printf("bounded exactly equals offset\n");
	}
	else
	{
		printf("bounded is not greater than, less than, nor "
		       "equal to, offset\n");
	}

Listing 24. Trying to construct an ordering over two capabilities.examples/compare_capabilities/example.c

When you run this example, it will print:

bounded is not greater than, less than, nor equal to, offset

This highlights that, within the C abstract machine, there is no good choice for what == should do on capabilities. In the current version, it breaks the substitution principle: you cannot use a and b interchangeably if a == b. In the alternative version, existing code that does a < b and a > b and assumes that a == b holds if both ordered comparisons fail would now be incorrect.

In general, in new code, you should avoid comparing pointers for anything other than exact equality, unless you are certain that they have the same base and bounds. Instead, be explicit about exactly what you are testing. Do you care if the permissions are different? Do you care about the bounds? Do you care if the value is tagged? Or do you just want to care about the address? In each case, you should explicitly compare the components of the capability that you care about.

You can also compare capabilities for subset relationships with __builtin_cheri_subset_test. This returns true if the second argument is a subset of the first. A capability is a subset of another if every right that it conveys is held by the other. This means the bounds of the subset capability must be smaller than or equal to the superset and all permissions held by the subset must be held by the superset.

You can see this for the capabilities that we've been looking at in Listing 25.

	printf("bounded ⊂ offset? %d\n",
	       cheri_subset_test(offset, bounded));
	printf("restricted ⊂ bounded? %d\n",
	       cheri_subset_test(bounded, restricted));
	printf("untagged ⊂ restricted? %d\n",
	       cheri_subset_test(restricted, untagged));
	printf("offset ⊂ bounded? %d\n",
	       cheri_subset_test(bounded, offset));

Listing 25. Subset relationships over two capabilities.examples/compare_capabilities/example.c

When you run this, the output is:

bounded ⊂ offset? 1
restricted ⊂ bounded? 1
untagged ⊂ restricted? 0
offset ⊂ bounded? 0

Most of these lines should not be a surprise. The bounded capability is a subset of the original, it was created by subsetting the bounds. The capability that was created by subsetting the rights on the bounded version is, in turn, a subset of the bounded version. Finally, the original is not a subset of the bounded version.

The surprising entry might be that the untagged capability is not a subset of the original. In a set-theoretic sense, this would be incorrect: The empty set is a subset of any other set. In practice, this degenerate case is not useful.

The test-subset operation gives a unidirectional substitution property (i.e. any operation that is safe to do with the subset is safe to do with the superset) but this is not usually something that you care about. The test is most useful for telling if one capability is derived from another (or, at least, could have a derivation path from a specific common root). For example, we can tell that (ignoring stack lifetime errors) bounded and restricted are both derived from the original stack allocation. It happens that, in this specific case, unbounded was derived from the same stack allocation but the lack of a tag bit means that there are no provenance guarantees. For untagged values, we can make no claims about whether they were derived from any other capabilities.

This is useful to check if a particular pointer that you've been given is derived from something that you already own. The claims mechanisms (described in Section 7.8. Ensuring that heap objects are not deallocated) uses this, for example, to allow threads to keep an object alive if you hold a pointer derived from the original object pointer. The temporal-safety properties of CHERIoT ensure that any dangling pointer to a heap object will be untagged and so any valid (tagged) pointer that is a subset of a heap allocation must be derived from the return from the original call to malloc or some similar function.

4.9. Sizing allocations

CHERI capabilities cannot represent arbitrary bases and bounds. The original CHERI prototypes, with a 64-bit address, encoded a 64-bit top address and a 64-bit base. This made capabilities 256 bits in total (four times the address size), which was not feasible for production implementations (though having lots of space was very useful for prototyping). Fortunately, there is a lot of redundancy between these three values. Generally, for any allocation, the high bits of the base, top, and some in-bounds address will all be the same.

CHERI systems since around 2016 have used compressed bounds encodings that take advantage of this redundancy. Rather than storing a complete address for the top and bottom, they store a floating-point value that is the distance from the address to the top and from the address to the base. The exponent bits are shared between the two. This means that, the larger the bounds, the more strongly aligned the base and bounds must be.

The current CHERIoT encoding gives byte-granularity bounds for objects up to 511 bytes, then requires one more bit of alignment for each bit needed to represent the size, up to 8 MiB. Capabilities larger than 8 MiB must be aligned on an 8 MiB boundary for their base and top. This is ample for small embedded systems where most compartments or heap objects are expected to be under tens of KiBs. This is a slightly simplified version of the original CHERI scheme, which simplifies critical-path lengths on short pipelines. A microcontroller may have a simple pipeline with only the traditional fetch, decode, and execute phases (or even less). The critical path is the path with the most logic chained together in a single stage. This limits the maximum clock speed for the device because a signal must be able to propagate through all of this logic in a single cycle. Future versions of CHERIoT are likely to support slightly more expressive formats on longer pipelines, where the decoding can be split between two or more stages. Other CHERI systems make different trade-offs.

Calculating the length can be non-trivial and can vary across CHERI systems. The compiler provides two builtins that help.

The first, __builtin_cheri_round_representable_length, returns the smallest length that is larger than (or equal to) the requested length and can be accurately represented. The compressed bounds encoding requires both the top and base to be aligned on the same amount and so there's a corresponding mask that needs to be used for alignment. The __builtin_cheri_representable_alignment_mask builtin returns the mask that can be applied to the base and top addresses to align them.

Listing 26 shows how to use these builtins via their wrappers to find the smallest representable size for a requested size.

	const size_t Size = 160000;
	printf("Smallest representable size of %d-byte "
	       "allocation: %d (0x%x). Alignment mask: 0x%x\n",
	       Size,
	       cheri_round_representable_length(Size),
	       cheri_round_representable_length(Size),
	       cheri_representable_alignment_mask(Size));
	void *allocation = malloc(Size);
	print_capability(allocation);

Listing 26. Rounding up sizes for representable allocations.examples/bounds_lengths/example.c

The allocator is using these internally when it determines the size to provide for a request and the alignment that it needs to find. When you run it, you may see something like this.

Smallest representable size of 160000-byte allocation: 160256 (0x27200). Alignment mask: 0xfffffe00
0x80006800 (valid:1 length: 0x27200 0x80006800-0x8002da00 otype:0 permissions: G RWcgm- -- -)

The requested size needs to be rounded up to 160,256 bytes. The hex representation makes it easier to see the alignment is 0x200, or 512 in decimal. The top and bottom of an allocation that can accurately represent the requested size must be 512-byte aligned. The alignment mask is simply another way of representing this, it is nine zeroes in the low bits and ones in all of the high bits.

When the allocator returns a value for this requested size, the length is rounded up as you'd expect. If you bitwise AND the base and top with the alignment mask, you will see no change. Both are, in this specific case, slightly more strongly aligned than required, most likely because this is the first malloc call in the program and so these are as strongly aligned as the heap base. You can test that the alignment is adequate by doing a bitwise AND (C operator: &) of the base or top with the alignment mask. This should leave the value unmodified.

4.10. Manipulating capabilities with CHERI::Capability

The raw C builtins can be somewhat verbose. CHERIoT RTOS provides a CHERI::Capability class in cheri.hh to simplify inspecting and manipulating CHERI capabilities.

These provide methods that are modelled to allow you to pretend that they give direct access to the fields of the capability. The manipulate_capabilities_cxx example shows how to do the same things as the manipulate_capabilities_c example, this time with the C++ APIs. First, Listing 27 reimplements the print_capability function using CHERI::Capability. This is slightly more verbose because it's printing with the printf function, which is a C variadic and so cannot take the result of ptr.address(), which is a proxy that allows you to manipulate the address.

void print_capability(CHERI::Capability<void> ptr)
{
	using P                          = CHERI::Permission;
	ptraddr_t            address     = ptr.address();
	CHERI::PermissionSet permissions = ptr.permissions();
	printf("0x%x (valid:%d length: 0x%x 0x%x-0x%x otype:%d "
	       "permissions: %c "
	       "%c%c%c%c%c%c %c%c %c%c%c)\n",
	       address,
	       ptr.is_valid(),
	       ptr.length(),
	       ptr.base(),
	       ptr.top(),
	       ptr.type(),
	       (permissions.contains(P::Global)) ? 'G' : '-',
	       (permissions.contains(P::Load)) ? 'R' : '-',
	       (permissions.contains(P::Store)) ? 'W' : '-',
	       (permissions.contains(P::LoadStoreCapability))
	         ? 'c'
	         : '-',
	       (permissions.contains(P::LoadGlobal)) ? 'g' : '-',
	       (permissions.contains(P::LoadMutable)) ? 'm' : '-',
	       (permissions.contains(P::StoreLocal)) ? 'l' : '-',
	       (permissions.contains(P::Seal)) ? 'S' : '-',
	       (permissions.contains(P::Unseal)) ? 'U' : '-',
	       (permissions.contains(P::Global)) ? '0' : '-');
}

Listing 27. Pretty-printing a capability using the C++ APIs.examples/manipulate_capabilities_cxx/example.cc

The using P = CHERI::Permission is not good style. It is done here so that the example code fits in a narrow page. In normal code, a mode descriptive name would be better.

Note the CHERI::PermissionSet class here. This is a (constexpr) class that encapsulates a CHERI permission set. The C version of this exposed the permissions in their raw form as a word where each bit represented a permission. The C++ version uses a rich type, with methods for subsetting. This can be used as a template parameter and can be used in static assertions for compile-time validation of derivation chains. The loader makes extensive use of this class to ensure correctness, with compile-time checks that operations on permission-set objects are valid.

This part of the example uses the contains() method to query whether a specific permission is present. This is strongly typed; it takes a CHERI::Permission, not an arbitrary integer. It is also a variadic template function: you can pass it multiple permissions and it will return true if and only if the permission set has all of them.

Next, Listing 28 does the same set of manipulations as Listing 22. This uses a CHERI::Capability<void> rather than a void* to hold the pointers.

	// A stack allocation
	char stackBuffer[23];
	print_capability(stackBuffer);
	// A heap allocation
	CHERI::Capability<void> heapBuffer = new char[23];
	print_capability(heapBuffer);
	// Setting the bounds of a heap capability
	auto bounded     = heapBuffer;
	bounded.bounds() = 23;
	print_capability(bounded);
	// Removing permissions from a heap capability
	bounded.permissions() &= CHERI::Permission::Load;
	print_capability(bounded);
	print_capability(heapBuffer);

Listing 28. Manipulating capabilities using the C++ APIs.examples/manipulate_capabilities_cxx/example.cc

The bounded.bounds() = 23 expression shows how the methods act like fields. This is doing a set-bounds operation on the capability. Similarly, the &= operation on the result of calling permissions() is an and-permissions operation. This lets you operate on the permissions as a CHERI::PermissionSet directly.

The equality comparison for CHERI::Capability uses exact comparison, unlike raw C/C++ pointer comparison. This is less confusing for new code (it respects the substitution principle) but users may be confused that a == b is true but Capability{a} == Capability{b} is false. Listing 29 shows the various forms of comparison.

	printf("heapBuffer == bounded? %d\n",
	       heapBuffer == bounded);
	printf("heapBuffer == bounded (as raw pointers)? %d\n",
	       heapBuffer.get() == bounded.get());
	printf(
	  "heapBuffer == bounded (as address comparison)? %d\n",
	  heapBuffer.address() == bounded.address());

Listing 29. Comparting capabilities using the C++ APIs.examples/manipulate_capabilities_cxx/example.cc

When you run this, you will see:

heapBuffer == bounded? 0
heapBuffer == bounded (as raw pointers)? 1
heapBuffer == bounded (as address comparison)? 1

The last two comparisons are equivalent, but the third is more explicit. If you want to compare two pointers for equality as address comparison, comparing their addresses makes the intent clear.

See cheri.hh for more details and for other convenience wrappers around the compiler builtins.

CHERIoT Programmers' Guide

Table of contents