Functions to extract arithmetic binary values from a byte buffer (in either endian order) to native format; conversely, given an arithmetic binary value, append it to a buffer of bytes in the specified endian order. More...

#include "serialize/byteswap.hpp"
#include <concepts>
#include <algorithm>
#include <bit>
#include <array>
#include <cstddef>
#include <cstdint>
#include <type_traits>

Include dependency graph for extract_append.hpp:

This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Concepts
concept	chops::integral_or_byte

Functions
template<std::endian BufEndian, integral_or_byte T>
constexpr T	chops::extract_val (const std::byte *buf) noexcept
	Extract a value from a `std::byte` buffer into a fundamental integral or `std::byte` type in native endianness, swapping bytes as needed.

template<std::endian BufEndian, integral_or_byte T>
constexpr std::size_t	chops::append_val (std::byte *buf, const T &val) noexcept
	Append an integral or `std::byte` value to a `std::byte` buffer, swapping into the specified endian order as needed.

template<std::unsigned_integral T>
constexpr std::size_t	chops::append_var_int (std::byte *output, T val)
	Encode an unsigned integer into a variable length buffer of bytes using the MSB (most significant bit) algorithm.

template<std::unsigned_integral T>
constexpr T	chops::extract_var_int (const std::byte *input, std::size_t input_size)
	Given a buffer of `std::bytes` that hold a variable sized integer, decode them into an unsigned integer.

Detailed Description

Functions to extract arithmetic binary values from a byte buffer (in either endian order) to native format; conversely, given an arithmetic binary value, append it to a buffer of bytes in the specified endian order.

The functions in this file are low-level. They handle fundamental arithmetic types and extracting or appending to std::byte buffers. It is meant to be the lower layer of serializing utilities, where the next higher layer provides buffer management, sequences, and overloads for specific types such as std::string, bool, and std::optional.

Note: The variable sized integer functions (extract_var_int, append_var_int) support the variable byte integer type in MQTT (Message Queuing Telemetry Transport), a commonly used IoT protocol. The code in this header is adapted from a Techoverflow.net article by Uli Koehler and published under the CC0 1.0 Universal license: https://techoverflow.net/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/

Author: Cliff Green, Roxanne Agerone, Uli Koehler

Copyright: (c) 2019-2024 by Cliff Green, Roxanne Agerone

Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE.txt or copy at http://www.boost.org/LICENSE_1_0.txt)

Function Documentation

◆ append_val()

template<std::endian BufEndian, integral_or_byte T>

std::size_t chops::append_val	(	std::byte *	buf,
		const T &	val )

constexprnoexcept

Append an integral or std::byte value to a std::byte buffer, swapping into the specified endian order as needed.

Template Parameters

BufEndian	The endianness of the buffer.
T	Type of value to append to buffer.

The BufEndian enum must be specified, but the type of the passed in value can be deduced.

Parameters

buf	Pointer to array of `std::bytes` big enough to hold the bytes of the value.
val	Value in native endian order to append to buf.

Returns: Number of bytes copied into the std::byte buffer.

Precondition: The buffer must already be allocated to hold at least sizeof(T) bytes.

Note: See note above about floating point values.

◆ append_var_int()

template<std::unsigned_integral T>

std::size_t chops::append_var_int	(	std::byte *	output,
		T	val )

constexpr

Encode an unsigned integer into a variable length buffer of bytes using the MSB (most significant bit) algorithm.

Given an integer, store the value in 1 or 2 or 3 or more bytes depending on the value. If small (under 128) the value is stored in one byte. If the value is greater than 127, the most significant bit in the first byte is set to 1 and the value is stored in two bytes. This logic is repeated as necessary.

This algorithm optimizes space when most of the values are small. If most of the values are large, this algorithm is inefficient, needing more buffer space for the encoded integers than if fixed size integer buffers were used.

The output of this function is (by definition) in little-endian order. However, as long as the two corresponding functions (or equivalent algorithms) are used consistently, the endianness will not matter. There is no byte swapping performed, and encoding and decoding will result in the native endianness of the platform. I.e. this works whether serialization is big-endian or little-endian.

Note: Signed types are not supported.

Parameters

val	The input value. Any standard unsigned integer type is allowed.
output	A pointer to a preallocated array of `std::bytes` big enough for the output. A safe minimum size is 5 bytes for 32 bit integers, 10 bytes for 64 bit integers and 3 bytes for 16 bit integers.

Returns: The number of bytes written to the output array.

Precondition: The output buffer must already be allocated large enough to hold the result.

◆ extract_val()

template<std::endian BufEndian, integral_or_byte T>

T chops::extract_val ( const std::byte * buf )

constexprnoexcept

Extract a value from a std::byte buffer into a fundamental integral or std::byte type in native endianness, swapping bytes as needed.

Template Parameters

BufEndian	The endianness of the `std::byte` buffer.
T	Type of return value.

Since T cannot be deduced, it must be specified when calling the function. If the endianness of the buffer matches the native endianness, no swapping is performed.

Parameters

buf	Pointer to an array of `std::bytes` containing an object of type T in network byte order.

Returns: A value in native endian order.

Precondition: The buffer must contain at least sizeof(T) bytes.

Note: Floating point swapping is not supported.

Earlier versions did support floating point, but it is brittle - the floating point representation must exactly match on both sides of the serialization (most modern processors use IEEE 754 floating point representations). A byte swapped floating point value cannot be directly accessed (e.g. passed by value), due to the bit patterns possibly representing NaN values, which can generate hardware traps, either causing runtime crashes or silently changing bits within the floating point number. An integer value, however, will always have valid bit patterns, even when byte swapped.

◆ extract_var_int()

template<std::unsigned_integral T>

T chops::extract_var_int	(	const std::byte *	input,
		std::size_t	input_size )

constexpr

Given a buffer of std::bytes that hold a variable sized integer, decode them into an unsigned integer.

For consistency with the append_var_int function, only unsigned integers are supported for the output type of this function.

Note: Signed types are not supported.

Parameters

input	A variable-length encoded integer stored in a buffer of `std::bytes`.
input_size	Number of bytes representing the integer.

Returns: The value in native unsigned integer format.

Concepts

Functions

Detailed Description

Function Documentation

◆ append_val()

◆ append_var_int()

◆ extract_val()

◆ extract_var_int()