Binary Serialize
Loading...
Searching...
No Matches
extract_append.hpp File Reference

Functions to extract arithmetic binary values from a byte buffer (in either endian order) to native format; conversely, given an arithmetic binary value, append it to a buffer of bytes in the specified endian order. More...

#include "serialize/byteswap.hpp"
#include <concepts>
#include <algorithm>
#include <bit>
#include <array>
#include <cstddef>
#include <cstdint>
#include <type_traits>
Include dependency graph for extract_append.hpp:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Concepts

concept  chops::integral_or_byte
 

Functions

template<std::endian BufEndian, integral_or_byte T>
constexpr T chops::extract_val (const std::byte *buf) noexcept
 Extract a value from a std::byte buffer into a fundamental integral or std::byte type in native endianness, swapping bytes as needed.
 
template<std::endian BufEndian, integral_or_byte T>
constexpr std::size_t chops::append_val (std::byte *buf, const T &val) noexcept
 Append an integral or std::byte value to a std::byte buffer, swapping into the specified endian order as needed.
 
template<std::unsigned_integral T>
constexpr std::size_t chops::append_var_int (std::byte *output, T val)
 Encode an unsigned integer into a variable length buffer of bytes using the MSB (most significant bit) algorithm.
 
template<std::unsigned_integral T>
constexpr T chops::extract_var_int (const std::byte *input, std::size_t input_size)
 Given a buffer of std::bytes that hold a variable sized integer, decode them into an unsigned integer.
 

Detailed Description

Functions to extract arithmetic binary values from a byte buffer (in either endian order) to native format; conversely, given an arithmetic binary value, append it to a buffer of bytes in the specified endian order.

The functions in this file are low-level. They handle fundamental arithmetic types and extracting or appending to std::byte buffers. It is meant to be the lower layer of serializing utilities, where the next higher layer provides buffer management, sequences, and overloads for specific types such as std::string, bool, and std::optional.

Note
The variable sized integer functions (extract_var_int, append_var_int) support the variable byte integer type in MQTT (Message Queuing Telemetry Transport), a commonly used IoT protocol. The code in this header is adapted from a Techoverflow.net article by Uli Koehler and published under the CC0 1.0 Universal license: https://techoverflow.net/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
Author
Cliff Green, Roxanne Agerone, Uli Koehler

Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE.txt or copy at http://www.boost.org/LICENSE_1_0.txt)

Function Documentation

◆ append_val()

template<std::endian BufEndian, integral_or_byte T>
std::size_t chops::append_val ( std::byte * buf,
const T & val )
constexprnoexcept

Append an integral or std::byte value to a std::byte buffer, swapping into the specified endian order as needed.

Template Parameters
BufEndianThe endianness of the buffer.
TType of value to append to buffer.

The BufEndian enum must be specified, but the type of the passed in value can be deduced.

Parameters
bufPointer to array of std::bytes big enough to hold the bytes of the value.
valValue in native endian order to append to buf.
Returns
Number of bytes copied into the std::byte buffer.
Precondition
The buffer must already be allocated to hold at least sizeof(T) bytes.
Note
See note above about floating point values.

◆ append_var_int()

template<std::unsigned_integral T>
std::size_t chops::append_var_int ( std::byte * output,
T val )
constexpr

Encode an unsigned integer into a variable length buffer of bytes using the MSB (most significant bit) algorithm.

Given an integer, store the value in 1 or 2 or 3 or more bytes depending on the value. If small (under 128) the value is stored in one byte. If the value is greater than 127, the most significant bit in the first byte is set to 1 and the value is stored in two bytes. This logic is repeated as necessary.

This algorithm optimizes space when most of the values are small. If most of the values are large, this algorithm is inefficient, needing more buffer space for the encoded integers than if fixed size integer buffers were used.

The output of this function is (by definition) in little-endian order. However, as long as the two corresponding functions (or equivalent algorithms) are used consistently, the endianness will not matter. There is no byte swapping performed, and encoding and decoding will result in the native endianness of the platform. I.e. this works whether serialization is big-endian or little-endian.

Note
Signed types are not supported.
Parameters
valThe input value. Any standard unsigned integer type is allowed.
outputA pointer to a preallocated array of std::bytes big enough for the output. A safe minimum size is 5 bytes for 32 bit integers, 10 bytes for 64 bit integers and 3 bytes for 16 bit integers.
Returns
The number of bytes written to the output array.
Precondition
The output buffer must already be allocated large enough to hold the result.

◆ extract_val()

template<std::endian BufEndian, integral_or_byte T>
T chops::extract_val ( const std::byte * buf)
constexprnoexcept

Extract a value from a std::byte buffer into a fundamental integral or std::byte type in native endianness, swapping bytes as needed.

Template Parameters
BufEndianThe endianness of the std::byte buffer.
TType of return value.

Since T cannot be deduced, it must be specified when calling the function. If the endianness of the buffer matches the native endianness, no swapping is performed.

Parameters
bufPointer to an array of std::bytes containing an object of type T in network byte order.
Returns
A value in native endian order.
Precondition
The buffer must contain at least sizeof(T) bytes.
Note
Floating point swapping is not supported.

Earlier versions did support floating point, but it is brittle - the floating point representation must exactly match on both sides of the serialization (most modern processors use IEEE 754 floating point representations). A byte swapped floating point value cannot be directly accessed (e.g. passed by value), due to the bit patterns possibly representing NaN values, which can generate hardware traps, either causing runtime crashes or silently changing bits within the floating point number. An integer value, however, will always have valid bit patterns, even when byte swapped.

◆ extract_var_int()

template<std::unsigned_integral T>
T chops::extract_var_int ( const std::byte * input,
std::size_t input_size )
constexpr

Given a buffer of std::bytes that hold a variable sized integer, decode them into an unsigned integer.

For consistency with the append_var_int function, only unsigned integers are supported for the output type of this function.

Note
Signed types are not supported.
Parameters
inputA variable-length encoded integer stored in a buffer of std::bytes.
input_sizeNumber of bytes representing the integer.
Returns
The value in native unsigned integer format.