Notice
This project was made with 3-5 users in mind. Thus I am only developing it until the group's needs are met.
If you are planning to use this in any capacity, please reach out for support, feature requests, and anything else.
Contact information available on the main page.This is a heavy work in progress. I will remove this part of the notice when no more major changes are planned. Until then, this document may change without notice.
OTBV file format specification
Table of contents
- 1. Rationale
- 2. Data representation
- 3. File structure
- 4. Encoder implementation guidelines
- 5. Decoder implementation guidelines
1. Introduction
1.1 Rationale
The Octree Binary Volume (OTBV) format specifies the algorithm for compressing and storing binary volumetric data. The compression algorithm is lossless. The format has been developed for the specific purpose of optimising the size of large datasets.
The OTBV format was initially created as a part of a research project concerning large amounts of structural data. The size of the datasets used (>1M samples at 128^3 resolution) has made it impractical to be transferred between computers working on different parts of the project. Due to the known specifics of the data (exactly 3 dimensions, binary data), the OTBV file format was developed to minimise the file size of the dataset.
1.2 Scope
This specification defines:
- The algorithms for data respesentation (Section )
- The algorithms for data encoding and decoding
- The structure of an .otbv file
- The expected encoder and decoder software behaviour
The format does not specify any computational optimisations related to volumetric data. Any details regarding such are up to the implementation of a specific encoder/decoder.
1.3 Possible future extensions
It is understood that the current specification of the format make its use limited. As of now, this is by design, as the specificity allows for maximum data density. In the future, it is possible to extend the format to allow for different data types. The other limitation of data is the requirement it be exactly 3-dimensional. While this can potentially be geenralised, such extension is more suited to be a separate format.
2. Data representation
2.1 Conventions
Numeric values are stored with bytes in network order (MSB first).All integers are unsigned, unless specified otherwise.
2.1 Octree encoding
The data part that represents the octree is encoded with the following pattern.- A Leaf Node is represented by 2 bits. The first bit is always "
0", and the second bit represents the value of the Leaf Node (either "0" or "1"). - An Internal Node is represented by a single "
1" bit. The data following an Internal Node represents the first child (seeVolume encoding ordering), until the child's termination, after which follows the second child's data, and so on, until the last child terminates.
If the volume is not homogeneous, the first node is always an Internal Node. The termination of the last child of the root Internal Node marks the end of the Data chunk.
2.2 Volume encoding ordering
3. File structure
The OTBV file format contains the following chunks, in order:The lengths of the Signature and the Metadata chunks are constant. The length of the Data chunk is stored in the Metadata chunk.3.1 Signature
The first 5 bytes of the file compose the signature which identifies the OTBV file format. All OTBV files must have a valid signature.The signature bytes are the same for all OTBV files:
HEX 4f 54 42 56 96The first 4 bytes name the file type. The 5th byte is a non-ASCII character, to prevent misidentification of text files that start with letters "OTBV" as OTBV files.
ASCII O T B V \226
If a decoder fails to validate the signature, the file should not be read further and the user should be notified that the file is malformed.
3.2 Metadata
The Metadata chunk stores additional data needed to read the Data chunk.
The first 3 bits (128, 64, and 32) of the first byte identify the number of padding bits at the start of the Data chunk.
Bit 4 of the first byte denotes if the volume was padded to a cube when encoding.
- If this bit is "
0", the decoder should only read the X resolution and assume the volume is cubic. - If this bit is "
1", the decoder should read all dimension. See the proper algorithm in5. Decoder implementation guidelines
Bits 5-8 of the first byte are reserved for custom flags.
Bytes 2-5 store the edge length (resolution) of the volume. If bit 4 of the first byte is set, this is the X resolution.
Bytes 6-9 and 10-13 store the Y and Z resolution respectively (if bit 4 of the first byte is not set these should be 0).
Bytes 14-17 store the length (in bytes) of the Data portion of the file.
3.3 Data
The data chunk stores the binary representation of the octree that encodes the volume.
4. Encoder implementation guidelines
5. Decoder implementation guidelines
Decoding the resolution
If bit 4 of the first byte of the Metadata chunk is not set, the volume should be read as cubic. In this case, bytes 2-5 of the Metadata chunk store the edge length of the volume. If that number is X, the volume should be interpreted as having resolution of X*X*X.
If bit 4 is set, bytes 2-5 store the real resolution in the X dimension. Bytes 6-9 and 10-13 store the Y and Z resolution respectively. The real resolution of the volume is X*Y*Z. The encoded volume has the resolution of N*N*N, where N is the smallest power of 2 that is larger than X, Y, and Z. The decoder should interpret the data in the file as being of resolution N*N*N, then trim the resulting volume to X*Y*Z.