Bit Packing Libraries — API & Interface Research

public
June 21, 2026 at 12:26 UTC

Bit Packing Libraries — API & Interface Research

Research across: dgryski/go-bitstream (Go), Prometheus chunkenc, InfluxDB tsm1, Rust bitstream-io, Python bitstring, Java BitSet.


Common API Pattern

Every serious implementation converges on the same shape:

  • Separate Reader and Writer types — not a single bidirectional object
  • Single bit: ReadBit() (bool, error) / WriteBit(bool)
  • N bits: ReadBits(n) (uint64, error) / WriteBits(v uint64, n)
  • Finalise writes: Flush()
  • Errors returned as values, not stored in struct (the InfluxDB approach of r.Err() is the outlier — less idiomatic)

Library-by-Library Breakdown

Go: dgryski/go-bitstream

func NewReader(r io.Reader) *BitReader
func NewWriter(w io.Writer) *BitWriter

func (b *BitReader) ReadBit() (bool, error)
func (b *BitReader) ReadBits(n uint) (uint64, error)

func (b *BitWriter) WriteBit(bit bool) error
func (b *BitWriter) WriteBits(u uint64, nBits uint) error
func (b *BitWriter) Flush(out byte) error

The reference Go implementation. Wraps io.Reader/io.Writer — stream-oriented. Flush pads the final byte with the supplied fill bit.


Prometheus: chunkenc/xor.go (internal bstream)

type bstream struct {
    stream []byte
    count  uint8
}

func (b *bstream) writeBit(v bit)
func (b *bstream) writeByte(byt byte)
func (b *bstream) writeBits(u uint64, nbits int)

func (b *bstreamReader) readBit() (bit, error)
func (b *bstreamReader) readBits(nbits int) (uint64, error)
func (b *bstreamReader) readByte() (byte, error)

Internal (unexported) but arguably the most battle-tested Go implementation. Same fundamental shape as dgryski. bit is just a bool typedef.


InfluxDB: tsm1 BitReader/BitWriter

type BitReader struct {
    buf [8]byte
    // ...
    err error  // ← stored in struct, not returned
}

func (r *BitReader) ReadBit() bool  // error via r.Err()
func (r *BitReader) ReadBits(nbits int) uint64

func (w *BitWriter) WriteBit(v bool)
func (w *BitWriter) WriteBits(u uint64, nbits int)
func (w *BitWriter) Flush()

Stores error in struct — callers check r.Err() after reads. Less idiomatic Go. Functional but the error-handling model is harder to compose.


Rust: bitstream-io

let mut reader = BitReader::endian(cursor, BigEndian);
let bit: bool = reader.read_bit()?;
let value: u32 = reader.read(8)?;  // type inferred from context

let mut writer = BitWriter::endian(vec, BigEndian);
writer.write_bit(true)?;
writer.write(8, 255u32)?;
writer.byte_align()?;

Same conceptual shape. Endianness is explicit (constructor param). byte_aligned() check available. Type-safe via generics — no casting to uint64 needed.


Python: bitstring

s = ConstBitStream(bytes=data)
value = s.read('uint:8')   # format string
bool_val = s.read('bool')

bs = BitStream()
bs.append('uint:8=255')
bs.append('bool=True')

Format-string based. High-level but heavyweight. Not suitable as a model for a low-level library.


Java: java.util.BitSet

BitSet bs = new BitSet(64);
bs.set(3);    // set bit at index 3
bs.get(3);    // read bit at index 3

Index-based, not stream-oriented. Not comparable — different use case (sparse bit flags, not packed encoding).


Your API vs the Field

Method Your API dgryski Prometheus InfluxDB
Write single bit WriteBit(bool) WriteBit(bool) error writeBit(bit) WriteBit(bool)
Write n bits WriteBits(uint64, uint8) WriteBits(uint64, uint) error writeBits(uint64, int) WriteBits(uint64, int)
Flush + get bytes Flush() []byte Flush(byte) error (writes to io.Writer) n/a (access .stream) Flush() (writes to io.Writer)
Non-destructive peek Snapshot() []byte
Read single bit ReadBit() (bool, error) ReadBit() (bool, error) readBit() (bit, error) ReadBit() bool
Read n bits ReadBits(uint8) (uint64, error) ReadBits(uint) (uint64, error) readBits(int) (uint64, error) ReadBits(int) uint64

Where your API differs:

  1. uint8 for bit count (vs uint or int) — more honest about the constraint (max 64 bits). Tighter contract.
  2. Flush() []byte returns bytes directly — more convenient than requiring an io.Writer. No equivalent found elsewhere; it's a clean ergonomic improvement.
  3. Snapshot() []byte — no equivalent found. Non-destructive mid-stream peek at current bytes. Useful for the TychoDB use case.

What "Hiding Complexity" Means in Practice

The libraries that feel simple share two traits:

1. No configuration on construction NewWriter() takes no options. You write bits, call Flush, get bytes. That's it. Complexity (endianness, padding, buffering) is handled internally with sensible defaults.

2. uint64 as the universal value type Callers don't think about int sizes. Pass a uint64, specify how many bits. Casting is the caller's problem if they need a smaller type — not the library's.

The libraries that feel complex require either format strings (Python bitstring) or index-based access (Java BitSet) — neither matches the mental model of "stream of bits".


Verdict

Your existing API is already at the right level of abstraction. It matches Prometheus's internal bstream almost line for line — arguably the most battle-tested Go implementation — with two ergonomic improvements:

  • Flush() returns []byte directly instead of writing to an io.Writer
  • uint8 for bit count is more honest than int
  • Snapshot() is a genuinely useful addition with no equivalent elsewhere

No changes needed to the API design. It is simple, it hides complexity, and it aligns with every serious implementation in the field.