# TychoDB — Learn Go With Tests Guide Build the full TychoDB time series store test-first. ~80 minutes. **Prerequisite:** `github.com/tychodb/bitpack` must be built and passing tests before you start. Guide: https://pasteai.io/d/afbbf561-892b-488d-8a62-0f264d45f26e The pattern throughout: write a failing test → `go test ./...` → write minimum code to pass → move on. --- ## Setup (3 min) ```bash mkdir tychodb && cd tychodb go mod init tychodb touch tsdb.go tsdb_test.go ``` `tsdb.go` — imports only: ```go package tsdb import ( "math" "math/bits" "sort" "strings" "github.com/tychodb/bitpack" ) ``` `tsdb_test.go`: ```go package tsdb import "testing" ``` **Local replace directive** — until `bitpack` is published on GitHub: ```bash go mod edit -replace github.com/tychodb/bitpack=../bitpack go mod edit -require github.com/tychodb/bitpack@v0.0.0 ``` Your `go.mod`: ``` module tychodb go 1.22 require github.com/tychodb/bitpack v0.0.0 replace github.com/tychodb/bitpack => ../bitpack ``` When `bitpack` is published, remove the `replace` line and run `go get github.com/tychodb/bitpack@v0.1.0`. --- ## Stage 1 — The Array (15 min) ### Test 1: Write and query ```go func TestArrayStore_WriteAndQuery(t *testing.T) { s := &ArrayStore{} labels := map[string]string{"host": "web-01"} s.Write("cpu", labels, 100, 42.0) s.Write("cpu", labels, 200, 43.0) s.Write("cpu", labels, 300, 44.0) got := s.Query("cpu", labels, 100, 200) if len(got) != 2 { t.Fatalf("want 2 samples, got %d", len(got)) } if got[0].Value != 42.0 { t.Errorf("want 42.0, got %f", got[0].Value) } if got[1].Timestamp != 200 { t.Errorf("want ts=200, got %d", got[1].Timestamp) } } ``` ```go type Sample struct { Timestamp int64 Value float64 } type Store interface { Write(metric string, labels map[string]string, ts int64, val float64) Query(metric string, labels map[string]string, from, to int64) []Sample } type RawSample struct { Metric string Labels map[string]string Timestamp int64 Value float64 } type ArrayStore struct { samples []RawSample } func (s *ArrayStore) Write(metric string, labels map[string]string, ts int64, val float64) { s.samples = append(s.samples, RawSample{ Metric: metric, Labels: labels, Timestamp: ts, Value: val, }) } func (s *ArrayStore) Query(metric string, labels map[string]string, from, to int64) []Sample { var result []Sample for _, sample := range s.samples { if sample.Metric == metric && sample.Labels["host"] == labels["host"] && sample.Timestamp >= from && sample.Timestamp <= to { result = append(result, Sample{sample.Timestamp, sample.Value}) } } return result } ``` ### Test 2: Label isolation (free) ```go func TestArrayStore_LabelIsolation(t *testing.T) { s := &ArrayStore{} s.Write("cpu", map[string]string{"host": "web-01"}, 100, 1.0) s.Write("cpu", map[string]string{"host": "web-02"}, 100, 2.0) got := s.Query("cpu", map[string]string{"host": "web-01"}, 0, 200) if len(got) != 1 || got[0].Value != 1.0 { t.Errorf("label isolation failed: got %v", got) } } ``` PASS immediately. --- ## Stage 2 — HashMap (20 min) ### Test 3: seriesKey is deterministic ```go func TestSeriesKey_Deterministic(t *testing.T) { a := seriesKey("cpu", map[string]string{"host": "web-01", "env": "prod"}) b := seriesKey("cpu", map[string]string{"env": "prod", "host": "web-01"}) if a != b { t.Errorf("keys differ: %q vs %q", a, b) } want := `cpu{env=prod,host=web-01}` if a != want { t.Errorf("want %q, got %q", want, a) } } ``` ```go func seriesKey(metric string, labels map[string]string) string { keys := make([]string, 0, len(labels)) for k := range labels { keys = append(keys, k) } sort.Strings(keys) var b strings.Builder b.WriteString(metric) b.WriteByte('{') for i, k := range keys { if i > 0 { b.WriteByte(',') } b.WriteString(k) b.WriteByte('=') b.WriteString(labels[k]) } b.WriteByte('}') return b.String() } ``` ### Test 4: HashMapStore write and binary search query ```go func TestHashMapStore_WriteAndQuery(t *testing.T) { s := NewHashMapStore() labels := map[string]string{"host": "web-01"} for i := 0; i < 100; i++ { s.Write("cpu", labels, int64(i*60), float64(i)) } // ts 120 to 300 = i=2,3,4,5 → 4 samples got := s.Query("cpu", labels, 120, 300) if len(got) != 4 { t.Fatalf("want 4, got %d: %v", len(got), got) } if got[0].Timestamp != 120 { t.Errorf("wrong first sample: %+v", got[0]) } } ``` ```go type HashMapStore struct { series map[string][]Sample } func NewHashMapStore() *HashMapStore { return &HashMapStore{series: make(map[string][]Sample)} } func (s *HashMapStore) Write(metric string, labels map[string]string, ts int64, val float64) { key := seriesKey(metric, labels) s.series[key] = append(s.series[key], Sample{ts, val}) } func (s *HashMapStore) Query(metric string, labels map[string]string, from, to int64) []Sample { key := seriesKey(metric, labels) samples := s.series[key] lo := sort.Search(len(samples), func(i int) bool { return samples[i].Timestamp >= from }) hi := sort.Search(len(samples), func(i int) bool { return samples[i].Timestamp > to }) return samples[lo:hi] } ``` ### Test 5: Series isolation (free) ```go func TestHashMapStore_SeriesIsolation(t *testing.T) { s := NewHashMapStore() s.Write("cpu", map[string]string{"host": "web-01"}, 100, 1.0) s.Write("cpu", map[string]string{"host": "web-02"}, 100, 2.0) s.Write("mem", map[string]string{"host": "web-01"}, 100, 3.0) got := s.Query("cpu", map[string]string{"host": "web-01"}, 0, 200) if len(got) != 1 || got[0].Value != 1.0 { t.Errorf("got %v", got) } } ``` PASS immediately. --- ## Stage 3 — Gorilla Compression (45 min) The bit-stream layer is already done — import it. Build zigzag → chunk encoding → store. ### Test 6: ZigZag is symmetric and small ```go func TestZigZag(t *testing.T) { cases := []int64{0, 1, -1, 100, -100, 1<<32 - 1, -(1 << 32)} for _, v := range cases { if got := zigzagDecode(zigzagEncode(v)); got != v { t.Errorf("round-trip(%d) = %d", v, got) } } if zigzagEncode(-1) != 1 { t.Errorf("zigzagEncode(-1) want 1, got %d", zigzagEncode(-1)) } if zigzagEncode(1) != 2 { t.Errorf("zigzagEncode(1) want 2, got %d", zigzagEncode(1)) } } ``` ```go func zigzagEncode(v int64) uint64 { return uint64((v << 1) ^ (v >> 63)) } func zigzagDecode(n uint64) int64 { return int64((n >> 1) ^ -(n & 1)) } ``` ### Test 7: Chunk encode/decode round-trip ```go func TestChunkRoundtrip(t *testing.T) { samples := []Sample{ {1717756800, 42.3}, {1717756860, 42.1}, {1717756920, 42.4}, {1717756980, 42.4}, // same value — must cost 1 bit {1717757040, 0.0}, // edge: zero value } encoded := encodeChunk(samples) decoded := decodeChunk(encoded, len(samples)) if len(decoded) != len(samples) { t.Fatalf("want %d, got %d", len(samples), len(decoded)) } for i, s := range samples { if decoded[i].Timestamp != s.Timestamp { t.Errorf("[%d] ts: want %d got %d", i, s.Timestamp, decoded[i].Timestamp) } if decoded[i].Value != s.Value { t.Errorf("[%d] val: want %f got %f", i, s.Value, decoded[i].Value) } } } ``` **writeEncodedDOD** — timestamp delta-of-delta with prefix code: ```go func writeEncodedDOD(w *bitpack.Writer, dod int64) { zz := zigzagEncode(dod) switch { case zz == 0: w.WriteBit(false) case zz < 128: w.WriteBit(true); w.WriteBit(false); w.WriteBits(zz, 7) case zz < 512: w.WriteBit(true); w.WriteBit(true); w.WriteBit(false); w.WriteBits(zz, 9) case zz < 4096: w.WriteBit(true); w.WriteBit(true); w.WriteBit(true); w.WriteBit(false); w.WriteBits(zz, 12) default: w.WriteBit(true); w.WriteBit(true); w.WriteBit(true); w.WriteBit(true); w.WriteBits(zz, 64) } } ``` **writeXORValue** — float64 as XOR against previous: ```go func writeXORValue(w *bitpack.Writer, prev, cur float64) { xor := math.Float64bits(prev) ^ math.Float64bits(cur) if xor == 0 { w.WriteBit(false) return } w.WriteBit(true) leading := uint64(bits.LeadingZeros64(xor)) trailing := uint64(bits.TrailingZeros64(xor)) meaningful := 64 - leading - trailing w.WriteBits(leading, 6) w.WriteBits(meaningful, 6) w.WriteBits(xor>>trailing, uint8(meaningful)) } ``` **encodeChunk** — first sample in full, rest compressed: ```go func encodeChunk(samples []Sample) []byte { if len(samples) == 0 { return nil } w := bitpack.NewWriter() w.WriteBits(uint64(samples[0].Timestamp), 64) w.WriteBits(math.Float64bits(samples[0].Value), 64) prevDelta := int64(0) prev := samples[0] for _, s := range samples[1:] { delta := s.Timestamp - prev.Timestamp dod := delta - prevDelta prevDelta = delta writeEncodedDOD(w, dod) writeXORValue(w, prev.Value, s.Value) prev = s } return w.Flush() } ``` **decodeChunk** — exact mirror: `ReadBit` and `ReadBits` return errors. We discard them with `_` — a truncated chunk is a programming error, not a runtime condition the caller can recover from. ```go func decodeChunk(data []byte, n int) []Sample { if len(data) == 0 { return nil } r := bitpack.NewReader(data) samples := make([]Sample, n) ts, _ := r.ReadBits(64) samples[0].Timestamp = int64(ts) val, _ := r.ReadBits(64) samples[0].Value = math.Float64frombits(val) prevDelta := int64(0) prevTs := samples[0].Timestamp prevVal := samples[0].Value for i := 1; i < n; i++ { ones := 0 for ones < 4 { b, _ := r.ReadBit() if !b { break } ones++ } var zz uint64 switch ones { case 1: zz, _ = r.ReadBits(7) case 2: zz, _ = r.ReadBits(9) case 3: zz, _ = r.ReadBits(12) case 4: zz, _ = r.ReadBits(64) } delta := prevDelta + zigzagDecode(zz) prevDelta = delta prevTs += delta samples[i].Timestamp = prevTs if xorBit, _ := r.ReadBit(); xorBit { leadV, _ := r.ReadBits(6) leading := int(leadV) meaningfulV, _ := r.ReadBits(6) meaningful := int(meaningfulV) trailing := 64 - leading - meaningful xorV, _ := r.ReadBits(uint8(meaningful)) prevVal = math.Float64frombits(math.Float64bits(prevVal) ^ (xorV << uint(trailing))) } samples[i].Value = prevVal } return samples } ``` `go test ./...` — PASS. ### Test 8: GorillaStore write and query ```go func TestGorillaStore_WriteAndQuery(t *testing.T) { s := NewGorillaStore() labels := map[string]string{"host": "web-01"} for i := 0; i < 60; i++ { s.Write("cpu", labels, int64(i*60), float64(i)*0.5) } // ts 120 to 300 = i=2,3,4,5 → 4 samples got := s.Query("cpu", labels, 120, 300) if len(got) != 4 { t.Fatalf("want 4, got %d: %v", len(got), got) } if got[0].Timestamp != 120 { t.Errorf("wrong start: %+v", got[0]) } } ``` ```go type GorillaStore struct { chunks map[string][]byte counts map[string]int } func NewGorillaStore() *GorillaStore { return &GorillaStore{ chunks: make(map[string][]byte), counts: make(map[string]int), } } func (s *GorillaStore) Write(metric string, labels map[string]string, ts int64, val float64) { key := seriesKey(metric, labels) samples := decodeChunk(s.chunks[key], s.counts[key]) samples = append(samples, Sample{ts, val}) s.chunks[key] = encodeChunk(samples) s.counts[key] = len(samples) } func (s *GorillaStore) Query(metric string, labels map[string]string, from, to int64) []Sample { key := seriesKey(metric, labels) samples := decodeChunk(s.chunks[key], s.counts[key]) lo := sort.Search(len(samples), func(i int) bool { return samples[i].Timestamp >= from }) hi := sort.Search(len(samples), func(i int) bool { return samples[i].Timestamp > to }) return samples[lo:hi] } func (s *GorillaStore) EncodedBytes() int { n := 0 for _, chunk := range s.chunks { n += len(chunk) } return n } ``` --- ## Final: Blog Tests (5 min) Paste these in as-is — no new code. ```go const ( baseTimestamp = int64(1_717_756_800_000_000_000) scrapeInterval = int64(60_000_000_000) bytesPerRaw = 100 ) func TestCompressionSizes(t *testing.T) { cpuSeq := []float64{42.0, 42.0, 42.25, 42.25, 42.0} populate := func(s Store) { for _, host := range []string{"web-01", "web-02"} { labels := map[string]string{"host": host, "env": "prod"} for i := 0; i < 60; i++ { ts := baseTimestamp + int64(i)*scrapeInterval s.Write("cpu_usage", labels, ts, cpuSeq[i%len(cpuSeq)]) } } } a := &ArrayStore{} h := NewHashMapStore() g := NewGorillaStore() populate(a) populate(h) populate(g) n := float64(120) cases := []struct { name string bytes int }{ {"Array", len(a.samples) * bytesPerRaw}, {"HashMap", func() int { total := 0 for _, s := range h.series { total += len(s) * 16 } return total }()}, {"Gorilla", g.EncodedBytes()}, } for _, tc := range cases { t.Run(tc.name, func(t *testing.T) { t.Logf("%d bytes (%.1f bytes/point)", tc.bytes, float64(tc.bytes)/n) }) } ratio := float64(cases[0].bytes) / float64(cases[2].bytes) t.Logf("Ratio: %.0fx smaller (Array -> Gorilla)", ratio) if ratio < 50 { t.Errorf("expected at least 50x compression, got %.0fx", ratio) } } func TestQueryConsistency(t *testing.T) { cpuSeq := []float64{42.0, 42.0, 42.25, 42.25, 42.0} labels := map[string]string{"host": "web-01", "env": "prod"} a := &ArrayStore{} h := NewHashMapStore() for i := 0; i < 60; i++ { ts := baseTimestamp + int64(i)*scrapeInterval v := cpuSeq[i%len(cpuSeq)] a.Write("cpu_usage", labels, ts, v) h.Write("cpu_usage", labels, ts, v) } from := baseTimestamp to := from + 30*scrapeInterval arrayResult := a.Query("cpu_usage", labels, from, to) hashmapResult := h.Query("cpu_usage", labels, from, to) if len(arrayResult) != len(hashmapResult) { t.Errorf("stores disagree: array=%d hashmap=%d", len(arrayResult), len(hashmapResult)) } } ``` Run from `tychodb/`: ```bash go test -v ./... ``` Expected: ``` --- PASS: TestArrayStore_WriteAndQuery --- PASS: TestArrayStore_LabelIsolation --- PASS: TestSeriesKey_Deterministic --- PASS: TestHashMapStore_WriteAndQuery --- PASS: TestHashMapStore_SeriesIsolation --- PASS: TestZigZag --- PASS: TestChunkRoundtrip --- PASS: TestGorillaStore_WriteAndQuery --- PASS: TestCompressionSizes/Array 12000 bytes (100.0 bytes/point) --- PASS: TestCompressionSizes/HashMap 1920 bytes (16.0 bytes/point) --- PASS: TestCompressionSizes/Gorilla 114 bytes (1.0 bytes/point) Ratio: 105x smaller (Array -> Gorilla) --- PASS: TestQueryConsistency ``` --- ## Benchmarks Add to `tsdb_test.go`. Run with: ```bash go test -bench=. -benchmem -benchtime=3s ./... ``` ```go func BenchmarkArrayStore_Write(b *testing.B) { labels := map[string]string{"host": "web-01", "env": "prod"} b.ReportAllocs() for n := 0; n < b.N; n++ { s := &ArrayStore{} for i := 0; i < 1000; i++ { s.Write("cpu", labels, int64(i*60), float64(i)) } } } func BenchmarkHashMapStore_Write(b *testing.B) { labels := map[string]string{"host": "web-01", "env": "prod"} b.ReportAllocs() for n := 0; n < b.N; n++ { s := NewHashMapStore() for i := 0; i < 1000; i++ { s.Write("cpu", labels, int64(i*60), float64(i)) } } } func BenchmarkGorillaStore_Write(b *testing.B) { labels := map[string]string{"host": "web-01", "env": "prod"} b.ReportAllocs() for n := 0; n < b.N; n++ { s := NewGorillaStore() for i := 0; i < 1000; i++ { s.Write("cpu", labels, int64(i*60), float64(i)) } } } const benchSamples = 10_000 func populateArray(labels map[string]string) *ArrayStore { s := &ArrayStore{} for i := 0; i < benchSamples; i++ { s.Write("cpu", labels, int64(i*60), float64(i%100)) } return s } func populateHashMap(labels map[string]string) *HashMapStore { s := NewHashMapStore() for i := 0; i < benchSamples; i++ { s.Write("cpu", labels, int64(i*60), float64(i%100)) } return s } func populateGorilla(labels map[string]string) *GorillaStore { s := NewGorillaStore() for i := 0; i < benchSamples; i++ { s.Write("cpu", labels, int64(i*60), float64(i%100)) } return s } func BenchmarkArrayStore_Query(b *testing.B) { labels := map[string]string{"host": "web-01", "env": "prod"} s := populateArray(labels) from, to := int64(1000*60), int64(2000*60) b.ResetTimer() b.ReportAllocs() for n := 0; n < b.N; n++ { _ = s.Query("cpu", labels, from, to) } } func BenchmarkHashMapStore_Query(b *testing.B) { labels := map[string]string{"host": "web-01", "env": "prod"} s := populateHashMap(labels) from, to := int64(1000*60), int64(2000*60) b.ResetTimer() b.ReportAllocs() for n := 0; n < b.N; n++ { _ = s.Query("cpu", labels, from, to) } } func BenchmarkGorillaStore_Query(b *testing.B) { labels := map[string]string{"host": "web-01", "env": "prod"} s := populateGorilla(labels) from, to := int64(1000*60), int64(2000*60) b.ResetTimer() b.ReportAllocs() for n := 0; n < b.N; n++ { _ = s.Query("cpu", labels, from, to) } } ``` --- ## Publishing bitpack When `bitpack` is on GitHub: ```bash go mod edit -dropreplace github.com/tychodb/bitpack go get github.com/tychodb/bitpack@v0.1.0 go mod tidy go test ./... ``` --- ## Summary | # | Test | Drives | |---|------|--------| | 1 | ArrayStore_WriteAndQuery | Sample, Store, RawSample, ArrayStore | | 2 | ArrayStore_LabelIsolation | free | | 3 | SeriesKey_Deterministic | seriesKey | | 4 | HashMapStore_WriteAndQuery | HashMapStore + binary search | | 5 | HashMapStore_SeriesIsolation | free | | 6 | ZigZag | zigzagEncode, zigzagDecode | | 7 | ChunkRoundtrip | writeEncodedDOD, writeXORValue, encodeChunk, decodeChunk | | 8 | GorillaStore_WriteAndQuery | GorillaStore | | 9 | CompressionSizes | blog benchmark — validation | | 10 | QueryConsistency | blog correctness — validation | Ten tests. One module. bitpack is a separate library built from its own guide.