Documentation
¶
Overview ¶
Package imetadata provides internal metadata indexing for efficient filtering.
This package implements the inverted index used for metadata filtering. It is separate from the public metadata package to avoid import cycles.
Architecture ¶
UnifiedIndex combines primary storage with an inverted index:
Primary: map[RowID]InternedDocument - O(1) document retrieval Inverted: map[field]map[value]*Bitmap - O(1) filter compilation
Filter Compilation ¶
Filters are compiled into bitmap operations:
OpEqual: field == value → single bitmap lookup OpIn: field IN [...] → union of bitmaps (OR) Multiple: AND all filter bitmaps
Unsupported operators (OpLessThan, etc.) fall back to document scanning.
Memory Efficiency ¶
The package uses several techniques to minimize memory:
- String interning via unique.Handle[string] (Go 1.23+)
- Roaring Bitmaps for compressed posting lists
- Value.Key() deduplication in inverted index
Thread Safety ¶
UnifiedIndex is fully thread-safe. All public methods acquire appropriate locks (RLock for reads, Lock for writes). For streaming filters, callers must hold RLock manually via RLock()/RUnlock() methods.
Serialization ¶
The inverted index supports binary serialization via WriteInvertedIndex and ReadInvertedIndex for persistence in disk segments.
Index ¶
- Constants
- func PutMatcherScratch(s *MatcherScratch)
- func PutPooledBitmap(b *LocalBitmap)
- func PutQueryScratch(qs *QueryScratch)
- func ReleaseFilterCursor(cursor FilterCursor)
- type AllCursor
- type AlwaysFalseMatcher
- type AlwaysTrueMatcher
- type BitmapBuilder
- func (bb *BitmapBuilder) Append(id uint32)
- func (bb *BitmapBuilder) AppendMany(ids []uint32)
- func (bb *BitmapBuilder) Finalize() *LocalBitmap
- func (bb *BitmapBuilder) FinalizeInto(dst *LocalBitmap)
- func (bb *BitmapBuilder) Len() int
- func (bb *BitmapBuilder) Reset(hitsHint int)
- func (bb *BitmapBuilder) RowIDs() []uint32
- type BitmapCursor
- type BitmapMatcher
- type Column
- type ColumnFilterCursor
- type CompositeMatcher
- type DenseNumericMatcher
- type DocumentProvider
- type EmptyCursor
- type FilterCursor
- type FilterMatcher
- type FilterMode
- type FilterResult
- func AdaptiveResult(rows []uint32, segmentSize uint32) FilterResult
- func AllResult() FilterResult
- func EmptyResult() FilterResult
- func FilterResultAnd(a, b FilterResult, qs *QueryScratch) FilterResult
- func QueryBitmapResult(qb *bitmap.QueryBitmap) FilterResult
- func RangeResult(start, end uint32) FilterResult
- func RowsResult(rows []uint32) FilterResult
- func (fr FilterResult) AsBitmap() FilterResultBitmap
- func (fr FilterResult) Cardinality() int
- func (fr FilterResult) CardinalityUint64() uint64
- func (fr FilterResult) Clone() FilterResult
- func (fr FilterResult) CloneInto(dst []uint32) (FilterResult, []uint32)
- func (fr FilterResult) Contains(id uint32) bool
- func (fr FilterResult) ForEach(fn func(uint32) bool)
- func (fr FilterResult) IsAll() bool
- func (fr FilterResult) IsEmpty() bool
- func (fr FilterResult) IsRange() bool
- func (fr FilterResult) Mode() FilterMode
- func (fr FilterResult) QueryBitmap() *bitmap.QueryBitmap
- func (fr FilterResult) Range() (start, end uint32)
- func (fr FilterResult) Rows() []uint32
- func (fr FilterResult) ToArray(scratch []uint32) []uint32
- func (fr FilterResult) ToArrayInto(dst []uint32) []uint32
- type FilterResultBitmap
- type FilterResultCursor
- type LocalBitmap
- func (b *LocalBitmap) Add(id uint32)
- func (b *LocalBitmap) AddMany(ids []uint32)
- func (b *LocalBitmap) And(other *LocalBitmap)
- func (b *LocalBitmap) AndNot(other *LocalBitmap)
- func (b *LocalBitmap) Cardinality() uint64
- func (b *LocalBitmap) Clear()
- func (b *LocalBitmap) Clone() *LocalBitmap
- func (b *LocalBitmap) CloneTo(dst *LocalBitmap)
- func (b *LocalBitmap) Contains(id uint32) bool
- func (b *LocalBitmap) ForEach(fn func(id uint32) bool)
- func (b *LocalBitmap) GetSizeInBytes() uint64
- func (b *LocalBitmap) IsEmpty() bool
- func (b *LocalBitmap) Iterator() iter.Seq[model.RowID]
- func (b *LocalBitmap) Or(other *LocalBitmap)
- func (b *LocalBitmap) ReadFrom(r io.Reader) (int64, error)
- func (b *LocalBitmap) Remove(id uint32)
- func (b *LocalBitmap) RunOptimize()
- func (b *LocalBitmap) ToArray() []uint32
- func (b *LocalBitmap) ToArrayInto(dst []uint32) []uint32
- func (b *LocalBitmap) WriteTo(w io.Writer) (int64, error)
- type MatcherCursor
- type MatcherScratch
- type MultiBitmapMatcher
- type NotBitmapMatcher
- type NumericIndex
- func (ni *NumericIndex) Add(fieldKey string, value metadata.Value, rowID model.RowID)
- func (ni *NumericIndex) Cardinality() int
- func (ni *NumericIndex) EstimateSelectivity(fieldKey string, op metadata.Operator, filterVal float64) float64
- func (ni *NumericIndex) EvaluateFilter(f metadata.Filter) *LocalBitmap
- func (ni *NumericIndex) EvaluateFilterInto(f metadata.Filter, dst *LocalBitmap)
- func (ni *NumericIndex) FieldCount() int
- func (ni *NumericIndex) ForEachMatch(fieldKey string, op metadata.Operator, filterVal float64, ...) bool
- func (ni *NumericIndex) GetFieldMatcher(fieldKey string, op metadata.Operator, filterVal float64) func(rowID uint32) bool
- func (ni *NumericIndex) GetNumericMatcher(fieldKey string, op metadata.Operator, filterVal float64) FilterMatcher
- func (ni *NumericIndex) GetStats(fieldKey string) (minVal, maxVal float64, cardinality int)
- func (ni *NumericIndex) HasField(fieldKey string) bool
- func (ni *NumericIndex) IsLowCardinality(fieldKey string) bool
- func (ni *NumericIndex) IsSealed() bool
- func (ni *NumericIndex) Len(fieldKey string) int
- func (ni *NumericIndex) MatchRowID(fieldKey string, op metadata.Operator, filterVal float64, rowID uint32) bool
- func (ni *NumericIndex) QueryRange(fieldKey string, minVal, maxVal float64, includeMin, includeMax bool, ...)
- func (ni *NumericIndex) QueryRangeSIMD(fieldKey string, minVal, maxVal float64, dst *LocalBitmap)
- func (ni *NumericIndex) ReadFrom(r io.Reader) (int64, error)
- func (ni *NumericIndex) Remove(fieldKey string, value metadata.Value, rowID model.RowID)
- func (ni *NumericIndex) Seal()
- func (ni *NumericIndex) WriteTo(w io.Writer) (int64, error)
- type NumericOp
- type QueryScratch
- type RangeCursor
- type ReleasableCursor
- type RowsCursor
- type SparseNumericMatcher
- type Stats
- type UnifiedIndex
- func (ui *UnifiedIndex) AddInvertedIndex(id model.RowID, doc metadata.Document)
- func (ui *UnifiedIndex) CompileFilter(fs *metadata.FilterSet) *LocalBitmap
- func (ui *UnifiedIndex) CompileFilterTo(fs *metadata.FilterSet, dst *LocalBitmap) bool
- func (ui *UnifiedIndex) CreateFilterFunc(fs *metadata.FilterSet) func(model.RowID) bool
- func (ui *UnifiedIndex) CreateStreamingFilter(ctx context.Context, fs *metadata.FilterSet) func(model.RowID) bool
- func (ui *UnifiedIndex) Delete(id model.RowID)
- func (ui *UnifiedIndex) EvaluateFilter(fs *metadata.FilterSet) *LocalBitmap
- func (ui *UnifiedIndex) EvaluateFilterResult(fs *metadata.FilterSet, qs *QueryScratch) FilterResult
- func (ui *UnifiedIndex) FilterCursor(fs *metadata.FilterSet, rowCount uint32) FilterCursor
- func (ui *UnifiedIndex) FilterCursorWithMatcher(fs *metadata.FilterSet, rowCount uint32, scratch *MatcherScratch) *MatcherCursor
- func (ui *UnifiedIndex) Get(ctx context.Context, id model.RowID) (metadata.Document, bool)
- func (ui *UnifiedIndex) GetFilterMatcher(fs *metadata.FilterSet, scratch *MatcherScratch) FilterMatcher
- func (ui *UnifiedIndex) GetStats() Stats
- func (ui *UnifiedIndex) Len() int
- func (ui *UnifiedIndex) Query(fs *metadata.FilterSet) (*LocalBitmap, error)
- func (ui *UnifiedIndex) RLock()
- func (ui *UnifiedIndex) RUnlock()
- func (ui *UnifiedIndex) ReadInvertedIndex(r io.Reader) error
- func (ui *UnifiedIndex) ScanFilter(fs *metadata.FilterSet) []model.RowID
- func (ui *UnifiedIndex) SealNumericIndex()
- func (ui *UnifiedIndex) Set(id model.RowID, doc metadata.Document)
- func (ui *UnifiedIndex) SetDocumentProvider(provider DocumentProvider)
- func (ui *UnifiedIndex) ToMap() map[model.RowID]metadata.Document
- func (ui *UnifiedIndex) WriteInvertedIndex(w io.Writer) error
Constants ¶
const DefaultUniverseSize = 1 << 20
DefaultUniverseSize is the default size for query-time bitmaps. 1M covers typical segment sizes; larger segments trigger reallocation.
const LowCardinalityThreshold = 512
LowCardinalityThreshold is the maximum distinct values for bitmap precomputation. Below this threshold, we precompute value→bitmap maps for O(1) equality lookups. Aligned with roaring bitmap container boundaries (array vs bitmap cutoff).
Variables ¶
This section is empty.
Functions ¶
func PutMatcherScratch ¶ added in v0.0.14
func PutMatcherScratch(s *MatcherScratch)
PutMatcherScratch returns a MatcherScratch to the pool.
func PutPooledBitmap ¶
func PutPooledBitmap(b *LocalBitmap)
PutPooledBitmap returns a pooled bitmap to the pool. Only call this for bitmaps obtained via GetPooledBitmap. Passing nil is safe and will be ignored. WARNING: Do not use the bitmap after calling this function.
func PutQueryScratch ¶
func PutQueryScratch(qs *QueryScratch)
PutQueryScratch returns a QueryScratch to the pool. Passing nil is safe and will be ignored.
func ReleaseFilterCursor ¶ added in v0.0.14
func ReleaseFilterCursor(cursor FilterCursor)
ReleaseFilterCursor releases a cursor if it implements ReleasableCursor. This is a convenience function for callers that want to release cursors without type assertion boilerplate.
Safe to call with nil or non-releasable cursors.
Types ¶
type AllCursor ¶
type AllCursor struct {
// contains filtered or unexported fields
}
AllCursor represents "all rows match" (unfiltered query). Iterates 0..size-1.
func NewAllCursor ¶
NewAllCursor creates a FilterCursor that matches all rows in [0, size).
func (*AllCursor) EstimateCardinality ¶
type AlwaysFalseMatcher ¶ added in v0.0.14
type AlwaysFalseMatcher struct{}
AlwaysFalseMatcher always returns false (impossible filter).
func GetAlwaysFalseMatcher ¶ added in v0.0.14
func GetAlwaysFalseMatcher() *AlwaysFalseMatcher
func (*AlwaysFalseMatcher) Matches ¶ added in v0.0.14
func (m *AlwaysFalseMatcher) Matches(uint32) bool
func (*AlwaysFalseMatcher) Release ¶ added in v0.0.14
func (m *AlwaysFalseMatcher) Release()
type AlwaysTrueMatcher ¶ added in v0.0.14
type AlwaysTrueMatcher struct{}
AlwaysTrueMatcher always returns true (no filter).
func GetAlwaysTrueMatcher ¶ added in v0.0.14
func GetAlwaysTrueMatcher() *AlwaysTrueMatcher
func (*AlwaysTrueMatcher) Matches ¶ added in v0.0.14
func (m *AlwaysTrueMatcher) Matches(uint32) bool
func (*AlwaysTrueMatcher) Release ¶ added in v0.0.14
func (m *AlwaysTrueMatcher) Release()
type BitmapBuilder ¶
type BitmapBuilder struct {
// contains filtered or unexported fields
}
BitmapBuilder provides zero-allocation bitmap construction. Instead of building bitmaps by mutation (Add calls), it collects rowIDs into a preallocated slice, then builds the bitmap once via AddMany.
This eliminates:
- arrayContainer growth allocations
- container cloning during OR operations
- Iterator allocations
Usage:
bb := NewBitmapBuilder()
bb.Reset(estimatedHits)
for _, id := range matches {
bb.Append(id)
}
bitmap := bb.Finalize() // or bb.RowIDs() for direct slice access
func NewBitmapBuilder ¶
func NewBitmapBuilder() *BitmapBuilder
NewBitmapBuilder creates a new BitmapBuilder with default capacity.
func (*BitmapBuilder) Append ¶
func (bb *BitmapBuilder) Append(id uint32)
Append adds a rowID to the builder. This is pure slice append, extremely cache-friendly.
func (*BitmapBuilder) AppendMany ¶
func (bb *BitmapBuilder) AppendMany(ids []uint32)
AppendMany adds multiple rowIDs to the builder.
func (*BitmapBuilder) Finalize ¶
func (bb *BitmapBuilder) Finalize() *LocalBitmap
Finalize builds the bitmap from collected rowIDs. RowIDs are sorted (required for optimal AddMany) and added in one call. Returns the reusable bitmap (do NOT pool it).
func (*BitmapBuilder) FinalizeInto ¶
func (bb *BitmapBuilder) FinalizeInto(dst *LocalBitmap)
FinalizeInto builds the bitmap into the provided LocalBitmap. This allows using a pooled bitmap while still benefiting from AddMany.
func (*BitmapBuilder) Len ¶
func (bb *BitmapBuilder) Len() int
Len returns the number of rowIDs collected.
func (*BitmapBuilder) Reset ¶
func (bb *BitmapBuilder) Reset(hitsHint int)
Reset prepares the builder for a new query. Call this before appending IDs. Zero allocations.
func (*BitmapBuilder) RowIDs ¶
func (bb *BitmapBuilder) RowIDs() []uint32
RowIDs returns the collected rowIDs directly. For low cardinality filters, use this to skip bitmap entirely. The returned slice is only valid until the next Reset call.
type BitmapCursor ¶
type BitmapCursor struct {
// contains filtered or unexported fields
}
BitmapCursor wraps a LocalBitmap for direct iteration. Uses the bitmap's native iterator, avoiding slice extraction. Zero allocations during iteration.
func NewBitmapCursor ¶
func NewBitmapCursor(bitmap *LocalBitmap) *BitmapCursor
NewBitmapCursor creates a FilterCursor from a bitmap. The bitmap is NOT copied - caller must ensure it outlives the cursor.
func (*BitmapCursor) EstimateCardinality ¶
func (c *BitmapCursor) EstimateCardinality() int
func (*BitmapCursor) ForEach ¶
func (c *BitmapCursor) ForEach(fn func(rowID uint32) bool)
func (*BitmapCursor) IsAll ¶
func (c *BitmapCursor) IsAll() bool
func (*BitmapCursor) IsEmpty ¶
func (c *BitmapCursor) IsEmpty() bool
type BitmapMatcher ¶ added in v0.0.14
type BitmapMatcher struct {
// contains filtered or unexported fields
}
BitmapMatcher checks if rowID exists in a bitmap. Used for OpEqual and OpIn filters.
func GetBitmapMatcher ¶ added in v0.0.14
func GetBitmapMatcher(bm *LocalBitmap) *BitmapMatcher
GetBitmapMatcher returns a pooled BitmapMatcher.
func (*BitmapMatcher) Matches ¶ added in v0.0.14
func (m *BitmapMatcher) Matches(rowID uint32) bool
func (*BitmapMatcher) Release ¶ added in v0.0.14
func (m *BitmapMatcher) Release()
type Column ¶
type Column interface {
// Matches checks if row at index i matches the filter condition.
Matches(i int, val metadata.Value, op metadata.Operator) bool
}
Column is the interface for typed column data. This matches the existing memtable column interface.
type ColumnFilterCursor ¶
type ColumnFilterCursor struct {
// contains filtered or unexported fields
}
ColumnFilterCursor evaluates filters directly against column data. This is the push-based replacement for EvaluateFilterResult. Filters are evaluated lazily during ForEach - no intermediate bitmap.
For AND logic (default): rows must pass ALL filters. Short-circuit evaluation stops at first non-match.
func NewColumnFilterCursor ¶
func NewColumnFilterCursor(columns map[string]Column, filters []metadata.Filter, universeSize uint32) *ColumnFilterCursor
NewColumnFilterCursor creates a cursor that evaluates filters against columns. Columns and filters are borrowed - caller must ensure they outlive the cursor.
func (*ColumnFilterCursor) EstimateCardinality ¶
func (c *ColumnFilterCursor) EstimateCardinality() int
func (*ColumnFilterCursor) ForEach ¶
func (c *ColumnFilterCursor) ForEach(fn func(rowID uint32) bool)
func (*ColumnFilterCursor) IsAll ¶
func (c *ColumnFilterCursor) IsAll() bool
func (*ColumnFilterCursor) IsEmpty ¶
func (c *ColumnFilterCursor) IsEmpty() bool
type CompositeMatcher ¶ added in v0.0.14
type CompositeMatcher struct {
// contains filtered or unexported fields
}
CompositeMatcher evaluates multiple matchers with AND semantics. Short-circuits on first false (most selective first for best perf).
func GetCompositeMatcher ¶ added in v0.0.14
func GetCompositeMatcher(matchers []FilterMatcher) *CompositeMatcher
GetCompositeMatcher returns a pooled CompositeMatcher.
func (*CompositeMatcher) Matches ¶ added in v0.0.14
func (m *CompositeMatcher) Matches(rowID uint32) bool
func (*CompositeMatcher) Release ¶ added in v0.0.14
func (m *CompositeMatcher) Release()
type DenseNumericMatcher ¶ added in v0.0.14
type DenseNumericMatcher struct {
// contains filtered or unexported fields
}
DenseNumericMatcher uses array indexing for dense rowID ranges. Optimal for immutable segments where rowIDs are contiguous.
func GetDenseNumericMatcher ¶ added in v0.0.14
func GetDenseNumericMatcher(rowIDToIndex []int32, values []float64, op NumericOp, filterVal float64) *DenseNumericMatcher
GetDenseNumericMatcher returns a pooled DenseNumericMatcher.
func (*DenseNumericMatcher) Matches ¶ added in v0.0.14
func (m *DenseNumericMatcher) Matches(rowID uint32) bool
func (*DenseNumericMatcher) Release ¶ added in v0.0.14
func (m *DenseNumericMatcher) Release()
type DocumentProvider ¶
DocumentProvider is a function that retrieves a document by ID. The context should be used for cancellation and timeouts during I/O.
type EmptyCursor ¶
type EmptyCursor struct{}
EmptyCursor represents "no rows match".
func (*EmptyCursor) EstimateCardinality ¶
func (c *EmptyCursor) EstimateCardinality() int
func (*EmptyCursor) ForEach ¶
func (c *EmptyCursor) ForEach(fn func(rowID uint32) bool)
func (*EmptyCursor) IsAll ¶
func (c *EmptyCursor) IsAll() bool
func (*EmptyCursor) IsEmpty ¶
func (c *EmptyCursor) IsEmpty() bool
type FilterCursor ¶
type FilterCursor interface {
// ForEach iterates over matching row IDs, calling fn for each.
// Iteration stops early if fn returns false.
// Zero allocations in the iteration path.
ForEach(fn func(rowID uint32) bool)
// EstimateCardinality returns an estimate of matching row IDs.
// This is used for selectivity-based execution planning.
// The estimate may be approximate (especially for complex filters).
EstimateCardinality() int
// IsEmpty returns true if no rows match (fast path).
IsEmpty() bool
// IsAll returns true if all rows match (fast path for no-filter case).
IsAll() bool
}
FilterCursor is a push-based iterator for filter results. Unlike materialized FilterResult (which builds a bitmap/array), FilterCursor pushes matching row IDs directly to the consumer.
This design eliminates:
- Roaring bitmap allocations (OR, Clone, Iterator)
- Memory materialization (no intermediate arrays)
- Early termination overhead (consumer controls flow)
Architecture: Filters push rows, search pulls nothing.
Example usage:
cursor.ForEach(func(rowID uint32) bool {
distance := computeDistance(query, vectors[rowID])
heap.PushMaybe(rowID, distance)
return heap.Len() < k*10 // early termination
})
func GetEmptyCursor ¶
func GetEmptyCursor() FilterCursor
GetEmptyCursor returns a singleton empty cursor.
type FilterMatcher ¶ added in v0.0.14
type FilterMatcher interface {
// Matches returns true if the rowID passes the filter.
// MUST be safe for concurrent use.
Matches(rowID uint32) bool
// Release returns the matcher to its pool.
// MUST be called when done to avoid pool exhaustion.
Release()
}
FilterMatcher is the zero-allocation interface for filter evaluation. Replaces closures with interface dispatch to eliminate heap escapes.
DESIGN RATIONALE: Go closures that capture variables escape to heap (~50 bytes each). For filtered search with N filters, this causes N allocations per query. Interface dispatch uses vtables (static) instead of closures (heap).
Performance:
- Closure: ~5ns + 1 alloc
- Interface: ~2ns + 0 alloc (vtable is static)
type FilterMode ¶
type FilterMode uint8
FilterMode indicates the representation mode of a FilterResult.
const ( // FilterNone indicates empty result (no matches). FilterNone FilterMode = iota // FilterAll indicates no filtering - all rows match. // This is semantically equivalent to a nil bitmap in the old API. FilterAll // FilterRows indicates result is stored as a sorted []uint32. FilterRows // FilterBitmap indicates result is stored as a QueryBitmap (SIMD, zero-alloc). FilterBitmap // FilterRange indicates result is a contiguous range [start, end). // Extremely efficient for temporal/sequential data. FilterRange )
type FilterResult ¶
type FilterResult struct {
// contains filtered or unexported fields
}
FilterResult is a dual-mode filter result that avoids allocations. For low cardinality, it stores rows as []uint32 (SIMD-friendly). For high cardinality, it uses QueryBitmap (SIMD-accelerated, zero-alloc). For contiguous ranges, it stores [start, end) bounds (zero storage overhead).
IMPORTANT: FilterResult now uses QueryBitmap instead of roaring. This removes Roaring from all query hot paths.
Critical: FilterResult does NOT own its memory. The rows slice and bitmap pointer are borrowed from QueryScratch. Lifetime is query-scoped.
func AdaptiveResult ¶
func AdaptiveResult(rows []uint32, segmentSize uint32) FilterResult
AdaptiveResult creates a FilterResult choosing the optimal representation based on cardinality and data characteristics. This is the recommended constructor when the optimal representation is not known in advance.
Thresholds:
- Empty → FilterNone
- 1 element → FilterRows (most cache-friendly)
- ≤1024 elements → FilterRows (fits in L1 cache for iteration)
- Contiguous range → FilterRange (zero storage overhead)
- >1024 elements → FilterBitmap (SIMD-accelerated ops)
The segmentSize parameter enables range detection and FilterAll optimization.
func AllResult ¶
func AllResult() FilterResult
AllResult creates a FilterResult that matches all rows. This indicates no filtering should be applied (equivalent to nil bitmap in old API).
func FilterResultAnd ¶
func FilterResultAnd(a, b FilterResult, qs *QueryScratch) FilterResult
FilterResultAnd performs AND operation on two FilterResults. Design rules:
- Rows ∩ Rows → Rows (two-pointer merge, zero-alloc)
- Rows ∩ Bitmap → Rows (probe QueryBitmap, zero-alloc)
- Rows ∩ Range → Rows (filter by range bounds, zero-alloc)
- Range ∩ Range → Range (intersect bounds, O(1))
- Bitmap ∩ Bitmap → SIMD AND, return rows if small
- Range ∩ Bitmap → materialize range intersection
Uses QueryScratch for scratch space (zero allocations in steady state).
func QueryBitmapResult ¶
func QueryBitmapResult(qb *bitmap.QueryBitmap) FilterResult
QueryBitmapResult creates a FilterResult from a QueryBitmap. The bitmap is NOT copied - caller must ensure it outlives the FilterResult. Zero allocations.
func RangeResult ¶
func RangeResult(start, end uint32) FilterResult
RangeResult creates a FilterResult from a contiguous range [start, end). This is extremely efficient for sequential/temporal data where many consecutive rows match. Zero storage overhead beyond two integers. Returns FilterNone if start >= end.
func RowsResult ¶
func RowsResult(rows []uint32) FilterResult
RowsResult creates a FilterResult from a []uint32 slice. The slice is NOT copied - caller must ensure it outlives the FilterResult. Zero allocations.
func (FilterResult) AsBitmap ¶
func (fr FilterResult) AsBitmap() FilterResultBitmap
AsBitmap returns a FilterResultBitmap wrapper that implements segment.Bitmap. Zero allocations.
func (FilterResult) Cardinality ¶
func (fr FilterResult) Cardinality() int
Cardinality returns the number of matching row IDs.
func (FilterResult) CardinalityUint64 ¶
func (fr FilterResult) CardinalityUint64() uint64
CardinalityUint64 returns the cardinality as uint64 (for segment.Bitmap interface).
func (FilterResult) Clone ¶
func (fr FilterResult) Clone() FilterResult
Clone creates an independent copy of the FilterResult. For FilterRows mode, this copies the underlying slice to prevent aliasing issues when the original slice is reused. For FilterBitmap mode, this clones the QueryBitmap. For FilterRange mode, returns the same value (no data to copy). For FilterNone/FilterAll, returns the same value (no data to copy).
func (FilterResult) CloneInto ¶
func (fr FilterResult) CloneInto(dst []uint32) (FilterResult, []uint32)
CloneInto copies the FilterResult into the provided buffer, returning a new FilterResult that owns its data. This avoids allocations when the caller provides a buffer with sufficient capacity. The returned FilterResult owns the slice (no aliasing with dst).
For FilterRows mode: appends rows to dst and returns a FilterResult pointing to the new slice. For FilterBitmap mode: extracts to rows mode (QueryBitmap is zero-alloc, no Clone needed). For FilterNone/FilterAll: returns the same value (no data to copy).
Example:
buf := make([]uint32, 0, 1024)
for _, fr := range filterResults {
cloned, buf = fr.CloneInto(buf)
collected = append(collected, cloned)
}
func (FilterResult) Contains ¶
func (fr FilterResult) Contains(id uint32) bool
Contains checks if the given row ID is in the result.
func (FilterResult) ForEach ¶
func (fr FilterResult) ForEach(fn func(uint32) bool)
ForEach iterates over all row IDs in the result. The callback should return true to continue, false to stop. Zero allocations.
func (FilterResult) IsAll ¶
func (fr FilterResult) IsAll() bool
IsAll returns true if the result matches all rows (no filtering). This is semantically equivalent to a nil bitmap in the old API.
func (FilterResult) IsEmpty ¶
func (fr FilterResult) IsEmpty() bool
IsEmpty returns true if the result contains no matches.
func (FilterResult) IsRange ¶
func (fr FilterResult) IsRange() bool
IsRange returns true if the result is stored as a range.
func (FilterResult) Mode ¶
func (fr FilterResult) Mode() FilterMode
Mode returns the current mode of the FilterResult.
func (FilterResult) QueryBitmap ¶
func (fr FilterResult) QueryBitmap() *bitmap.QueryBitmap
QueryBitmap returns the underlying QueryBitmap (only valid for FilterBitmap mode). Returns nil for other modes.
func (FilterResult) Range ¶
func (fr FilterResult) Range() (start, end uint32)
Range returns the range bounds (only valid for FilterRange mode). Returns (0, 0) for other modes.
func (FilterResult) Rows ¶
func (fr FilterResult) Rows() []uint32
Rows returns the underlying rows slice (only valid for FilterRows mode). Returns nil for other modes.
func (FilterResult) ToArray ¶
func (fr FilterResult) ToArray(scratch []uint32) []uint32
ToArray extracts all row IDs into a slice. For FilterRows, returns the slice directly (no copy). For FilterBitmap, uses the provided scratch buffer. For FilterRange, expands range into scratch buffer.
func (FilterResult) ToArrayInto ¶
func (fr FilterResult) ToArrayInto(dst []uint32) []uint32
ToArrayInto is an alias for ToArray to satisfy segment.Bitmap interface. Copies all elements into dst, returning the populated slice. If dst has insufficient capacity, a new slice may be allocated.
type FilterResultBitmap ¶
type FilterResultBitmap struct {
// contains filtered or unexported fields
}
FilterResultBitmap wraps FilterResult to implement segment.Bitmap interface. This allows FilterResult to be used anywhere segment.Bitmap is expected. Zero allocations - the wrapper is a value type.
func (FilterResultBitmap) Cardinality ¶
func (frb FilterResultBitmap) Cardinality() uint64
Cardinality returns the number of elements in the set.
func (FilterResultBitmap) Contains ¶
func (frb FilterResultBitmap) Contains(id uint32) bool
Contains reports whether id is present in the set.
func (FilterResultBitmap) ForEach ¶
func (frb FilterResultBitmap) ForEach(fn func(id uint32) bool)
ForEach calls fn for each id in the set.
func (FilterResultBitmap) ToArrayInto ¶
func (frb FilterResultBitmap) ToArrayInto(dst []uint32) []uint32
ToArrayInto copies all elements into dst, returning the populated slice.
type FilterResultCursor ¶
type FilterResultCursor struct {
// contains filtered or unexported fields
}
FilterResultCursor wraps an existing FilterResult as a cursor. This provides backward compatibility during migration.
func NewFilterResultCursor ¶
func NewFilterResultCursor(fr FilterResult) *FilterResultCursor
NewFilterResultCursor wraps a FilterResult as a FilterCursor.
func (*FilterResultCursor) EstimateCardinality ¶
func (c *FilterResultCursor) EstimateCardinality() int
func (*FilterResultCursor) ForEach ¶
func (c *FilterResultCursor) ForEach(fn func(rowID uint32) bool)
func (*FilterResultCursor) IsAll ¶
func (c *FilterResultCursor) IsAll() bool
func (*FilterResultCursor) IsEmpty ¶
func (c *FilterResultCursor) IsEmpty() bool
type LocalBitmap ¶
type LocalBitmap struct {
// contains filtered or unexported fields
}
LocalBitmap implements a 32-bit Roaring Bitmap. It wraps the official roaring implementation. Used for internal row filtering (RowID).
func GetPooledBitmap ¶
func GetPooledBitmap() *LocalBitmap
GetPooledBitmap gets a bitmap from the pool for temporary use. IMPORTANT: Caller MUST call PutPooledBitmap when done to avoid pool exhaustion. The bitmap is cleared before being returned. Use this for temporary bitmaps in hot paths (e.g., filter evaluation). For owned bitmaps that outlive a function call, use NewLocalBitmap instead.
func NewLocalBitmap ¶
func NewLocalBitmap() *LocalBitmap
NewLocalBitmap creates a new empty local bitmap. The returned bitmap is owned by the caller and should NOT be returned to the pool. Use this for long-lived bitmaps (e.g., inverted index storage, API boundaries). For temporary bitmaps in hot paths, use GetPooledBitmap instead.
func (*LocalBitmap) AddMany ¶
func (b *LocalBitmap) AddMany(ids []uint32)
AddMany adds multiple RowIDs to the bitmap in batch. This is more efficient than calling Add repeatedly.
func (*LocalBitmap) And ¶
func (b *LocalBitmap) And(other *LocalBitmap)
And computes the intersection of two bitmaps.
func (*LocalBitmap) AndNot ¶
func (b *LocalBitmap) AndNot(other *LocalBitmap)
AndNot computes the difference: b = b AND NOT other. This removes all elements in other from b.
func (*LocalBitmap) Cardinality ¶
func (b *LocalBitmap) Cardinality() uint64
Cardinality returns the number of elements in the bitmap.
func (*LocalBitmap) Clear ¶
func (b *LocalBitmap) Clear()
Clear removes all elements from the bitmap.
func (*LocalBitmap) Clone ¶
func (b *LocalBitmap) Clone() *LocalBitmap
Clone returns a deep copy of the bitmap. The returned bitmap is owned (not pooled) and allocates a new roaring.Bitmap. For hot paths where you want to avoid allocation, use CloneTo with a pooled bitmap.
func (*LocalBitmap) CloneTo ¶
func (b *LocalBitmap) CloneTo(dst *LocalBitmap)
CloneTo copies this bitmap's contents into the destination bitmap. The destination is cleared first, then filled via Or operation. This avoids roaring's internal Clone allocation when used with pooled bitmaps. Example:
dst := GetPooledBitmap() src.CloneTo(dst) // use dst... PutPooledBitmap(dst)
func (*LocalBitmap) Contains ¶
func (b *LocalBitmap) Contains(id uint32) bool
Contains checks if a RowID is in the bitmap.
func (*LocalBitmap) ForEach ¶
func (b *LocalBitmap) ForEach(fn func(id uint32) bool)
ForEach iterates over the bitmap.
func (*LocalBitmap) GetSizeInBytes ¶
func (b *LocalBitmap) GetSizeInBytes() uint64
GetSizeInBytes returns the size of the bitmap in bytes.
func (*LocalBitmap) IsEmpty ¶
func (b *LocalBitmap) IsEmpty() bool
IsEmpty returns true if the bitmap is empty.
func (*LocalBitmap) Iterator ¶
func (b *LocalBitmap) Iterator() iter.Seq[model.RowID]
Iterator returns an iterator over the bitmap.
func (*LocalBitmap) Or ¶
func (b *LocalBitmap) Or(other *LocalBitmap)
Or computes the union of two bitmaps.
func (*LocalBitmap) ReadFrom ¶
func (b *LocalBitmap) ReadFrom(r io.Reader) (int64, error)
ReadFrom reads the bitmap from an io.Reader.
func (*LocalBitmap) Remove ¶
func (b *LocalBitmap) Remove(id uint32)
Remove removes a RowID from the bitmap.
func (*LocalBitmap) RunOptimize ¶
func (b *LocalBitmap) RunOptimize()
RunOptimize attempts to further compress the bitmap by converting containers to run-length encoding where beneficial. This should be called after the bitmap is fully built and before it's used in queries.
This is especially effective for:
- Bitmaps with consecutive ranges (e.g., prefix cumulative bitmaps)
- Bitmaps built from sorted data
- Dense regions followed by sparse regions
Trade-off: Slightly slower to build, but faster queries and smaller memory.
func (*LocalBitmap) ToArray ¶
func (b *LocalBitmap) ToArray() []uint32
ToArray returns all elements in the bitmap as a slice. This allocates a new slice. For zero-alloc extraction, use ToArrayInto.
func (*LocalBitmap) ToArrayInto ¶
func (b *LocalBitmap) ToArrayInto(dst []uint32) []uint32
ToArrayInto copies all elements into the provided buffer, returning the populated slice. If dst has insufficient capacity, a new slice is allocated and returned. This is the zero-alloc path for extracting rowIDs in hot loops.
Example:
rowIDs := b.ToArrayInto(scratch[:0])
type MatcherCursor ¶ added in v0.0.14
type MatcherCursor struct {
// contains filtered or unexported fields
}
MatcherCursor iterates over row IDs using a FilterMatcher for zero-allocation evaluation. This is the preferred cursor for hot paths - eliminates closure allocations.
func NewMatcherCursor ¶ added in v0.0.14
func NewMatcherCursor(matcher FilterMatcher, rowCount uint32, estimate int) *MatcherCursor
NewMatcherCursor creates a cursor from a FilterMatcher. Caller MUST call Release() when done.
func (*MatcherCursor) EstimateCardinality ¶ added in v0.0.14
func (c *MatcherCursor) EstimateCardinality() int
func (*MatcherCursor) ForEach ¶ added in v0.0.14
func (c *MatcherCursor) ForEach(fn func(rowID uint32) bool)
func (*MatcherCursor) IsAll ¶ added in v0.0.14
func (c *MatcherCursor) IsAll() bool
func (*MatcherCursor) IsEmpty ¶ added in v0.0.14
func (c *MatcherCursor) IsEmpty() bool
func (*MatcherCursor) Release ¶ added in v0.0.14
func (c *MatcherCursor) Release()
Release returns the cursor and its matcher to pools.
type MatcherScratch ¶ added in v0.0.14
type MatcherScratch struct {
Matchers []FilterMatcher
Bitmaps []*LocalBitmap // Scratch for collecting IN bitmaps
}
MatcherScratch is a pooled slice for collecting matchers.
func GetMatcherScratch ¶ added in v0.0.14
func GetMatcherScratch() *MatcherScratch
GetMatcherScratch returns a pooled MatcherScratch.
type MultiBitmapMatcher ¶ added in v0.0.14
type MultiBitmapMatcher struct {
// contains filtered or unexported fields
}
MultiBitmapMatcher checks if rowID exists in any of the bitmaps. Used for OpIn filters with multiple values.
func GetMultiBitmapMatcher ¶ added in v0.0.14
func GetMultiBitmapMatcher(bitmaps []*LocalBitmap) *MultiBitmapMatcher
GetMultiBitmapMatcher returns a pooled MultiBitmapMatcher.
func (*MultiBitmapMatcher) Matches ¶ added in v0.0.14
func (m *MultiBitmapMatcher) Matches(rowID uint32) bool
func (*MultiBitmapMatcher) Release ¶ added in v0.0.14
func (m *MultiBitmapMatcher) Release()
type NotBitmapMatcher ¶ added in v0.0.14
type NotBitmapMatcher struct {
// contains filtered or unexported fields
}
NotBitmapMatcher checks if rowID does NOT exist in a bitmap. Used for OpNotEqual filters.
func GetNotBitmapMatcher ¶ added in v0.0.14
func GetNotBitmapMatcher(bm *LocalBitmap) *NotBitmapMatcher
GetNotBitmapMatcher returns a pooled NotBitmapMatcher.
func (*NotBitmapMatcher) Matches ¶ added in v0.0.14
func (m *NotBitmapMatcher) Matches(rowID uint32) bool
func (*NotBitmapMatcher) Release ¶ added in v0.0.14
func (m *NotBitmapMatcher) Release()
type NumericIndex ¶
type NumericIndex struct {
// contains filtered or unexported fields
}
NumericIndex provides O(log n) range queries for numeric fields. Uses a columnar layout for optimal cache locality during binary search.
Architecture:
- Columnar storage: values []float64 and rowIDs []uint32 (aligned)
- Binary search on values for range boundaries
- Batch bitmap add via AddMany for efficient result building
- Precomputed bitmaps for low-cardinality fields (O(1) equality)
Performance:
- Range queries: O(log n + m) where m is the number of matches
- Equality (low cardinality): O(1) bitmap lookup
- Cache-optimal: binary search touches only values array
- Memory: 12 bytes per entry (8 float64 + 4 uint32) + bitmaps for low-card fields
When to use NumericIndex vs InvertedIndex:
- NumericIndex: high cardinality fields (timestamps, prices, scores)
- InvertedIndex: low cardinality fields (categories, status, types)
func NewNumericIndex ¶
func NewNumericIndex() *NumericIndex
NewNumericIndex creates a new empty numeric index.
func (*NumericIndex) Add ¶
Add adds a numeric value for a field and rowID. If the value is not numeric (int or float), it is ignored. Thread-safety: Caller must hold write lock.
func (*NumericIndex) Cardinality ¶
func (ni *NumericIndex) Cardinality() int
Cardinality returns the total number of (value, rowID) pairs across all fields.
func (*NumericIndex) EstimateSelectivity ¶
func (ni *NumericIndex) EstimateSelectivity(fieldKey string, op metadata.Operator, filterVal float64) float64
EstimateSelectivity estimates the selectivity of a filter on a field. Returns the fraction of rows that match (0.0 to 1.0).
func (*NumericIndex) EvaluateFilter ¶
func (ni *NumericIndex) EvaluateFilter(f metadata.Filter) *LocalBitmap
EvaluateFilter evaluates a numeric filter using adaptive execution. - Low cardinality fields: O(1) bitmap lookup (precomputed) - High cardinality fields: O(log n) binary search on column Returns a bitmap of matching rowIDs. Supports: OpLessThan, OpLessEqual, OpGreaterThan, OpGreaterEqual, OpNotEqual, OpEqual Thread-safety: Caller must hold read lock.
NOTE: The returned bitmap is from a pool. Caller should call PutPooledBitmap when done. For zero-alloc hot paths, use EvaluateFilterInto instead.
func (*NumericIndex) EvaluateFilterInto ¶
func (ni *NumericIndex) EvaluateFilterInto(f metadata.Filter, dst *LocalBitmap)
EvaluateFilterInto is the zero-allocation version of EvaluateFilter. Results are added (OR'd) into the provided destination bitmap. The destination is NOT cleared first - this allows building up results from multiple filters. For a fresh result, caller should clear dst before calling.
Performance: This is the hot-path optimized version. - No allocations in steady state - Caller controls bitmap lifecycle (pooled or owned)
Thread-safety: Caller must hold read lock.
func (*NumericIndex) FieldCount ¶
func (ni *NumericIndex) FieldCount() int
FieldCount returns the number of indexed fields.
func (*NumericIndex) ForEachMatch ¶
func (ni *NumericIndex) ForEachMatch( fieldKey string, op metadata.Operator, filterVal float64, fn func(rowID uint32) bool, ) bool
ForEachMatch iterates over matching row IDs without allocating a bitmap. This is the zero-allocation path for cursor-based filter evaluation. Returns true if the iteration was completed, false if fn returned false (early termination).
func (*NumericIndex) GetFieldMatcher ¶
func (ni *NumericIndex) GetFieldMatcher(fieldKey string, op metadata.Operator, filterVal float64) func(rowID uint32) bool
GetFieldMatcher returns a closure that matches rowIDs against a filter condition. The field is looked up ONCE at call time, eliminating per-rowID string map lookups. This is the optimized hot path for FilterCursor.
Returns nil if the field doesn't exist.
func (*NumericIndex) GetNumericMatcher ¶ added in v0.0.14
func (ni *NumericIndex) GetNumericMatcher(fieldKey string, op metadata.Operator, filterVal float64) FilterMatcher
GetNumericMatcher returns a pooled FilterMatcher for zero-allocation filter evaluation. This replaces closure-based GetFieldMatcher to eliminate heap escapes.
Returns nil if the field doesn't exist or isn't sealed. Caller MUST call Release() on the returned matcher when done.
func (*NumericIndex) GetStats ¶
func (ni *NumericIndex) GetStats(fieldKey string) (minVal, maxVal float64, cardinality int)
GetStats returns minVal, maxVal, and cardinality for a field. Returns zeros if field doesn't exist or is empty.
func (*NumericIndex) HasField ¶
func (ni *NumericIndex) HasField(fieldKey string) bool
HasField returns true if the field has numeric entries.
func (*NumericIndex) IsLowCardinality ¶
func (ni *NumericIndex) IsLowCardinality(fieldKey string) bool
IsLowCardinality returns true if the field has precomputed bitmap index.
func (*NumericIndex) IsSealed ¶
func (ni *NumericIndex) IsSealed() bool
IsSealed returns true if the index has been sealed. After sealing, the index's sorted arrays and bitmap indexes are immutable and can be read without locks (except for pendingDeletes).
func (*NumericIndex) Len ¶
func (ni *NumericIndex) Len(fieldKey string) int
Len returns the number of entries for a field.
func (*NumericIndex) MatchRowID ¶
func (ni *NumericIndex) MatchRowID(fieldKey string, op metadata.Operator, filterVal float64, rowID uint32) bool
MatchRowID checks if a specific rowID matches a filter condition. Uses O(1) lookup via rowIDToIndex map (built during Seal). This is the zero-allocation hot path for FilterCursor range filters.
Returns false if:
- Field doesn't exist
- RowID not found in field
- RowID has a pending delete
- Value doesn't match the filter condition
Thread-safety: Caller must hold read lock.
func (*NumericIndex) QueryRange ¶
func (ni *NumericIndex) QueryRange( fieldKey string, minVal, maxVal float64, includeMin, includeMax bool, dst *LocalBitmap, )
QueryRange finds all rowIDs where fieldKey is in [minVal, maxVal]. Set includeMin/includeMax to control boundary inclusion. Results are added to dst using batch operations for efficiency. Skips pending deletes if any.
Thread-safety: Caller must hold read lock. Note: For unsorted data (before Seal()), this uses SIMD full-column scan. For sorted data (after Seal()), this uses binary search which is faster.
func (*NumericIndex) QueryRangeSIMD ¶
func (ni *NumericIndex) QueryRangeSIMD( fieldKey string, minVal, maxVal float64, dst *LocalBitmap, )
QueryRangeSIMD performs a SIMD-accelerated full-column scan for range queries. Unlike QueryRange which uses binary search on sorted data, this scans all values using vectorized comparisons.
When to use QueryRangeSIMD vs QueryRange:
- QueryRangeSIMD: unsorted data (during bulk load, before Seal())
- QueryRange: sorted data (after Seal()) - binary search is faster for sorted data
Benchmarks show binary search outperforms SIMD scan on sorted data because:
- Binary search touches O(log n) cache lines vs O(n) for full scan
- Modern CPUs have excellent branch prediction for binary search
The range is always inclusive: [minVal, maxVal]. For exclusive bounds, adjust minVal/maxVal with math.Nextafter. Thread-safety: Caller must hold read lock.
func (*NumericIndex) ReadFrom ¶
func (ni *NumericIndex) ReadFrom(r io.Reader) (int64, error)
ReadFrom reads the numeric index from an io.Reader.
func (*NumericIndex) Remove ¶
Remove marks a rowID for deferred deletion. Actual removal happens on Seal() to avoid hot-path slice reallocations. The value parameter is used only for type validation (must be numeric). Thread-safety: Caller must hold write lock.
func (*NumericIndex) Seal ¶
func (ni *NumericIndex) Seal()
Seal sorts all fields, compacts pending deletes, and builds bitmap indexes. Call this after bulk loading or before persisting. Uses parallel processing for large fields.
After Seal():
- The sorted arrays (values, rowIDs) are immutable
- The bitmap indexes (bitmapIndex, prefixBitmaps) are immutable
- Only pendingDeletes can be modified
Thread-safety: Caller must hold write lock.
type NumericOp ¶ added in v0.0.14
type NumericOp uint8
NumericOp represents a numeric comparison operation.
func NumericOpFromOperator ¶ added in v0.0.14
NumericOpFromOperator converts metadata.Operator to NumericOp.
type QueryScratch ¶
type QueryScratch struct {
// Tmp1 is a SIMD-friendly scratch bitmap for intermediate results.
// This is the execution-time bitmap - NOT roaring.
Tmp1 *bitmap.QueryBitmap
// Tmp2 is a second SIMD-friendly scratch bitmap for complex operations.
Tmp2 *bitmap.QueryBitmap
// TmpStorage is a roaring-based bitmap for storage-layer operations.
// Used as a bridge between storage (LocalBitmap) and execution (QueryBitmap).
// Example: NumericIndex.EvaluateFilterInto outputs here, then convert to QueryBitmap.
TmpStorage *LocalBitmap
// TmpRowIDs is a reusable buffer for collecting rowIDs
TmpRowIDs []uint32
// TmpRowIDs2 is a secondary buffer for memtable shard accumulation
// (avoids allocation when shards overwrite TmpRowIDs)
TmpRowIDs2 []uint32
// TmpIndices is a reusable buffer for SIMD index operations (int32 for SIMD compatibility)
TmpIndices []int32
}
QueryScratch provides reusable scratch space for query-local bitmap operations. This reduces allocations by reusing the same bitmaps across filter stages within a single query, rather than getting/putting from the pool per stage.
Architecture:
- Tmp1/Tmp2: QueryBitmap (SIMD, zero-alloc) for execution-time operations
- TmpStorage: LocalBitmap (roaring) for storage-layer bridge operations
The separation ensures Roaring never appears in hot execution paths while still supporting storage-layer operations that output to LocalBitmap.
Usage:
qs := GetQueryScratch() defer PutQueryScratch(qs) // use qs.Tmp1, qs.Tmp2 for SIMD operations // use qs.TmpStorage for storage-layer bridge
func GetQueryScratch ¶
func GetQueryScratch() *QueryScratch
GetQueryScratch gets a QueryScratch from the pool. All scratch buffers are cleared before being returned. Caller MUST call PutQueryScratch when done.
type RangeCursor ¶
type RangeCursor struct {
// contains filtered or unexported fields
}
RangeCursor iterates over a contiguous range [start, end). Extremely efficient for temporal/sequential data. Zero storage, zero allocations.
func NewRangeCursor ¶
func NewRangeCursor(start, end uint32) *RangeCursor
NewRangeCursor creates a FilterCursor for a contiguous range.
func (*RangeCursor) EstimateCardinality ¶
func (c *RangeCursor) EstimateCardinality() int
func (*RangeCursor) ForEach ¶
func (c *RangeCursor) ForEach(fn func(rowID uint32) bool)
func (*RangeCursor) IsAll ¶
func (c *RangeCursor) IsAll() bool
func (*RangeCursor) IsEmpty ¶
func (c *RangeCursor) IsEmpty() bool
type ReleasableCursor ¶ added in v0.0.14
type ReleasableCursor interface {
FilterCursor
// Release returns the cursor and any pooled resources (matchers, scratch buffers)
// back to their respective pools. Safe to call multiple times.
Release()
}
ReleasableCursor is an optional interface for cursors that hold pooled resources. Callers should check if the cursor implements this interface and call Release() after using the cursor to return resources to pools.
Usage:
cursor := segment.FilterCursor(filter)
defer func() {
if r, ok := cursor.(ReleasableCursor); ok {
r.Release()
}
}()
cursor.ForEach(fn)
type RowsCursor ¶
type RowsCursor struct {
// contains filtered or unexported fields
}
RowsCursor wraps a sorted []uint32 slice as a FilterCursor. This is the most common cursor type for low-cardinality results. Zero allocations - borrows the slice.
func NewRowsCursor ¶
func NewRowsCursor(rows []uint32) *RowsCursor
NewRowsCursor creates a FilterCursor from a sorted row slice. The slice is NOT copied - caller must ensure it outlives the cursor.
func (*RowsCursor) EstimateCardinality ¶
func (c *RowsCursor) EstimateCardinality() int
func (*RowsCursor) ForEach ¶
func (c *RowsCursor) ForEach(fn func(rowID uint32) bool)
func (*RowsCursor) IsAll ¶
func (c *RowsCursor) IsAll() bool
func (*RowsCursor) IsEmpty ¶
func (c *RowsCursor) IsEmpty() bool
type SparseNumericMatcher ¶ added in v0.0.14
type SparseNumericMatcher struct {
// contains filtered or unexported fields
}
SparseNumericMatcher uses map lookup for sparse rowID sets. Used for memtables and segments with non-contiguous rowIDs.
func GetSparseNumericMatcher ¶ added in v0.0.14
func GetSparseNumericMatcher(rowIDToIndex map[uint32]int, values []float64, op NumericOp, filterVal float64) *SparseNumericMatcher
GetSparseNumericMatcher returns a pooled SparseNumericMatcher.
func (*SparseNumericMatcher) Matches ¶ added in v0.0.14
func (m *SparseNumericMatcher) Matches(rowID uint32) bool
func (*SparseNumericMatcher) Release ¶ added in v0.0.14
func (m *SparseNumericMatcher) Release()
type Stats ¶
type Stats struct {
DocumentCount int // Total documents
FieldCount int // Number of indexed fields
BitmapCount int // Total number of bitmaps
TotalCardinality uint64 // Sum of all bitmap cardinalities
MemoryBytes uint64 // Estimated memory usage
}
Stats returns statistics about the unified index.
type UnifiedIndex ¶
type UnifiedIndex struct {
// contains filtered or unexported fields
}
UnifiedIndex combines metadata storage with inverted indexing using Bitmaps. This provides efficient hybrid vector + metadata search with minimal memory overhead.
Architecture:
- Primary storage: map[model.RowID]InternedDocument (metadata by RowID, interned keys)
- Inverted index: map[key]map[valueKey]*LocalBitmap (efficient posting lists)
- Numeric index: sorted (value, rowID) pairs for O(log n) range queries
Benefits:
- Memory efficient (Bitmap compression + String Interning)
- Fast filter compilation (Bitmap AND/OR operations)
- O(log n) numeric range queries (vs O(cardinality) without NumericIndex)
- Simple API (single unified type)
func NewUnifiedIndex ¶
func NewUnifiedIndex() *UnifiedIndex
NewUnifiedIndex creates a new unified metadata index.
func (*UnifiedIndex) AddInvertedIndex ¶
func (ui *UnifiedIndex) AddInvertedIndex(id model.RowID, doc metadata.Document)
AddInvertedIndex adds a document to the inverted index without storing the document itself. This is useful for building an index for immutable segments where documents are stored separately. Note: This does not support updates/deletes correctly as the old document is not known.
func (*UnifiedIndex) CompileFilter ¶
func (ui *UnifiedIndex) CompileFilter(fs *metadata.FilterSet) *LocalBitmap
CompileFilter compiles a FilterSet into a fast bitmap-based filter. Returns a bitmap of matching IDs, or nil if compilation fails.
Supported operators:
- OpEqual: field == value
- OpIn: field IN (value1, value2, ...)
- OpNotEqual, OpGreaterThan, etc.: Falls back to scanning
func (*UnifiedIndex) CompileFilterTo ¶
func (ui *UnifiedIndex) CompileFilterTo(fs *metadata.FilterSet, dst *LocalBitmap) bool
CompileFilterTo compiles a FilterSet into the provided destination bitmap. Returns true if compilation succeeded (all operators supported), false otherwise. The destination bitmap is cleared before use.
func (*UnifiedIndex) CreateFilterFunc ¶
CreateFilterFunc creates a filter function from a FilterSet. This is used by hybrid search to efficiently test membership.
Returns:
- Fast path: If compilation succeeds, returns bitmap-based O(1) lookup
- Slow path: Falls back to scanning + evaluating each document (locks per call)
func (*UnifiedIndex) CreateStreamingFilter ¶
func (ui *UnifiedIndex) CreateStreamingFilter(ctx context.Context, fs *metadata.FilterSet) func(model.RowID) bool
CreateStreamingFilter creates a filter function that checks the index without allocating intermediate bitmaps. The caller MUST hold the read lock (RLock) while using the returned function. The context is used for provider lookups on fallback paths.
func (*UnifiedIndex) Delete ¶
func (ui *UnifiedIndex) Delete(id model.RowID)
Delete removes metadata for an ID and updates the inverted index.
func (*UnifiedIndex) EvaluateFilter ¶
func (ui *UnifiedIndex) EvaluateFilter(fs *metadata.FilterSet) *LocalBitmap
EvaluateFilter evaluates a filter set against the inverted index and returns a bitmap. This method supports all operators, including numeric comparisons like OpLessThan.
Query Planning:
- Estimates selectivity for each filter
- Evaluates most selective filters first (cost-based optimization)
- Short-circuits on empty intermediate results
Lazy Execution:
- First filter uses zero-copy reference when possible (immutable OpEqual lookups)
- Cloning only occurs when mutation is needed (AND operations)
- This reduces N-1 bitmap allocations to 0-1 for typical multi-filter queries
For numeric comparisons (< > <= >= !=):
- Uses NumericIndex with O(log n + matches) binary search if field has numeric entries
- Falls back to O(cardinality) scan only for non-numeric or missing fields
NOTE: The returned bitmap is from a pool. Caller should call PutPooledBitmap when done.
func (*UnifiedIndex) EvaluateFilterResult ¶
func (ui *UnifiedIndex) EvaluateFilterResult(fs *metadata.FilterSet, qs *QueryScratch) FilterResult
EvaluateFilterResult evaluates a filter set using the dual-mode FilterResult. This is the zero-allocation version that avoids roaring bitmaps for low cardinality.
Design:
- Low cardinality: returns FilterRows mode (SIMD-friendly []uint32)
- High cardinality: returns FilterBitmap mode (roaring.Bitmap)
- No allocations in hot path (uses QueryScratch)
Returns:
- FilterResult with mode indicating the representation
- Caller does NOT need to return anything to pool (memory is query-scoped)
Thread-safety: Thread-safe (holds RLock during evaluation)
func (*UnifiedIndex) FilterCursor ¶
func (ui *UnifiedIndex) FilterCursor(fs *metadata.FilterSet, rowCount uint32) FilterCursor
FilterCursor returns a push-based cursor for filtered iteration. This is the zero-allocation hot path that:
- Avoids Roaring bitmap OR operations (no dst.Or(bitmap) calls)
- Evaluates filters lazily during iteration
- Supports early termination
For equality/IN filters: O(1) bitmap Contains() check per row For range filters: O(1) numeric comparison per row
The caller MUST hold the read lock (RLock) while using the cursor.
func (*UnifiedIndex) FilterCursorWithMatcher ¶ added in v0.0.14
func (ui *UnifiedIndex) FilterCursorWithMatcher(fs *metadata.FilterSet, rowCount uint32, scratch *MatcherScratch) *MatcherCursor
FilterCursorWithMatcher returns a zero-allocation cursor using FilterMatcher. This is the preferred API for hot paths.
Returns:
- cursor: The filter cursor (MUST call cursor.Release() when done)
- estimate: Estimated cardinality
The caller MUST hold the read lock (RLock) while using the cursor.
func (*UnifiedIndex) Get ¶
Get retrieves metadata for an ID. Returns nil if the ID doesn't exist. The context is used for provider lookups on fallback paths.
func (*UnifiedIndex) GetFilterMatcher ¶ added in v0.0.14
func (ui *UnifiedIndex) GetFilterMatcher(fs *metadata.FilterSet, scratch *MatcherScratch) FilterMatcher
GetFilterMatcher returns a pooled FilterMatcher for zero-allocation filter evaluation. This is the preferred API for hot paths - uses interface dispatch instead of closures.
Returns:
- AlwaysTrueMatcher if fs is nil or empty
- AlwaysFalseMatcher if any filter is impossible
- CompositeMatcher for multiple filters (AND semantics)
- Single matcher for single-filter cases
Caller MUST call Release() on the returned matcher when done. The caller MUST hold the read lock (RLock) while using the matcher.
func (*UnifiedIndex) GetStats ¶
func (ui *UnifiedIndex) GetStats() Stats
GetStats returns statistics about the index.
func (*UnifiedIndex) Len ¶
func (ui *UnifiedIndex) Len() int
Len returns the number of documents in the index.
func (*UnifiedIndex) Query ¶
func (ui *UnifiedIndex) Query(fs *metadata.FilterSet) (*LocalBitmap, error)
Query evaluates a filter set and returns a bitmap of matching documents. Returns nil if the filter set matches all documents (empty filter). Returns error if the filter contains unsupported operators (e.g. range queries).
func (*UnifiedIndex) RUnlock ¶
func (ui *UnifiedIndex) RUnlock()
RUnlock unlocks the index for reading.
func (*UnifiedIndex) ReadInvertedIndex ¶
func (ui *UnifiedIndex) ReadInvertedIndex(r io.Reader) error
ReadInvertedIndex reads the inverted index from the reader.
func (*UnifiedIndex) ScanFilter ¶
func (ui *UnifiedIndex) ScanFilter(fs *metadata.FilterSet) []model.RowID
ScanFilter evaluates a FilterSet by scanning all documents. This is slower than CompileFilter but supports all operators. Use this as a fallback when CompileFilter returns nil.
func (*UnifiedIndex) SealNumericIndex ¶
func (ui *UnifiedIndex) SealNumericIndex()
SealNumericIndex seals the numeric index for sorted binary search. This must be called after bulk loading with AddInvertedIndex is complete. After sealing, numeric range queries use O(log n) binary search instead of O(n) scan.
func (*UnifiedIndex) Set ¶
func (ui *UnifiedIndex) Set(id model.RowID, doc metadata.Document)
Set stores metadata for an ID and updates the inverted index. This replaces any existing metadata for the ID.
func (*UnifiedIndex) SetDocumentProvider ¶
func (ui *UnifiedIndex) SetDocumentProvider(provider DocumentProvider)
SetDocumentProvider sets the document provider for fallback retrieval.
func (*UnifiedIndex) ToMap ¶
func (ui *UnifiedIndex) ToMap() map[model.RowID]metadata.Document
ToMap returns a copy of all documents as a map. This is useful for serialization and snapshot creation.
func (*UnifiedIndex) WriteInvertedIndex ¶
func (ui *UnifiedIndex) WriteInvertedIndex(w io.Writer) error
WriteInvertedIndex writes the inverted index to the writer.