imetadata

package
v0.0.15 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 19, 2026 License: Apache-2.0 Imports: 17 Imported by: 0

Documentation

Overview

Package imetadata provides internal metadata indexing for efficient filtering.

This package implements the inverted index used for metadata filtering. It is separate from the public metadata package to avoid import cycles.

Architecture

UnifiedIndex combines primary storage with an inverted index:

Primary:  map[RowID]InternedDocument      - O(1) document retrieval
Inverted: map[field]map[value]*Bitmap     - O(1) filter compilation

Filter Compilation

Filters are compiled into bitmap operations:

OpEqual:  field == value  → single bitmap lookup
OpIn:     field IN [...]  → union of bitmaps (OR)
Multiple: AND all filter bitmaps

Unsupported operators (OpLessThan, etc.) fall back to document scanning.

Memory Efficiency

The package uses several techniques to minimize memory:

  • String interning via unique.Handle[string] (Go 1.23+)
  • Roaring Bitmaps for compressed posting lists
  • Value.Key() deduplication in inverted index

Thread Safety

UnifiedIndex is fully thread-safe. All public methods acquire appropriate locks (RLock for reads, Lock for writes). For streaming filters, callers must hold RLock manually via RLock()/RUnlock() methods.

Serialization

The inverted index supports binary serialization via WriteInvertedIndex and ReadInvertedIndex for persistence in disk segments.

Index

Constants

View Source
const DefaultUniverseSize = 1 << 20

DefaultUniverseSize is the default size for query-time bitmaps. 1M covers typical segment sizes; larger segments trigger reallocation.

View Source
const LowCardinalityThreshold = 512

LowCardinalityThreshold is the maximum distinct values for bitmap precomputation. Below this threshold, we precompute value→bitmap maps for O(1) equality lookups. Aligned with roaring bitmap container boundaries (array vs bitmap cutoff).

Variables

This section is empty.

Functions

func PutMatcherScratch added in v0.0.14

func PutMatcherScratch(s *MatcherScratch)

PutMatcherScratch returns a MatcherScratch to the pool.

func PutPooledBitmap

func PutPooledBitmap(b *LocalBitmap)

PutPooledBitmap returns a pooled bitmap to the pool. Only call this for bitmaps obtained via GetPooledBitmap. Passing nil is safe and will be ignored. WARNING: Do not use the bitmap after calling this function.

func PutQueryScratch

func PutQueryScratch(qs *QueryScratch)

PutQueryScratch returns a QueryScratch to the pool. Passing nil is safe and will be ignored.

func ReleaseFilterCursor added in v0.0.14

func ReleaseFilterCursor(cursor FilterCursor)

ReleaseFilterCursor releases a cursor if it implements ReleasableCursor. This is a convenience function for callers that want to release cursors without type assertion boilerplate.

Safe to call with nil or non-releasable cursors.

Types

type AllCursor

type AllCursor struct {
	// contains filtered or unexported fields
}

AllCursor represents "all rows match" (unfiltered query). Iterates 0..size-1.

func NewAllCursor

func NewAllCursor(size uint32) *AllCursor

NewAllCursor creates a FilterCursor that matches all rows in [0, size).

func (*AllCursor) EstimateCardinality

func (c *AllCursor) EstimateCardinality() int

func (*AllCursor) ForEach

func (c *AllCursor) ForEach(fn func(rowID uint32) bool)

func (*AllCursor) IsAll

func (c *AllCursor) IsAll() bool

func (*AllCursor) IsEmpty

func (c *AllCursor) IsEmpty() bool

type AlwaysFalseMatcher added in v0.0.14

type AlwaysFalseMatcher struct{}

AlwaysFalseMatcher always returns false (impossible filter).

func GetAlwaysFalseMatcher added in v0.0.14

func GetAlwaysFalseMatcher() *AlwaysFalseMatcher

func (*AlwaysFalseMatcher) Matches added in v0.0.14

func (m *AlwaysFalseMatcher) Matches(uint32) bool

func (*AlwaysFalseMatcher) Release added in v0.0.14

func (m *AlwaysFalseMatcher) Release()

type AlwaysTrueMatcher added in v0.0.14

type AlwaysTrueMatcher struct{}

AlwaysTrueMatcher always returns true (no filter).

func GetAlwaysTrueMatcher added in v0.0.14

func GetAlwaysTrueMatcher() *AlwaysTrueMatcher

func (*AlwaysTrueMatcher) Matches added in v0.0.14

func (m *AlwaysTrueMatcher) Matches(uint32) bool

func (*AlwaysTrueMatcher) Release added in v0.0.14

func (m *AlwaysTrueMatcher) Release()

type BitmapBuilder

type BitmapBuilder struct {
	// contains filtered or unexported fields
}

BitmapBuilder provides zero-allocation bitmap construction. Instead of building bitmaps by mutation (Add calls), it collects rowIDs into a preallocated slice, then builds the bitmap once via AddMany.

This eliminates:

  • arrayContainer growth allocations
  • container cloning during OR operations
  • Iterator allocations

Usage:

bb := NewBitmapBuilder()
bb.Reset(estimatedHits)
for _, id := range matches {
    bb.Append(id)
}
bitmap := bb.Finalize() // or bb.RowIDs() for direct slice access

func NewBitmapBuilder

func NewBitmapBuilder() *BitmapBuilder

NewBitmapBuilder creates a new BitmapBuilder with default capacity.

func (*BitmapBuilder) Append

func (bb *BitmapBuilder) Append(id uint32)

Append adds a rowID to the builder. This is pure slice append, extremely cache-friendly.

func (*BitmapBuilder) AppendMany

func (bb *BitmapBuilder) AppendMany(ids []uint32)

AppendMany adds multiple rowIDs to the builder.

func (*BitmapBuilder) Finalize

func (bb *BitmapBuilder) Finalize() *LocalBitmap

Finalize builds the bitmap from collected rowIDs. RowIDs are sorted (required for optimal AddMany) and added in one call. Returns the reusable bitmap (do NOT pool it).

func (*BitmapBuilder) FinalizeInto

func (bb *BitmapBuilder) FinalizeInto(dst *LocalBitmap)

FinalizeInto builds the bitmap into the provided LocalBitmap. This allows using a pooled bitmap while still benefiting from AddMany.

func (*BitmapBuilder) Len

func (bb *BitmapBuilder) Len() int

Len returns the number of rowIDs collected.

func (*BitmapBuilder) Reset

func (bb *BitmapBuilder) Reset(hitsHint int)

Reset prepares the builder for a new query. Call this before appending IDs. Zero allocations.

func (*BitmapBuilder) RowIDs

func (bb *BitmapBuilder) RowIDs() []uint32

RowIDs returns the collected rowIDs directly. For low cardinality filters, use this to skip bitmap entirely. The returned slice is only valid until the next Reset call.

type BitmapCursor

type BitmapCursor struct {
	// contains filtered or unexported fields
}

BitmapCursor wraps a LocalBitmap for direct iteration. Uses the bitmap's native iterator, avoiding slice extraction. Zero allocations during iteration.

func NewBitmapCursor

func NewBitmapCursor(bitmap *LocalBitmap) *BitmapCursor

NewBitmapCursor creates a FilterCursor from a bitmap. The bitmap is NOT copied - caller must ensure it outlives the cursor.

func (*BitmapCursor) EstimateCardinality

func (c *BitmapCursor) EstimateCardinality() int

func (*BitmapCursor) ForEach

func (c *BitmapCursor) ForEach(fn func(rowID uint32) bool)

func (*BitmapCursor) IsAll

func (c *BitmapCursor) IsAll() bool

func (*BitmapCursor) IsEmpty

func (c *BitmapCursor) IsEmpty() bool

type BitmapMatcher added in v0.0.14

type BitmapMatcher struct {
	// contains filtered or unexported fields
}

BitmapMatcher checks if rowID exists in a bitmap. Used for OpEqual and OpIn filters.

func GetBitmapMatcher added in v0.0.14

func GetBitmapMatcher(bm *LocalBitmap) *BitmapMatcher

GetBitmapMatcher returns a pooled BitmapMatcher.

func (*BitmapMatcher) Matches added in v0.0.14

func (m *BitmapMatcher) Matches(rowID uint32) bool

func (*BitmapMatcher) Release added in v0.0.14

func (m *BitmapMatcher) Release()

type Column

type Column interface {
	// Matches checks if row at index i matches the filter condition.
	Matches(i int, val metadata.Value, op metadata.Operator) bool
}

Column is the interface for typed column data. This matches the existing memtable column interface.

type ColumnFilterCursor

type ColumnFilterCursor struct {
	// contains filtered or unexported fields
}

ColumnFilterCursor evaluates filters directly against column data. This is the push-based replacement for EvaluateFilterResult. Filters are evaluated lazily during ForEach - no intermediate bitmap.

For AND logic (default): rows must pass ALL filters. Short-circuit evaluation stops at first non-match.

func NewColumnFilterCursor

func NewColumnFilterCursor(columns map[string]Column, filters []metadata.Filter, universeSize uint32) *ColumnFilterCursor

NewColumnFilterCursor creates a cursor that evaluates filters against columns. Columns and filters are borrowed - caller must ensure they outlive the cursor.

func (*ColumnFilterCursor) EstimateCardinality

func (c *ColumnFilterCursor) EstimateCardinality() int

func (*ColumnFilterCursor) ForEach

func (c *ColumnFilterCursor) ForEach(fn func(rowID uint32) bool)

func (*ColumnFilterCursor) IsAll

func (c *ColumnFilterCursor) IsAll() bool

func (*ColumnFilterCursor) IsEmpty

func (c *ColumnFilterCursor) IsEmpty() bool

type CompositeMatcher added in v0.0.14

type CompositeMatcher struct {
	// contains filtered or unexported fields
}

CompositeMatcher evaluates multiple matchers with AND semantics. Short-circuits on first false (most selective first for best perf).

func GetCompositeMatcher added in v0.0.14

func GetCompositeMatcher(matchers []FilterMatcher) *CompositeMatcher

GetCompositeMatcher returns a pooled CompositeMatcher.

func (*CompositeMatcher) Matches added in v0.0.14

func (m *CompositeMatcher) Matches(rowID uint32) bool

func (*CompositeMatcher) Release added in v0.0.14

func (m *CompositeMatcher) Release()

type DenseNumericMatcher added in v0.0.14

type DenseNumericMatcher struct {
	// contains filtered or unexported fields
}

DenseNumericMatcher uses array indexing for dense rowID ranges. Optimal for immutable segments where rowIDs are contiguous.

func GetDenseNumericMatcher added in v0.0.14

func GetDenseNumericMatcher(rowIDToIndex []int32, values []float64, op NumericOp, filterVal float64) *DenseNumericMatcher

GetDenseNumericMatcher returns a pooled DenseNumericMatcher.

func (*DenseNumericMatcher) Matches added in v0.0.14

func (m *DenseNumericMatcher) Matches(rowID uint32) bool

func (*DenseNumericMatcher) Release added in v0.0.14

func (m *DenseNumericMatcher) Release()

type DocumentProvider

type DocumentProvider func(ctx context.Context, id model.RowID) (metadata.Document, bool)

DocumentProvider is a function that retrieves a document by ID. The context should be used for cancellation and timeouts during I/O.

type EmptyCursor

type EmptyCursor struct{}

EmptyCursor represents "no rows match".

func (*EmptyCursor) EstimateCardinality

func (c *EmptyCursor) EstimateCardinality() int

func (*EmptyCursor) ForEach

func (c *EmptyCursor) ForEach(fn func(rowID uint32) bool)

func (*EmptyCursor) IsAll

func (c *EmptyCursor) IsAll() bool

func (*EmptyCursor) IsEmpty

func (c *EmptyCursor) IsEmpty() bool

type FilterCursor

type FilterCursor interface {
	// ForEach iterates over matching row IDs, calling fn for each.
	// Iteration stops early if fn returns false.
	// Zero allocations in the iteration path.
	ForEach(fn func(rowID uint32) bool)

	// EstimateCardinality returns an estimate of matching row IDs.
	// This is used for selectivity-based execution planning.
	// The estimate may be approximate (especially for complex filters).
	EstimateCardinality() int

	// IsEmpty returns true if no rows match (fast path).
	IsEmpty() bool

	// IsAll returns true if all rows match (fast path for no-filter case).
	IsAll() bool
}

FilterCursor is a push-based iterator for filter results. Unlike materialized FilterResult (which builds a bitmap/array), FilterCursor pushes matching row IDs directly to the consumer.

This design eliminates:

  • Roaring bitmap allocations (OR, Clone, Iterator)
  • Memory materialization (no intermediate arrays)
  • Early termination overhead (consumer controls flow)

Architecture: Filters push rows, search pulls nothing.

Example usage:

cursor.ForEach(func(rowID uint32) bool {
    distance := computeDistance(query, vectors[rowID])
    heap.PushMaybe(rowID, distance)
    return heap.Len() < k*10 // early termination
})

func GetEmptyCursor

func GetEmptyCursor() FilterCursor

GetEmptyCursor returns a singleton empty cursor.

type FilterMatcher added in v0.0.14

type FilterMatcher interface {
	// Matches returns true if the rowID passes the filter.
	// MUST be safe for concurrent use.
	Matches(rowID uint32) bool

	// Release returns the matcher to its pool.
	// MUST be called when done to avoid pool exhaustion.
	Release()
}

FilterMatcher is the zero-allocation interface for filter evaluation. Replaces closures with interface dispatch to eliminate heap escapes.

DESIGN RATIONALE: Go closures that capture variables escape to heap (~50 bytes each). For filtered search with N filters, this causes N allocations per query. Interface dispatch uses vtables (static) instead of closures (heap).

Performance:

  • Closure: ~5ns + 1 alloc
  • Interface: ~2ns + 0 alloc (vtable is static)

type FilterMode

type FilterMode uint8

FilterMode indicates the representation mode of a FilterResult.

const (
	// FilterNone indicates empty result (no matches).
	FilterNone FilterMode = iota
	// FilterAll indicates no filtering - all rows match.
	// This is semantically equivalent to a nil bitmap in the old API.
	FilterAll
	// FilterRows indicates result is stored as a sorted []uint32.
	FilterRows
	// FilterBitmap indicates result is stored as a QueryBitmap (SIMD, zero-alloc).
	FilterBitmap
	// FilterRange indicates result is a contiguous range [start, end).
	// Extremely efficient for temporal/sequential data.
	FilterRange
)

type FilterResult

type FilterResult struct {
	// contains filtered or unexported fields
}

FilterResult is a dual-mode filter result that avoids allocations. For low cardinality, it stores rows as []uint32 (SIMD-friendly). For high cardinality, it uses QueryBitmap (SIMD-accelerated, zero-alloc). For contiguous ranges, it stores [start, end) bounds (zero storage overhead).

IMPORTANT: FilterResult now uses QueryBitmap instead of roaring. This removes Roaring from all query hot paths.

Critical: FilterResult does NOT own its memory. The rows slice and bitmap pointer are borrowed from QueryScratch. Lifetime is query-scoped.

func AdaptiveResult

func AdaptiveResult(rows []uint32, segmentSize uint32) FilterResult

AdaptiveResult creates a FilterResult choosing the optimal representation based on cardinality and data characteristics. This is the recommended constructor when the optimal representation is not known in advance.

Thresholds:

  • Empty → FilterNone
  • 1 element → FilterRows (most cache-friendly)
  • ≤1024 elements → FilterRows (fits in L1 cache for iteration)
  • Contiguous range → FilterRange (zero storage overhead)
  • >1024 elements → FilterBitmap (SIMD-accelerated ops)

The segmentSize parameter enables range detection and FilterAll optimization.

func AllResult

func AllResult() FilterResult

AllResult creates a FilterResult that matches all rows. This indicates no filtering should be applied (equivalent to nil bitmap in old API).

func EmptyResult

func EmptyResult() FilterResult

EmptyResult creates an empty FilterResult.

func FilterResultAnd

func FilterResultAnd(a, b FilterResult, qs *QueryScratch) FilterResult

FilterResultAnd performs AND operation on two FilterResults. Design rules:

  • Rows ∩ Rows → Rows (two-pointer merge, zero-alloc)
  • Rows ∩ Bitmap → Rows (probe QueryBitmap, zero-alloc)
  • Rows ∩ Range → Rows (filter by range bounds, zero-alloc)
  • Range ∩ Range → Range (intersect bounds, O(1))
  • Bitmap ∩ Bitmap → SIMD AND, return rows if small
  • Range ∩ Bitmap → materialize range intersection

Uses QueryScratch for scratch space (zero allocations in steady state).

func QueryBitmapResult

func QueryBitmapResult(qb *bitmap.QueryBitmap) FilterResult

QueryBitmapResult creates a FilterResult from a QueryBitmap. The bitmap is NOT copied - caller must ensure it outlives the FilterResult. Zero allocations.

func RangeResult

func RangeResult(start, end uint32) FilterResult

RangeResult creates a FilterResult from a contiguous range [start, end). This is extremely efficient for sequential/temporal data where many consecutive rows match. Zero storage overhead beyond two integers. Returns FilterNone if start >= end.

func RowsResult

func RowsResult(rows []uint32) FilterResult

RowsResult creates a FilterResult from a []uint32 slice. The slice is NOT copied - caller must ensure it outlives the FilterResult. Zero allocations.

func (FilterResult) AsBitmap

func (fr FilterResult) AsBitmap() FilterResultBitmap

AsBitmap returns a FilterResultBitmap wrapper that implements segment.Bitmap. Zero allocations.

func (FilterResult) Cardinality

func (fr FilterResult) Cardinality() int

Cardinality returns the number of matching row IDs.

func (FilterResult) CardinalityUint64

func (fr FilterResult) CardinalityUint64() uint64

CardinalityUint64 returns the cardinality as uint64 (for segment.Bitmap interface).

func (FilterResult) Clone

func (fr FilterResult) Clone() FilterResult

Clone creates an independent copy of the FilterResult. For FilterRows mode, this copies the underlying slice to prevent aliasing issues when the original slice is reused. For FilterBitmap mode, this clones the QueryBitmap. For FilterRange mode, returns the same value (no data to copy). For FilterNone/FilterAll, returns the same value (no data to copy).

func (FilterResult) CloneInto

func (fr FilterResult) CloneInto(dst []uint32) (FilterResult, []uint32)

CloneInto copies the FilterResult into the provided buffer, returning a new FilterResult that owns its data. This avoids allocations when the caller provides a buffer with sufficient capacity. The returned FilterResult owns the slice (no aliasing with dst).

For FilterRows mode: appends rows to dst and returns a FilterResult pointing to the new slice. For FilterBitmap mode: extracts to rows mode (QueryBitmap is zero-alloc, no Clone needed). For FilterNone/FilterAll: returns the same value (no data to copy).

Example:

buf := make([]uint32, 0, 1024)
for _, fr := range filterResults {
    cloned, buf = fr.CloneInto(buf)
    collected = append(collected, cloned)
}

func (FilterResult) Contains

func (fr FilterResult) Contains(id uint32) bool

Contains checks if the given row ID is in the result.

func (FilterResult) ForEach

func (fr FilterResult) ForEach(fn func(uint32) bool)

ForEach iterates over all row IDs in the result. The callback should return true to continue, false to stop. Zero allocations.

func (FilterResult) IsAll

func (fr FilterResult) IsAll() bool

IsAll returns true if the result matches all rows (no filtering). This is semantically equivalent to a nil bitmap in the old API.

func (FilterResult) IsEmpty

func (fr FilterResult) IsEmpty() bool

IsEmpty returns true if the result contains no matches.

func (FilterResult) IsRange

func (fr FilterResult) IsRange() bool

IsRange returns true if the result is stored as a range.

func (FilterResult) Mode

func (fr FilterResult) Mode() FilterMode

Mode returns the current mode of the FilterResult.

func (FilterResult) QueryBitmap

func (fr FilterResult) QueryBitmap() *bitmap.QueryBitmap

QueryBitmap returns the underlying QueryBitmap (only valid for FilterBitmap mode). Returns nil for other modes.

func (FilterResult) Range

func (fr FilterResult) Range() (start, end uint32)

Range returns the range bounds (only valid for FilterRange mode). Returns (0, 0) for other modes.

func (FilterResult) Rows

func (fr FilterResult) Rows() []uint32

Rows returns the underlying rows slice (only valid for FilterRows mode). Returns nil for other modes.

func (FilterResult) ToArray

func (fr FilterResult) ToArray(scratch []uint32) []uint32

ToArray extracts all row IDs into a slice. For FilterRows, returns the slice directly (no copy). For FilterBitmap, uses the provided scratch buffer. For FilterRange, expands range into scratch buffer.

func (FilterResult) ToArrayInto

func (fr FilterResult) ToArrayInto(dst []uint32) []uint32

ToArrayInto is an alias for ToArray to satisfy segment.Bitmap interface. Copies all elements into dst, returning the populated slice. If dst has insufficient capacity, a new slice may be allocated.

type FilterResultBitmap

type FilterResultBitmap struct {
	// contains filtered or unexported fields
}

FilterResultBitmap wraps FilterResult to implement segment.Bitmap interface. This allows FilterResult to be used anywhere segment.Bitmap is expected. Zero allocations - the wrapper is a value type.

func (FilterResultBitmap) Cardinality

func (frb FilterResultBitmap) Cardinality() uint64

Cardinality returns the number of elements in the set.

func (FilterResultBitmap) Contains

func (frb FilterResultBitmap) Contains(id uint32) bool

Contains reports whether id is present in the set.

func (FilterResultBitmap) ForEach

func (frb FilterResultBitmap) ForEach(fn func(id uint32) bool)

ForEach calls fn for each id in the set.

func (FilterResultBitmap) ToArrayInto

func (frb FilterResultBitmap) ToArrayInto(dst []uint32) []uint32

ToArrayInto copies all elements into dst, returning the populated slice.

type FilterResultCursor

type FilterResultCursor struct {
	// contains filtered or unexported fields
}

FilterResultCursor wraps an existing FilterResult as a cursor. This provides backward compatibility during migration.

func NewFilterResultCursor

func NewFilterResultCursor(fr FilterResult) *FilterResultCursor

NewFilterResultCursor wraps a FilterResult as a FilterCursor.

func (*FilterResultCursor) EstimateCardinality

func (c *FilterResultCursor) EstimateCardinality() int

func (*FilterResultCursor) ForEach

func (c *FilterResultCursor) ForEach(fn func(rowID uint32) bool)

func (*FilterResultCursor) IsAll

func (c *FilterResultCursor) IsAll() bool

func (*FilterResultCursor) IsEmpty

func (c *FilterResultCursor) IsEmpty() bool

type LocalBitmap

type LocalBitmap struct {
	// contains filtered or unexported fields
}

LocalBitmap implements a 32-bit Roaring Bitmap. It wraps the official roaring implementation. Used for internal row filtering (RowID).

func GetPooledBitmap

func GetPooledBitmap() *LocalBitmap

GetPooledBitmap gets a bitmap from the pool for temporary use. IMPORTANT: Caller MUST call PutPooledBitmap when done to avoid pool exhaustion. The bitmap is cleared before being returned. Use this for temporary bitmaps in hot paths (e.g., filter evaluation). For owned bitmaps that outlive a function call, use NewLocalBitmap instead.

func NewLocalBitmap

func NewLocalBitmap() *LocalBitmap

NewLocalBitmap creates a new empty local bitmap. The returned bitmap is owned by the caller and should NOT be returned to the pool. Use this for long-lived bitmaps (e.g., inverted index storage, API boundaries). For temporary bitmaps in hot paths, use GetPooledBitmap instead.

func (*LocalBitmap) Add

func (b *LocalBitmap) Add(id uint32)

Add adds a RowID to the bitmap.

func (*LocalBitmap) AddMany

func (b *LocalBitmap) AddMany(ids []uint32)

AddMany adds multiple RowIDs to the bitmap in batch. This is more efficient than calling Add repeatedly.

func (*LocalBitmap) And

func (b *LocalBitmap) And(other *LocalBitmap)

And computes the intersection of two bitmaps.

func (*LocalBitmap) AndNot

func (b *LocalBitmap) AndNot(other *LocalBitmap)

AndNot computes the difference: b = b AND NOT other. This removes all elements in other from b.

func (*LocalBitmap) Cardinality

func (b *LocalBitmap) Cardinality() uint64

Cardinality returns the number of elements in the bitmap.

func (*LocalBitmap) Clear

func (b *LocalBitmap) Clear()

Clear removes all elements from the bitmap.

func (*LocalBitmap) Clone

func (b *LocalBitmap) Clone() *LocalBitmap

Clone returns a deep copy of the bitmap. The returned bitmap is owned (not pooled) and allocates a new roaring.Bitmap. For hot paths where you want to avoid allocation, use CloneTo with a pooled bitmap.

func (*LocalBitmap) CloneTo

func (b *LocalBitmap) CloneTo(dst *LocalBitmap)

CloneTo copies this bitmap's contents into the destination bitmap. The destination is cleared first, then filled via Or operation. This avoids roaring's internal Clone allocation when used with pooled bitmaps. Example:

dst := GetPooledBitmap()
src.CloneTo(dst)
// use dst...
PutPooledBitmap(dst)

func (*LocalBitmap) Contains

func (b *LocalBitmap) Contains(id uint32) bool

Contains checks if a RowID is in the bitmap.

func (*LocalBitmap) ForEach

func (b *LocalBitmap) ForEach(fn func(id uint32) bool)

ForEach iterates over the bitmap.

func (*LocalBitmap) GetSizeInBytes

func (b *LocalBitmap) GetSizeInBytes() uint64

GetSizeInBytes returns the size of the bitmap in bytes.

func (*LocalBitmap) IsEmpty

func (b *LocalBitmap) IsEmpty() bool

IsEmpty returns true if the bitmap is empty.

func (*LocalBitmap) Iterator

func (b *LocalBitmap) Iterator() iter.Seq[model.RowID]

Iterator returns an iterator over the bitmap.

func (*LocalBitmap) Or

func (b *LocalBitmap) Or(other *LocalBitmap)

Or computes the union of two bitmaps.

func (*LocalBitmap) ReadFrom

func (b *LocalBitmap) ReadFrom(r io.Reader) (int64, error)

ReadFrom reads the bitmap from an io.Reader.

func (*LocalBitmap) Remove

func (b *LocalBitmap) Remove(id uint32)

Remove removes a RowID from the bitmap.

func (*LocalBitmap) RunOptimize

func (b *LocalBitmap) RunOptimize()

RunOptimize attempts to further compress the bitmap by converting containers to run-length encoding where beneficial. This should be called after the bitmap is fully built and before it's used in queries.

This is especially effective for:

  • Bitmaps with consecutive ranges (e.g., prefix cumulative bitmaps)
  • Bitmaps built from sorted data
  • Dense regions followed by sparse regions

Trade-off: Slightly slower to build, but faster queries and smaller memory.

func (*LocalBitmap) ToArray

func (b *LocalBitmap) ToArray() []uint32

ToArray returns all elements in the bitmap as a slice. This allocates a new slice. For zero-alloc extraction, use ToArrayInto.

func (*LocalBitmap) ToArrayInto

func (b *LocalBitmap) ToArrayInto(dst []uint32) []uint32

ToArrayInto copies all elements into the provided buffer, returning the populated slice. If dst has insufficient capacity, a new slice is allocated and returned. This is the zero-alloc path for extracting rowIDs in hot loops.

Example:

rowIDs := b.ToArrayInto(scratch[:0])

func (*LocalBitmap) WriteTo

func (b *LocalBitmap) WriteTo(w io.Writer) (int64, error)

WriteTo writes the bitmap to an io.Writer.

type MatcherCursor added in v0.0.14

type MatcherCursor struct {
	// contains filtered or unexported fields
}

MatcherCursor iterates over row IDs using a FilterMatcher for zero-allocation evaluation. This is the preferred cursor for hot paths - eliminates closure allocations.

func NewMatcherCursor added in v0.0.14

func NewMatcherCursor(matcher FilterMatcher, rowCount uint32, estimate int) *MatcherCursor

NewMatcherCursor creates a cursor from a FilterMatcher. Caller MUST call Release() when done.

func (*MatcherCursor) EstimateCardinality added in v0.0.14

func (c *MatcherCursor) EstimateCardinality() int

func (*MatcherCursor) ForEach added in v0.0.14

func (c *MatcherCursor) ForEach(fn func(rowID uint32) bool)

func (*MatcherCursor) IsAll added in v0.0.14

func (c *MatcherCursor) IsAll() bool

func (*MatcherCursor) IsEmpty added in v0.0.14

func (c *MatcherCursor) IsEmpty() bool

func (*MatcherCursor) Release added in v0.0.14

func (c *MatcherCursor) Release()

Release returns the cursor and its matcher to pools.

type MatcherScratch added in v0.0.14

type MatcherScratch struct {
	Matchers []FilterMatcher
	Bitmaps  []*LocalBitmap // Scratch for collecting IN bitmaps
}

MatcherScratch is a pooled slice for collecting matchers.

func GetMatcherScratch added in v0.0.14

func GetMatcherScratch() *MatcherScratch

GetMatcherScratch returns a pooled MatcherScratch.

type MultiBitmapMatcher added in v0.0.14

type MultiBitmapMatcher struct {
	// contains filtered or unexported fields
}

MultiBitmapMatcher checks if rowID exists in any of the bitmaps. Used for OpIn filters with multiple values.

func GetMultiBitmapMatcher added in v0.0.14

func GetMultiBitmapMatcher(bitmaps []*LocalBitmap) *MultiBitmapMatcher

GetMultiBitmapMatcher returns a pooled MultiBitmapMatcher.

func (*MultiBitmapMatcher) Matches added in v0.0.14

func (m *MultiBitmapMatcher) Matches(rowID uint32) bool

func (*MultiBitmapMatcher) Release added in v0.0.14

func (m *MultiBitmapMatcher) Release()

type NotBitmapMatcher added in v0.0.14

type NotBitmapMatcher struct {
	// contains filtered or unexported fields
}

NotBitmapMatcher checks if rowID does NOT exist in a bitmap. Used for OpNotEqual filters.

func GetNotBitmapMatcher added in v0.0.14

func GetNotBitmapMatcher(bm *LocalBitmap) *NotBitmapMatcher

GetNotBitmapMatcher returns a pooled NotBitmapMatcher.

func (*NotBitmapMatcher) Matches added in v0.0.14

func (m *NotBitmapMatcher) Matches(rowID uint32) bool

func (*NotBitmapMatcher) Release added in v0.0.14

func (m *NotBitmapMatcher) Release()

type NumericIndex

type NumericIndex struct {
	// contains filtered or unexported fields
}

NumericIndex provides O(log n) range queries for numeric fields. Uses a columnar layout for optimal cache locality during binary search.

Architecture:

  • Columnar storage: values []float64 and rowIDs []uint32 (aligned)
  • Binary search on values for range boundaries
  • Batch bitmap add via AddMany for efficient result building
  • Precomputed bitmaps for low-cardinality fields (O(1) equality)

Performance:

  • Range queries: O(log n + m) where m is the number of matches
  • Equality (low cardinality): O(1) bitmap lookup
  • Cache-optimal: binary search touches only values array
  • Memory: 12 bytes per entry (8 float64 + 4 uint32) + bitmaps for low-card fields

When to use NumericIndex vs InvertedIndex:

  • NumericIndex: high cardinality fields (timestamps, prices, scores)
  • InvertedIndex: low cardinality fields (categories, status, types)

func NewNumericIndex

func NewNumericIndex() *NumericIndex

NewNumericIndex creates a new empty numeric index.

func (*NumericIndex) Add

func (ni *NumericIndex) Add(fieldKey string, value metadata.Value, rowID model.RowID)

Add adds a numeric value for a field and rowID. If the value is not numeric (int or float), it is ignored. Thread-safety: Caller must hold write lock.

func (*NumericIndex) Cardinality

func (ni *NumericIndex) Cardinality() int

Cardinality returns the total number of (value, rowID) pairs across all fields.

func (*NumericIndex) EstimateSelectivity

func (ni *NumericIndex) EstimateSelectivity(fieldKey string, op metadata.Operator, filterVal float64) float64

EstimateSelectivity estimates the selectivity of a filter on a field. Returns the fraction of rows that match (0.0 to 1.0).

func (*NumericIndex) EvaluateFilter

func (ni *NumericIndex) EvaluateFilter(f metadata.Filter) *LocalBitmap

EvaluateFilter evaluates a numeric filter using adaptive execution. - Low cardinality fields: O(1) bitmap lookup (precomputed) - High cardinality fields: O(log n) binary search on column Returns a bitmap of matching rowIDs. Supports: OpLessThan, OpLessEqual, OpGreaterThan, OpGreaterEqual, OpNotEqual, OpEqual Thread-safety: Caller must hold read lock.

NOTE: The returned bitmap is from a pool. Caller should call PutPooledBitmap when done. For zero-alloc hot paths, use EvaluateFilterInto instead.

func (*NumericIndex) EvaluateFilterInto

func (ni *NumericIndex) EvaluateFilterInto(f metadata.Filter, dst *LocalBitmap)

EvaluateFilterInto is the zero-allocation version of EvaluateFilter. Results are added (OR'd) into the provided destination bitmap. The destination is NOT cleared first - this allows building up results from multiple filters. For a fresh result, caller should clear dst before calling.

Performance: This is the hot-path optimized version. - No allocations in steady state - Caller controls bitmap lifecycle (pooled or owned)

Thread-safety: Caller must hold read lock.

func (*NumericIndex) FieldCount

func (ni *NumericIndex) FieldCount() int

FieldCount returns the number of indexed fields.

func (*NumericIndex) ForEachMatch

func (ni *NumericIndex) ForEachMatch(
	fieldKey string,
	op metadata.Operator,
	filterVal float64,
	fn func(rowID uint32) bool,
) bool

ForEachMatch iterates over matching row IDs without allocating a bitmap. This is the zero-allocation path for cursor-based filter evaluation. Returns true if the iteration was completed, false if fn returned false (early termination).

func (*NumericIndex) GetFieldMatcher

func (ni *NumericIndex) GetFieldMatcher(fieldKey string, op metadata.Operator, filterVal float64) func(rowID uint32) bool

GetFieldMatcher returns a closure that matches rowIDs against a filter condition. The field is looked up ONCE at call time, eliminating per-rowID string map lookups. This is the optimized hot path for FilterCursor.

Returns nil if the field doesn't exist.

func (*NumericIndex) GetNumericMatcher added in v0.0.14

func (ni *NumericIndex) GetNumericMatcher(fieldKey string, op metadata.Operator, filterVal float64) FilterMatcher

GetNumericMatcher returns a pooled FilterMatcher for zero-allocation filter evaluation. This replaces closure-based GetFieldMatcher to eliminate heap escapes.

Returns nil if the field doesn't exist or isn't sealed. Caller MUST call Release() on the returned matcher when done.

func (*NumericIndex) GetStats

func (ni *NumericIndex) GetStats(fieldKey string) (minVal, maxVal float64, cardinality int)

GetStats returns minVal, maxVal, and cardinality for a field. Returns zeros if field doesn't exist or is empty.

func (*NumericIndex) HasField

func (ni *NumericIndex) HasField(fieldKey string) bool

HasField returns true if the field has numeric entries.

func (*NumericIndex) IsLowCardinality

func (ni *NumericIndex) IsLowCardinality(fieldKey string) bool

IsLowCardinality returns true if the field has precomputed bitmap index.

func (*NumericIndex) IsSealed

func (ni *NumericIndex) IsSealed() bool

IsSealed returns true if the index has been sealed. After sealing, the index's sorted arrays and bitmap indexes are immutable and can be read without locks (except for pendingDeletes).

func (*NumericIndex) Len

func (ni *NumericIndex) Len(fieldKey string) int

Len returns the number of entries for a field.

func (*NumericIndex) MatchRowID

func (ni *NumericIndex) MatchRowID(fieldKey string, op metadata.Operator, filterVal float64, rowID uint32) bool

MatchRowID checks if a specific rowID matches a filter condition. Uses O(1) lookup via rowIDToIndex map (built during Seal). This is the zero-allocation hot path for FilterCursor range filters.

Returns false if:

  • Field doesn't exist
  • RowID not found in field
  • RowID has a pending delete
  • Value doesn't match the filter condition

Thread-safety: Caller must hold read lock.

func (*NumericIndex) QueryRange

func (ni *NumericIndex) QueryRange(
	fieldKey string,
	minVal, maxVal float64,
	includeMin, includeMax bool,
	dst *LocalBitmap,
)

QueryRange finds all rowIDs where fieldKey is in [minVal, maxVal]. Set includeMin/includeMax to control boundary inclusion. Results are added to dst using batch operations for efficiency. Skips pending deletes if any.

Thread-safety: Caller must hold read lock. Note: For unsorted data (before Seal()), this uses SIMD full-column scan. For sorted data (after Seal()), this uses binary search which is faster.

func (*NumericIndex) QueryRangeSIMD

func (ni *NumericIndex) QueryRangeSIMD(
	fieldKey string,
	minVal, maxVal float64,
	dst *LocalBitmap,
)

QueryRangeSIMD performs a SIMD-accelerated full-column scan for range queries. Unlike QueryRange which uses binary search on sorted data, this scans all values using vectorized comparisons.

When to use QueryRangeSIMD vs QueryRange:

  • QueryRangeSIMD: unsorted data (during bulk load, before Seal())
  • QueryRange: sorted data (after Seal()) - binary search is faster for sorted data

Benchmarks show binary search outperforms SIMD scan on sorted data because:

  • Binary search touches O(log n) cache lines vs O(n) for full scan
  • Modern CPUs have excellent branch prediction for binary search

The range is always inclusive: [minVal, maxVal]. For exclusive bounds, adjust minVal/maxVal with math.Nextafter. Thread-safety: Caller must hold read lock.

func (*NumericIndex) ReadFrom

func (ni *NumericIndex) ReadFrom(r io.Reader) (int64, error)

ReadFrom reads the numeric index from an io.Reader.

func (*NumericIndex) Remove

func (ni *NumericIndex) Remove(fieldKey string, value metadata.Value, rowID model.RowID)

Remove marks a rowID for deferred deletion. Actual removal happens on Seal() to avoid hot-path slice reallocations. The value parameter is used only for type validation (must be numeric). Thread-safety: Caller must hold write lock.

func (*NumericIndex) Seal

func (ni *NumericIndex) Seal()

Seal sorts all fields, compacts pending deletes, and builds bitmap indexes. Call this after bulk loading or before persisting. Uses parallel processing for large fields.

After Seal():

  • The sorted arrays (values, rowIDs) are immutable
  • The bitmap indexes (bitmapIndex, prefixBitmaps) are immutable
  • Only pendingDeletes can be modified

Thread-safety: Caller must hold write lock.

func (*NumericIndex) WriteTo

func (ni *NumericIndex) WriteTo(w io.Writer) (int64, error)

WriteTo writes the numeric index to an io.Writer in a compact binary format. Uses buffered writing for better I/O performance.

type NumericOp added in v0.0.14

type NumericOp uint8

NumericOp represents a numeric comparison operation.

const (
	NumericOpEq NumericOp = iota
	NumericOpNe
	NumericOpLt
	NumericOpLe
	NumericOpGt
	NumericOpGe
)

func NumericOpFromOperator added in v0.0.14

func NumericOpFromOperator(op metadata.Operator) NumericOp

NumericOpFromOperator converts metadata.Operator to NumericOp.

type QueryScratch

type QueryScratch struct {
	// Tmp1 is a SIMD-friendly scratch bitmap for intermediate results.
	// This is the execution-time bitmap - NOT roaring.
	Tmp1 *bitmap.QueryBitmap
	// Tmp2 is a second SIMD-friendly scratch bitmap for complex operations.
	Tmp2 *bitmap.QueryBitmap
	// TmpStorage is a roaring-based bitmap for storage-layer operations.
	// Used as a bridge between storage (LocalBitmap) and execution (QueryBitmap).
	// Example: NumericIndex.EvaluateFilterInto outputs here, then convert to QueryBitmap.
	TmpStorage *LocalBitmap
	// TmpRowIDs is a reusable buffer for collecting rowIDs
	TmpRowIDs []uint32
	// TmpRowIDs2 is a secondary buffer for memtable shard accumulation
	// (avoids allocation when shards overwrite TmpRowIDs)
	TmpRowIDs2 []uint32
	// TmpIndices is a reusable buffer for SIMD index operations (int32 for SIMD compatibility)
	TmpIndices []int32
}

QueryScratch provides reusable scratch space for query-local bitmap operations. This reduces allocations by reusing the same bitmaps across filter stages within a single query, rather than getting/putting from the pool per stage.

Architecture:

  • Tmp1/Tmp2: QueryBitmap (SIMD, zero-alloc) for execution-time operations
  • TmpStorage: LocalBitmap (roaring) for storage-layer bridge operations

The separation ensures Roaring never appears in hot execution paths while still supporting storage-layer operations that output to LocalBitmap.

Usage:

qs := GetQueryScratch()
defer PutQueryScratch(qs)
// use qs.Tmp1, qs.Tmp2 for SIMD operations
// use qs.TmpStorage for storage-layer bridge

func GetQueryScratch

func GetQueryScratch() *QueryScratch

GetQueryScratch gets a QueryScratch from the pool. All scratch buffers are cleared before being returned. Caller MUST call PutQueryScratch when done.

type RangeCursor

type RangeCursor struct {
	// contains filtered or unexported fields
}

RangeCursor iterates over a contiguous range [start, end). Extremely efficient for temporal/sequential data. Zero storage, zero allocations.

func NewRangeCursor

func NewRangeCursor(start, end uint32) *RangeCursor

NewRangeCursor creates a FilterCursor for a contiguous range.

func (*RangeCursor) EstimateCardinality

func (c *RangeCursor) EstimateCardinality() int

func (*RangeCursor) ForEach

func (c *RangeCursor) ForEach(fn func(rowID uint32) bool)

func (*RangeCursor) IsAll

func (c *RangeCursor) IsAll() bool

func (*RangeCursor) IsEmpty

func (c *RangeCursor) IsEmpty() bool

type ReleasableCursor added in v0.0.14

type ReleasableCursor interface {
	FilterCursor
	// Release returns the cursor and any pooled resources (matchers, scratch buffers)
	// back to their respective pools. Safe to call multiple times.
	Release()
}

ReleasableCursor is an optional interface for cursors that hold pooled resources. Callers should check if the cursor implements this interface and call Release() after using the cursor to return resources to pools.

Usage:

cursor := segment.FilterCursor(filter)
defer func() {
    if r, ok := cursor.(ReleasableCursor); ok {
        r.Release()
    }
}()
cursor.ForEach(fn)

type RowsCursor

type RowsCursor struct {
	// contains filtered or unexported fields
}

RowsCursor wraps a sorted []uint32 slice as a FilterCursor. This is the most common cursor type for low-cardinality results. Zero allocations - borrows the slice.

func NewRowsCursor

func NewRowsCursor(rows []uint32) *RowsCursor

NewRowsCursor creates a FilterCursor from a sorted row slice. The slice is NOT copied - caller must ensure it outlives the cursor.

func (*RowsCursor) EstimateCardinality

func (c *RowsCursor) EstimateCardinality() int

func (*RowsCursor) ForEach

func (c *RowsCursor) ForEach(fn func(rowID uint32) bool)

func (*RowsCursor) IsAll

func (c *RowsCursor) IsAll() bool

func (*RowsCursor) IsEmpty

func (c *RowsCursor) IsEmpty() bool

type SparseNumericMatcher added in v0.0.14

type SparseNumericMatcher struct {
	// contains filtered or unexported fields
}

SparseNumericMatcher uses map lookup for sparse rowID sets. Used for memtables and segments with non-contiguous rowIDs.

func GetSparseNumericMatcher added in v0.0.14

func GetSparseNumericMatcher(rowIDToIndex map[uint32]int, values []float64, op NumericOp, filterVal float64) *SparseNumericMatcher

GetSparseNumericMatcher returns a pooled SparseNumericMatcher.

func (*SparseNumericMatcher) Matches added in v0.0.14

func (m *SparseNumericMatcher) Matches(rowID uint32) bool

func (*SparseNumericMatcher) Release added in v0.0.14

func (m *SparseNumericMatcher) Release()

type Stats

type Stats struct {
	DocumentCount    int    // Total documents
	FieldCount       int    // Number of indexed fields
	BitmapCount      int    // Total number of bitmaps
	TotalCardinality uint64 // Sum of all bitmap cardinalities
	MemoryBytes      uint64 // Estimated memory usage
}

Stats returns statistics about the unified index.

type UnifiedIndex

type UnifiedIndex struct {
	// contains filtered or unexported fields
}

UnifiedIndex combines metadata storage with inverted indexing using Bitmaps. This provides efficient hybrid vector + metadata search with minimal memory overhead.

Architecture:

  • Primary storage: map[model.RowID]InternedDocument (metadata by RowID, interned keys)
  • Inverted index: map[key]map[valueKey]*LocalBitmap (efficient posting lists)
  • Numeric index: sorted (value, rowID) pairs for O(log n) range queries

Benefits:

  • Memory efficient (Bitmap compression + String Interning)
  • Fast filter compilation (Bitmap AND/OR operations)
  • O(log n) numeric range queries (vs O(cardinality) without NumericIndex)
  • Simple API (single unified type)

func NewUnifiedIndex

func NewUnifiedIndex() *UnifiedIndex

NewUnifiedIndex creates a new unified metadata index.

func (*UnifiedIndex) AddInvertedIndex

func (ui *UnifiedIndex) AddInvertedIndex(id model.RowID, doc metadata.Document)

AddInvertedIndex adds a document to the inverted index without storing the document itself. This is useful for building an index for immutable segments where documents are stored separately. Note: This does not support updates/deletes correctly as the old document is not known.

func (*UnifiedIndex) CompileFilter

func (ui *UnifiedIndex) CompileFilter(fs *metadata.FilterSet) *LocalBitmap

CompileFilter compiles a FilterSet into a fast bitmap-based filter. Returns a bitmap of matching IDs, or nil if compilation fails.

Supported operators:

  • OpEqual: field == value
  • OpIn: field IN (value1, value2, ...)
  • OpNotEqual, OpGreaterThan, etc.: Falls back to scanning

func (*UnifiedIndex) CompileFilterTo

func (ui *UnifiedIndex) CompileFilterTo(fs *metadata.FilterSet, dst *LocalBitmap) bool

CompileFilterTo compiles a FilterSet into the provided destination bitmap. Returns true if compilation succeeded (all operators supported), false otherwise. The destination bitmap is cleared before use.

func (*UnifiedIndex) CreateFilterFunc

func (ui *UnifiedIndex) CreateFilterFunc(fs *metadata.FilterSet) func(model.RowID) bool

CreateFilterFunc creates a filter function from a FilterSet. This is used by hybrid search to efficiently test membership.

Returns:

  • Fast path: If compilation succeeds, returns bitmap-based O(1) lookup
  • Slow path: Falls back to scanning + evaluating each document (locks per call)

func (*UnifiedIndex) CreateStreamingFilter

func (ui *UnifiedIndex) CreateStreamingFilter(ctx context.Context, fs *metadata.FilterSet) func(model.RowID) bool

CreateStreamingFilter creates a filter function that checks the index without allocating intermediate bitmaps. The caller MUST hold the read lock (RLock) while using the returned function. The context is used for provider lookups on fallback paths.

func (*UnifiedIndex) Delete

func (ui *UnifiedIndex) Delete(id model.RowID)

Delete removes metadata for an ID and updates the inverted index.

func (*UnifiedIndex) EvaluateFilter

func (ui *UnifiedIndex) EvaluateFilter(fs *metadata.FilterSet) *LocalBitmap

EvaluateFilter evaluates a filter set against the inverted index and returns a bitmap. This method supports all operators, including numeric comparisons like OpLessThan.

Query Planning:

  • Estimates selectivity for each filter
  • Evaluates most selective filters first (cost-based optimization)
  • Short-circuits on empty intermediate results

Lazy Execution:

  • First filter uses zero-copy reference when possible (immutable OpEqual lookups)
  • Cloning only occurs when mutation is needed (AND operations)
  • This reduces N-1 bitmap allocations to 0-1 for typical multi-filter queries

For numeric comparisons (< > <= >= !=):

  • Uses NumericIndex with O(log n + matches) binary search if field has numeric entries
  • Falls back to O(cardinality) scan only for non-numeric or missing fields

NOTE: The returned bitmap is from a pool. Caller should call PutPooledBitmap when done.

func (*UnifiedIndex) EvaluateFilterResult

func (ui *UnifiedIndex) EvaluateFilterResult(fs *metadata.FilterSet, qs *QueryScratch) FilterResult

EvaluateFilterResult evaluates a filter set using the dual-mode FilterResult. This is the zero-allocation version that avoids roaring bitmaps for low cardinality.

Design:

  • Low cardinality: returns FilterRows mode (SIMD-friendly []uint32)
  • High cardinality: returns FilterBitmap mode (roaring.Bitmap)
  • No allocations in hot path (uses QueryScratch)

Returns:

  • FilterResult with mode indicating the representation
  • Caller does NOT need to return anything to pool (memory is query-scoped)

Thread-safety: Thread-safe (holds RLock during evaluation)

func (*UnifiedIndex) FilterCursor

func (ui *UnifiedIndex) FilterCursor(fs *metadata.FilterSet, rowCount uint32) FilterCursor

FilterCursor returns a push-based cursor for filtered iteration. This is the zero-allocation hot path that:

  • Avoids Roaring bitmap OR operations (no dst.Or(bitmap) calls)
  • Evaluates filters lazily during iteration
  • Supports early termination

For equality/IN filters: O(1) bitmap Contains() check per row For range filters: O(1) numeric comparison per row

The caller MUST hold the read lock (RLock) while using the cursor.

func (*UnifiedIndex) FilterCursorWithMatcher added in v0.0.14

func (ui *UnifiedIndex) FilterCursorWithMatcher(fs *metadata.FilterSet, rowCount uint32, scratch *MatcherScratch) *MatcherCursor

FilterCursorWithMatcher returns a zero-allocation cursor using FilterMatcher. This is the preferred API for hot paths.

Returns:

  • cursor: The filter cursor (MUST call cursor.Release() when done)
  • estimate: Estimated cardinality

The caller MUST hold the read lock (RLock) while using the cursor.

func (*UnifiedIndex) Get

Get retrieves metadata for an ID. Returns nil if the ID doesn't exist. The context is used for provider lookups on fallback paths.

func (*UnifiedIndex) GetFilterMatcher added in v0.0.14

func (ui *UnifiedIndex) GetFilterMatcher(fs *metadata.FilterSet, scratch *MatcherScratch) FilterMatcher

GetFilterMatcher returns a pooled FilterMatcher for zero-allocation filter evaluation. This is the preferred API for hot paths - uses interface dispatch instead of closures.

Returns:

  • AlwaysTrueMatcher if fs is nil or empty
  • AlwaysFalseMatcher if any filter is impossible
  • CompositeMatcher for multiple filters (AND semantics)
  • Single matcher for single-filter cases

Caller MUST call Release() on the returned matcher when done. The caller MUST hold the read lock (RLock) while using the matcher.

func (*UnifiedIndex) GetStats

func (ui *UnifiedIndex) GetStats() Stats

GetStats returns statistics about the index.

func (*UnifiedIndex) Len

func (ui *UnifiedIndex) Len() int

Len returns the number of documents in the index.

func (*UnifiedIndex) Query

func (ui *UnifiedIndex) Query(fs *metadata.FilterSet) (*LocalBitmap, error)

Query evaluates a filter set and returns a bitmap of matching documents. Returns nil if the filter set matches all documents (empty filter). Returns error if the filter contains unsupported operators (e.g. range queries).

func (*UnifiedIndex) RLock

func (ui *UnifiedIndex) RLock()

RLock locks the index for reading.

func (*UnifiedIndex) RUnlock

func (ui *UnifiedIndex) RUnlock()

RUnlock unlocks the index for reading.

func (*UnifiedIndex) ReadInvertedIndex

func (ui *UnifiedIndex) ReadInvertedIndex(r io.Reader) error

ReadInvertedIndex reads the inverted index from the reader.

func (*UnifiedIndex) ScanFilter

func (ui *UnifiedIndex) ScanFilter(fs *metadata.FilterSet) []model.RowID

ScanFilter evaluates a FilterSet by scanning all documents. This is slower than CompileFilter but supports all operators. Use this as a fallback when CompileFilter returns nil.

func (*UnifiedIndex) SealNumericIndex

func (ui *UnifiedIndex) SealNumericIndex()

SealNumericIndex seals the numeric index for sorted binary search. This must be called after bulk loading with AddInvertedIndex is complete. After sealing, numeric range queries use O(log n) binary search instead of O(n) scan.

func (*UnifiedIndex) Set

func (ui *UnifiedIndex) Set(id model.RowID, doc metadata.Document)

Set stores metadata for an ID and updates the inverted index. This replaces any existing metadata for the ID.

func (*UnifiedIndex) SetDocumentProvider

func (ui *UnifiedIndex) SetDocumentProvider(provider DocumentProvider)

SetDocumentProvider sets the document provider for fallback retrieval.

func (*UnifiedIndex) ToMap

func (ui *UnifiedIndex) ToMap() map[model.RowID]metadata.Document

ToMap returns a copy of all documents as a map. This is useful for serialization and snapshot creation.

func (*UnifiedIndex) WriteInvertedIndex

func (ui *UnifiedIndex) WriteInvertedIndex(w io.Writer) error

WriteInvertedIndex writes the inverted index to the writer.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL