Table of Contents

Class BTreeIndex

Namespace
AiDotNet.RetrievalAugmentedGeneration.Graph
Assembly
AiDotNet.dll

Simple file-based index for mapping string keys to file offsets.

public class BTreeIndex : IDisposable
Inheritance
BTreeIndex
Implements
Inherited Members

Remarks

This class provides a persistent index structure that maps string keys (e.g., node IDs) to byte offsets in data files. The index is stored on disk and reloaded on restart, enabling fast lookups without scanning entire data files.

For Beginners: Think of this like a book's index at the back.

Without an index:

  • To find "photosynthesis", you'd read every page from start to finish
  • Very slow for large books (or large data files)

With an index:

  • Look up "photosynthesis" → find it's on page 157
  • Jump directly to page 157
  • Much faster!

This class does the same for graph data:

  • Key: "node_alice_001"
  • Value: byte offset 45678 in nodes.dat file
  • We can jump directly to byte 45678 to read Alice's data

The index itself is stored in a file so it survives application restarts.

Implementation Note: This is a simplified index using a sorted dictionary. For production systems with millions of entries, consider implementing a true B-Tree structure with splitting/merging nodes, or use an embedded database like SQLite or LevelDB.

Constructors

BTreeIndex(string)

Initializes a new instance of the BTreeIndex class.

public BTreeIndex(string indexFilePath)

Parameters

indexFilePath string

The path to the index file on disk.

Properties

Count

Gets the number of entries in the index.

public int Count { get; }

Property Value

int

Methods

Add(string, long)

Adds or updates a key-offset mapping in the index.

public void Add(string key, long offset)

Parameters

key string

The key to index (e.g., node ID).

offset long

The byte offset in the data file.

Remarks

For Beginners: This adds an entry to the index.

Example:

  • index.Add("alice", 1024)
  • This means: "Alice's data starts at byte 1024 in the file"

If "alice" already exists, it updates to the new offset. The index is marked as "dirty" and will be saved to disk later.

Clear()

Removes all entries from the index.

public void Clear()

Contains(string)

Checks if a key exists in the index.

public bool Contains(string key)

Parameters

key string

The key to check.

Returns

bool

True if the key exists; otherwise, false.

Dispose()

Disposes the index, ensuring all changes are saved to disk.

public void Dispose()

Dispose(bool)

Releases resources used by the index.

protected virtual void Dispose(bool disposing)

Parameters

disposing bool

True if called from Dispose(), false if called from finalizer.

Flush()

Saves the index to disk if it has been modified.

public void Flush()

Remarks

This method writes the index to disk in a simple text format: - Each line: "key|offset" - Example: "alice|1024"

The file is only written if changes have been made (_isDirty flag). This is called automatically by Dispose() to ensure data isn't lost.

For Beginners: This saves the index to a file.

Think of it like saving your work:

  • You've been adding entries to the index (in memory)
  • Flush() writes everything to disk
  • Now the index survives if the program crashes or restarts

Format example (nodes_index.db):

alice|1024
bob|2048
charlie|3072

Get(string)

Retrieves the file offset associated with a key.

public long Get(string key)

Parameters

key string

The key to look up.

Returns

long

The byte offset if found; otherwise, -1.

Remarks

For Beginners: This looks up where data is stored.

Example:

  • var offset = index.Get("alice")
  • Returns: 1024 (meaning Alice's data is at byte 1024)
  • Or returns -1 if "alice" is not in the index

GetAllKeys()

Gets all keys in the index.

public IEnumerable<string> GetAllKeys()

Returns

IEnumerable<string>

Collection of all keys.

Remove(string)

Removes a key from the index.

public bool Remove(string key)

Parameters

key string

The key to remove.

Returns

bool

True if the key was found and removed; otherwise, false.

Remarks

For Beginners: This removes an entry from the index.

Example:

  • index.Remove("alice")
  • Now we can no longer look up where Alice's data is
  • The actual data file is NOT modified - just the index

Note: For production systems, you'd typically mark entries as deleted rather than removing them, to avoid fragmenting the data file.