Table of Contents

Class FileGraphStore<T>

Namespace
AiDotNet.RetrievalAugmentedGeneration.Graph
Assembly
AiDotNet.dll

File-based implementation of IGraphStore<T> with persistent storage on disk.

public class FileGraphStore<T> : IGraphStore<T>, IDisposable

Type Parameters

T

The numeric type used for vector operations.

Inheritance
FileGraphStore<T>
Implements
Inherited Members

Remarks

This implementation provides persistent graph storage using files: - nodes.dat: Binary file containing serialized nodes - edges.dat: Binary file containing serialized edges - node_index.db: B-Tree index mapping node IDs to file offsets - edge_index.db: B-Tree index mapping edge IDs to file offsets

For Beginners: This stores your graph on disk so it survives restarts.

How it works:

  1. When you add a node, it's written to nodes.dat
  2. The position (offset) is recorded in node_index.db
  3. To retrieve a node, we look up its offset and read from that position
  4. Everything is saved to disk automatically

Pros:

  • 💾 Data persists across restarts
  • 🔄 Can handle graphs larger than RAM
  • 📁 Simple file-based storage (no database required)

Cons:

  • 🐌 Slower than in-memory (disk I/O overhead)
  • 🔒 Not suitable for concurrent access from multiple processes
  • 📦 No compression (files can be large)

Good for:

  • Applications that need to save graph state
  • Graphs up to a few million nodes
  • Single-process applications

Not good for:

  • Real-time systems requiring sub-millisecond latency
  • Multi-process concurrent access
  • Distributed systems (use Neo4j or similar instead)

Constructors

FileGraphStore(string, WriteAheadLog?)

Initializes a new instance of the FileGraphStore<T> class.

public FileGraphStore(string storageDirectory, WriteAheadLog? wal = null)

Parameters

storageDirectory string

The directory where graph data files will be stored.

wal WriteAheadLog

Optional Write-Ahead Log for ACID transactions and crash recovery.

Properties

EdgeCount

Gets the total number of edges in the graph store.

public int EdgeCount { get; }

Property Value

int

Remarks

For Beginners: This tells you how many relationships/connections exist between entities in the graph.

NodeCount

Gets the total number of nodes in the graph store.

public int NodeCount { get; }

Property Value

int

Remarks

For Beginners: This tells you how many entities (people, places, things) are stored in the graph.

Methods

AddEdge(GraphEdge<T>)

Adds an edge to the graph representing a relationship between two nodes.

public void AddEdge(GraphEdge<T> edge)

Parameters

edge GraphEdge<T>

The edge to add.

Remarks

This method creates a relationship between two existing nodes. Both the source and target nodes must already exist in the graph, otherwise an exception is thrown. The edge is indexed for efficient traversal from both directions.

For Beginners: This adds a connection between two entities.

Like saying "Alice knows Bob":

  • edge.SourceId = "alice_001"
  • edge.RelationType = "KNOWS"
  • edge.TargetId = "bob_002"
  • edge.Weight = 0.9 (how strong the relationship is)

Both Alice and Bob must already be added as nodes first!

AddEdgeAsync(GraphEdge<T>)

Asynchronously adds an edge to the graph.

public Task AddEdgeAsync(GraphEdge<T> edge)

Parameters

edge GraphEdge<T>

The edge to add.

Returns

Task

A task representing the asynchronous operation.

AddNode(GraphNode<T>)

Adds a node to the graph or updates it if it already exists.

public void AddNode(GraphNode<T> node)

Parameters

node GraphNode<T>

The node to add.

Remarks

This method stores a node in the graph. If a node with the same ID already exists, it will be updated with the new data. The node is automatically indexed by its label for efficient label-based queries.

For Beginners: This adds a new entity to the graph.

Like adding a person to a social network:

  • node.Id = "alice_001"
  • node.Label = "PERSON"
  • node.Properties = { "name": "Alice Smith", "age": 30 }

If Alice already exists, her information gets updated.

AddNodeAsync(GraphNode<T>)

Asynchronously adds a node to the graph or updates it if it already exists.

public Task AddNodeAsync(GraphNode<T> node)

Parameters

node GraphNode<T>

The node to add.

Returns

Task

A task representing the asynchronous operation.

Remarks

This is the async version of AddNode(GraphNode<T>). Use this for file-based or database-backed stores to avoid blocking the thread during I/O operations.

For Beginners: This does the same as AddNode but doesn't block your app.

When should you use async?

  • FileGraphStore: Yes! (writes to disk)
  • MemoryGraphStore: Optional (no I/O, but provided for consistency)
  • Database stores: Definitely! (network I/O)

Example:

await store.AddNodeAsync(node);  // Non-blocking

Clear()

Removes all nodes and edges from the graph.

public void Clear()

Remarks

This method clears the entire graph, removing all data. For persistent stores, this may involve deleting files or database records. Use with extreme caution!

For Beginners: This deletes EVERYTHING from the graph.

Like wiping the entire social network clean - all people and all connections gone!

⚠️ WARNING: This cannot be undone! Make backups first!

ClearAsync()

Asynchronously removes all nodes and edges from the graph.

public Task ClearAsync()

Returns

Task

A task representing the asynchronous operation.

Dispose()

Disposes the file graph store, ensuring all changes are flushed to disk.

public void Dispose()

GetAllEdges()

Gets all edges currently stored in the graph.

public IEnumerable<GraphEdge<T>> GetAllEdges()

Returns

IEnumerable<GraphEdge<T>>

Collection of all edges.

Remarks

This method retrieves every edge without any filtering. Use with caution on large graphs as it may be memory-intensive.

For Beginners: This gets every single relationship in the graph.

Like asking: "Show me every connection between all entities"

Warning: Large graphs can have millions of relationships!

GetAllEdgesAsync()

Asynchronously gets all edges currently stored in the graph.

public Task<IEnumerable<GraphEdge<T>>> GetAllEdgesAsync()

Returns

Task<IEnumerable<GraphEdge<T>>>

A task that represents the asynchronous operation. The task result contains all edges.

GetAllNodes()

Gets all nodes currently stored in the graph.

public IEnumerable<GraphNode<T>> GetAllNodes()

Returns

IEnumerable<GraphNode<T>>

Collection of all nodes.

Remarks

This method retrieves every node without any filtering. Use with caution on large graphs as it may be memory-intensive.

For Beginners: This gets every single entity in the graph.

Like asking: "Show me everyone and everything in the network"

Warning: If you have millions of entities, this could be slow and use lots of memory!

GetAllNodesAsync()

Asynchronously gets all nodes currently stored in the graph.

public Task<IEnumerable<GraphNode<T>>> GetAllNodesAsync()

Returns

Task<IEnumerable<GraphNode<T>>>

A task that represents the asynchronous operation. The task result contains all nodes.

GetEdge(string)

Retrieves an edge by its unique identifier.

public GraphEdge<T>? GetEdge(string edgeId)

Parameters

edgeId string

The unique identifier of the edge.

Returns

GraphEdge<T>

The edge if found; otherwise, null.

Remarks

For Beginners: This gets a specific relationship if you know its ID.

Edge IDs are usually auto-generated like: "alice_001_KNOWS_bob_002"

GetEdgeAsync(string)

Asynchronously retrieves an edge by its unique identifier.

public Task<GraphEdge<T>?> GetEdgeAsync(string edgeId)

Parameters

edgeId string

The unique identifier of the edge.

Returns

Task<GraphEdge<T>>

A task that represents the asynchronous operation. The task result contains the edge if found; otherwise, null.

GetIncomingEdges(string)

Gets all incoming edges to a specific node.

public IEnumerable<GraphEdge<T>> GetIncomingEdges(string nodeId)

Parameters

nodeId string

The target node ID.

Returns

IEnumerable<GraphEdge<T>>

Collection of incoming edges to the node.

Remarks

Incoming edges represent relationships where this node is the target. For example, if Alice KNOWS Bob, the "KNOWS" edge is incoming to Bob.

For Beginners: This finds all relationships coming IN to an entity.

If you ask for Bob's incoming edges, you get:

  • Alice KNOWS Bob
  • Charlie WORKS_WITH Bob
  • CompanyY EMPLOYS Bob

These are relationships others have WITH Bob.

GetIncomingEdgesAsync(string)

Asynchronously gets all incoming edges to a specific node.

public Task<IEnumerable<GraphEdge<T>>> GetIncomingEdgesAsync(string nodeId)

Parameters

nodeId string

The target node ID.

Returns

Task<IEnumerable<GraphEdge<T>>>

A task that represents the asynchronous operation. The task result contains the collection of incoming edges.

GetNode(string)

Retrieves a node by its unique identifier.

public GraphNode<T>? GetNode(string nodeId)

Parameters

nodeId string

The unique identifier of the node.

Returns

GraphNode<T>

The node if found; otherwise, null.

Remarks

For Beginners: This gets a specific entity if you know its ID.

Like asking: "Show me the person with ID 'alice_001'"

GetNodeAsync(string)

Asynchronously retrieves a node by its unique identifier.

public Task<GraphNode<T>?> GetNodeAsync(string nodeId)

Parameters

nodeId string

The unique identifier of the node.

Returns

Task<GraphNode<T>>

A task that represents the asynchronous operation. The task result contains the node if found; otherwise, null.

GetNodesByLabel(string)

Gets all nodes with a specific label.

public IEnumerable<GraphNode<T>> GetNodesByLabel(string label)

Parameters

label string

The node label to filter by (e.g., "PERSON", "COMPANY", "LOCATION").

Returns

IEnumerable<GraphNode<T>>

Collection of nodes with the specified label.

Remarks

Labels are used to categorize nodes by type. This enables efficient queries like "find all PERSON nodes" or "find all COMPANY nodes".

For Beginners: This finds all entities of a specific type.

Like asking: "Show me all PERSON nodes" Returns: Alice, Bob, Charlie (all people in the graph)

Or: "Show me all COMPANY nodes" Returns: Microsoft, Google, Amazon (all companies)

Labels are like categories or tags for organizing your entities.

GetNodesByLabelAsync(string)

Asynchronously gets all nodes with a specific label.

public Task<IEnumerable<GraphNode<T>>> GetNodesByLabelAsync(string label)

Parameters

label string

The node label to filter by.

Returns

Task<IEnumerable<GraphNode<T>>>

A task that represents the asynchronous operation. The task result contains the collection of nodes with the specified label.

GetOutgoingEdges(string)

Gets all outgoing edges from a specific node.

public IEnumerable<GraphEdge<T>> GetOutgoingEdges(string nodeId)

Parameters

nodeId string

The source node ID.

Returns

IEnumerable<GraphEdge<T>>

Collection of outgoing edges from the node.

Remarks

Outgoing edges represent relationships where this node is the source. For example, if Alice KNOWS Bob, the "KNOWS" edge is outgoing from Alice.

For Beginners: This finds all relationships going OUT from an entity.

If you ask for Alice's outgoing edges, you get:

  • Alice KNOWS Bob
  • Alice WORKS_AT CompanyX
  • Alice LIVES_IN Seattle

These are things Alice does or has relationships with.

GetOutgoingEdgesAsync(string)

Asynchronously gets all outgoing edges from a specific node.

public Task<IEnumerable<GraphEdge<T>>> GetOutgoingEdgesAsync(string nodeId)

Parameters

nodeId string

The source node ID.

Returns

Task<IEnumerable<GraphEdge<T>>>

A task that represents the asynchronous operation. The task result contains the collection of outgoing edges.

RemoveEdge(string)

Removes an edge from the graph.

public bool RemoveEdge(string edgeId)

Parameters

edgeId string

The unique identifier of the edge to remove.

Returns

bool

True if the edge was found and removed; otherwise, false.

Remarks

For Beginners: This deletes a specific relationship.

Like saying "Alice no longer knows Bob" - removes just that connection, but Alice and Bob still exist in the graph.

RemoveEdgeAsync(string)

Asynchronously removes an edge from the graph.

public Task<bool> RemoveEdgeAsync(string edgeId)

Parameters

edgeId string

The unique identifier of the edge to remove.

Returns

Task<bool>

A task that represents the asynchronous operation. The task result is true if the edge was found and removed; otherwise, false.

RemoveNode(string)

Removes a node and all its connected edges from the graph.

public bool RemoveNode(string nodeId)

Parameters

nodeId string

The unique identifier of the node to remove.

Returns

bool

True if the node was found and removed; otherwise, false.

Remarks

This method removes a node and automatically cleans up all edges connected to it (both incoming and outgoing). This ensures the graph remains consistent.

For Beginners: This deletes an entity and all its connections.

Like removing Alice from the network:

  • Alice's profile is deleted
  • All "Alice KNOWS Bob" relationships are deleted
  • All "Bob KNOWS Alice" relationships are deleted

This keeps the graph clean - no broken connections!

RemoveNodeAsync(string)

Asynchronously removes a node and all its connected edges from the graph.

public Task<bool> RemoveNodeAsync(string nodeId)

Parameters

nodeId string

The unique identifier of the node to remove.

Returns

Task<bool>

A task that represents the asynchronous operation. The task result is true if the node was found and removed; otherwise, false.