Table of Contents

Class MemoryNetwork<T>

Namespace
AiDotNet.NeuralNetworks
Assembly
AiDotNet.dll

Represents a Memory Network, a neural network architecture designed with explicit memory components for improved reasoning and question answering capabilities.

public class MemoryNetwork<T> : NeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable

Type Parameters

T

The numeric type used for calculations, typically float or double.

Inheritance
MemoryNetwork<T>
Implements
IFullModel<T, Tensor<T>, Tensor<T>>
IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>
IParameterizable<T, Tensor<T>, Tensor<T>>
ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>
IGradientComputable<T, Tensor<T>, Tensor<T>>
Inherited Members
Extension Methods

Remarks

Memory Networks combine neural network components with a long-term memory matrix that can be read from and written to. This architecture allows the network to store information persistently and access it selectively when needed, making it particularly effective for tasks requiring reasoning over facts or answering questions based on provided context.

For Beginners: A Memory Network is a special type of neural network that has its own "memory storage" component.

Think of it like a person who has:

  • A notebook (the memory) where they can write down important facts
  • The ability to read specific information from their notebook when needed
  • The ability to add new information to their notebook as they learn

For example, if you provided a Memory Network with several facts about a topic:

  • It would store these facts in its memory matrix
  • When asked a question, it would search its memory for relevant information
  • It would use this retrieved information to formulate an answer

Memory Networks are particularly good at:

  • Question answering based on a given context
  • Reasoning tasks that require remembering multiple facts
  • Dialog systems that need to maintain conversation history
  • Tasks where information needs to be remembered and used later

Constructors

MemoryNetwork(NeuralNetworkArchitecture<T>, int, int, ILossFunction<T>?)

Initializes a new instance of the MemoryNetwork<T> class with the specified architecture and memory parameters.

public MemoryNetwork(NeuralNetworkArchitecture<T> architecture, int memorySize, int embeddingSize, ILossFunction<T>? lossFunction = null)

Parameters

architecture NeuralNetworkArchitecture<T>

The neural network architecture defining the structure of the network.

memorySize int

The number of memory slots in the memory matrix.

embeddingSize int

The dimensionality of each memory embedding vector.

lossFunction ILossFunction<T>

Remarks

This constructor creates a Memory Network with the specified architecture and memory configuration. It initializes the memory matrix to the given dimensions and sets up the network layers based on the provided architecture or default configuration if no specific layers are provided.

For Beginners: This creates a new Memory Network with your chosen settings.

When creating a Memory Network, you specify three main things:

  1. Architecture: The basic structure of your network (input/output sizes, etc.)

  2. Memory Size: How many separate facts the network can remember

    • Like choosing how many pages your notebook has
    • More memory slots = more storage capacity
  3. Embedding Size: How detailed each stored fact can be

    • Like deciding how much detail to write on each notebook page
    • Larger embeddings = more detailed representations

Once created, the network initializes an empty memory matrix (like a blank notebook) and sets up the layers needed for processing inputs and interacting with memory.

Methods

AnswerQuestion(Tensor<T>)

Queries the memory network with a question and returns the answer.

public Tensor<T> AnswerQuestion(Tensor<T> question)

Parameters

question Tensor<T>

The question tensor.

Returns

Tensor<T>

The answer tensor.

Remarks

This method is a wrapper around the Predict method that is semantically meaningful for question answering tasks, which are a common use case for Memory Networks.

For Beginners: This asks the Memory Network a question and gets an answer.

When asking a question:

  • The question is processed through the network
  • The network accesses relevant information from its memory
  • It combines the question with retrieved memory to generate an answer

This is the most common way to use a Memory Network once it's trained, especially for question answering and reasoning tasks.

CreateNewInstance()

Creates a new instance of the Memory Network with the same architecture and configuration.

protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>

A new Memory Network instance with the same architecture and configuration.

Remarks

This method creates a new instance of the Memory Network with the same architecture and memory configuration as the current instance. It's used in scenarios where a fresh copy of the model is needed while maintaining the same configuration.

For Beginners: This method creates a brand new copy of the Memory Network with the same setup.

Think of it like creating a clone of the network:

  • The new network has the same architecture (structure)
  • It has the same memory size and embedding size
  • But it's a completely separate instance with its own memory matrix
  • The memory starts fresh (empty) rather than copying the current memory contents

This is useful when you want to:

  • Train multiple versions of the same memory network architecture
  • Start with a clean memory but the same network structure
  • Compare how different training approaches affect learning with the same configuration

DeserializeNetworkSpecificData(BinaryReader)

Deserializes memory network-specific data from a binary reader.

protected override void DeserializeNetworkSpecificData(BinaryReader reader)

Parameters

reader BinaryReader

The binary reader to read from.

Remarks

This method loads the state of a previously saved Memory Network from a binary stream. It restores memory network-specific parameters like the memory matrix contents, allowing the model to continue from exactly where it left off.

For Beginners: This loads a complete Memory Network from a saved file.

When loading the Memory Network:

  • Memory matrix contents are restored (all the stored facts)
  • Configuration parameters are restored
  • Neural network parameters are restored

This lets you:

  • Continue working with a model exactly where you left off
  • Use a model that someone else has trained
  • Deploy pre-trained models in applications
  • Restore the memory of facts the network had previously learned

GetModelMetadata()

Gets metadata about the memory network model.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>

A ModelMetaData object containing information about the memory network.

Remarks

This method returns comprehensive metadata about the Memory Network, including its architecture, memory configuration, and other relevant parameters. This information is useful for model management, tracking experiments, and reporting.

For Beginners: This provides detailed information about your Memory Network.

The metadata includes:

  • What this model is and what it does
  • Details about the network architecture
  • Information about the memory configuration
  • Statistics about the current memory state

This information is useful for:

  • Documentation
  • Comparing different memory network configurations
  • Debugging and analysis
  • Tracking memory usage and performance

InitializeLayers()

Initializes the layers of the Memory Network based on the provided architecture.

protected override void InitializeLayers()

Remarks

This method sets up the layers of the Memory Network. If the architecture provides specific layers, those are used directly. Otherwise, default layers appropriate for a Memory Network are created, including input encoding, memory reading, memory writing, and output layers configured with the specified memory parameters.

For Beginners: This method sets up the building blocks of your Memory Network.

When initializing the network:

  • If you provided specific layers in the architecture, those are used
  • If not, the network creates standard Memory Network layers automatically

The standard Memory Network layers typically include:

  1. Input Encoding Layer: Converts raw input into a format suitable for memory operations
  2. Memory Read Layer: Allows the network to retrieve relevant information from memory
  3. Memory Write Layer: Allows the network to update memory with new information
  4. Output Layer: Produces the final answer based on the input and retrieved memory

This process is like assembling all the components before the network starts processing data. Each layer has a specific role in how the network interacts with its memory.

Predict(Tensor<T>)

Processes input through the memory network to generate predictions.

public override Tensor<T> Predict(Tensor<T> input)

Parameters

input Tensor<T>

The input tensor to process.

Returns

Tensor<T>

The output tensor after processing.

Remarks

This method implements the forward pass of the Memory Network. It encodes the input, uses it to calculate attention over memory slots, retrieves relevant information from memory, and combines it with the input to generate the final output.

For Beginners: This method processes an input through the Memory Network to get an answer.

The prediction process works like this:

  1. Input Encoding: Convert the input (like a question) into an internal representation
  2. Memory Attention: Determine which parts of memory are most relevant to this input
  3. Memory Reading: Retrieve information from the most relevant memory slots
  4. Response Generation: Combine the input with the retrieved memory to generate an answer

This is similar to how you might answer a question:

  • First understand the question
  • Then recall relevant information from your memory
  • Finally formulate an answer based on both the question and what you remembered

SerializeNetworkSpecificData(BinaryWriter)

Serializes memory network-specific data to a binary writer.

protected override void SerializeNetworkSpecificData(BinaryWriter writer)

Parameters

writer BinaryWriter

The binary writer to write to.

Remarks

This method saves the state of the Memory Network to a binary stream. It serializes memory network-specific parameters like the memory matrix contents, allowing the complete state to be restored later.

For Beginners: This saves the complete state of your Memory Network to a file.

When saving the Memory Network:

  • Memory matrix contents are saved (all the stored facts and information)
  • Configuration parameters are saved
  • Neural network parameters are saved

This allows you to:

  • Save your progress and continue training later
  • Share trained models with others
  • Deploy models in applications
  • Preserve the memory of facts the network has learned

StoreFact(Tensor<T>)

Stores a new fact in memory.

public void StoreFact(Tensor<T> fact)

Parameters

fact Tensor<T>

The fact to store, as a tensor.

Remarks

This method adds a new fact to the memory of the network. It encodes the fact using the input encoding layers, finds the least recently used memory slot, and stores the encoded fact there. This allows for explicit memory updates beyond what happens during normal training.

For Beginners: This directly adds a new fact to the network's memory.

When adding a fact:

  • The fact is first encoded into an embedding (internal representation)
  • The system finds an appropriate memory slot to store it
  • The encoded fact is written to that memory slot

This provides a way to directly add knowledge to the memory network without having to train it on examples, which can be useful for quickly updating the network's knowledge base.

Train(Tensor<T>, Tensor<T>)

Trains the memory network on input-output pairs.

public override void Train(Tensor<T> input, Tensor<T> expectedOutput)

Parameters

input Tensor<T>

The input tensor for training.

expectedOutput Tensor<T>

The expected output tensor.

Remarks

This method trains the Memory Network using backpropagation. It performs a forward pass through the network, calculates the loss between predictions and expected outputs, computes gradients, and updates the network parameters.

For Beginners: This method teaches the Memory Network to give correct answers.

The training process works like this:

  1. Make a prediction using the current network parameters
  2. Compare the prediction to the expected output
  3. Calculate the error (how wrong the prediction was)
  4. Update the network parameters to reduce this error
  5. Optionally update memory with new information

Over time, this process helps the network:

  • Learn how to encode inputs effectively
  • Learn which memory slots to pay attention to for different inputs
  • Learn how to combine memory and input to produce correct outputs
  • Build up a useful memory of facts and information

UpdateParameters(Vector<T>)

Updates the parameters of all layers in the network using the provided parameter vector.

public override void UpdateParameters(Vector<T> parameters)

Parameters

parameters Vector<T>

A vector containing updated parameters for all layers.

Remarks

This method distributes the provided parameter values to each layer in the network. It extracts the appropriate segment of the parameter vector for each layer based on the layer's parameter count. This allows for updating the learned weights and biases in the network's layers after training.

For Beginners: This method updates all the learnable values in the network's layers.

During training, a Memory Network learns many values (called parameters) that determine how it processes information. These include:

  • How to encode inputs
  • How to determine which memory slots to access
  • How to update memory with new information
  • How to produce outputs based on memory and input

This method:

  1. Takes a long list of all these parameters
  2. Figures out which parameters belong to which layers
  3. Updates each layer with its corresponding parameters

Note that this updates the network's processing mechanisms but not the content of the memory itself. The memory content is updated during normal operation through the memory write layers.