Class ContentClassifierBase<T>
- Namespace
- AiDotNet.AdversarialRobustness.Safety
- Assembly
- AiDotNet.dll
Base class for ML-based content classifiers.
public abstract class ContentClassifierBase<T> : IContentClassifier<T>, IModelSerializer
Type Parameters
TThe numeric data type used for calculations.
- Inheritance
-
ContentClassifierBase<T>
- Implements
- Derived
- Inherited Members
Remarks
This abstract class provides common functionality for content classifiers, including threshold-based filtering, category management, and result formatting. Subclasses implement the actual ML model for classification.
For Beginners: This is a template that makes it easier to build different types of content classifiers. It handles the common tasks like comparing scores to thresholds and formatting results, so you can focus on the actual classification logic in your subclass.
Constructors
ContentClassifierBase(double, string[]?)
Initializes a new instance of the content classifier.
protected ContentClassifierBase(double threshold = 0.5, string[]? categories = null)
Parameters
thresholddoubleThe detection threshold (default: 0.5).
categoriesstring[]The supported categories.
Fields
DefaultCategories
The default categories for content classification.
protected static readonly string[] DefaultCategories
Field Value
- string[]
Remarks
Used as a static constant to avoid virtual calls in constructor. Subclasses can provide their own categories via the constructor parameter.
NumOps
Numeric operations for type T.
protected static readonly INumericOperations<T> NumOps
Field Value
- INumericOperations<T>
Properties
DetectionThreshold
The detection threshold for classifying content as harmful.
protected T DetectionThreshold { get; set; }
Property Value
- T
SupportedCategories
The supported content categories for this classifier.
protected string[] SupportedCategories { get; set; }
Property Value
- string[]
Methods
Classify(Vector<T>)
Classifies content and returns the classification result.
public abstract ContentClassificationResult<T> Classify(Vector<T> content)
Parameters
contentVector<T>The content to classify as a vector representation.
Returns
- ContentClassificationResult<T>
The classification result with category predictions and confidence scores.
ClassifyBatch(Matrix<T>)
Classifies a batch of content items.
public virtual ContentClassificationResult<T>[] ClassifyBatch(Matrix<T> contents)
Parameters
contentsMatrix<T>Matrix where each row is a content item to classify.
Returns
- ContentClassificationResult<T>[]
Array of classification results, one per input row.
ClassifyText(string)
Classifies content provided as text.
public virtual ContentClassificationResult<T> ClassifyText(string text)
Parameters
textstringThe text content to classify.
Returns
- ContentClassificationResult<T>
The classification result with category predictions and confidence scores.
CreateResultFromScores(Dictionary<string, T>)
Creates a classification result from category scores.
protected ContentClassificationResult<T> CreateResultFromScores(Dictionary<string, T> categoryScores)
Parameters
categoryScoresDictionary<string, T>Dictionary of category names to scores.
Returns
- ContentClassificationResult<T>
Formatted classification result.
Deserialize(byte[])
Loads a previously serialized model from binary data.
public abstract void Deserialize(byte[] data)
Parameters
databyte[]The byte array containing the serialized model data.
Remarks
This method takes binary data created by the Serialize method and uses it to restore a model to its previous state.
For Beginners: This is like opening a saved file to continue your work.
When you call this method:
- You provide the binary data (bytes) that was previously created by Serialize
- The model rebuilds itself using this data
- After deserializing, the model is exactly as it was when serialized
- It's ready to make predictions without needing to be trained again
For example:
- You download a pre-trained model file for detecting spam emails
- You deserialize this file into your application
- Immediately, your application can detect spam without any training
- The model has all the knowledge that was built into it by its original creator
This is particularly useful when:
- You want to use a model that took days to train
- You need to deploy the same model across multiple devices
- You're creating an application that non-technical users will use
Think of it like installing the brain of a trained expert directly into your application.
GetSupportedCategories()
Gets the list of content categories this classifier can detect.
public virtual string[] GetSupportedCategories()
Returns
- string[]
Array of category names supported by this classifier.
IsReady()
Checks if the classifier is ready to make predictions.
public abstract bool IsReady()
Returns
- bool
True if the model is loaded and ready, false otherwise.
LoadModel(string)
Loads the model from a file.
public abstract void LoadModel(string filePath)
Parameters
filePathstringThe path to the file containing the saved model.
Remarks
This method provides a convenient way to load a model directly from disk. It combines file I/O operations with deserialization.
For Beginners: This is like clicking "Open" in a document editor. Instead of manually reading from a file and then calling Deserialize(), this method does both steps for you.
Exceptions
- FileNotFoundException
Thrown when the specified file does not exist.
- IOException
Thrown when an I/O error occurs while reading from the file or when the file contains corrupted or invalid model data.
SaveModel(string)
Saves the model to a file.
public abstract void SaveModel(string filePath)
Parameters
filePathstringThe path where the model should be saved.
Remarks
This method provides a convenient way to save the model directly to disk. It combines serialization with file I/O operations.
For Beginners: This is like clicking "Save As" in a document editor. Instead of manually calling Serialize() and then writing to a file, this method does both steps for you.
Exceptions
- IOException
Thrown when an I/O error occurs while writing to the file.
- UnauthorizedAccessException
Thrown when the caller does not have the required permission to write to the specified file path.
Serialize()
Converts the current state of a machine learning model into a binary format.
public abstract byte[] Serialize()
Returns
- byte[]
A byte array containing the serialized model data.
Remarks
This method captures all the essential information about a trained model and converts it into a sequence of bytes that can be stored or transmitted.
For Beginners: This is like exporting your work to a file.
When you call this method:
- The model's current state (all its learned patterns and parameters) is captured
- This information is converted into a compact binary format (bytes)
- You can then save these bytes to a file, database, or send them over a network
For example:
- After training a model to recognize cats vs. dogs in images
- You can serialize the model to save all its learned knowledge
- Later, you can use this saved data to recreate the model exactly as it was
- The recreated model will make the same predictions as the original
Think of it like taking a snapshot of your model's brain at a specific moment in time.
TextToVector(string)
Converts text to a vector representation for classification.
protected virtual Vector<T> TextToVector(string text)
Parameters
textstringThe text to convert.
Returns
- Vector<T>
Vector representation of the text.
Remarks
Override this method to implement custom text encoding (e.g., tokenization, embeddings). The default implementation creates a simple character-frequency representation.