Class RuleBasedContentClassifier<T>
- Namespace
- AiDotNet.AdversarialRobustness.Safety
- Assembly
- AiDotNet.dll
A rule-based content classifier that uses pattern matching for classification.
public class RuleBasedContentClassifier<T> : ContentClassifierBase<T>, IContentClassifier<T>, IModelSerializer
Type Parameters
TThe numeric data type used for calculations.
- Inheritance
-
RuleBasedContentClassifier<T>
- Implements
- Inherited Members
Remarks
This classifier serves as a baseline or fallback when ML models are not available. It uses configurable regex patterns to detect various categories of harmful content.
For Beginners: This is a simple classifier that looks for specific words and patterns in text. While it's less sophisticated than ML-based classifiers, it's fast, interpretable, and doesn't require training data.
Constructors
RuleBasedContentClassifier(Dictionary<string, List<string>>, double)
Initializes a new rule-based content classifier with custom patterns.
public RuleBasedContentClassifier(Dictionary<string, List<string>> categoryPatterns, double threshold = 0.5)
Parameters
categoryPatternsDictionary<string, List<string>>Dictionary mapping categories to their detection patterns.
thresholddoubleDetection threshold (default: 0.5).
RuleBasedContentClassifier(double)
Initializes a new rule-based content classifier with default patterns.
public RuleBasedContentClassifier(double threshold = 0.5)
Parameters
thresholddoubleDetection threshold (default: 0.5).
Methods
AddPattern(string, string)
Adds a detection pattern for a category.
public void AddPattern(string category, string pattern)
Parameters
Classify(Vector<T>)
Classifies content and returns the classification result.
public override ContentClassificationResult<T> Classify(Vector<T> content)
Parameters
contentVector<T>The content to classify as a vector representation.
Returns
- ContentClassificationResult<T>
The classification result with category predictions and confidence scores.
Remarks
Important: Vector classification is limited for rule-based pattern matching. This classifier uses regex patterns that match text content. When classifying vectors, the original text content is not recoverable, so pattern matching cannot detect harmful content effectively.
For reliable harmful content detection, use ClassifyText(string) instead.
ClassifyText(string)
Classifies content provided as text.
public override ContentClassificationResult<T> ClassifyText(string text)
Parameters
textstringThe text content to classify.
Returns
- ContentClassificationResult<T>
The classification result with category predictions and confidence scores.
ClearCategory(string)
Removes all patterns for a category.
public void ClearCategory(string category)
Parameters
categorystringThe category to clear.
Deserialize(byte[])
Loads a previously serialized model from binary data.
public override void Deserialize(byte[] data)
Parameters
databyte[]The byte array containing the serialized model data.
Remarks
This method takes binary data created by the Serialize method and uses it to restore a model to its previous state.
For Beginners: This is like opening a saved file to continue your work.
When you call this method:
- You provide the binary data (bytes) that was previously created by Serialize
- The model rebuilds itself using this data
- After deserializing, the model is exactly as it was when serialized
- It's ready to make predictions without needing to be trained again
For example:
- You download a pre-trained model file for detecting spam emails
- You deserialize this file into your application
- Immediately, your application can detect spam without any training
- The model has all the knowledge that was built into it by its original creator
This is particularly useful when:
- You want to use a model that took days to train
- You need to deploy the same model across multiple devices
- You're creating an application that non-technical users will use
Think of it like installing the brain of a trained expert directly into your application.
IsReady()
Checks if the classifier is ready to make predictions.
public override bool IsReady()
Returns
- bool
True if the model is loaded and ready, false otherwise.
LoadModel(string)
Loads the model from a file.
public override void LoadModel(string filePath)
Parameters
filePathstringThe path to the file containing the saved model.
Remarks
This method provides a convenient way to load a model directly from disk. It combines file I/O operations with deserialization.
For Beginners: This is like clicking "Open" in a document editor. Instead of manually reading from a file and then calling Deserialize(), this method does both steps for you.
Exceptions
- FileNotFoundException
Thrown when the specified file does not exist.
- IOException
Thrown when an I/O error occurs while reading from the file or when the file contains corrupted or invalid model data.
SaveModel(string)
Saves the model to a file.
public override void SaveModel(string filePath)
Parameters
filePathstringThe path where the model should be saved.
Remarks
This method provides a convenient way to save the model directly to disk. It combines serialization with file I/O operations.
For Beginners: This is like clicking "Save As" in a document editor. Instead of manually calling Serialize() and then writing to a file, this method does both steps for you.
Exceptions
- IOException
Thrown when an I/O error occurs while writing to the file.
- UnauthorizedAccessException
Thrown when the caller does not have the required permission to write to the specified file path.
Serialize()
Converts the current state of a machine learning model into a binary format.
public override byte[] Serialize()
Returns
- byte[]
A byte array containing the serialized model data.
Remarks
This method captures all the essential information about a trained model and converts it into a sequence of bytes that can be stored or transmitted.
For Beginners: This is like exporting your work to a file.
When you call this method:
- The model's current state (all its learned patterns and parameters) is captured
- This information is converted into a compact binary format (bytes)
- You can then save these bytes to a file, database, or send them over a network
For example:
- After training a model to recognize cats vs. dogs in images
- You can serialize the model to save all its learned knowledge
- Later, you can use this saved data to recreate the model exactly as it was
- The recreated model will make the same predictions as the original
Think of it like taking a snapshot of your model's brain at a specific moment in time.