Class TextProcessingHelper
Provides text processing utilities for splitting and tokenizing text.
public static class TextProcessingHelper
- Inheritance
-
TextProcessingHelper
- Inherited Members
Methods
SplitIntoSentences(string)
Splits text into sentences based on common sentence-ending punctuation. Handles periods, exclamation marks, and question marks followed by spaces or newlines.
public static List<string> SplitIntoSentences(string text)
Parameters
textstringThe text to split into sentences.
Returns
Tokenize(string)
Tokenizes text by splitting on whitespace and common punctuation marks. Converts text to lowercase and removes empty tokens.
public static List<string> Tokenize(string text)
Parameters
textstringThe text to tokenize.