CLI.utils.sentence_tokenizer¶
About:¶
The sina_sentence_tokenize command allows you to tokenize text into sentences using the SinaTools utility. It provides flexibility in tokenizing at different punctuation marks, including dots, question marks, and exclamation marks. It also allows tokenization at new lines.
Usage:¶
Below is the usage information that can be generated by running sina_sentence_tokenize –help.
Usage:
sina_sentence_tokenize --text=TEXT [options]
sina_sentence_tokenize --file=FILE [options]
Options:
--text TEXT
Text to be tokenized into sentences.
--file FILE
File containing the text to be tokenized into sentences
--dot
Tokenize at dots.
--new_line
Tokenize at new lines.
--question_mark
Tokenize at question marks.
--exclamation_mark
Tokenize at exclamation marks.
Examples:¶
sina_sentence_tokenize --text "Your text here. Does it work? Yes! Try with new lines." --dot --question_mark --exclamation_mark
sina_sentence_tokenize --file "path/to/your/file.txt" --dot --question_mark --exclamation_mark