Skip to content

Latest commit

 

History

History
28 lines (19 loc) · 548 Bytes

README.md

File metadata and controls

28 lines (19 loc) · 548 Bytes

Sentence Splitter

Script to split documents into sentences.

Setup

conda env create
conda activate sentence-splitter
spacy download en_core_web_sm

Usage

Pass an input file with one document per line:

./sentence_splitter.py INPUT_FILE > OUTPUT_FILE

The output will be one sentence per line, and documents will be separated by an empty line. You can alternatively pass the input file in stdin.

To see more information about the script and its available options, run:

./sentence_splitter.py --help