Performing sentence splitting
Many NLP processes require splitting a large amount of text into sentences. This may seem to be a simple task, but for computers it can be problematic. A simple sentence splitter can look just for periods (.), or use other algorithms such as predictive classifiers. We will examine two means of sentence splitting with NLTK.
How to do it
We will use a sentence stored in thee 07/sentence1.txt
file. It has the following content, which was pulled from a random job listing on StackOverflow:
We are seeking developers with demonstrable experience in: ASP.NET, C#, SQL Server, and AngularJS. We are a fast-paced, highly iterative team that has to adapt quickly as our factory grows. We need people who are comfortable tackling new problems, innovating solutions, and interacting with every facet of the company on a daily basis. Creative, motivated, able to take responsibility and support the applications you create. Help us get rockets out the door faster!
The first example...