Summarizing text with OTS
The Open Text Summarizer (OTS) is an application that removes the fluff from a piece of text to create a succinct summary.
Getting ready
The ots
package is not part of most Linux standard distributions, but it can be installed with the following command:
apt-get install libots-devel
How to do it...
The OTS
application is easy to use. It reads text from a file or from stdin
and generates the summary to stdout
.
ots LongFile.txt | less
Or
cat LongFile.txt | ots | less
The OTS
application can also be used with curl
to summarize information from websites. For example, you can use ots
to summarize longwinded blogs:
curl http://BlogSite.org | sed -r 's/<[^>]+>//g' | ots | less
How it works...
The curl
command retrieves the page from a blog site and passes the page to sed
. The sed
command uses a regular expression to replace all the HTML tags, a string that starts with a less-than symbol and ends with a greater-than symbol, with a blank. The stripped text is passed to ots
...