Text summarization
Text summarization is the process of transforming an input document into a short summary, to help us understand the main content of the document in a short amount of time. Fundamentally, there are two types of summarization. One of them is extractive summarization and the other is abstractive summarization. We will briefly look at descriptions of these types of summaries.
Extractive summarization
In this type of summarization, the important phrases or keywords in a document are extracted and concatenated to get a short summary.
The main advantage is that it is simple and robust, since the extracted text is taken directly from the document. The disadvantage of this method is that we may not be able to obtain new paraphrasing, which produces clarity in the summary. In the next section, we will briefly look at extractive summarization using gensim
.
Summarization using gensim
Gensim has a summarizer that is based on an improved version of the TextRank algorithm by Rada Mihalcea...