Packt+ | Advance your knowledge in tech

You're reading from Hands-On Natural Language Processing with Python A practical guide to applying deep learning architectures to your NLP applications

Product type Paperback

Published in Jul 2018

Publisher Packt

ISBN-13 9781789139495

Length 312 pages

Edition 1st Edition

Languages

Processing

Tools

NLTK

Concepts

Deep Learning

Authors (2):

Rajalingappaa Shanmugamani

Rajesh Arumugam

View More author details

Table of Contents (20) Chapters

Title Page

Packt Upsell

Foreword

Contributors

Preface

1. Getting Started

2. Text Classification and POS Tagging Using NLTK FREE CHAPTER

3. Deep Learning and TensorFlow

4. Semantic Embedding Using Shallow Models

5. Text Classification Using LSTM

6. Searching and DeDuplicating Using CNNs

7. Named Entity Recognition Using Character LSTM

8. Text Generation and Summarization Using GRUs

9. Question-Answering and Chatbots Using Memory Networks

10. Machine Translation Using the Attention-Based Model

11. Speech Recognition Using DeepSpeech

12. Text-to-Speech Using Tacotron

13. Deploying Trained Models

1. Other Books You May Enjoy

Leave a review - let other readers know what you think

Index

Topic modeling

When we have a collection of documents for which we do not clearly know the categories, topic models help us to roughly find the categorization. The model treats each document as a mixture of topics, probably with one dominating topic.

For example, let's suppose we have the following sentences:

Eating fruits as snacks is a healthy habit
Exercising regularly is an important part of a healthy lifestyle
Grapefruit and oranges are citrus fruits

A topic model of these sentences may output the following:

Topic A: 40% healthy, 20% fruits, 10% snacks
Topic B: 20% Grapefruit, 20% oranges, 10% citrus
Sentence 1 and 2: 80% Topic A, 20% Topic B
Sentence 3: 100% Topic B

From the output of the model, we can guess that Topic A is about health and Topic B is about fruits. Though these topics are not known apriori, the model outputs corresponding probabilities for words associated with health, exercising, and fruits in the documents.

It is clear from these examples that topic modeling is an unsupervised...