Packt+ | Advance your knowledge in tech

You're reading from Hands-On Natural Language Processing with Python A practical guide to applying deep learning architectures to your NLP applications

Product type Paperback

Published in Jul 2018

Publisher Packt

ISBN-13 9781789139495

Length 312 pages

Edition 1st Edition

Languages

Processing

Tools

NLTK

Concepts

Deep Learning

Authors (2):

Rajalingappaa Shanmugamani

Rajesh Arumugam

View More author details

Table of Contents (20) Chapters

Title Page

Packt Upsell

Foreword

Contributors

Preface

1. Getting Started

2. Text Classification and POS Tagging Using NLTK FREE CHAPTER

3. Deep Learning and TensorFlow

4. Semantic Embedding Using Shallow Models

5. Text Classification Using LSTM

6. Searching and DeDuplicating Using CNNs

7. Named Entity Recognition Using Character LSTM

8. Text Generation and Summarization Using GRUs

9. Question-Answering and Chatbots Using Memory Networks

10. Machine Translation Using the Attention-Based Model

11. Speech Recognition Using DeepSpeech

12. Text-to-Speech Using Tacotron

13. Deploying Trained Models

1. Other Books You May Enjoy

Leave a review - let other readers know what you think

Index

Building an RNN model for speech recognition

We will be using the free-spoken digits audio dataset from https://github.com/Jakobovski/free-spoken-digit-dataset/tree/master/recordings for our basic model. Download the data to any directory on your system. In the example code, replace the path referring to the .wav file with the path you have copied the data to.

Note

Note that we have split the data into training data which includes 1,470 files and 30 for the test set.

Before we get into the details of the model itself, we will look at how to prepare it for the training. The most common preprocessing step used in practice is to transform the raw audio data into its frequency spectrum. The frequency spectrum or power spectrum is like a fingerprint for the data in which the raw audio is broken into constituent parts or frequencies. This representation helps in identifying which frequencies (high or low pitch) dominate (in power or energy) in the signal compared to others. We will now look at how...