Packt+ | Advance your knowledge in tech

You're reading from Java Deep Learning Projects Implement 10 real-world deep learning applications using Deeplearning4j and open source APIs

Product type Paperback

Published in Jun 2018

Publisher Packt

ISBN-13 9781788997454

Length 436 pages

Edition 1st Edition

Languages

Java

Tools

Deeplearning4j

Concepts

Deep Learning

Author (1):

Md. Rezaul Karim

View More author details

Table of Contents (17) Chapters

Title Page

Packt Upsell

Contributors

Preface

1. Getting Started with Deep Learning FREE CHAPTER

2. Cancer Types Prediction Using Recurrent Type Networks

3. Multi-Label Image Classification Using Convolutional Neural Networks

4. Sentiment Analysis Using Word2Vec and LSTM Network

5. Transfer Learning for Image Classification

6. Real-Time Object Detection using YOLO, JavaCV, and DL4J

7. Stock Price Prediction Using LSTM Network

8. Distributed Deep Learning – Video Classification Using Convolutional LSTM Networks

9. Playing GridWorld Game Using Deep Reinforcement Learning

10. Developing Movie Recommendation Systems Using Factorization Machines

11. Discussion, Current Trends, and Outlook

1. Other Books You May Enjoy

Leave a review - let other readers know what you think

Index

Neural Q-learning

Most reinforcement learning algorithms boil down to just three main steps: infer, do, and learn. During the first step, the algorithm selects the best action a in a given state s using the knowledge it has so far. Next, it performs an action to find the reward r as well as the next state s'.

Then it improves its understanding of the world using the newly acquired knowledge (s, r, a, s'). These steps can be formulated even better using QLearning algorithms, which is more or less at the core of Deep Reinforcement Learning.

Introduction to QLearning

Computing the acquired knowledge using (s, r, a, s') is just a naive way to calculate the utility. So, we need to find a more robust way to compute it in such that we calculate the utility of a particular state-action pair (s, a) by recursively considering the utilities of future actions. The utility of your current action is influenced by not only the immediate reward but also the next best action, as shown in the following formula...