Subscription

All Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Newsletter Hub

Free Learning

You're reading from Hands-On Graph Analytics with Neo4j Perform graph processing and visualization techniques using connected data across your enterprise

Product type Paperback

Published in Aug 2020

Publisher Packt

ISBN-13 9781839212611

Length 510 pages

Edition 1st Edition

Languages

Cypher

Tools

Neo4j

Concepts

Database Programming

Author (1):

Scifo

View More author details

Table of Contents (18) Chapters

Preface

1. Section 1: Graph Modeling with Neo4j

2. Graph Databases FREE CHAPTER

3. The Cypher Query Language

4. Empowering Your Business with Pure Cypher

5. Section 2: Graph Algorithms

6. The Graph Data Science Library and Path Finding

7. Spatial Data

8. Node Importance

9. Community Detection and Similarity Measures

10. Section 3: Machine Learning on Graphs

11. Using Graph-based Features in Machine Learning

12. Predicting Relationships

13. Graph Embedding - from Graphs to Matrices

14. Section 4: Neo4j for Production

15. Using Neo4j in Your Web Application

16. Neo4j at Scale

17. Other Books You May Enjoy

Leave a review - let other readers know what you think

One-hot encoding

Let's consider the following quotation, uttered by the famous character Detective Sherlock Holmes in the novel A Study in Scarlett, by Arthur Conan Doyle:

It is a capital mistake to theorize before one has data.

First, we will simplify this sentence by removing words that do not provide any information, such as a and the (known as stop words in NLP) and remove the conjugate form of the verbs:

be capital mistake theorize before one have data

In most cases, we would also order words, let's say into alphabetical order, and remove duplicates, which would leave us with the following words to encode:

be before capital date have mistake one theorize

In order to represent each word of this corpus with a vector, we can use the one-hot encoding technique. This involves creating a vector of size equal to the number of words in the corpus, with zeros everywhere except at the index of the word. This is illustrated in the following diagram:

The word be is...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at £13.99/month. Cancel anytime

Authors (1)

Scifo

Estelle Scifo possesses over 7 years experience as a data scientist, after receiving her PhD from the Laboratoire de lAcclrateur Linaire, Orsay (affiliated to CERN in Geneva). As a Neo4j certified professional, she uses graph databases on a daily basis and takes full advantage of its features to build efficient machine learning models out of this data. In addition, she is also a data science mentor to guide newcomers into the field. Her domain expertise and deep insight into the perspective of the beginners needs make her an excellent teacher.

See other products by Scifo