Subscription

All Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Newsletter Hub

Free Learning

You're reading from Hands-On Graph Analytics with Neo4j Perform graph processing and visualization techniques using connected data across your enterprise

Product type Paperback

Published in Aug 2020

Publisher Packt

ISBN-13 9781839212611

Length 510 pages

Edition 1st Edition

Languages

Cypher

Tools

Neo4j

Concepts

Database Programming

Author (1):

Scifo

View More author details

Table of Contents (18) Chapters

Preface

1. Section 1: Graph Modeling with Neo4j

2. Graph Databases FREE CHAPTER

3. The Cypher Query Language

4. Empowering Your Business with Pure Cypher

5. Section 2: Graph Algorithms

6. The Graph Data Science Library and Path Finding

7. Spatial Data

8. Node Importance

9. Community Detection and Similarity Measures

10. Section 3: Machine Learning on Graphs

11. Using Graph-based Features in Machine Learning

12. Predicting Relationships

13. Graph Embedding - from Graphs to Matrices

14. Section 4: Neo4j for Production

15. Using Neo4j in Your Web Application

16. Neo4j at Scale

17. Other Books You May Enjoy

Leave a review - let other readers know what you think

Training a model

In this chapter, we will use a simple decision tree classifier. It can be trained with scikit-learn using the following:

from sklearn.tree import DecisionTreeClassifier
clf = DecisionTreeClassifier(random_state=123, min_samples_leaf=10)
clf.fit(X_train, y_train)

However, if you run this code on our current dataset, you will receive some errors because a decision tree does not know how to handle NaN or missing data, and we have a couple of rows with missing information.

In order to fill these NaN values, we will use a SimpleImputer model, which will replace the NaN values with the mean value of each feature. Following the scikit-learn API, we need to train the transformer on our train sample:

from sklearn.impute import SimpleImputer
imp = SimpleImputer(strategy='mean')
imp.fit(X_train)

We then need to actually perform the transformation, on both our training and test samples:

X_train = imp.transform(X_train)
X_test = imp.transform(X_test)

Once the data has...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at £13.99/month. Cancel anytime

Authors (1)

Scifo

Estelle Scifo possesses over 7 years experience as a data scientist, after receiving her PhD from the Laboratoire de lAcclrateur Linaire, Orsay (affiliated to CERN in Geneva). As a Neo4j certified professional, she uses graph databases on a daily basis and takes full advantage of its features to build efficient machine learning models out of this data. In addition, she is also a data science mentor to guide newcomers into the field. Her domain expertise and deep insight into the perspective of the beginners needs make her an excellent teacher.

See other products by Scifo