





















































In this article by Ankur Goel, author of the book, Neo4j Cookbook, we will cover the following recipes:
By definition, any database that is optimized to store and query the data that represents objects defined in a geometric space is called a spatial database. Although Neo4j is primarily a graph database, due to the importance of geospatial data in today's world, the spatial extension has been introduced in Neo4j as an unmanaged extension. It gives you most of the facilities, which are provided by common geospatial databases along with the power of connectivity through edges, which Neo4j, as a graph database, provides.
In this article, we will take a look at some of the widely used use cases of Neo4j as a spatial database, and you will learn how typical geospatial operations can be performed on it.
Before proceeding further, you need to install Neo4j on your system using one of the recipies that follow here. The installation will depend on your system type.
(For more resources related to this topic, see here.)
Neo4j is a highly scalable graph database that runs over all the common platforms; it can be used as is or can be embedded inside applications as well. The following recipe will show you how to set up a single instance of Neo4j over the Linux operating system.
Perform the following steps to get started with this recipe:
$ wget http://dist.neo4j.org/neo4j-community-2.2.0-M02-unix.tar.gz
$ echo $JAVA_HOME
If this command throws no output, install Java for your Linux distribution and also set the JAVA_HOME path
Now, let's install Neo4j over the Linux operating system, which is simple, as shown in the following steps:
$ tar –zxvf neo4j-community-2.2.0-M02-unix.tar.gz
$ ls
$ cd <neo4j-community-2.2.0-M02>/bin/
$ ./neo4j start
$ ./neo4j status
Neo4j can also be monitored using the web console. Open http://<ip>:7474/webadmin, as shown in the following screenshot:
The preceding diagram is a screenshot of the web console of Neo4j through which the server can be monitored and different Cypher queries can be run over the graph database.
Neo4j comes with prebuilt binaries over the Linux operating system, which can be extracted and run over. Neo4j comes with both web-based and terminal-based consoles, over which the Neo4j graph database can be explored.
During installation, you may face several kind of issues, such as max open files and so on. For more information, check out http://neo4j.com/docs/stable/server-installation.html#linux-install.
Neo4j is a highly scalable graph database that runs over all the common platforms; it can be used as is or can be embedded inside applications. The following recipe will show you how to set up a single instance of Neo4j over the Windows operating system.
Perform the following steps to get started with this recipe:
This has both 32- and 64-bit prebuilt binaries
echo %JAVA_HOME%
If this command throws no output, install Java for your Windows distribution and also set the JAVA_HOME path
Now, let's install Neo4j over the Windows operating system, which is simple, as shown here:
The preceding screenshot shows the Windows installer running.
The preceding screenshot shows the Windows installer asking for the graph database's location.
Neo4j comes with prebuilt binaries over the Windows operating system, which can be extracted and run over. Neo4j comes with both web-based and terminal-based consoles, over which the Neo4j graph database can be explored.
During installation, you might face several kinds of issues such as max open files and so on. For more information, check out http://neo4j.com/docs/stable/server-installation.html#windows-install.
Neo4j is a highly scalable graph database that runs over all the common platforms; it can be used as in a mode and can also be embedded inside applications. The following recipe will show you how to set up a single instance of Neo4j over the OS X operating system.
Perform the following steps to get started with this recipe:
$ wget http://dist.neo4j.org/neo4j-community-2.2.0-M02-unix.tar.gz
$ echo $JAVA_HOME
If this command throws no output, install Java for your Mac OS X distribution and also set the JAVA_HOME path
Now, let's install Neo4j over the OS X operating system, which is very simple, as shown in the following steps:
$ tar –zxvf neo4j-community-2.2.0-M02-unix.tar.gz
$ ls
$ cd <neo4j-community-2.2.0-M02>/bin/
$ ./neo4j start
$ ./neo4j status
Neo4j comes with prebuilt binaries over the OS X operating system, which can be extracted and run over. Neo4j comes with both web-based and terminal-based consoles, over which the Neo4j graph database can be explored.
Neo4j over Mac OS X can also be installed using brew, which has been explained here.
Run the following commands over the shell:
$ brew update
$ brew install neo4j
After this, Neo4j can be started using the start option with the Neo4j command:
$ neo4j start
This will start the Neo4j server, which can be accessed from the default URL (http://localhost:7474).
The installation can be reached using the following commands:
$ cd /usr/local/Cellar/neo4j/
$ cd {NEO4J_VERSION}/libexec/
You can learn more about OS X installation from http://neo4j.com/docs/stable/server-installation.html#osx-install.
Due to the limitation of content that can provided in this article, we assume you would already know how to perform the basic operations using Neo4j such as creating a graph, importing data from different formats into Neo4j, the common configurations used for Neo4j.
Neo4j Spatial is a library of utilities for Neo4j that facilitates the enabling of spatial operations on the data. Even on the existing data, geospatial indexes can be added and many geospatial operations can be performed on it.
In this recipe, you will learn how to install the Neo4j Spatial extension.
The following steps will get you started with this recipe:
apt-get install maven
yum install apache-maven
Now, let's install the Neo4j Spatial plugin, which is very simple to do, by following these steps:
git clone git://github.com/neo4j/spatial spatial
cd spatial
mvn clean install
unzip target/neo4j/neo4j-spatial-0.11-SNAPSHOT-server-plugin.zip $NEO4J_ROOT_DIR/plugins/
$NEO4J_ROOT_DIR/bin/neo4j restart
curl –L http://<neo4j_server_ip>:<port>/db/data
If you are using Neo4j 2.2 or higher, then use the following command:
curl --user neo4j:<password> http://localhost:7474/db/data/
The output will look like what is shown in the following screenshot, which shows the Neo4j Spatial plugin installed:
Neo4j Spatial is a library of utilities that helps perform geospatial operations on the dataset, which is present in the Neo4j graph database. You can add geospatial indexes on the existing data and perform operations, such as data within a specified region or within some distance of point of interest.
Neo4j Spatial comes as an unmanaged extension, which can be easily installed as well as removed from Neo4j. The extension does not interfere with any of the core functionality.
To read more about Neo4j Spatial extension, we encourage users to visit the GitHub repository at https://github.com/neo4j-contrib/spatial.
Also, it will be good to read about the Neo4j unmanaged extension in general (http://neo4j.com/docs/stable/server-unmanaged-extensions.html).
The shapefile format is a popular geospatial vector data format for the Geographic Information System (GIS) software. It is developed and regulated by Esri as an open specification for data interoperability among Esri. It is very popular among GIS products, and many times, the data format is in Esri shapefiles.
The main file is the .shp file, which contains the geometry data. The binary data file consists of a single, fixed-length header followed by variable-length data records.
In this recipe, you will learn how to import the Esri shapefiles in the Neo4j graph database.
Perform the following steps to get started with this recipe:
$NEO4J_ROOT_DIR/bin/neo4j restart
Since the Esri shapefile format is, by default, supported by the Neo4j Spatial extension, it is very easy to import data using the Java API from it using the following steps:
Execute the following commands:
wget http://biogeo.ucdavis.edu/data/diva/adm/AFG_adm.zip
unzip AFG_adm.zip
mv AFG_adm1.* /data
GraphDatabaseService esri_database = new GraphDatabaseFactory().newEmbeddedDatabase(storeDir); try { ShapefileImporter importer = new ShapefileImporter(esri_database); importer.importFile("/data/AFG_adm1.shp", "layer_afganistan"); } finally { esri_database.shutdown(); }
File dir = new File("/data"); FilenameFilter filter = new FilenameFilter() { public boolean accept(File dir, String name) { return name.endsWith(".shp"); }}; File[] listOfFiles = dir.listFiles(filter); for (final File fileEntry : listOfFiles) { System.out.println("FileEntry Directory "+fileEntry); try { importer.importFile(fileEntry.toString(), "layer_afganistan"); } catch(Exception e){ esri_database.shutdown(); }}
The Neo4j Spatial extension natively supports the import of data in the shapefile format. Using the ShapefileImporter method, any SHP file can be easily imported into Neo4j. The ShapefileImporter method takes two arguments: the first argument is the path to the SHP files and the second is the layer in which it should be imported.
We will encourage you to read more about shapefiles and layers in general; for this, please visit the following URLs for more information:
OpenStreetMap is a powerhouse of data when it comes to geospatial data. It is a collaborative project to create a free, editable map of the world. OpenStreetMap provides geospatial data in the .osm file format. To read more about .osm files in general, check out http://wiki.openstreetmap.org/wiki/.osm.
In this recipe, you will learn how to import the .osm files in the Neo4j graph database.
Perform the following steps to get started with this recipe:
$NEO4J_ROOT_DIR/bin/neo4j restart
Since the OSM file format is, by default, supported by the Neo4j Spatial extension, it is very easy to import data from it using the following steps:
Execute the following commands:
wget http://download.geofabrik.de/africa-latest.osm.bz2
bunzip2 africa-latest.osm.bz2
mv africa-latest.osm /data
OSMImporter importer = new OSMImporter("africa");
try {
importer.importFile(osm_database, "/data/botswana-latest.osm", false, 5000, true);
} catch(Exception e){
osm_database.shutdown();
}
importer.reIndex(osm_database,10000);
File dir = new File("/data");
FilenameFilter filter = new FilenameFilter() {
public boolean accept(File dir, String name) {
return name.endsWith(".osm");
}};
File[] listOfFiles = dir.listFiles(filter);
for (final File fileEntry : listOfFiles) {
System.out.println("FileEntry Directory "+fileEntry);
try {importer.importFile(osm_database, fileEntry.toString(),
false, 5000, true);
importer.reIndex(osm_database,10000);
} catch(Exception e){
osm_database.shutdown();
}
This is slightly more complex as it requires two phases: the first phase requires a batch inserter performing insertions into the database, and the second phase requires reindexing of the database with the spatial indexes.
We will encourage you to read more about the OSM file and the batch inserter in general; for this, visit the following URLs:
The recipes that you have learned until now consist of Java code, which is used to import spatial data into Neo4j. However, by using any other programming language, such as Python or Ruby, spatial data can be easily imported into Neo4j using the REST interface.
In this recipe, you will learn how to import geospatial data using the REST interface.
Perform the following steps to get started with this recipe:
$NEO4J_ROOT_DIR/bin/neo4j restart
Using the REST interface is a very simple three-stage process to import the geospatial data into the Neo4j graph database server. For the sake of simplicity, the code of the Python language has been used to explain this recipe, although you can also use curl for this recipe:
# Create geom index
url = http://<neo4j_server_ip>:<port>/db/data/index/node/
payload= {
"name" : "geom",
"config" : {
"provider" : "spatial",
"geometry_type" : "point",
"lat" : "lat",
"lon" : "lon"
}
}
url = "http://<neo4j_server_ip>:<port>/db/data/node"
payload = {'lon': 38.6, 'lat': 67.88, 'name': 'abc'}
req = urllib2.Request(url)
req.add_header('Content-Type', 'application/json')
response = urllib2.urlopen(req, json.dumps(payload))
node = json.loads(response.read())['self']
#add node to geom index
url = "http://<neo4j_server_ip>:<port>/db/data/index/node/geom"
payload = {'value': 'dummy', 'key': 'dummy', 'uri': node}
req = urllib2.Request(url)
req.add_header('Content-Type', 'application/json')
response = urllib2.urlopen(req, json.dumps(payload))
print response.read()
The data will look like what is shown in the following screenshot after the addition of a few more nodes; this screenshot depicts the Neo4j Spatial data that has been imported:
The following screenshot depicts the properties of a single node, which has been imported into Neo4j:
Adding geospatial data using the REST API is a three-step process, listed as follows:
http://<neo4j_server_ip>:<port>/db/data/index/node/
http://<neo4j_server_ip>:<port>/db/data/node
http://<neo4j_server_ip>:<port>/db/data/index/node/geom
We encourage you to read more about the spatial REST interfaces in general (http://neo4j-contrib.github.io/spatial/).
In this recipe, you will learn how to create a point layer using the REST API interface.
Perform the following steps to get started with this recipe:
$NEO4J_ROOT_DIR/bin/neo4j restart
In this recipe, we will use the http://<neo4j_server_ip>:/db/data/ext/ SpatialPlugin/graphdb/addSimplePointlayer endpoint to create a simple point layer.
Let's add a simple point layer, as shown in the following code:
"layer" : "geom",
"lat" : "lat" ,
"lon" : "lon",url = "http://<neo4j_server_ip>:<port>//db/data/ext/SpatialPlugin/graphdb/addSimplePointlayer
payload= {
"layer" : "geom",
"lat" : "lat" ,
"lon" : "lon",
}
r = requests.post(url, data=json.dumps(payload), headers=headers)
The data will look like what is shown in the following screenshot; this screenshot shows the output of the create point in layer query:
Creating a point in the layer query is based on the REST interface, which the Neo4j Spatial plugin already provides with it.
We will encourage you to read more about spatial REST interfaces in general; to do this, visit http://neo4j-contrib.github.io/spatial/.
In this recipe, you will learn how to find all the geometries within the bounding box using the spatial REST interface.
Perform the following steps to get started with this recipe:
$NEO4J_ROOT_DIR/bin/neo4j restart
In this recipe, we will use the following endpoint to find all the geometries within the bounding box:http://<neo4j_server_ip>:<port>/db/data/ext/SpatialPlugin/graphdb/findGeometriesInBBox.
Let's find the all the geometries, using the following information:
"minx" : 0.0,
"maxx" : 100.0,
"miny" : 0.0,
"maxy" : 100.0
url = "http://<neo4j_server_ip>:<port>//db/data/ext/SpatialPlugin/graphdb
payload= {
"layer" : "geom",
"minx" : 0.0,
"maxx" : 100.3,
"miny" : 0.0,
"maxy" : 100.0
}
r = requests.post(url, data=json.dumps(payload), headers=headers)
The data will look like what is shown in the following screenshot; this screenshot shows the output of the bounding box query:
Finding geometries in the bounding box is based on the REST interface, which the Neo4j Spatial plugin provides. The output of the REST call contains an array of the nodes, containing the node's id, lat/lng, and its incoming/outgoing relationships. In the preceding output, you can see node id54 returned as the output.
We will encourage you to read more about spatial REST interfaces in general; to do this, visit http://neo4j-contrib.github.io/spatial/.
In this recipe, you will learn how to find all the geometries within a distance using the spatial REST interface.
Perform the following steps to get started with this recipe:
$NEO4J_ROOT_DIR/bin/neo4j restart
In this recipe, we will use the following endpoint to find all the geometries within a certain distance: http://<neo4j_server_ip>:<port>/db/data/ext/SpatialPlugin/graphdb/findGeometriesWithinDistance.
Let's find all the geometries between the specified distance using the following information:
"pointX" : -116.67,
"pointY" : 46.89,
"distanceinKm" : 500,
url = "http://<neo4j_server_ip>:<port>//db/data/ext/SpatialPlugin/graphdb/findGeometriesWithinDistance
payload= {
"layer" : "geom",
"pointY" : 46.8625,
"pointX" : -114.0117,
"distanceInKm" : 500,
}
r = requests.post(url, data=json.dumps(payload), headers=headers)
The data will look like what is shown in the following screenshot; this screenshot shows the output of a withinDistance query:
Finding geometries within a distance is based on the REST interface that the Neo4j Spatial plugin provides. The output of the REST call contains an array of the nodes, containing the node's id, lat/lng, and its incoming/outgoing relationships. In the preceding output, we can see node id71 returned as the output.
We encourage you to read more about the spatial REST interfaces in general (http://neo4j-contrib.github.io/spatial/).
In this recipe, you will learn how to find all the geometries within a distance using the Cypher query.
Perform the following steps to get started with this recipe:
$NEO4J_ROOT_DIR/bin/neo4j restart
In this recipe, we will use the following endpoint to find all the geometries within a certain distance:
http://<neo4j_server_ip>:<port>/db/data/cipher
Let's find all the geometries within a distance using a Cypher query:
"pointX" : -116.67,
"pointY" : 46.89,
"distanceinKm" : 500,
url = "http://<neo4j_server_ip>:<port>//db/data/cypher
payload= {
"query" : "START n=node:geom('withinDistance:[46.9163, -114.0905, 500.0]') RETURN n"
}
r = requests.post(url, data=json.dumps(payload), headers=headers)
The data will look like what is shown in the following screenshot; this screenshot shows the output of the withinDistance query that uses Cypher:
The following is the Cypher output in the Neo4j console:
Cypher comes with a withinDistance query, which takes three parameters: lat, lon, and search distance.
We will encourage you to read more about the spatial REST interfaces in general (http://neo4j-contrib.github.io/spatial/).
Developing Location-based Services with Neo4j, teaches you the most important aspect of today's data, location, and how to deal with it in Neo4j. You have learnt how to import geospatial data into Neo4j and run queries, such as proximity searches, bounding boxes, and so on.
Further resources on this subject: Recommender systems dissected