Packt+ | Advance your knowledge in tech

You're reading from Seven NoSQL Databases in a Week Get up and running with the fundamentals and functionalities of seven of the most popular NoSQL databases

Product type Paperback

Published in Mar 2018

Publisher Packt

ISBN-13 9781787288867

Length 308 pages

Edition 1st Edition

Languages

Java

Tools

Cassandra

Concepts

Database Programming

Authors (2):

Sudarshan Kadambi

Xun (Brian) Wu

View More author details

Table of Contents (16) Chapters

Title Page

Dedication

Packt Upsell

Contributors

Preface

1. Introduction to NoSQL Databases FREE CHAPTER

2. MongoDB

3. Neo4j

4. Redis

5. Cassandra

6. HBase

7. DynamoDB

8. InfluxDB

1. Other Books You May Enjoy

Leave a review - let other readers know what you think

Index

A

ACID properties / ACID guarantees
advanced packaging tool (APT) / Installation using apt-get
Apache Software Foundation (ASF) / Introduction to Cassandra
Append-only Files (AOF) / Tunable data durability
applications, MongoDB
- user profiles / MongoDB documents
- product and catalog data / MongoDB documents
- metadata / MongoDB documents
- content / MongoDB documents
Atomicity, Consistency, Integrity, and Durability (ACID) / Modeling relational data
attributes, DynamoDB / Tables, items, and attributes
attributes, MongoDB / MongoDB documents
availability / Consistency versus availability
availability zone / Node configuration
AWS
- used, for setting up DynamoDB / Setting up using AWS

B

BATCH statements
- incorrect use / Incorrect use of BATCH statements
- ByteOrderedPartitioner, using / Using Byte Ordered Partitioner
- load balancer, using / Using a load balancer in front of Cassandra nodes
- framework driver, using / Using a framework driver
best practices, DynamoDB
- table best practices / Best practices
- item best practices / Best practices
- query and scan best practices / Best practices
- local secondary indexes best practices / Best practices
binary large object (BLOB) / Storing binary large object data
Byte Ordered Partitioner (BOP) / Using Byte Ordered Partitioner

C

cache sharding / Cache sharding
CAP theorem / ACID guarantees
Cassandra
- about / Introduction to Cassandra, Introduction to InfluxDB
- using / What problems does Cassandra solve?, Using Cassandra
- key features / What are the key features of Cassandra?
- tunable consistency / Tunable consistency
- data center awareness / Data center awareness
- linear scalability / Linear scalability
- building, with JVM / Built on the JVM
- use cases / Appropriate use cases for Cassandra
- internals, overview / Overview of the internals
- data modeling / Data modeling in Cassandra
- partition keys / Partition keys
- clustering keys / Clustering keys
- implementing / Putting it all together
- optimal use cases / Optimal use cases
- hardware, selecting / Cassandra hardware selection, installation, and configuration
- RAM / RAM
- CPU / CPU
- disk / Disk
- operating system / Operating system
- network/firewall / Network/firewall
- installation, with apt-get / Installation using apt-get
- tarball, installing / Tarball installation
- JVM, installing / JVM installation
- running / Running Cassandra
- node, adding to cluster / Adding a new node to the cluster
- nodetool / Nodetool
- CQLSH / CQLSH
- Python, using / Python
- Java, using / Java
- used, for backing up / Taking a backup with Cassandra
- snapshot, restoring from / Restoring from a snapshot
- executing, on Linux / Run Cassandra on Linux
- 7000 port / Open ports 7199, 7000, 7001, and 9042
- 7001 port / Open ports 7199, 7000, 7001, and 9042
- 7199 port / Open ports 7199, 7000, 7001, and 9042
- 9042 port / Open ports 7199, 7000, 7001, and 9042
- security, enabling / Enable security
- solid state drives (SSDs), using / Use solid state drives (SSDs) if possible
- seed nodes per data canter, configuring / Configure only one or two seed nodes per data center
- weekly repairs, scheduling / Schedule weekly repairs
- compaction, avoiding / Do not force a major compaction
- mutation / Remember that every mutation is a write
- data model / The data model is key
- support contract, considering / Consider a support contract
- references / References
Cassandra, anti-patterns
- about / Cassandra anti-patterns
- frequently-updated data / Frequently updated data
- frequently-deleted data / Frequently deleted data
- queues / Queues or queue-like data
- solutions, with query flexibility / Solutions requiring query flexibility
- solutions, with table scans / Solutions requiring full table scans
- BATCH statements, incorrect use / Incorrect use of BATCH statements
Cassandra Query Language (CQL) / Solutions requiring full table scans
Cassandra Query Language Shell (CQLSH) / CQLSH
casual clustering / Clustering, Causal clustering
clustering key / Data modeling in Cassandra
collections, MongoDB / MongoDB collections
comma-separated values (CSV) / CQLSH
compaction / Overview of the internals
components, HBase
- Zookeeper / Zookeeper
- HDFS / HDFS
Concurrent Mark and Sweep (CMS) / Configuration
conditional operator
- applying, on filter parameter / Applying conditional and logical operators on the filter parameter
consistency models, NoSQL databases
- strong consistency / Consistency versus availability
- timeline consistency / Consistency versus availability
- eventual consistency / Consistency versus availability
coprocessors, HBase
- observers / HBase coprocessors
- endpoints / HBase coprocessors
CRUD operations, DynamoDB / Data models and CRUD operations in DynamoDB
Cypher / Cypher

D

database, MongoDB / The MongoDB database
data models, InfluxDB / Data model and storage engine
data models, MongoDB
- about / Data models in MongoDB
- references document data model / The references document data model
- embedded data model / The embedded data model
data structure server / Introduction to Redis
data types, DynamoDB
- scalar type / Data types
- document types / Data types
- set types / Data types
data types, MongoDB
- null / MongoDB data types
- boolean / MongoDB data types
- number / MongoDB data types
- string / MongoDB data types
- date / MongoDB data types
- array / MongoDB data types
- Embedded document / MongoDB data types
documents, MongoDB / MongoDB documents
document types, DynamoDB
- list / Data types
- map / Data types
domain-specific language (DSL) / Kapacitor
downloadable DynamoDB
- versus DynamoDB web services / The difference between downloadable DynamoDB and DynamoDB web services
DynamoDB
- versus SQL / The difference between SQL and DynamoDB
- advantages / The difference between SQL and DynamoDB
- disadvantages / The difference between SQL and DynamoDB
- setting up / Setting up DynamoDB
- setting up, locally / Setting up locally
- setting up, AWS used / Setting up using AWS
- data types / DynamoDB data types and terminology, Data types
- tables / Tables, items, and attributes
- attributes / Tables, items, and attributes
- items / Tables, items, and attributes
- primary key / Primary key
- secondary indexes / Secondary indexes
- stream feature / Streams
- queries / Queries
- scan operation / Scan
- CRUD operations / Data models and CRUD operations in DynamoDB
- limitations / Limitations of DynamoDB
- best practices / Best practices
DynamoDB streams
- enabling / Streams
- disabling / Streams
DynamoDB web services
- versus downloadable DynamoDB / The difference between downloadable DynamoDB and DynamoDB web services

E

embedded data model / The embedded data model
Enterprise Management Associates (EMA) / Network management

F

features, Neo4j
- clustering / Clustering
- Neo4j Browser / Neo4j Browser
- cache sharding / Cache sharding
- help for beginners / Help for beginners
fields
- filters, applying on / Applying filters on fields
file system (FS) cache / How does Neo4j work?
filter parameter
- conditional operator, applying on / Applying conditional and logical operators on the filter parameter
- logical operators, applying on / Applying conditional and logical operators on the filter parameter
filters
- applying, on fields / Applying filters on fields
First In First Out (FIFO) / Queues

G

Garbage-First Garbage Collector (G1GC) / Node configuration
Gossiper / Introduction to Cassandra
graph database management systems (GDBMS) / Analytics

H

Hadoop Distributed File System (HDFS) / HDFS
hardware calculator feature, Neo4j, Inc.
- reference / Disk
hardware selection, Neo4j
- random access memory (RAM) / Random access memory
- CPU / CPU
- disk / Disk
- operating system / Operating system
- network/firewall / Network/firewall
HBase
- table / Architecture, Logical and physical data models
- namespace / Architecture, Logical and physical data models
- region / Architecture
- RegionServer / Architecture
- components / Components in the HBase stack
- Zookeeper, using / Zookeeper
- system trade-offs / System trade-offs
- logical data model / Logical and physical data models
- physical data model / Logical and physical data models
- high availability / HBase high availability
- replicated reads / Replicated reads
- in multiple regions / HBase in multiple regions
- coprocessors / HBase coprocessors
- versus SQL / SQL over HBase
HBase architecture / Architecture
HBase Client API / Interacting with HBase – the HBase Client API
HBase clusters
- interacting with / Interacting with secure HBase clusters
HBase compactions / HBase compactions
HBase master / HBase master
HBase read path / The HBase read path
HBase RegionServers / HBase RegionServers
HBase shell / Interacting with HBase – the HBase shell
HBase write path
- about / The HBase write path
- design motivation / HBase writes – design motivation
high-availability (HA) / How does Neo4j work?
Hive / Introduction to InfluxDB

I

Industrial Internet of Things (IIoT) / Introduction to InfluxDB
InfluxDB
- about / Introduction to InfluxDB
- key concepts / Key concepts and terms of InfluxDB
- data model / Data model and storage engine
- storage engine / Data model and storage engine, Storage engine
- installing / Installing InfluxDB
- installation link / Installing InfluxDB
- configuring / Configuring InfluxDB
- production deployment considerations / Production deployment considerations
- query language / Query language
- query pagination / Query pagination
- query performance optimizations / Query performance optimizations
- interaction, via REST API / Interaction via Rest API
- with Java client / InfluxDB with Java client
- with Python client / InfluxDB with a Python client
- with Go client / InfluxDB with Go client
- clustering and HA / Clustering and HA
- Retention Policy (RP) / Retention policy
- monitoring / Monitoring
InfluxDB API client / InfluxDB API client
InfluxDB ecosystem
- about / InfluxDB ecosystem
- Telegraf / Telegraf
- Kapacitor / Kapacitor
InfluxDB operations
- about / InfluxDB operations
- backup / Backups
- restore / Restore
Integrated Developer Environment (IDE) / Java
Internet of Things (IoT) / Introduction to InfluxDB
items, DynamoDB / Tables, items, and attributes

J

Java
- Neo4j, using with / Java
Java Management Extensions (JMX) / Built on the JVM
Java virtual machine (JVM) / Random access memory
Jedis
- reference / Java

K

Kapacitor / Kapacitor
key concepts, InfluxDB
- measurement / Key concepts and terms of InfluxDB
- field set / Key concepts and terms of InfluxDB
- field key / Key concepts and terms of InfluxDB
- field value / Key concepts and terms of InfluxDB
- tags / Key concepts and terms of InfluxDB
- continuous query / Key concepts and terms of InfluxDB
- line protocol / Key concepts and terms of InfluxDB
- point / Key concepts and terms of InfluxDB
- Retention Policy (RP) / Key concepts and terms of InfluxDB
- series / Key concepts and terms of InfluxDB
- timestamps / Key concepts and terms of InfluxDB
- Time Structured Merge (TSM) tree / Key concepts and terms of InfluxDB
- Write Ahead Log (WAL) / Key concepts and terms of InfluxDB

L

labels / What is Neo4j?
Last In First Out (LIFO) / Queues
LazyWebCypher loader
- reference / Cypher
least-frequently-used (LFU) policy / How does Neo4j work?
legacy indexes / Indexing everything
Linux
- Cassandra, executing / Run Cassandra on Linux
log-structured merge (LSM) database / In-place updates versus appends
Log-Structured Merge-Tree (LSM Tree) / Storage engine
logical operators
- applying, on filter parameter / Applying conditional and logical operators on the filter parameter

M

MongoDB
- download link / Installing of MongoDB
- installing / Installing of MongoDB
- data types / MongoDB data types
- database / The MongoDB database
- collections / MongoDB collections
- documents / MongoDB documents
- versus SQL / MongoDB documents
- advantages, over RDBMS / MongoDB documents
- uses / MongoDB documents
- applications / MongoDB documents
- limitations / MongoDB documents
- data models / Data models in MongoDB
- replication / Replication in MongoDB
- large data, storing / Storing large data in MongoDB
MongoDB CRUD operations
- create operation / The create operation
- read operation / The read operation
- update operation / The update operation
- delete operation / The delete operation
MongoDB indexing
- about / Introduction to MongoDB indexing
- default _id index / The default _id index
- single field / The default _id index
- compound index / The default _id index
- multikey index / The default _id index
- text indexes / The default _id index
- hashed index / The default _id index
- unique indexes / The default _id index
- partial indexes / The default _id index
- sparse index / The default _id index
- TTL index / The default _id index
- limitations / The default _id index

N

namespace, HBase / Architecture, Logical and physical data models
Neo4j
- about / What is Neo4j?
- working / How does Neo4j work?
- features / Features of Neo4j
- use cases / Evaluating your use case
- anti-patterns / Neo4j anti-patterns
- relational modeling techniques, applying / Applying relational modeling techniques in Neo4j
- using, for first time / Using Neo4j for the first time on something mission-critical
- entities, storing within entities / Storing entities and relationships within entities
- relationships, storing within entities / Storing entities and relationships within entities
- improper usage, of relationship types / Improper use of relationship types
- binary large object data, storing / Storing binary large object data
- indexes types / Indexing everything
- hardware selection / Neo4j hardware selection, installation, and configuration
- installing / Installation
- JVM, installing / Installing JVM
- configuration / Configuration
- high-availability clustering / High-availability clustering
- casual clustering / Causal clustering
- using / Using Neo4j
- using, with Python / Python
- using, with Java / Java
- backup, taking / Taking a backup with Neo4j
- restore, performing / Backup/restore with Neo4j Enterprise
- tips, for success / Tips for success
- references / Tips for success, References
Neo4j Browser
- about / Neo4j Browser
- running / Neo4j Browser
Neo4j Community
- backup, performing / Backup/restore with Neo4j Community
- restore, performing / Backup/restore with Neo4j Community
- versus Neo4j Enterprise / Backup/restore with Neo4j Community
Neo4j Enterprise
- backup, performing / Backup/restore with Neo4j Enterprise
network attached storage (NAS) / Disk
network partition tolerance / What problems does Cassandra solve?
node
- about / What is Neo4j?, Introduction to Cassandra
- configuring / Node configuration
node/relationship cache / How does Neo4j work?
nodetool / Using Cassandra
NoSQL databases
- consistency models / Consistency versus availability
- hash, versus range partition / Hash versus range partition
- update, versus append / In-place updates versus appends
- storage models, comparing / Row versus column versus column-family storage models
- strongly, versus loosely enforced schemas / Strongly versus loosely enforced schemas

O

online analytical processing (OLAP) / Analytics
online transaction processing (OLTP) / Analytics

P

partition key / Data modeling in Cassandra
primary key, DynamoDB
- about / Primary key
- partition key / Primary key
- composite primary key / Primary key
production deployment considerations, InfluxDB
- high availability / Production deployment considerations
- backups / Production deployment considerations
- security / Production deployment considerations
proof-of-concept (POC) / Tips for success
property graph model / What is Neo4j?
Python
- Neo4j, using with / Python
- Cassandra, working with / Python

Q

queries, DynamoDB / Queries
query language, InfluxDB / Query language
query pagination, InfluxDB / Query pagination
query performance optimizations, InfluxDB / Query performance optimizations

R

random access memory (RAM) / Random access memory
Redis, anti-patterns
- about / Redis anti-patterns
- dataset / Dataset cannot fit into RAM
- relational data, modeling / Modeling relational data
- improper connection management / Improper connection management
- security / Security
- KEYS command, using / Using the KEYS command
- network time, reducing / Unnecessary trips over the network
redundant array of independent disks (RAID) / Disk
references document data model / The references document data model
region, HBase / Architecture
RegionServer, HBase / Architecture
relational database management systems (RDBMS) / Analytics
relational databases
- ACID properties / ACID guarantees
relational modeling techniques
- applying, in Neo4j / Applying relational modeling techniques in Neo4j
REmote DIctionary Server (Redis)
- about / Introduction to Redis
- key features / What are the key features of Redis?
- performance / Performance
- tunable data durability / Tunable data durability
- publish/subscribe / Publish/Subscribe
- data types / Useful data types
- data, expiring over time / Expiring data over time
- counters / Counters, Counters
- server-side Lua scripting / Server-side Lua scripting
- use cases / Appropriate use cases for Redis
- data / Data fits into RAM
- data durability / Data durability is not a concern
- data, scaling / Data at scale
- data model / Simple data model
- use case / Features of Redis matching part of your use case
- used, for data modeling / Data modeling and application design with Redis
- used, for application design / Data modeling and application design with Redis
- data structures, advantages / Taking advantage of Redis' data structures
- queues / Queues
- sets / Sets
- notifications / Notifications
- caching / Caching
- setting up / Redis setup, installation, and configuration
- installation / Redis setup, installation, and configuration, Installation
- configuration / Redis setup, installation, and configuration
- virtualization, versus on-the-metal / Virtualization versus on-the-metal
- RAM / RAM
- CPU / CPU
- disk / Disk
- operating system / Operating system
- network/firewall / Network/firewall
- configuration files / Configuration files
- using / Using Redis
- redis-cli / redis-cli
- Lua / Lua
- Python / Python
- Java / Java
- used, for obtaining backup / Taking a backup with Redis
- restoring, from backup / Restoring from a backup
repair / Overview of the internals
replicated reads, HBase / Replicated reads
replication / Replication
replication, MongoDB
- about / Replication in MongoDB
- automatic failover / Automatic failover in replication
- read operations / Read operations
replication factor (RF) / CQLSH
Retention Policy (RP) / Retention policy
ring / Introduction to Cassandra

S

scalar types, DynamoDB
- string / Data types
- number / Data types
- Boolean / Data types
- binary / Data types
- null / Data types
scaling up / What problems does Cassandra solve?
scan operation, DynamoDB / Scan
schema indexes / Indexing everything
secondary indexes, DynamoDB
- global secondary index / Secondary indexes
- local secondary index / Secondary indexes
set types, DynamoDB / Data types
sharding
- about / Sharding
- components / Sharded clusters
- advantages / Advantages of sharding
Sorted String Table (SSTable) / Storage engine
SQL
- versus MongoDB / MongoDB documents
- versus HBase / SQL over HBase
- versus DynamoDB / The difference between SQL and DynamoDB
storage engine, InfluxDB / Data model and storage engine, Storage engine
stream feature, DynamoDB / Streams

T

tables, DynamoDB / Tables, items, and attributes
tables, HBase / Architecture, Logical and physical data models
Telegraf
- about / Telegraf
- data management / Telegraf data management
time-series data
- use case / Introduction to InfluxDB
Time Structured Merge Tree (TSM Tree) / Storage engine
time to live / Expiring data over time
tombstones / Overview of the internals
top-level / Introduction to Cassandra
transparent huge pages (THP) / Not disabling THP

U

Ubuntu, in VirtualBox
- reference / Installing InfluxDB
universally unique identifiers (UUIDs) / Data model and storage engine
Usage Data Collector (UDC) / Configuration
use cases, Neo4j
- social networks / Social networks
- matchmaking / Matchmaking
- network management / Network management
- analytics / Analytics
- recommendation engines / Recommendation engines