MongoDB
MongoDB is a NoSQL database, classified as document-oriented database created in 2007
The word MongoDB is originated from humongous
Introduction
MongoDB is written in C++ language
MongoDB pairs each key with a data structure named as document
MongoDB stores data in flexible, JSON-like documents. The document model map to the objects in your application code, making data easy to work with
MongoDB is a collection of documents
Collections in Mongo are equivalent to tables in relational databases. They can hold multiple JSON documents
Documents are equivalent to records or rows of data in SQL. While a SQL row can reference data in other tables, Mongo documents usually combine that in a document
Fields or attributes are similar to columns in a SQL table
Models are higher-order constructors that take a schema and create an instance of a document equivalent to records in a relational database
There is no schema enforcement in MongoDB by default (you may implement it). SQL defines a schema via the table definition. A mongoose schema is a document data structure (or shape of the document) that is enforced via the application layer
MongoDB stores document in a binary-encoded format termed as BSON (Binary JSON) - BSON and JSON. BSON is an extended format of JSON data model
MongoDB is a distributed database at its core, so it has high availability, horizontal scaling, and geographic distribution built in and easy to use
MongoDB creates database for each application and each database contains multiple collections of documents
In MongoDB everything is stored in one place
Tables are optional
Great for applications with lot of read and writes
It is very fast
MongoDB Flow:
WiredTiger Storage Engine is the default storage engine used by MongoDB, we can use any other engine also
MongoDB can be used by any user, but usually it is used by the Three A's of MongoDB:
- A - Application
- A - Analytics
- A - Admin
MongoDB is recommended:
where very minimal Total Cost of Ownership (TCO) is required
when a need for replication across multiple data centers globally
where rapid deployment and faster scaling are required
when a need for easy loading of data at the beginning and overtime is needed
when massive concurrency is demanded by a user
when no downtime can be tolerated
when the database needs to grow rapidly as per user needs
when high uncertainty in sizing exists
where seamless and consistent experience is expected
NoSQL - Overview
NoSQL provides a mechanism for retrieval and storage of data other than relational databases
- NoSQL does not use the relational model
- NoSQL runs on clusters
- NoSQL is mostly used in Big data and Real-time web applications
- Commonly used Data structures are Document, Graph, Key-Value, and Wide Column
Document Databases
Document-Oriented databases are a special type of NoSQL database used for managing semi-structured data
Document databases pair each key with a complex data structure commonly with a block of XML or JSON termed as a document
The document is the most basic unit of data here, its similar to a row in SQL
Document databases contain collections (kind of a table in SQL terms), each collection contains many documents
json{ "BookID": "987-1-556-5546", "Title": "The Hobbit, or There and Back Again", "Author": "J. R. R. Tolkien" }
Use Cases
- Event logging
- Blogs and website content management
- Web analytics or Real-Time analytics
- E-Commerce
Features
- Flexible data modelling
- Fast querying
- Faster write performance
Some of the commonly used Document databases are MongoDB, Couchbase, OrientDB, and RavenDB
Features of MongoDB
Indexing
Following are the indexes supported in MongoDB:
Default
_id
: Each collection contains an index named default_id
Single Field: Used for Single field or sort. Indexes can be either in ascending order or descending order
Compound Index: Used for multiple fields.
Multikey Index: These are used to index array data.
Geo-spacial Index: Indexes used are two dimensional and 2D sphere (geolocation).
Text Search Indexes
Hashed Indexes
Clustered Indexes
Load Balancing
Sharding is a technique used for distribute data across multiple servers
MongoDB supports Horizontal scaling by sharding
Mongo leverages Sharding for splitting up of a large collection among multiple servers
MongoDB supports deployments with very large data sets and high throughput operations through this
Capped Collections
MongoDB supports Capped collections. It is a fixed type collection that maintains insertion order once the specified size has arrived
It acts as a circular queue. In this collection, you can restrain the size of the collection, or you can put a limit on the size of a collection
Syntax:
db.createCollection(<CollectionName>, {capped: <true/false>, size: Number, max:number })
Two types of capped collections used are:
fs.files
is used to store the meta-datafs.chunks
is used to store the file chunks
Example:
Suppose we have an eCommerce application. We are logging user data and should restrict data not to go more than four documents. In such scenario, we use capped collection.
db.createCollection("LogUsers", {capped : true,size : 100, max :4})
Adhoc Querying
MongoDB supports:
Single value field
Range fields
Conditional operators
Regular expression search queries
Replication
MongoDB uses replica sets for high availability
Replica sets contain two or more copies of the data. Each replica set may act as a primary or secondary replica set. By default, read and write operations are performed on the primary replica. The secondary replica will maintain a copy of primary data
Storage Mechanisms
MongoDB supports different storage engines:
MMAPv1: Default Storage engine till MongoDB version 3.2
WiredTiger: Default storage engine starting from MongoDB 3.2
In-Memory Storage Engine: This storage engine will be available in Enterprise version. It retains documents in-memory
MongoDB uses GridFS specification for storing and retrieving large collections
GridFS is a special type of file system in which data can be stored within MongoDB collections. GridFS splits a larger file into smaller chunks and stores each chunk of data in a separate document with a size of 255k
Aggregation
In MongoDB, aggregation process records and return computed results.
Aggregation can be categorized as :
Pipeline Aggregation: Documents are piped through processing pipeline and executes in different stages and transforms the documents into a final aggregated result.
Map-Reduce: It splits a larger problem into smaller chunks and sends to different machines for processing. It comprises two phases: reduce and map.
Single Purpose: These operations will aggregate documents from a single collection.
MongoDB Architecture
The architecture of MongoDB comprises:
Application Driver
Databases
Collections
Documents
Indexes
Security Features
Storage Engine
Application Drivers
Drivers are client libraries that provide interfaces and methods for applications to interact with MongoDB database
Drivers will handle the translation of documents between BSON objects and mapping structures
C++, Java, .NET, Ruby, PHP, JavaScript, Node.js, Python, Perl, PHP, and Scala are some of the widely used drivers supported by MongoDB
Database
The database can be defined as a physical container of collections. MongoDB server can have one or more databases.
The default database for MongoDB is test. In the absence of any database, collections will be stored in the test database.
The command to check databases in MongoDB Server:
show dbs
Document
A document is a set of key-value pairs that support dynamic schema. A document is similar to Row in RDBMS. In Relational databases, schemas should be defined before we add any data whereas MongoDB allows the insertion of data without a predefined schema.
Dynamic schema implies that the documents stored in the database can have different fields, with different types for each field.
CRUD
C - Create operations
R - Read operations
U - Update operations
D - Delete/Restore operations
Create Operations
Creating a schema and inserting data are the operations performed.
We can add single or multiple entries in one go.
Each entry will have an auto generated _id added to the it. We can add over own _id by just including the value for it (Not Recommended).
REMEMBER
_id must always be unique for each entry.
db.collectionName.insertOne({.}, options)
- Creates a collection if not present and inserts one document.db.collectionName.insertMutiple([{.},..,{.}], options)
- Creates a collection if not present and inserts multiple documents.db.collectionName.insert({.}, options)
Read Operations
Reading operations include searching a document.
That is done using:
db.collectionName.find(filter, options)
- Get all matches.db.collectionName.findOne(filter, options)
- Get the first match.
There are read operations used mostly using the respective driver like:
db.collectionName.find().forEach((a) => {printjson(a)})
- Loop through and perform an operation on each entry.
find().foEach()
runs on the clients system, this mean find()
will fetch all the data and forEach()
is used to filter out only the necessary information. This is method eats up lot of network bandwidth as more data is received then necessary.
So, we use options inside the find()
function to get only the information that is necessary. Options are nothing but 0 or 1 flags against the particular field, where 1 means include the field and 0 include everything except this field. Like, db.collectionName.find({}, {email: 1, _id: 0})
.
When we fetch information form the database, a cursor (metafile that contains metadata) is returned.
Update Operations
Update operations include updating documents or replace a document.
That is done using:
db.collectionName.updateOne(filter, data, options)
- Update a document that matches the filters.db.collectionName.updateMany(filter, data, options)
- Update many documents that match the filters.db.collectionName.replaceOne(filter, data, options)
- Replace a document that matches the filters.
Delete Operations
Delete/Restore operations include deleting documents.
That is done using:
db.collectionName.deleteOne(filter, options)
- Delete a document that matches the filters.db.collectionName.deleteMany(filter, options)
- Delete documents that match the filters.
Delete operations need to used very carefully and add options to make it failsafe.
MongoDB Schema
Schema
Modelling Database
- What are the predefined data sets needed?
- Where the data is being used?
- How much filter is there per query?
- How many queries are being fired?
- How often will you change the data?
Database Design for Mere Mortals: A Hands-On Guide to Relational Database Design (3rd Edition)
Relations
Relation is a way in which one document interacts with another document.
- One to One Relationship
- One to Many Relationship
- Many to Many Relationship
Practices
First create a schema for the collection
The schema is stored in a separate file for each collection. Use singular noun for file names
Import mongoose module:
const mongoose = require("mongoose");
We are creating a schema, so lets define a constant for that:
const Schema = mongoose.Schema;
Now define your schema
javascriptconst StudentSchema = new Schema({ name: String; });
Now give a name for this model:
const Student = mongoose.model("student", StudentSchema);
Export this schema to use it other places to create Student
Now lets make a connection to the MongoDB
- Import mongoose, then connect to the database:
mongoose.connect("mongodb://localhost/<databaseName>", {useNewUrlParser: true});
- Import mongoose, then connect to the database:
Example
Create a new Database named:
mycustomers
:javascriptuse mycustomers
Add user to the database:
javascriptdb.createUser({ user: 'accountUser', pwd: passwordPrompt(), // Or "password" roles: ['readWrite', 'dbAdmin'], });
Add collections (similar to tables):
javascriptdb.createCollection('customers');
Add document to the collection:
javascriptdb.customers.insert({ first_name: 'Prabhu', last_name: 'Hiremath' });
Now update the document:
javascriptdb.customers.update( { first_name: 'Prabhu' }, { first_name: 'Prabhu', last_name: 'Hiremath', age: 25 } ); // or update a specific property db.customers.update({ first_name: 'Prabhu' }, { $set: { age: 25 } });
Remove a field:
javascriptdb.customers.update({ first_name: 'Prabhu' }, { $unset: { age: 1 } });
Keys in MongoDB need not be enclosed within "".
MongoDB Cheat Sheet
Show All Databases
show dbs
Show Current Database
db;
Create Or Switch Database
use acme
Drop
db.dropDatabase();
Create Collection
db.createCollection('posts');
Show Collections
show collections
Insert Row
db.posts.insert({
title: 'Post One',
body: 'Body of post one',
category: 'News',
tags: ['news', 'events'],
user: {
name: 'John Doe',
status: 'author',
},
date: Date(),
});
Insert Multiple Rows
db.posts.insertMany([
{
title: 'Post Two',
body: 'Body of post two',
category: 'Technology',
date: Date(),
},
{
title: 'Post Three',
body: 'Body of post three',
category: 'News',
date: Date(),
},
{
title: 'Post Four',
body: 'Body of post three',
category: 'Entertainment',
date: Date(),
},
]);
Get All Rows
db.posts.find();
Get All Rows Formatted
db.find().pretty();
Find Rows
db.posts.find({ category: 'News' });
Sort Rows
# asc
db.posts.find().sort({ title: 1 }).pretty()
# desc
db.posts.find().sort({ title: -1 }).pretty()
Count Rows
db.posts.find().count();
db.posts.find({ category: 'news' }).count();
Limit Rows
db.posts.find().limit(2).pretty();
Chaining
db.posts.find().limit(2).sort({ title: 1 }).pretty();
Foreach
db.posts.find().forEach(function (doc) {
print('Blog Post: ' + doc.title);
});
Find One Row
db.posts.findOne({ category: 'News' });
Find Specific Fields
db.posts.find(
{ title: 'Post One' },
{
title: 1,
author: 1,
}
);
Update Row
db.posts.update(
{ title: 'Post Two' },
{
title: 'Post Two',
body: 'New body for post 2',
date: Date(),
},
{
upsert: true,
}
);
Update Specific Field
db.posts.update(
{ title: 'Post Two' },
{
$set: {
body: 'Body for post 2',
category: 'Technology',
},
}
);
Increment Field ($inc)
db.posts.update(
{ title: 'Post Two' },
{
$inc: {
likes: 5,
},
}
);
Rename Field
db.posts.update(
{ title: 'Post Two' },
{
$rename: {
likes: 'views',
},
}
);
Delete Row
db.posts.remove({ title: 'Post Four' });
Sub-Documents
db.posts.update(
{ title: 'Post One' },
{
$set: {
comments: [
{
body: 'Comment One',
user: 'Mary Williams',
date: Date(),
},
{
body: 'Comment Two',
user: 'Harry White',
date: Date(),
},
],
},
}
);
Find By Element in Array ($elemMatch)
db.posts.find({
comments: {
$elemMatch: {
user: 'Mary Williams',
},
},
});
Add Index
db.posts.createIndex({ title: 'text' });
Text Search
db.posts.find({
$text: {
$search: '"Post O"',
},
});
Greater & Less Than
db.posts.find({ views: { $gt: 2 } });
db.posts.find({ views: { $gte: 7 } });
db.posts.find({ views: { $lt: 7 } });
db.posts.find({ views: { $lte: 7 } });