Once the installation is done, we can start the MongoDB by creating the database and the table. Before going into the deep, let us gather some basic idea about the Mongo DB. What is MongoDB, how and where to use.
If compared to the traditional RDBMS or the Hive database. We have tables in the RDBMS database but in MongoDB we consider them as the collection. In the RDBMS we refer the rows or fields as data but in MongoDB, these are called a document. In MongoDB, we will work upon the collection and the documents. Let’s start the MongoDB
What is MongoDB:
MongoDB is an open-source, NoSQL database management tool which used to store the JSON like data in the form of the documents. MongoDB provides the non-structured query language. As MongoDB is a NoSQL database, fields can vary documents to document. In RDBMS we have tables and rows whereas in MongoDB we have collection and documents.
| RDBMS | MongoDB |
| Database | Database |
| Table | Collection |
| Row | Document |
MongoDB Architecture
MongoDB was developed by combining the critical capabilities of the traditional RDBMS system along with innovation of the NoSQL technologies
Mongo DB |
|
RDBMS Features |
NoSQL Features |
| Expressive query language Secondary indexes Strong consistency |
Flexibility Performance Scalability |
Features of MongoDB
- Vary in the fields in each document:
-
- Each collection consists of the documents, MongoDB provides feasibility in varying the number of the fields for each document. Size and the contents of one document can be different from other
- Schema-less collection:
-
- In MongoDB no need to define the schema of the collection at the time of creation. Fields can be created on the fly.
- Queries:
-
- Allows ad-hoc to run for the queries
- Indexing:
-
- Support Primary and Secondary indexing. Any field in the document can be indexes
- Replication:
-
- One of the import features of MongoDB. MongoDB provides high availability with the replica sets.
- It keeps two or more copies of the data. Each replica can acts as either Primary or Secondary at a time. Read and write operations are mainly performed at Primary by default.
- In case of the failure secondary acts as primary. Sometimes read operations are also performed by Secondary, but data is eventually consistent by default.
- Load Balancing:
-
- Another important feature of the MongoDB is automatic load balancing feature.
- It performs horizontal scaling because of the shards. Shards distribute the data across multiple physical partitions
- Data in the collection is distributed based on the shared keys choose by the user. Data is split into ranges and distribute among multiple shards
- File Storage:
- MongoDB used GridFS to store the data. Files are divided into chunk or the parts and stored as a document
- Document Oriented Storage:
- MongoDB uses BSON format which is JSON like format
- Capped Collection:
- Fixed-size collections in MongoDB are called a capped collection. The capped collection maintains the insertion order and acts as a circular queue on filling the files
- Transactions:
- Version 4.0 and above supports the multi-document ACID transformation.
In the next post, I will continue the MongoDB and start with the basic commands. Kindly share your feedback and questions and subscribe Hadoop Tech and our facebook page(Hadoop Tech) for more articles
