Big Data & Tools with NoSQL
  • Big Data & Tools
  • ReadMe
  • Big Data Overview
    • Overview
    • Job Opportunities
    • What is Data?
    • How does it help?
    • Types of Data
    • The Big 4 V's
      • Variety
      • Volume
      • Velocity
      • Veracity
      • Other V's
    • Trending Technologies
    • Big Data Concerns
    • Big Data Challenges
    • Data Integration
    • Scaling
      • CAP Theorem
      • Optimistic concurrency
      • Eventual consistency
      • Concurrent vs. Parallel Programming
    • Big Data Tools
    • No SQL Databases
    • What does Big Data learning means?
  • Linux & Tools
    • Overview
    • Linux Commands - 01
    • Linux Commands - 02
    • AWK
    • CSVKIT
    • CSVSQL
    • CSVGREP
  • Data Format
    • Storage Formats
    • CSV/TSV/Parquet
    • Parquet Example
    • JSON
    • HTTP & REST API
      • Terms to Know
        • Statefulness
        • Statelessness
        • Monolithic Architecture
        • Microservices
        • Idempotency
    • REST API
    • Python
      • Setup
      • Decorator
      • Unit Testing
      • Flask Demo
      • Flask Demo - 01
      • Flask Demo - 02
      • Flask Demo - 03
      • Flask Demo - 04
      • Flask Demo - 06
    • API Testing
    • Flask Demo Testing
    • API Performance
    • API in Big Data World
  • NoSQL
    • Types of NoSQL Databases
    • Redis
      • Overview
      • Terms to know
      • Redis - (RDBMS) MySql
      • Redis Cache Demo
      • Use Cases
      • Data Structures
        • Strings
        • List
        • Set
        • Hash
        • Geospatial Index
        • Pub/Sub
        • Redis - Python
      • Redis JSON
      • Redis Search
      • Persistence
      • Databases
      • Timeseries
    • Neo4J
      • Introduction
      • Neo4J Terms
      • Software
      • Neo4J Components
      • Hello World
      • Examples
        • MySQL: Neo4J
        • Sample Transactions
        • Sample
        • Create Nodes
        • Update Nodes
        • Relation
        • Putting it all together
        • Commonly used Functions
        • Data Profiling
        • Queries
        • Python Scripts
      • More reading
    • MongoDB
      • Sample JSON
      • Introduction
      • Software
      • MongoDB Best Practices
      • MongoDB Commands
      • Insert Document
      • Querying MongoDB
      • Update & Remove
      • Import
      • Logical Operators
      • Data Types
      • Operators
      • Aggregation Pipeline
      • Further Reading
      • Fun Task
        • Sample
    • InfluxDB
      • Data Format
      • Scripts
  • Python
    • Python Classes
    • Serialization-Deserialization
  • Tools
    • JQ
    • DUCK DB
    • CICD Intro
    • CICD Tools
      • CI YAML
      • CD Yaml
    • Containers
      • VMs or Containers
      • What container does
      • Podman
      • Podman Examples
  • Cloud Everywhere
    • Overview
    • Types of Cloud Services
    • Challenges of Cloud Computing
    • High Availability
    • Azure Cloud
      • Services
      • Storages
      • Demo
    • Terraform
  • Data Engineering
    • Batch vs Streaming
    • Kafka
      • Introduction
      • Kafka Use Cases
      • Kafka Software
      • Python Scripts
      • Different types of Streaming
    • Quality & Governance
    • Medallion Architecture
    • Data Engineering Model
    • Data Mesh
  • Industry Trends
    • Roadmap - Data Engineer
    • Good Reads
      • IP & SUBNET
Powered by GitBook
On this page
  1. NoSQL
  2. MongoDB

Introduction

MongoDB is a document-oriented database. It doesn't have any schema and stores documents in JSON format. It's intuitive for those familiar with JavaScript and easy to work with for storing complex, nested data—current version 7.0.

As a technology, MySQL and MongoDB are very different, but I will use MySQL references wherever possible to make it easier to understand.

MongoDB historically leans towards Consistency and Partition Tolerance (CP).

MySQL
MongoDB

Database

Database

Table

Collection

Row

Document

Column

Field

Index

Index

Pros

  1. Scalability: MongoDB is designed to scale horizontally by distributing data across multiple servers, making it suitable for handling large amounts of data and high traffic loads.

  2. Flexibility: MongoDB's document data model allows for flexible and dynamic schema design, making it easier to handle evolving data structures.

  3. High Performance: MongoDB's embedded data model and indexing capabilities can provide high read and write performance for specific workloads.

  4. Rich Query Language: MongoDB's query language supports various operations, including ad-hoc queries, text searches, and geospatial queries.

  5. Ease of Use: MongoDB's syntax and query language is relatively straightforward, making it easier for developers to learn and use than other NoSQL databases.

  6. Replication and High Availability: MongoDB supports built-in replication and automatic failover, ensuring high availability and data redundancy.

  7. Sharding: MongoDB's sharding feature allows for horizontal scaling by distributing data across multiple shards (partitions), enabling support for larger datasets.

Cons

  1. Lack of Strict Schema: While flexibility is a strength, the lack of a strict schema can lead to data inconsistencies and make it more challenging to maintain data integrity.

  2. Limited Transactions: MongoDB's transaction support was limited until version 4.0 (released in 2018), which introduced multi-document ACID transactions.

  3. Limited Join Support: MongoDB's document data model does not natively support joins, which can make it more challenging to handle complex relational data structures.

  4. Memory Usage: MongoDB's data model can lead to higher memory usage than traditional relational databases, especially for workloads with high write throughput or large documents.

  5. Potential Data Duplication: Denormalization, often used in MongoDB to improve read performance, can lead to data duplication and potential inconsistencies.

  6. Lack of Mature Tools: While the MongoDB ecosystem is growing, some developers may find the tooling and ecosystem less mature than long-established relational databases.

  7. Single Writer Per Shard: In sharded environments, MongoDB only allows a single writer per shard at a time, which can limit write scalability for specific workloads.

PreviousSample JSONNextSoftware

Last updated 1 year ago