Big Data & Tools with NoSQL
  • Big Data & Tools
  • ReadMe
  • Big Data Overview
    • Overview
    • Job Opportunities
    • What is Data?
    • How does it help?
    • Types of Data
    • The Big 4 V's
      • Variety
      • Volume
      • Velocity
      • Veracity
      • Other V's
    • Trending Technologies
    • Big Data Concerns
    • Big Data Challenges
    • Data Integration
    • Scaling
      • CAP Theorem
      • Optimistic concurrency
      • Eventual consistency
      • Concurrent vs. Parallel Programming
    • Big Data Tools
    • No SQL Databases
    • What does Big Data learning means?
  • Linux & Tools
    • Overview
    • Linux Commands - 01
    • Linux Commands - 02
    • AWK
    • CSVKIT
    • CSVSQL
    • CSVGREP
  • Data Format
    • Storage Formats
    • CSV/TSV/Parquet
    • Parquet Example
    • JSON
    • HTTP & REST API
      • Terms to Know
        • Statefulness
        • Statelessness
        • Monolithic Architecture
        • Microservices
        • Idempotency
    • REST API
    • Python
      • Setup
      • Decorator
      • Unit Testing
      • Flask Demo
      • Flask Demo - 01
      • Flask Demo - 02
      • Flask Demo - 03
      • Flask Demo - 04
      • Flask Demo - 06
    • API Testing
    • Flask Demo Testing
    • API Performance
    • API in Big Data World
  • NoSQL
    • Types of NoSQL Databases
    • Redis
      • Overview
      • Terms to know
      • Redis - (RDBMS) MySql
      • Redis Cache Demo
      • Use Cases
      • Data Structures
        • Strings
        • List
        • Set
        • Hash
        • Geospatial Index
        • Pub/Sub
        • Redis - Python
      • Redis JSON
      • Redis Search
      • Persistence
      • Databases
      • Timeseries
    • Neo4J
      • Introduction
      • Neo4J Terms
      • Software
      • Neo4J Components
      • Hello World
      • Examples
        • MySQL: Neo4J
        • Sample Transactions
        • Sample
        • Create Nodes
        • Update Nodes
        • Relation
        • Putting it all together
        • Commonly used Functions
        • Data Profiling
        • Queries
        • Python Scripts
      • More reading
    • MongoDB
      • Sample JSON
      • Introduction
      • Software
      • MongoDB Best Practices
      • MongoDB Commands
      • Insert Document
      • Querying MongoDB
      • Update & Remove
      • Import
      • Logical Operators
      • Data Types
      • Operators
      • Aggregation Pipeline
      • Further Reading
      • Fun Task
        • Sample
    • InfluxDB
      • Data Format
      • Scripts
  • Python
    • Python Classes
    • Serialization-Deserialization
  • Tools
    • JQ
    • DUCK DB
    • CICD Intro
    • CICD Tools
      • CI YAML
      • CD Yaml
    • Containers
      • VMs or Containers
      • What container does
      • Podman
      • Podman Examples
  • Cloud Everywhere
    • Overview
    • Types of Cloud Services
    • Challenges of Cloud Computing
    • High Availability
    • Azure Cloud
      • Services
      • Storages
      • Demo
    • Terraform
  • Data Engineering
    • Batch vs Streaming
    • Kafka
      • Introduction
      • Kafka Use Cases
      • Kafka Software
      • Python Scripts
      • Different types of Streaming
    • Quality & Governance
    • Medallion Architecture
    • Data Engineering Model
    • Data Mesh
  • Industry Trends
    • Roadmap - Data Engineer
    • Good Reads
      • IP & SUBNET
Powered by GitBook
On this page
  • How to measure data?
  • When does data become Big?
  1. Big Data Overview

What is Data?

Data is nothing but facts and figures. When the data is processed, it's called Information.

Everything is data.

Whatever we say, where ever we go, whatever we do, everything is captured and stored somewhere in some form.

How to measure data?

byte        - 1 letter
1 Kilobyte  - 1024 B
1 Megabyte  - 1024 KB
1 Gigabyte  - 1024 MB
1 Terabyte  - 1024 GB    
(1,099,511,627,776 Bytes)
1 Petabyte  - 1024 TB
1 Exabyte   - 1024 PB
1 Zettabyte - 1024 EB
1 Yottabyte - 1024 ZB

Examples of Traditional Data

  • Banking

  • Student

  • Employee

  • Customer

  • Sales

When does data become Big?

When does data expand?

  • Banking - Instead of one, if we start to consolidate all branches across the world. Ex: Citi Bank.

  • Student - Not just one college. Student info across the country. Ex: US News

  • Media to Social Media.

When does data get granular?

  • Monitor PC CPU/Memory usage every minute/second.

  • Cell phone data.

  • IoT (Internet of Things) sensor data.

  • Social Media posts.

  • Traffic Information.

Everything is now captured and used for analysis.

Why collect lots of data?

  • Hardware is getting cheaper and better.

  • Technology is advanced, so it enables us to consume more.

  • Companies want to pursue something new to grow their business.

PreviousJob OpportunitiesNextHow does it help?