Data Warehousing
  • Data Warehousing
  • Readme
  • Fundamentals
    • Terms to Know
    • Jobs
    • Skills needed for DW developer
    • Application Tiers
    • Operational Database
    • What is a Data Warehouse
      • Typical Data Architecture
      • Problem Statement
      • Features of Data Warehouse
      • Need for Data Warehouse
      • Current State of the Art
    • Activities of Data Science
    • Types of Data
    • Data Storage Systems
    • Data Warehouse 1980 - Current
    • Data Warehouse vs Data Mart
    • Data Warehouse Architecture
      • Top-Down Approach
      • Bottom-Up Approach
    • Data Warehouse Characteristic
      • Subject Oriented
      • Integrated
      • Time Variant
      • Non Volatile
    • Tools
    • Cloud vs On-Premise
    • Steps to design a Data Warehouse
      • Gather Requirements
      • Environment
      • Data Modeling
      • Choosing ETL / ELT Solution
      • Online Analytic Processing
      • Front End
      • Query Optimization
    • Dataset Examples
    • Thoughts on some data
  • RDBMS
    • Data Model
      • Entity Relationship Model
      • Attributes
      • Keys
      • Transaction
      • ACID
    • Online vs Batch
    • DSL vs GPL
    • Connect to Elvis
    • SQL Concepts
      • Basic Select - 1
      • Basic Select - 2
      • UNION Operators
      • Wild Cards & Distinct
      • Group By & Having
      • Sub Queries
      • Derived Tables
      • Views
    • Practice using SQLBolt
  • Cloud
    • Overview
    • Types of Cloud Services
    • Challenges of Cloud Computing
    • AWS
      • AWS Global Infrastructure
      • EC2
      • S3
      • IAM
    • Terraform
  • Spark - Databricks
    • Storage Formats
    • File Formats
    • Medallion Architecture
    • Delta
  • Data Warehousing Concepts
    • Dimensional Modelling
      • Star Schema
      • Galaxy Schema
      • Snowflake Schema
      • Starflake Schema
      • Star vs Snowflake
      • GRAIN
      • Multi-Fact Star Schema
      • Vertabelo Tool
    • Dimension - Fact
    • Sample Excercise
    • Keys
      • Why Surrogate Keys are Important
    • More Examples
    • Master Data Management
    • Steps of Dimensional Modeling
    • Types of Dimensions
      • Date Dimension Table
      • Degenerate Dimension
      • Junk Dimension
      • Static Dimension
      • Conformed Dimensions
      • Slowly Changing Dimensions
        • SCD - Type 0
        • SCD - Type 1
        • SCD - Type 2
        • SCD - Type 3
        • SCD - Type 4
        • SCD - Type 6
        • SCD - Type 5 - Fun Fact
      • Role Playing Dimension
      • Conformed vs Role Playing
      • Shrunken Dimension
      • Swappable Dimension
      • Step Dimension
    • Types of Facts
      • Factless Fact Table
      • Transaction Fact
      • Periodic Fact
      • Accumulating Snapshot Fact Table
      • Transaction vs Periodic vs Accumulating
      • Additive, Semi-Additive, Non-Additive
      • Periodic Snapshot vs Additive
      • Conformed Fact
    • Sample Data Architecture Diagram
    • Data Pipeline Models
    • New DW Concepts
Powered by GitBook
On this page
  1. Data Warehousing Concepts

New DW Concepts

  1. Cloud Data Warehousing: With the increasing popularity of cloud computing, cloud data warehousing has become a popular concept. It involves storing data in a cloud-based rather than an on-premise data warehouse. This allows for greater scalability, flexibility, and cost savings.

Examples: Databricks, Snowflake, Azure Synapse, and so on.

  1. Data Virtualization: Data virtualization is a technique that allows data to be accessed and integrated from multiple sources without the need for physical data movement or replication. This can help reduce data redundancy and improve data consistency.

  2. Self-Service BI: Self-service BI allows business users to access and analyze data without relying on IT or data analysts. This concept has become popular with user-friendly data visualization tools enabling users to create reports and dashboards.

  3. Big Data Analytics: Big data analytics involves using advanced analytics techniques to analyze large and complex datasets. This requires specialized tools and technologies, such as Hadoop and Spark, to process and analyze large volumes of data.

  4. Data Governance: Data governance involves establishing policies, standards, and procedures for managing data assets. This helps ensure data accuracy, consistency, and security and that data is used to align with organizational goals and objectives.

  5. Delta Sharing: With Delta Sharing, organizations can share their data with partners, customers, and other stakeholders without having to move or copy the data. This can help reduce data duplication and improve data governance while allowing for more collaborative and agile data sharing.

Overall, these new data warehousing concepts are focused on improving the speed, flexibility, and accessibility of data and ensuring that data is used in a way that supports organizational objectives.

  1. DataOps: DataOps is a methodology that emphasizes collaboration, automation, and monitoring to improve the speed and quality of data analytics. It combines DevOps and agile methods to create a more efficient and streamlined data pipeline.

  2. Data Mesh: Data Mesh is an architectural approach emphasizing decentralization and domain-driven data architecture design. It involves breaking down data silos and creating a more flexible and scalable data architecture that aligns with business needs.

  3. Augmented Analytics: Augmented analytics is a technique that uses machine learning and artificial intelligence to automate data preparation, insight generation, and insight sharing. It aims to improve the speed and accuracy of data analytics while reducing the reliance on data scientists and analysts.

  4. Real-time Data Warehousing: Real-time data warehousing involves using streaming data technologies like Apache Kafka to capture and process data in real-time. This enables organizations to analyze and act on data in real-time rather than waiting for batch processing cycles.

  5. Data Privacy and Ethics: Data privacy and ethics are becoming increasingly important in data warehousing and analytics. Organizations focus on ensuring that data is collected, stored, and used ethically and responsibly and that data privacy regulations, such as GDPR and CCPA, are followed.

These are just a few new data warehousing concepts emerging in response to the changing data landscape. As data volumes continue to grow and technologies continue to evolve, we can expect to see continued innovation in data warehousing and analytics.

PreviousData Pipeline Models

Last updated 2 years ago