๐ YouTube API โ Data Warehouse & Analytics Solution
This repository demonstrates a complete data pipeline that extracts data from the YouTube Data API, models it using the Medallion Architecture, and delivers business-ready insights via Grafana dashboards.
๐ฆ Project Summary
This project implements a modern analytics pipeline with:
- Medallion Architecture: Structured into Bronze, Silver, and Gold layers for scalable data processing.
- ETL Workflows: Automated extraction, transformation, and loading using Apache Airflow.
- Data Modeling: Dimensional modeling in PostgreSQL for optimized querying.
- Dashboards: Real-time reporting using Grafana, powered by SQL.
๐งฐ Tech Stack
- PostgreSQL โ Central data warehouse
- Apache Airflow โ Workflow orchestration
- Grafana โ Real-time data visualization
- Linux VM โ Compute environment for pipeline execution
- Python โ API ingestion & transformation logic
๐ฏ Project Objectives
Build a production-ready analytics solution to analyze YouTube channel and video performance:
- Source structured data from the YouTube Data API
- Clean, validate, and model for business intelligence
- Persist historical metrics (views, likes, etc.) for trend analysis
- Deliver actionable insights via dashboards and SQL queries
๐๏ธ Data Architecture (Medallion Model)
This project follows a Bronze โ Silver โ Gold pipeline:

๐น Bronze Layer
Raw ingestion from the YouTube API (JSON format)
๐ธ Silver Layer
Cleaned, validated, and structured data (see data flow and model below)
Data Flow

Data Model

๐ก Gold Layer
Aggregated data used to generate KPIs and dashboards in Grafana
-
Visualization Sample
๐ BI Use Cases
Dashboards and SQL queries answer key questions such as:
- What are the top-performing videos per channel?
- How is each channel performing over time?
- What are the daily trends for views and engagement?
๐ Repository Structure
โโโ README.md
โโโ channel_lists.py
โโโ channel_overview.py
โโโ channel_videos.py
โโโ __pycache__/ # Compiled Python files
โโโ project_files/
โ โโโ Architecture/ # Draw.io and PNG files for architecture
โ โโโ ddl_update_scripts/ # SQL DDLs and procedures
โ โโโ dim_channels.sql
โ โโโ dim_videos.sql
โ โโโ fct_subscribers_views_video_count.sql
โ โโโ fct_video_statistics.sql
โโโ requirements.txt # Python dependencies๐ Access the Code
Browse the full codebase here