A comprehensive SQL-based exploratory data analysis (EDA) project built on top of a production-ready Sales Data Warehouse. This project demonstrates advanced analytical techniques including time-series analysis, customer segmentation, product performance tracking, and business intelligence reporting using T-SQL.
- Project Overview
- Project Context
- Analysis Categories
- Key Features
- Key Insights & Metrics
- Prerequisites
- Next Steps
- Acknowledgments
- License
- Author
This project extends the Sales Data Warehouse (Medallion Architecture: Bronze β Silver β Gold) with a comprehensive suite of analytical queries and business intelligence views. The analysis focuses on extracting actionable insights from sales, customer, and product data to support data-driven decision-making.
Business Objectives:
- Understand customer behavior and lifecycle patterns
- Analyze product performance and profitability
- Track sales trends and temporal patterns
- Enable strategic segmentation for targeted marketing
- Create reusable analytical views for BI reporting
This analysis layer builds upon the Sales Data Warehouse project, which implements:
- Medallion Architecture (Bronze β Silver β Gold layers)
- Star Schema design with fact and dimension tables
- ETL pipelines for data integration from CRM and ERP systems
- Data quality framework ensuring analytical accuracy
For more details on the underlying data warehouse, see the main project documentation.
- Metadata discovery using
INFORMATION_SCHEMA - Table and column structure inspection
- Dimension cardinality profiling
- Distinct value enumeration for categorical attributes
- Customer demographics profiling (country, gender, marital status)
- Product hierarchy exploration (categories, subcategories, product lines)
- Date range boundaries for sales transactions
- Customer age distribution analysis
- Time span calculations for data coverage
- Core KPI calculations (total sales, quantity, average price)
- Customer and product counts
- Consolidated business performance report using
UNION ALL
- Aggregated metrics by dimensions (country, gender, category)
- Revenue distribution analysis
- Customer and product concentration patterns
- Top/Bottom performers identification
- Window function rankings (
DENSE_RANK(),ROW_NUMBER()) - Comparative analysis across products, customers, countries, and categories
- Temporal granularity analysis (yearly, monthly)
- Date aggregation techniques (
YEAR(),MONTH(),DATETRUNC(),FORMAT()) - Sales trend visualization preparation
- Running totals using
SUM() OVER() - Moving averages with window functions
- Cumulative metrics for trend analysis
- Product performance benchmarking against historical averages
- YoY growth/decline tracking using
LAG() - Performance classification (Above/Below Average, Increase/Decrease)
- Category sales contribution percentages
- Market share calculations
- Portfolio composition analysis
- Product cost tier segmentation (Budget/Mid-Range/Premium/Luxury)
- Customer lifecycle segmentation (VIP/Regular/New)
- Behavioral clustering using
CASElogic
- Comprehensive customer intelligence view
- RFM metrics (Recency, Frequency, Monetary)
- Demographic and behavioral segmentation
- Customer lifetime value indicators
- Product performance intelligence view
- Sales velocity and lifecycle metrics
- Profitability analysis
- Performance tier classification
- 18,484 unique customers across 6 countries
- Customer segmentation: VIP / Regular / New based on spend and tenure
- Age group distribution: Gen Z β Senior (6 categories)
- RFM analysis: Recency, Frequency, Monetary value tracking
- 295 active products across 4 categories
- Cost tiers: Budget (<β¬100) / Mid-Range (β¬100-β¬500) / Premium (β¬500-β¬1K) / Luxury (>β¬1K)
- Performance classification: High/Mid/Low performers based on revenue
- Product lifecycle tracking with last sale date and recency metrics
- 60,398 transactions spanning 48 months (2011-2014)
- Time-series analysis: yearly, monthly, daily granularities
- Running totals and moving averages for trend analysis
- Year-over-year comparisons for growth tracking
- Database: SQL Server 2017 or higher
- Data Warehouse: Existing implementation with Gold layer (star schema)
- Tables Required:
gold.fact_sales(sales transactions)gold.dim_customers(customer dimension)gold.dim_products(product dimension)
- Permissions:
SELECTon Gold layer,CREATE VIEWfor BI views
Planned Enhancements:
- π Tableau Dashboard: Interactive visualizations for sales performance, customer analytics, and product insights
- π Power BI Dashboard: Executive dashboards with KPIs, trends, and drill-down capabilities
- π BI Tool Integration: Direct connection of
gold.report_customersandgold.report_productsviews to visualization platforms
This project builds upon the foundational Sales Data Warehouse project, implementing best practices in:
- Data Warehousing (Medallion Architecture, Star Schema)
- SQL Analytics (Window Functions, CTEs, Advanced Aggregations)
- Business Intelligence (KPI Design, Segmentation, Reporting)
Special thanks to Baraa Khatib Salkini (Data With Baraa) for educational guidance in data engineering and analytics.
This project is licensed under the MIT License - see the LICENSE file for details.
Youhad Ayoub
- π GitHub: @YOUHAD08
- πΌ LinkedIn: Ayoub Youhad
- π§ Email: ayoubyouhad79@gmail.com
Last Updated: January 2026