01
Azure Data Factory Retail ETL Pipeline
Built an end-to-end Azure Data Factory ETL pipeline to move Mockaroo-generated retail CSV data from Azure Blob Storage into Azure SQL Database. The pipeline loads data into staging tables, runs a SQL stored procedure to transform the data, and validates the final reporting table using SQL queries.
Azure Data FactoryAzure Blob StorageAzure SQL DatabaseSQLMockarooETL
→ View on GitHub
02
SQL Server to Azure SQL Database Migration Project
Migrated a local SQL Server retail database to Azure SQL Database using Python and pyodbc. Created SQL tables, stored procedures, validation queries, and data-quality checks to verify successful migration.
SQL ServerAzure SQL DatabasePythonSQLETL
→ View on GitHub
03
CRM Sales Data Lakehouse using Databricks
Built an end-to-end CRM sales analytics lakehouse using Databricks, PySpark, Delta Lake, and SQL. Designed Bronze, Silver, and Gold data layers to process customer, sales, product, employee, and support ticket data. Created Gold analytics tables and a Databricks dashboard to analyze revenue, customer value, product performance, regional sales, and support ticket trends.
DatabricksPySparkDelta LakeSQLETLData ENgineeringDashboard
→ View on GitHub
04
Canadian Data Analyst Job Market Analysis
Built a data analysis and automation project to study the Canadian data analyst job market. I collected job postings from public APIs, cleaned and analyzed the data using Python and Pandas, visualized insights in Tableau, and later extended the project with an n8n workflow to automate weekly job collection, duplicate removal, Google Sheets updates, and new job alerts.
PythonWeb ScrapingPandasTableauREST APINLPn8nJavascriptAutomation
05
Energy Production Data Warehouse & Analytics Dashboard
Designed and implemented an end-to-end SQL Server data warehouse to track and analyze asset downtime across energy production facilities. Built a fully automated ETL pipeline to ingest, clean, and transform raw operational data into structured fact and dimension tables. Developed an interactive Power BI dashboard that reduced manual reporting time by ~60% and enabled teams to identify high-downtime assets and pinpoint failure trends.
SQL ServerETLPower BIData Warehousing
→ View on GitHub
06
Hospital Admission Analysis
Analyzed 200,000+ New York hospital admission records to uncover patterns in patient volume, length of stay, and resource utilization across departments. Designed an interactive Tableau dashboard with drill-down filters by hospital, diagnosis, and time period. Key findings revealed significant weekend admission spikes and seasonal trends to inform staffing and bed allocation strategies.
TableauHealthcare AnalyticsEDA
→ View on GitHub
07
Fake Job Classifier
Built a machine learning classifier to detect fraudulent job postings, trained on 17,000+ job listings. Engineered features using TF-IDF and trained a Logistic Regression model with SMOTE to handle class imbalance, achieving AUC = 0.96. Deployed as a fully interactive Streamlit web application — paste any job description and get an instant real/fake prediction.
PythonNLPScikit-learnStreamlit
08
Employees Sales Performance Insights
Developed a complete BI pipeline simulating a realistic multi-channel sales environment. Used PL/SQL stored procedures to automate data aggregation, calculate KPIs, and generate ranked employee summaries across regions and product lines. Delivered a Tableau dashboard giving sales managers a clear view of top performers, underperforming channels, and revenue distribution.
TableauPL/SQLBusiness IntelligenceData Aggregation
→ View on GitHub
09
Crime Severity Index in Canadian Provinces
Conducted a multi-year analysis of Statistics Canada's Crime Severity Index across all Canadian provinces from 2019–2023, with a focused 2022 vs. 2023 year-over-year comparison. Built a Tableau visualization suite featuring choropleth maps and trend lines that communicate regional disparities — useful for policy researchers, journalists, and public safety analysts.
TableauPublic DataData StorytellingCanadian Stats
→ View on GitHub