top of page

Projects

Gemini_Generated_Image_9tvbyu9tvbyu9tvb.png
Gemini_Generated_Image_r0mazvr0mazvr0ma.png
Gemini_Generated_Image_tofxqetofxqetofx.png
Screenshot 2026-01-06 154011.jpg

Automated Ingestion for Summer Splash

Designed and implemented an end-to-end automated ingestion pipeline in Microsoft Fabric to eliminate manual data handling for Urban Apparel’s “Summer Splash” marketing campaign. The solution ingests daily CSV performance data from an external source into a Fabric Lakehouse using a scheduled pipeline, ensuring timely, reliable, and error-free data availability for downstream analytics. This automation reduced operational risk, improved data freshness, and enabled marketing teams to generate insights without manual intervention.

Global Freight Forwarders: Incremental Ingestion of Logistics Data

This project delivers an automated, incremental data ingestion solution for Global Freight Forwarders (GFF) using Microsoft Fabric. The solution replaces a manual, inefficient workflow of processing raw JSON shipping logs with a robust pipeline that intelligently identifies and loads only new files into a Delta Lake table. The outcome enables the operations team to access consolidated, near real-time shipment data without manual intervention, ensuring data integrity and accelerating critical operational analysis.

Telecom Customer 360: Dataflow Gen2 Transformation

Explores how Dataflow Gen2 was used at TelcoPrime to rescue a strategic "Customer 360" initiative by transforming chaotic, unvalidated Bronze data into a trusted Silver layer. The solution implements a robust, low-code pipeline that standardizes complex address inconsistencies , resolves state-code variations through reference lookups , and sanitizes contaminated phone records. The resulting Customer_Silver table provides a validated, high-quality foundation that significantly reduces undeliverable mail for marketing and ensures accurate data features for predictive churn analytics.

The Workforce Intelligence Engine - People Analytics

This project demonstrates the use of PySpark to rescue Innovate Solutions Inc. from "hyper-growth" data paralysis after its global workforce tripled in two years. The solution replaces manual, error-prone spreadsheets with a centralized analytics engine that unifies fragmented employee and department data into a single source of truth. By implementing production-level code, the engine automates the analysis of salary distributions, hiring trends, and tenure across multiple continents. The resulting framework provides leadership with high-fidelity insights, transforming raw HR data into a scalable asset for real-time financial planning and talent management

Screenshot 2026-01-08 152252.jpg
Gemini_Generated_Image_o4jnb5o4jnb5o4jn.png
Gemini_Generated_Image_yapg5eyapg5eyapg.png
Gemini_Generated_Image_namgxnamgxnamgxn.png
Gemini_Generated_Image_hdt35ihdt35ihdt3.png

GreenGrid Solutions - The "Smart Pricing" Pilot

This project utilizes  Azure Databricks and PySpark Notebooks to rescue Green Grid Solutions from "smart grid" data paralysis during a critical 100-home pilot program. By processing raw files directly from Databricks Volumes, the solution replaces manual, error-prone Excel billing with an automated transformation engine. The logic cleans messy sensor data and secures sensitive customer information into a single source of truth. This framework provides leadership with high-fidelity billing insights while ensuring the pilot meets strict legal standards.

SanteFlux: The GDPR Privacy Crisis

SantéFlux, a European health-tech company, faced a critical GDPR compliance crisis while delivering real-time health analytics from millions of smartwatch readings. The business needed to generate a city-level health trends report to support executive decision-making, while ensuring that all personally identifiable information (PII) was properly masked or anonymized. At the same time, the solution had to scale efficiently under heavy data volumes and strict performance constraints using PySpark and Azure Databricks

Global Retail Inc. Building the End-to-End Lakehouse

Demonstrates how Microsoft Fabric was used to build a unified Lakehouse analytics platform, replacing a traditional transactional database–driven reporting approach using a Medallion architecture (Bronze, Silver, Gold). Fabric Pipelines are used for data ingestion, PySpark notebooks handle joining and de-normalizing data, surrogate key generation, and aggregations, and a Gold semantic model with Power BI dashboards supports reporting. A master pipeline orchestrates the end-to-end workflow at the final stage, ensuring the entire Lakehouse process runs in a controlled and automated manner.

Global Credit Corp- Warehouse API Ingestion with T-SQL

This case study focuses on building an automated data ingestion pipeline using Microsoft Fabric Data Warehouse. The goal is to fetch daily currency exchange rates from an external API and load them directly into a warehouse table using T-SQL.The solution handles calling the API in a loop for multiple dates, parsing JSON responses using OPENJSON, and storing the data in a structured table that can be joined with transaction data. The pipeline is designed to be automated, efficient, and reliable, with validation logic to avoid duplicate records.This case study demonstrates practical skills in API integration, T-SQL stored procedures, Fabric pipelines, and warehouse-based data ingestion.

Global Corp - PySpark Optimization & Delta Lake

This case study focuses on optimizing a critical finance data pipeline for a global organization using PySpark and Delta Lake in Microsoft Fabric. The pipeline is responsible for joining ten millions of accounts receivable transaction records with customer master data to generate daily aging reports for leadership. Performance and data reliability issues made the process slow and inconsistent. The solution involved diagnosing bottlenecks, optimizing PySpark joins, and using Delta Lake capabilities such as schema enforcement, time travel, restore, and table maintenance using OPTIMIZE and Z-ORDER to deliver a fast, reliable, and production-ready data pipeline.

Screenshot 2026-02-02 134929.jpg

Veritas Global - Compliance Investigation: T-SQL Analytics

This case study focuses on modernizing compliance reporting for a global legal and compliance organization using Microsoft Fabric Warehouse and advanced T-SQL analytics. The solution addressed challenges such as messy legacy data, deeply nested organizational hierarchies, and the need for accurate historical accountability in compliance investigations.A strong data foundation was built using  T-SQL, window functions, and fixed-depth hierarchy creation to support workload analysis and executive reporting. SCD Type 2 logic was implemented to track analyst roster changes over time, delivering reliable, audit-ready analytics in a scalable, production-ready framework.

bottom of page