Proven Results. Delivered.
Real projects. Measurable outcomes. From legacy modernization to cost reduction, here's how I help data teams move faster and spend smarter.
Lakehouse Migration for SaaS ERP
SaaS ERP Provider
Challenge
A growing SaaS ERP company was struggling with their legacy data lake. Pipeline execution times were slow, data discoverability was poor, and the team spent more time troubleshooting than building new features.
Solution
I architected and scaled their analytical data platform from a legacy data lake to a modern Lakehouse architecture using PySpark, Apache Iceberg, AWS Glue, and Python. Implemented Medallion architecture patterns (Bronze, Silver, Gold) with clear separation of ingestion, refinement, and curated analytical datasets.
My Role
Data Platform Engineer – responsible for architecture design, pipeline development, orchestration, and infrastructure automation.
Key Deliverables
- 01Lakehouse architecture with Medallion pattern using Apache Iceberg
- 02Modular PySpark pipelines with configuration-driven jobs
- 03Apache Airflow orchestration on AWS ECS with scheduling and retries
- 04Terraform-based infrastructure for reproducible deployments
Cloud Lakehouse Platform for Energy Sector
Energy Tech Company
Challenge
An energy company needed a robust cloud-based data platform supporting both batch and streaming workloads. Existing pipelines were unstable, lacked proper monitoring, and had no clear data governance.
Solution
I designed and operated a cloud-based Lakehouse-style data platform supporting batch and streaming ingestion, transformation, and analytical serving. Implemented Medallion architecture, distributed processing pipelines, and comprehensive CI/CD workflows.
My Role
Senior Data Platform Engineer – owned platform architecture, pipeline development, infrastructure-as-code, and monitoring setup.
Key Deliverables
- 01Lakehouse platform with Delta Lake and Medallion architecture
- 02PySpark and Golang-based data pipelines with deterministic processing
- 03GitHub Actions CI/CD for automated testing and deployment
- 04Monitoring and alerting with structured logging and failure notifications
Azure Databricks Platform Optimization
Enterprise Consulting Client
Challenge
A consulting client had adopted Azure Databricks but faced inconsistent job performance, frequent pipeline failures, and no proper governance. Teams worked in silos with duplicate data and unpredictable costs.
Solution
I implemented production-grade Medallion Lakehouse architectures on Azure Databricks using Delta Lake and PySpark. Optimized cluster configurations, established data access controls, and created Git-based CI/CD workflows.
My Role
Data Engineer & Consultant – led architecture implementation, performance optimization, and conducted Spark workshops for client teams.
Key Deliverables
- 01Medallion Lakehouse architecture with Delta Lake
- 02Optimized Databricks cluster configurations and autoscaling policies
- 03Databricks Jobs with retry logic and dependency management
- 04Table ACLs and data masking for enterprise data access control
RAG-Based AI Agent for Customer Support
B2B SaaS Company
Challenge
A SaaS company's support team was overwhelmed with repetitive inquiries. Manual ticket handling was slow, inconsistent, and prevented the team from focusing on complex customer issues.
Solution
I integrated a serverless RAG-based AI agent architecture using OpenAI, LangChain, Qdrant, Airflow, and AWS Lambda. The system automated routine inquiries while maintaining quality through vector-based retrieval and contextual responses.
My Role
AI/ML Engineer – designed the RAG architecture, built the vector pipeline, and integrated with existing support infrastructure.
Key Deliverables
- 01RAG-based AI agent using LangChain and OpenAI
- 02Qdrant vector database for semantic search
- 03Airflow-orchestrated document ingestion pipeline
- 04AWS Lambda serverless deployment for cost efficiency