Azure Data Engineer at KamerAI (2025-06 – Present)
Azure Data Engineer at Colan Infotech (2022-03 – 2025-05)
L3 Data Engineer at PepsiCo (2025-06 – Present)
L3 Data Engineer at PepsiCo
- Converted a high-frequency CSV ingestion ADF pipeline into a Spark Structured Streaming workflow using Auto Loader, which resulted in $10,593.6 monthly savings in Databricks and VM costs.
- Replaced the traditional watermark-based ingestion logic with Change Data Feed (CDF) incremental loading, enabling the pipeline to process only recently changed data instead of scanning entire terabyte-scale source tables, resulting in $4,401.16 monthly cost savings.
- Reduced the runtime of a legacy Databricks job from 4.5 hours to 2.2 hours without increasing infrastructure expenses, which allows the pipeline to refresh 10-11 times a day.
- Handled high-priority production incidents and re-written/optimized over 6 legacy applications, resulting in cumulative monthly cost savings of $11,649.19 and worked alongside a team of 16 members, providing support to L1 and L2 teams in addressing production issues.
Senior Data Engineer at SES Satellites (2022-03 – 2025-05)
Senior Data Engineer at SES Satellites
- Designed and implemented an end-to-end project lifecycle for Spark Structured Streaming jobs across multiple environments.
- Reduced Databricks DBU and Azure VM costs by reconfiguring and reorganizing job clusters and streaming jobs, which results in savings of > $93,000 a year without impacting job performance.
- Migrated the entire project from Scala to Python/PySpark to unify the codebase, Utilized Databricks Asset Bundles (DABs) for redesigning with best practices.
- Developed a custom Spark Streaming Listener function to log streaming metrics into Log Analytics Workspace.