Data Engineer at LTIMINDTREE LTD (2023-05 – Present)
Tech Stack: Python, SQL, Pyspark, Databricks, Delta Lake, Bigquery, GCP, Git, Servicenow
- Designed and implemented a centralized monitoring and subscription alerting system using Python, SQL, BigQuery, Cloud Run, Cloud Functions, and Cloud Scheduler, reducing manual checks by 80% and enabling real-time notifications for data ingestion success, failure, inactivity, and data quality issues.
- Created and executed a fault-tolerant monitoring, alerting, and incident management framework for email and SharePoint data ingestion pipelines using Azure Logic Apps, APIs, and Blob Storage, improving observability and reducing manual monitoring efforts.
- Modernized ingestion frameworks by upgrading Python runtime from 3.8 to 3.12, refactoring code and dependencies to improve performance, stability, and long-term maintainability while reducing technical debt.
- Automated BigQuery table deployment and promotion from lower to higher environments (Dev → QA → Prod) and developed a generic, unified table structure, simplifying downstream analytics and improving data standardization.
- Led large-scale data transformations and performance optimization using Python (Pandas, NumPy), PySpark, and BigQuery, processing datasets over 1TB and reducing query execution time through optimized transformations.
- Owned production data pipelines, ensuring high availability, data quality, and timely incident resolution through structured troubleshooting and monitoring.
- Created Python-based incident management utilities that improved troubleshooting clarity for non-technical users by 20% and automated ServiceNow incident watch list updates, reducing manual effort by 5%.
- Worked closely with product, analytics, and business teams to translate requirements into scalable and maintainable data solutions.