Senior Data Engineer, Allianz Life April 2018 – Dec 2021
Tech Stack: Python, PySpark, Scikit-learn, SQL, Mage, AWS Cloud
- Crafted a scalable Recommendation System data pipeline using Python Scripting, AWS components (S3, EMR, Glue,
Redshift), and advanced data visualization tools like Tableau and PowerBI for insightful data analysis and recommendations.
- Designed claims automation system to seamlessly handle large volumes of daily and weekly data, supporting multiple teams like fraud preventions, underwriters, marketing, and agency team enabling them with custom DataMart.
- Managed & orchestrated data pipelines for real time marketing campaign, utilizing Apache Airflow, Kafka, and SQL procedures.
- Developed a fraud detection model using Random Forest for the Claims team, saving the company potential financial losses by identifying and preventing fraudulent claims worth $1.5 million in a single quarter.
- Trained 36 associates on Data Engineering core concepts with hands-on programming using Python, SQL, and Airflow to provide technical guidance to junior team members, fostering their professional growth & ensuring high-quality deliverables.
Machine Learning Engineer, Tata Consultancy Services July 2015 – March 2019
Tech Stack: Python, Airflow, Splunk, Ansible, Linux
- Developed robust automation scripts for model deployment using Docker & Kubernetes, monitoring and maintenance, reducing deployment time by 70% and ensuring continuous model performance tracking.
- Conducted POC’s for multiple client site, improving performance tuning and optimization of existing machine learning models,
achieving a 5% improvement in accuracy, and reducing resource consumption.
- Implemented named entity recognition (NER) models using spaCy and BiLSTM-CRF (Bidirectional LSTM with C