Data Analyst at ZS Associates (2025-01 – Present)
US Pharma Project
- Gathered detailed business requirements from the client, coordinated with multiple vendor data sources, prepared requirement documents and FIA, and facilitated necessary access requests including S3 buckets for inbound, outbound, and archival dataflows.
- Designed and developed end-to-end ETL pipelines, configured ingestion frameworks, and built fact and dimension tables aligned with business rules, ensuring stable and timely pipeline execution.
- Implemented performance optimization techniques, maintained historical data using SCD Type 2, applied timebucketing and geography mapping to create final reporting-ready datasets.
- API Data Pull via Python for client cost saving and daily updated data.
- Implemented NLP Model to categorize Patients Data into respective Buckets
- Improved project performance from 65% to 85% by optimizing flow runtimes
- Technologies: Spark, Spark SQL, Presto, PySpark, Python
Data Analyst at ZS Associates (2024-08 – 2025-02)
US Pharma Project
- Developed a Mastered Data Management (MDM) process to master the customer universe record and further provide it to Data warehousing team.
- Mastered Data using delta load from different data vendors and implementing SCDs of different type based on historical data management.
- Leading client call on design and rules discussion of MDM.
- Technologies: Spark, Spark SQL, Presto, PySpark, Python
Data Analyst at ZS Associates (2024-03 – 2025-01)
US Pharma Project
- Efficiently fetched data from various sources and streamlined data processing using Azure Data Factory pipelines, enhancing data flow efficiency and reliability
- Followed the Medallion architecture principles by organizing data into bronze, silver, and gold layers to ensure data Quality and handled delta load with SCDs.
- Created and managed facts and dimensions within Databricks Notebooks along with historical data tables, facilitating structured and meaningful Data Model
- Developed Databricks Notebooks to apply complex business logic and transformations
- Customized data processing workflows align with specific use cases, ensuring that the final datasets met the analytical needs of the business
- Technologies: Spark, Spark SQL, PySpark, Azure Data Factory and Databricks
Data Analyst at Celebal Technology (2023-05 – 2024-02)
Manufacture Based Project
- Migrated data from Oracle Cloud to a modern Data Lakehouse environment, significantly streamlining ETL operations and improving processing efficiency.
- Build Medallion Architecture (Bronze–Silver–Gold) to optimize data transformation layers, enhancing data quality, reliability, and reusability.
- Developed Dynamic functions for history maintenance and designed Fact and Dimension tables to support advanced analytics, reporting, and business intelligence use cases.
- Technologies: PySpark, ADLS, SQL, ADF, DLT, and Databricks