Quantzig - Data Engineer
QuantzigJob description
Job Role : Data Engineer
YoE : 2 to 7 years
Location : Bangalore
Date : Ongoing (from 29th April 2026)
About the Role :
We are seeking a highly skilled Data Engineer with expertise in Databricks and AWS Cloud to design, build, and optimize enterprise-scale ETL pipelines and reporting solutions. This role is ideal for someone who thrives at the intersection of data engineering and business intelligence, with a strong focus on transforming complex datasets into inputs for actionable insights. You will play a critical role in enabling spend analytics, patient access, and commercial reporting initiatives.
Key Responsibilities :
Databricks ETL Development :
- Design, develop, and maintain scalable ETL pipelines in Databricks using PySpark and SQL.
- Implement robust data ingestion, transformation, and validation processes to ensure high-quality datasets for analytics.
- Optimize workflows for performance, scalability, and reliability across large healthcare datasets.
Data Product Build :
- Transform AWS assets (on EC2 and Redshift) to Databricks by recreating the running ETL jobs and orchestrating via Airflow or Databricks Workflows.
- Hands-on working experience of catalog management using Unity Catalog.
- Conversant with AI capabilities within AWS and Databricks.
Requirements :
- 2 to 7 years of experience in Databricks data engineering roles.
- Strong hands-on proficiency in Databricks (PySpark, SQL) and AWS Cloud.
- Proven track record in ETL pipeline development and understanding of BI dashboards.
- Experience with US healthcare datasets.
- Strong SQL skills for data extraction, aggregation, and reporting.
- Excellent problem-solving abilities, with the capacity to work independently and in cross-functional teams.
- Detail-oriented with a commitment to delivering high-quality, reliable data solutions.
Good to Have :
Domain & Data Knowledge :
- Knowledge of IQVIA claims (standard, eLaaD, Remit, Rejection, NBRx, TRx, IGG4 Claims) and MMIT datasets to derive actionable insights.
End-to-End Production Deployment & Orchestration :
- Candidates with exposure to deploying and orchestrating data pipelines in production environments will be preferred. While not a core requirement, the following experience is a strong differentiator :
- Experience deploying ETL or ML pipelines end-to-end in a production environment, including environment promotion across dev, staging, and production.
- Familiarity with CI/CD tooling (GitHub Actions, Azure DevOps, or Jenkins) for automating pipeline deployment and release management.
- Exposure to Databricks Asset Bundles (DABs) or equivalent frameworks for version-controlled, repeatable job deployments.
- Working knowledge of Apache Airflow for DAG authoring, scheduling, dependency management, and monitoring of production workflows.
- Awareness of infrastructure-as-code practices (Terraform or AWS CloudFormation) for managing cloud resources supporting data pipelines.
- Basic understanding of containerization concepts (Docker) for packaging and deploying data pipeline components.
- Experience with pipeline monitoring and alerting tracking job health, SLA adherence, failure notifications, and data freshness in production.
- Familiarity with secrets and configuration management across environments (AWS Secrets Manager, Databricks Secrets, or equivalent).
Interested in this role?