SR.Machine Learning Engineer — Agentic Voice (Healthcare)

Company: Attune

Location: West Loop, Chicago, IL (Hybrid — 3 days/week in office)

Type: Full‑time

Job Summary

We’re looking for a hands-on, entrepreneurial Senior Machine Learning Engineer who has already taken voice-centric AI systems (TTS, STT, LLM-driven dialog) from prototype to planet-scale production. You will own the full lifecycle of our ML stack—research, data pipelines, training, evaluation, deployment, and relentless optimisation—so that millions of patients can have natural, sub-second conversations with our Agentic Voice platform. You’ll collaborate tightly with product, infra, and compliance teams, set a high technical bar for ML excellence

What sets this role apart: You'll specialize in creating highly optimized, domain-specific conversational AI models by fine-tuning and compressing existing LLMs and specialized conversational architectures for specific use cases. We need someone who can rapidly research, prototype, and deploy smaller, faster, cheaper models that outperform general-purpose solutions in conversational settings - achieving 10x speed improvements and 90% cost reductions while building efficient pipelines for intent classification, dialogue management, and text-based optimization systems that improve conversational quality of our dialogue systems.

Key Responsibilities

Advanced Model Optimization & Fine Tuning

Apply LoRA, QLoRA, DPO, RLHF and parameter-efficient methods to create smaller, faster models optimized for conversational contexts

Implement quantization, pruning, knowledge distillation to significantly reduce model size while preserving quality

Work with modern conversational architectures: DeBERTa, SetFit, sentence transformers, lightweight decoder models for domain-specific use cases

Rapidly evaluate and adapt latest research for conversational applications

End-to-End ML Engineering

Design, build, and maintain high-performing STT, TTS, and LLM pipelines that operate at < 800 ms end-to-end latency and thousands of concurrent calls.

Train and fine-tune smaller, task specific LLMs optimized for accuracy, latency and cost efficiency in real time applications.

Inference at Scale

Optimise GPU- and CPU-based serving on EKS / Kubernetes using techniques such as dynamic batching, quantisation, speculative decoding, and streaming gRPC / WebSockets.

Dialogue Management

Implement and extend LangGraph / LangChain flows and Model Context Protocol (MCP) schemas to orchestrate complex multi-turn healthcare conversations safely and compliantly.

Data & Evaluation

Build robust data pipelines (Kafka → Snowflake / S3) for conversation logs; design offline and online evaluation frameworks for ASR/WER, TTS MOS, and task-completion metrics.

Technical Leadership

Establish ML best practices—versioning, monitoring, A/B gating, CICD for models—and mentor engineers on ML ops, audio processing, and prompt engineering.

Cross-Functional Collaboration

Work daily with product managers, designers, compliance leads, and customer teams to translate business goals into scalable voice experiences.

Innovation & Research

Stay on the cutting edge of open-source speech and LLM research; run rapid POCs (e.g., Whisper-v3, Bark), explore efficient fine tuning techniques(e.g. LORA, DPO), and continuously improve model performance in production environments.

Reliability & Compliance

Ensure HIPAA-grade security, auditable PHI handling, guardrails, and fallback strategies to keep conversations safe and reliable 24 × 7.

Qualifications

Education

B.S. or M.S. in Computer Science, Machine Learning, or related field.

Experience

7+ years building production ML systems, 2+ specifically in speech / conversational AI.
Proven track record shipping voice AI or large-scale LLM products to tens-of-millions of users or thousands of concurrent sessions.

Technical Expertise

Advanced Fine-tuning & Model Compression:
Proven experience with parameter-efficient fine-tuning techniques (LoRA, QLoRA, adapters) for conversational applications
Knowledge of few-shot learning frameworks for conversational tasks with limited data
Experience with model compression techniques (quantization GPTQ/AWQ, pruning, knowledge distillation) for real-time inference
Speech: Deep understanding of ASR (e.g., Whisper, NeMo, Kaldi) and TTS (e.g., Tacotron, FastSpeech, VITS) model internals and evaluation.
LLMs & Dialogue: Experience with GPT-class models, fine-tuning, RAG, LangGraph, LangChain, MCP, prompt-engineering and safety guardrails.
Languages: Expert in Python; proficiency in TypeScript / Node and/or Java is a plus.
MLOps & Infra: Kubernetes (AWS EKS), Helm, Terraform, MLflow / SageMaker, model-aware CI/CD, feature stores, GPU scheduling, autoscaling.
Data: Kafka, Redis, Postgres, Snowflake; designing real-time and batch pipelines for audio and text.
Protocols: gRPC, WebSockets, HTTP/2 streaming, RTP/WebRTC.
Security & Compliance: Experience securing PHI/PII, HIPAA/HITRUST controls, and SOC2 processes.

Soft Skills:

Product Mindset: Proven ability to make strategic product decisions with a focus on user needs and business impact.
Entrepreneurial: Experience taking ideas all the way from ideation to execution. Instead of waiting for tasks, you’re proactively identifying areas of opportunity and building them out.
Leadership Skills: Demonstrated experience in leading technical teams and mentoring engineers.
Problem-Solving: Excellent analytical and problem-solving abilities.
Communication: Strong verbal and written communication skills; ability to articulate complex technical concepts to non-technical stakeholders.

How we work:

Small, cross‑functional pods with a tech lead (Data Science/Engineering owns sprint tickets; product owns the what/why and outcomes).
Bias to prototype → validate → build; instrument everything; learn fast.
High autonomy, high bar, candid feedback, low politics.
Hybrid: 3 days/week in our West Loop office; occasional travel for customer meetings and team onsites.

Compensation & benefits:

Competitive base + equity; comprehensive health benefits; flexible PTO.

Attune is committed to a culture of teamwork; where everyone works together to plan, do, learn, and continuously improve. We accomplish that by staying true to our core values.

Lead With Empathy - Attune designs technology that listens first and responds with compassion and precision. Every interaction reflects genuine understanding and care for patients, providers, and partners.

Trust Is Earned -Trust is built through openness, clarity, and reliability. Attune upholds the highest standards of privacy, security, and communication, ensuring confidence in every exchange.

Work in Harmony - Collaboration drives progress. Attune aligns patients, care organizations, and technology partners to create seamless, unified systems that work together toward better outcomes.

Prioritize Outcomes - Success is measured by impact, not activity. Attune focuses on closing care gaps, improving experiences, and advancing meaningful health outcomes.

Innovate With Integrity - Attune advances AI responsibly, creating solutions that amplify human expertise without losing the human touch. Innovation always serves people first.

Expand Access - Care should be easy to reach and equitable for all. Attune’s technology removes barriers, expands access, and ensures every patient can connect with care when it matters most.

Attune is an Equal Employment Opportunity Employer and all employment decisions shall be made without regard to age, race, creed, color, religion, sex, national origin, ancestry, disability status, veteran status, sexual orientation, gender identity or expression, genetic information, marital status, citizenship status or any other basis as protected by federal, state, or local law.

We are committed to the full inclusion of all qualified individuals. As part of this commitment, we will ensure that persons with disabilities are provided reasonable accommodations. If reasonable accommodation is needed to participate in the job application or interview process, perform essential job functions, and/or receive other benefits and privileges of employment, please contact us.

Senior Machine Learning Engineer

Job description