Research Engineer, Evaluations

Technologie

AssemblyAI

Fully, SchweizVor 1 MonatenBis 25.5.2026

Praktikum

Stellenbeschreibung

Own end-to-end and integration-level model evaluation across accuracy, latency, and feature-specific metrics

Build and maintain competitive benchmarking pipelines

Design and run systematic experiments to measure the impact of model changes

Onboard, curate, and maintain evaluation datasets

Create evaluation subsets to stress-test specific capabilities and edge cases

Define evaluation metrics for real-world performance

Translate qualitative customer feedback into quantifiable evaluation criteria

Work with customer-facing teams to understand pain points and convert them into research priorities

Maintain clean evaluation pipelines and clear documentation

Identify evaluation gaps proactively and propose solutions

Experience

ML fundamentals: Interpret results and debug issues without training from scratch

Strong Python skills: Write clean evaluation scripts, work with data pipelines, comfortable with SQL and cloud infrastructure

Metric intuition: Understanding of good evaluation metrics and ensuring statistical rigor

Voice agent stack familiarity: Understands VAD, ASR, turn detection, LLM, TTS systems interaction

Tinkerer mentality: Preference for shipping and iterating quickly

Communication skills: Explain technical results, summarize findings, and translate customer feedback

Ownership mindset: Proactively fill evaluation gaps

Work at least 3-4 hours overlapping with Eastern US Time Zone

Salary and Perks

Pay range: $210K - $260K

About AssemblyAI

Industry-leading Speech AI models to automatically recognize and understand speech.

Keywords

OCamlCloud computingPythonSqlStress TestingDebuggerDebugging

¿Te interesa este puesto?