🚀 Hiring: SwarmBench Task Engineer — SWE / Code

📍 Location: Remote

💼 Job Type: Freelance / Contract

💰 Payment: Hourly Basis

🕒 Shift Timing: 7:30 PM – 12:30 AM IST + Flexible 4 Hours (PST Overlap Required)

⏳ Availability: Full-time (8 Hours Daily) with 4 Hours PST Overlap

We are looking for experienced SwarmBench Task Engineers (Code / SWE) to design and build high-quality multi-agent benchmark tasks based on real-world software engineering workflows.

🔹 Experience Required: 5+ Years

🔧 Key Skills:

Strong experience in Python & JavaScript development
Hands-on experience with AI coding benchmarks like SWE-bench, Terminal-Bench, etc.
Ability to navigate large open-source codebases (Django, Flask, FastAPI, Node.js, etc.)
Strong understanding of Git workflows, PRs, diffs, cherry-picking & commits
Comfortable with Docker (Dockerfiles, image building, debugging containers)
Experience writing test scripts using pytest, unittest, or custom assertions
Excellent technical documentation and specification writing skills

📌 Role Responsibilities:

Build multi-agent benchmark tasks using real-world open-source code changes
Work with Harbor evaluation framework inside Docker environments
Write detailed task instructions with expected behavior and constraints
Create Python-based verification scripts for validating AI-generated code changes
Design decomposition strategies for multi-agent workflows
Debug and refine tasks for reproducibility and deterministic execution
Improve benchmark quality, clarity, and evaluation signals

🎯 Ideal Candidate:

Someone who enjoys deep codebase analysis, software engineering workflows, debugging complex systems, and working at the intersection of AI + Software Engineering.

Task Engineer - Code

Job description

Related

Related