We're working with a company that empowers the world's most innovative organizations to enhance their AI agents through expertly curated human feedback. They collaborate with leading AI research teams to train advanced Large Language Models (LLMs) to operate as proactive, multi-step agents, focusing on complex, real-world architectural workflows.

The Role

Develop objective, verifiable criteria to evaluate system performance and ensure outputs meet strict functional requirements

Review system logs and "trajectories" to refactor code, improve execution paths, and achieve optimal reliability

Test systems for vulnerabilities, including improper data exposure, unauthorized access, and edge-case failures

Contribute expertise to training generative AI models as a freelance expert

What You'll Need

2+ years of experience in backend engineering, AI automation, or complex systems integration

Proven ability to build and maintain production-grade software with modular separation

Strong command of at least two major programming languages (e.g., Python, JavaScript, Go, or Java)

Experience working with SQL databases and building for live, non-mocked environments

Outstanding attention to detail and ability to provide clear, high-density technical feedback

What's On Offer

Opportunity to shape the future of autonomous agents and generative AI systems

Fully remote freelance role with flexible hours

Competitive hourly rates up to USD $50 for core project work

Additional incentives averaging 7.5% through "Missions"

Apply via Haystack today!

Full Stack Developer

Job description

The Role

Related

Related