We're working with a company that empowers the world's most innovative organizations to enhance their AI agents through expertly curated human feedback. They collaborate with leading AI research teams to train advanced Large Language Models (LLMs) to operate as proactive, multi-step agents, focusing on complex, real-world architectural workflows.
The Role
- Develop objective, verifiable criteria to evaluate system performance and ensure outputs meet strict functional requirements
- Review system logs and "trajectories" to refactor code, improve execution paths, and achieve optimal reliability
- Test systems for vulnerabilities, including improper data exposure, unauthorized access, and edge-case failures
- Contribute expertise to training generative AI models as a freelance expert
What You'll Need
- 2+ years of experience in backend engineering, AI automation, or complex systems integration
- Proven ability to build and maintain production-grade software with modular separation
- Strong command of at least two major programming languages (e.g., Python, JavaScript, Go, or Java)
- Experience working with SQL databases and building for live, non-mocked environments
- Outstanding attention to detail and ability to provide clear, high-density technical feedback
What's On Offer
- Opportunity to shape the future of autonomous agents and generative AI systems
- Fully remote freelance role with flexible hours
- Competitive hourly rates up to USD $50 for core project work
- Additional incentives averaging 7.5% through "Missions"
Apply via Haystack today!