A technology evaluation firm is looking for individuals to assess outputs from large language models and autonomous agents. This remote role involves evaluating workflows and providing actionable feedback for model refinement. Candidates should have strong experience in LLM evaluation, proficiency i