Python GenAI Model Evaluator

Technology

ClifyX

Austin, United States1 months agoUntil 4/19/2026

Job description

Please share 2 onsite profiles for Model Evaluators . Location can be either SCV or Austin.

Billing : $78/Hr

Technical Skills

Strong understanding of LLMs, generative AI, and transformer-based architectures.
Experience with Python, data analysis, and model evaluation frameworks.
Familiarity with prompt engineering, embeddings, RLHF/RLAIF, and LLM-based scoring methods.
Experience building evaluation datasets and working with annotation platforms.
Understanding of safety alignment, bias detection, and adversarial testing.

Tools & Platforms

ML/AI frameworks: PyTorch, TensorFlow, HuggingFace, LangChain.
Evaluation/annotation tools: Scale AI, GroundTruth, Labelbox, Prodigy.
Prompt testing tools: Weights & Biases, MLflow, OpenAI evals, LLM-as-a-judge pipelines.

¿Te interesa este puesto?