hackajob

is collaborating with

*Moody's Corporation**

to connect them with exceptional professionals for this role.*

At Moody's, we unite the brightest minds to turn today’s risks into tomorrow’s opportunities. We do this by striving to create an inclusive environment where everyone feels welcome to be who they are—with the freedom to exchange ideas, think innovatively, and listen to each other and customers in meaningful ways. Moody’s is transforming how the world sees risk.

As a global leader in ratings and integrated risk assessment, we’re advancing AI to move from insight to action—enabling intelligence that not only understands complexity but responds to it. We decode risk to unlock opportunity, helping our clients navigate uncertainty with clarity, speed, and confidence.

If you are excited about this opportunity but do not meet every single requirement, please apply! You still may be a great fit for this role or other open roles. We are seeking candidates who model our values: invest in every relationship, lead with curiosity, champion diverse perspectives, turn inputs into actions, and uphold trust through integrity.

*Skills And Competencies
Ph.D. in Computer Science, Machine Learning, Natural Language Processing, Statistics, or a related quantitative field; or Master’s degree with 2-3 years of experience in machine learning evaluation or a related area
Strong foundations in statistical methods, experimental design, and hypothesis testing
Experience evaluating machine learning or NLP models, including designing experiments and interpreting results
Familiarity with LLM evaluation benchmarks and methodologies
Strong programming skills in Python or R
Excellent communication skills in English (both written and verbal)
*Preferred
Experience evaluating LLMs or generative AI systems
Experience with production machine learning systems
Exposure to cloud platforms such as AWS, GCP, or Azure
Publications or demonstrated work in model evaluation, benchmarking, or related areas
*Education
Ph.D. in Computer Science, Machine Learning, Natural Language Processing, Statistics, or a related quantitative field; or Master’s degree with 2-3 years of experience in machine learning evaluation or a related area
*Responsibilities
Evaluate and validate large language models for production-grade analytical and decision-support systems
Design and implement evaluation frameworks for assessing LLM performance in credit analytics and decision-support contexts
Develop metrics and benchmarks to measure model robustness, reliability, consistency, and output quality
Analyze model behavior across diverse inputs, identifying failure modes, edge cases, and areas for improvement
Collaborate with model development and deployment teams to integrate validation processes into the model lifecycle
Conduct systematic assessments of model stability over time and across updates
Evaluate model outputs for bias, fairness, and economic relevance to credit risk applications
Develop and maintain documentation for evaluation methodologies, findings, and recommendations
Contribute to the advancement of best practices for LLM evaluation within the Credit COE
*About The Team**

Our Credit Center of Excellence (COE) team is responsible for maintaining and enhancing our industry-leading credit analytics and predictive modelling capabilities. We work closely with various teams including product management, commercial strategy, and go-to-market leaders to ensure the delivery of high-quality credit risk assessments and solutions. By joining our team, you will be part of exciting work in credit analytics with a global team spread across all US time zones, GMT, and GMT+1.

Machine Learning Evaluator

Job description

is collaborating with

Related

Related