Skip to main content

Lead Machine Learning Engineer (MLOps, PyTorch, AWS)

Technology
Capital One
Richmond, United States€179,400 - €245,600 /year2 weeks agoUntil 6/8/2026
Full timeOn-site

Job description

Requirements

Must have:

- Bachelors degree - Minimum of 6 years of experience in designing and developing data-intensive solutions using distributed computing (Internship experience is not applicable) - At least 4 years of programming experience with Python, Scala, or Java - A minimum of 2 years of experience building, scaling, and optimizing machine learning systems - Masters or doctoral degree in computer science, electrical engineering, mathematics, or a related field (preferred) - 3+ years of experience in creating production-ready data pipelines for machine learning models (preferred) - 3+ years of practical experience with established machine learning frameworks such as scikit-learn, PyTorch, Dask, Spark, or TensorFlow (preferred) - 2+ years of experience in developing high-performance, resilient, and maintainable code (preferred) - 2+ years of experience in data collection and preparation for machine learning models (preferred) - 2+ years of leadership experience (preferred) - 1+ years of experience guiding teams in developing machine learning solutions using industry best practices (preferred) - Exposure to developing and deploying machine learning solutions in public cloud environments such as AWS, Azure, or Google Cloud Platform (preferred) - Experience in designing, implementing, and scaling complex data pipelines for machine learning models and assessing their performance (preferred) - Demonstrated impact in the machine learning field through conference presentations, publications, blog posts, open source contributions, or patents (preferred)

Responsibilities:

- Design, construct, and deliver machine learning models and components that address real-world business challenges while collaborating with Product and Data Science teams.
  • Utilize your understanding of machine learning modeling techniques to inform infrastructure decisions, including model selection, data and feature selection, training, hyperparameter tuning, and validation.
  • Tackle complex issues by writing and testing application code, developing and validating machine learning models, and automating testing and deployment processes.
  • Work within a cross-functional Agile team to create and enhance software that supports advanced big data and machine learning applications.
  • Retrain, maintain, and oversee production models.
  • Utilize or develop cloud-based architectures, technologies, or platforms to deliver optimized machine learning models at scale.
  • Create efficient data pipelines to feed machine learning models.
  • Apply continuous integration and continuous deployment practices, including test automation and monitoring, to ensure successful deployment of machine learning models and application code.
  • Ensure code is well-managed to minimize vulnerabilities, maintain proper governance of models from a risk standpoint, and adhere to best practices in responsible and explainable AI.
  • Use programming languages such as Python, Scala, or Java.

Company:

We are Capital One, passionate about transforming our vision for AI into reality.

Our Intelligent

Foundations and Experiences (IFX) team collaborates closely with partners throughout the organization to push the boundaries in AI science and engineering. We build and implement proprietary solutions central to our business, delivering substantial value to millions of customers. Our AI models and platforms empower teams across Capital One, allowing them to elevate their products with AIs transformative potential in a responsible and scalable manner.

We offer a comprehensive and competitive benefits package that supports the holistic well-being of our team members. Learn more about our culture and career opportunities on the Capital One Careers website.

Keywords
TensorFlowPyTorchScalaScikit-learnApache SparkDaskPythonJavaBig data

¿Te interesa este puesto?