AWS MLOps Engineer
Job description
Company Description
Version 1 has celebrated 30 years in business and continues to be trusted by global brands to deliver technology and transformation solutions that drive customer success. Our deep expertise enables our customers to navigate the rapidly evolving technology landscape. We foster strong partnerships with global technology leaders including Microsoft, AWS, Oracle, Red Hat, OutSystems, Snowflake, ensuring that our customers are provided with the highest quality solutions and services.
We’re an award-winning employer reflecting how our employees are at the very heart of what we do:
- UK &
- Ireland's premier AWS, Microsoft &
- Oracle partner
- UK
- Ireland by GPTW
We’re a core values [link to values] driven company, we hire people who share our values, and we reward those who display and foster them, it’s deeply embedded within our DNA. Invest in us and we’ll invest in you.
Job Description
What You Will Do
- Architect and implement scalable infrastructure for AI and ML workloads (training, evaluation, inference).
- Design and operate Kubernetes-based platforms for multi-tenant, production AI systems.
- Build and refine MLOps pipelines covering model versioning, experiment tracking, CI/CD, deployment, monitoring, and rollback.
- Establish DevOps best practices across infrastructure, application, and ML layers.
- Lead security-first infrastructure design (access control, secrets management, isolation, observability, auditability).
- Deploy and operate enterprise-grade production systems with strong uptime and reliability standards.
- Leverage modern AI coding agents and developer copilots to accelerate engineering workflows.
- Partner with ML engineers and application teams to translate research and product requirements into scalable infrastructure capabilities.
- 8-12 years of experience in infrastructure, platform engineering, or distributed systems.
- Proven experience building and operating enterprise-grade production systems.
- Deep hands-on expertise with Kubernetes in production (autoscaling, networking, upgrades, reliability patterns).
- Strong background in MLOps and ML platform lifecycle management.
- Experience with cloud platforms (AWS, GCP, or Azure) and Infrastructure-as-Code (Terraform, Pulumi, etc.).
- Practical, hands-on use of AI coding agents / AI-assisted development tools.
- Strong programming ability in Go, Python, or similar infrastructure-oriented language
- Experience supporting GPU workloads and large-scale training/inference.
- Familiarity with enterprise security standards (SOC2, ISO, zero-trust architectures).
- Experience building internal developer platforms serving multiple teams.
- Background supporting AI systems in regulated or high-reliability environments.
- Share in our success with our Quarterly Performance-Related Profit Share Scheme, where employees collectively benefit from a share of our company's profits
- Strong Career Progression & mentorship coaching through our Strength in Balance &
- Leadership schemes with a dedicated quarterly Pathways Career Development programme
- Pension, Private Healthcare Cover, Life Assurance, Financial advice and an Employee Discount scheme
And many more exciting benefits… drop us a note to find out more.
¿Te interesa este puesto?