Skip to main content

Senior DevOps Engineer

Technology
IREN
Vancouver, Canada1 months agoUntil 2026-05-14
Full time

Job description

Job Description

IREN is a leading AI Cloud Service Provider, delivering large-scale GPU clusters for AI training and inference. IREN’s vertically integrated platform is underpinned by its expansive portfolio of grid-connected land and data centers in renewable-rich regions across the U.S. and Canada.

With 100% renewable energy, we build, own and operate our data centers and take pride in being at the forefront of sustainable solutions for the ever-evolving applications of high-performance compute. We believe that human progress is invaluable, but it should be done in the right way – responsibly, sustainably and having a positive impact on the communities we operate in.

As a Senior DevOps Engineer, you will help build and operate the foundational systems that power our GPU-enabled multi-tenant environment.

You will design and maintain the infrastructure, automation, and platform abstractions that enable product teams to deliver reliable, scalable, and high-performance services across our cloud and on-prem footprint.

Job Requirements

Technical Skills

  • 5+ years of experience in Platform Engineering, DevOps, Cloud Infrastructure, or SRE roles.
  • Proficient in Python or Go for platform tooling, automation, and API integration.
  • Hands-on experience with Kubernetes(operations, networking, storage, RBAC, and multi-tenant considerations).
  • Strong background in AWS services including EC2, EKS/ECS, S3, IAM, VPC, RDS.
  • Expertise with Infrastructure-as-Code(Terraform or CloudFormation) and GitOps workflows (ArgoCD).
  • Familiarity with service mesh, ingress controllers, and CNI network stacks.
  • Strong understanding of observability: Prometheus (metrics/alerting), Grafana (dashboards), and logs.
  • Familiarity with event-driven systems (Kafka basics: topics, partitions, consumer groups).
  • Experience with secure configuration of cloud/K8s systems (RBAC, SSO integrations,NetworkPolicies).
Soft Skills & Competencies
  • Clear and structured communicator with cross-functional teams.
  • Strong problem-solving skills — able to debug complex distributed systems.
  • Pragmatic engineer who balances long-term quality with short-term delivery.
  • Collaborative team member comfortable working across Platform, Infrastructure, and Product domains.
Bonus Qualifications
  • Experience running on-prem Kubernetes or hybrid cloud infrastructure.
  • Familiarity with GPU workloads, node management, or multi-tenant compute isolation.
  • Hands-on experience with load testing, chaos engineering, or advanced performance tuning.
  • Exposure to internal developer platforms (IDP) or platform-as-a-product mindset.
  • Experience with Helm, Kustomize, Pulumi, or Temporal workflow engine.

Job Responsibilities

Platform & Infrastructure Engineering

  • Design, develop, and operate core platform services that support compute, networking, and storage across cloud and on-prem environments.
  • Build and maintain Infrastructure-as-Code using tools like Terraform or CloudFormation, following GitOps best practices with ArgoCD.
  • Implement and operate continuous delivery pipelines and deployment automation for platform services.
  • Improve platform scalability, reliability, and performance across Kubernetes and cloud systems.
  • Collaborate with HPC, Networking, and Security teams to ensure resilient, multi-tenant platform architectures.
Kubernetes & Cloud Operations
  • Operate and optimize Kubernetes clusters(EKS and on-prem K8s distributions).
  • Manage networking components such as CNI plugins, Ingress controllers, and Service Mesh(Envoy/Istio/NGINX).
  • Configure and maintain RBAC, Network Policies, and identity integration for secure, least-privilege access.
  • Work with cloud resources (AWS EC2, EKS, S3, RDS) to deploy, scale, and secure platform services.
Observability & Operational Excellence
  • Implement and maintain platform observability tooling using Prometheus, Grafana, Loki, and alerting pipelines.
  • Establish and maintain SLOs/SLAs, responding to incidents and continuously improving reliability.
  • Debug distributed systems issues across compute, network, storage, and application surfaces.
  • Contribute to on-call rotations and production readiness reviews.

Job Benefits

At IREN, we offer a comprehensive Total Rewards package designed to support your health, well-being, and long-term success. Our Canada package includes:

Compensation

  • The expected base salary for this role starts at CAD$135,000 - 150,000/annum.
    • Actual compensation will be determined based on factors such as experience, qualifications, and market data for the region.
  • Total Compensation package may be inclusive of annual incentive bonus, and equity (long-term incentive)
Health & Wellness

  • Medical, dental, and vision insurance coverage – 100% company paid for employees and dependents
  • Company-paid life and disability insurance
  • Voluntary life and critical illness coverage available
  • Employee Assistance Program and virtual health care platform
Financial Well-Being

  • RRSP with company match
  • Voluntary TFSA
Time Off & Flexibility

  • 3 weeks annually for vacation and paid holidays
Growth & Development

  • Opportunities for advancement and internal mobility
  • Training and personal development opportunities
Lifestyle & Culture

  • Company events and team-building activities
We value diverse perspectives and believe that skills can be developed. If you’re passionate about this role, we want to hear from you — whether you meet every criteria or not. Your unique experiences might be exactly what we need!

Podtech Data Centers

Inc., the employing entity and proud member of the IREN Group is an equal opportunity employer that is committed to creating an inclusive workplace. We evaluate qualified applicants without regard to race, colour, religion, age, sex, sexual orientation, gender identity, genetic information, national origin, disability, veteran status, and other legally protected characteristics.

This job will remain posted until filled. While we appreciate all applications we receive, we are only able to contact candidates under consideration.

By applying for this position and submitting your resume and application materials, you consent to the processing of your personal information in accordance with our Job Applicant Privacy Statement available on our website at www.iren.com.

Keywords
monthsOfExperience: 60Plug-inApache KafkaGrafanaAutoconfScalabilityCloud computingNode.jsDevOpsPythonNginxAmazon Elastic Compute CloudComputer-aided designNodeDebuggerAWSKubernetesTerraformDebugging

¿Te interesa este puesto?