Network & Server Deployment Engineer (GPU Data Centre)
NscaleJob description
- *About Nscale**
As a Nscaler, you’ll build trust through openness and transparency, where everyone is inspired to do their best work. Collaboration is key, and we work together swiftly and respectfully, embracing adaptability and resilience in all we do.
- *Network & Server Deployment Engineer
- *What You'll Be Doing
- Designing, deploying, and operating large-scale HPC clusters and GPU-based Compute environments
- Creating and maintaining hardware architectures, including BOMs, rack elevations, and reference designs
- Implementing and maintaining HPC scheduling and workload management systems (e.g., Slurm)
- Designing and optimising InfiniBand and Ethernet network topologies (Fat Tree, Dragonfly, rail-optimized configurations)
- Working with deployment teams to ensure cluster builds align with architectural specifications
- Automating provisioning, configuration, and operations of multi-vendor HPC hardware and software stacks
- Collaborating with software, infrastructure, and datacenter teams to ensure seamless integration of HPC environments
- Troubleshooting and tuning cluster performance across compute, storage, and interconnect layers
- *About You**
- Proven experience in designing, deploying, and operating HPC or large-scale compute clusters
- Strong knowledge of Slurm or similar workload management systems (e.g., PBS, LSF)
- Proven experience in InfiniBand networking design and operations, including subnet management, QoS, RDMA, and performance tuning
- Experience with high-speed Ethernet networks and associated protocols (e.g., VLAN, LACP, BGP, OSPF, EVPN, VXLAN)
- Familiarity with HPC network topologies such as Fat Tree or Dragonfly
- Experience creating hardware BOMs, rack layouts, and reference architectures for compute deployments
- Strong scripting skills in Python and/or Bash for automation and orchestration
- Solid understanding of optics, cabling, and physical layer design considerations for HPC and GPU cluster environments
- Strong analytical, troubleshooting, and documentation skills
- A collaborative mindset and passion for building high-performance, scalable infrastructure
- *In All We Do, Our Core Values Guide Us
Relentless Innovation**
At Nscale, we constantly push the boundaries of innovation, embracing creative risks to shape the future. Our aim is to deliver products that not only meet but exceed today’s expectations, setting new standards for tomorrow.- *Ownership and Accountability**
- *Openness and Transparency**
- *Customer-Centric Focus**
- *Sustainability**
- *Full-Speed Collaboration**
- *Equal Opportunities Statement**
If there’s anything we can do to accommodate your specific situation, please let us know.
The responsibilities outlined in this job description are not exhaustive and are intended to provide a general overview of the position. The employee may be required to perform additional duties, tasks, and responsibilities as assigned by management, consistent with the skills and qualifications required for the role.
- For information on how Nscale handles candidate personal data, please see our Employee & Candidate Privacy Notice:
- Here.
- For information on how Nscale handles candidate personal data, please see our Employee & Candidate Privacy Notice: Here.*
¿Te interesa este puesto?