Skip to main content

HPC Systems Engineer

Technology
SAIC
Charlottesville, United States1 weeks agoUntil 6/9/2026
Full timeOn-site

Job description

Requirements

Must have:

- Bachelors degree in a science or technology field; 10 years of experience can substitute for degree - Minimum of 8 years of experience in relevant positions is necessary - At least 6 years of experience managing Linux systems in enterprise, research computing, or distributed computing settings - Active Top Secret clearance required; must obtain TS/SCI clearance before starting work - Capability to work 100% onsite in Charlottesville, VA - Background in supporting distributed computing environments or HPC cluster systems - Familiarity with workload schedulers such as Slurm, PBS, Torque, or equivalent - Proficient in Linux systems administration using command-line interfaces - Experience with scripting or automation tools (Bash, Python, etc.) - Ability to achieve required DoD 8140 (8570) IAT Level II certification - Direct knowledge of HPC or distributed computing environments is mandatory Preferred qualifications include:
  • Administration experience in multi-node HPC cluster setups - Proficiency with parallel or distributed file systems like Lustre, BeeGFS, or GPFS - Experience with GPU-enabled computing environments and CUDA workloads - Familiarity with configuration management tools such as Ansible or Puppet - Background in supporting research, laboratory, or mission-critical computing environments - Prior work within DoD/DoW or Intelligence Community settings.

Responsibilities:

- Assist in the development, configuration, and maintenance of HPC cluster platforms - Oversee cluster platform configuration and administration of workload scheduling - Troubleshoot issues related to distributed computing and cluster environments - Conduct performance assessments across compute, storage, and network layers - Provide support for GPU computing workloads - Implement automation and develop operational tools to streamline processes

Company:

At SAIC, we are a leading technology integrator, delivering comprehensive life cycle services and solutions across technical, engineering, intelligence, and enterprise information technology sectors. Our commitment to redefining ingenuity stems from our extensive customer and domain expertise, enabling us to provide robust systems engineering and integration services for large, complex projects. With approximately 15,000 dedicated employees, we prioritize integrity and mission success in serving our clients within the U.S. federal government. Our headquarters is located in Reston, Virginia, and we generate annual revenues of about $4.5 billion. For more details on what we offer, please visit our website.

Keywords
CUDASlurm Workload ManagerGNU parallelLustreDiracLinuxNode.jsCluster analysisPuppetPythonData clusterNode

¿Te interesa este puesto?