Search Jobvertise Jobs
Jobvertise

Linux HPC/Nvidia Admin
Location:
US-VA-Ashburn
Jobcode:
34028
Email Job | Report Job

Report this job





Incorrect company
Incorrect location
Job is expired
Job may be a scam
Other







Apply Online
or email this job to apply later

Linux Administrator

Ashburn, VA, Onsite

Contract 12+ months

About Our Client

Our client is a leading technology company specializing in high-performance computing solutions.

With a focus on cutting-edge GPU-based architectures and containerized applications, they are at the forefront of innovation in the HPC industry. The company values technical excellence, continuous learning, and collaborative problem-solving. Their mission is to provide state-of-the-art computing solutions that empower businesses and researchers to tackle the most complex computational challenges.

Job Description

We are seeking a highly skilled and experienced Linux Administrator to join our client's IT team.

In this role, you will be responsible for managing and maintaining Linux server infrastructure, with a focus on GPU-based HPC systems and NVIDIA architectures.

You will ensure optimal performance, security, and reliability of these advanced computing environments. This position offers an exciting opportunity to work with cutting-edge technologies and contribute to groundbreaking projects in the field of high-performance computing.

The ideal candidate is passionate about Linux systems administration, has hands-on experience with GPU-based HPC environments, and is comfortable working in a dynamic, on-site environment.

You will play a crucial role in supporting and optimizing the client's advanced computing infrastructure.

Duties and Responsibilities

- Manage and maintain Linux-based servers and HPC systems, particularly those based on NVIDIA DGX architectures

- Install, configure, and optimize GPU-based HPC environments

- Troubleshoot complex issues related to Linux systems, including hardware failures, network problems, and software conflicts

- Implement and manage security policies for HPC environments

- Develop and maintain scripts and automation tools for system monitoring, backups, and deployments

- Monitor system performance and conduct performance tuning for optimal efficiency

- Design and implement disaster recovery plans and backup strategies

- Create and maintain comprehensive documentation for system configurations and procedures

- Collaborate with other IT team members, developers, and stakeholders on projects

- Manage system upgrades and patches, ensuring minimal disruption to operations

- Work with containerized applications and optimize their performance in HPC environments

Required Experience/Skills

- Minimum of 5 years of experience as a Linux Administrator or in a similar role

- Extensive hands-on experience managing large-scale Linux-based environments

- Strong knowledge of GPU-based HPC architectures, particularly NVIDIA technologies

- Experience working with containerized applications in HPC environments

- Recent hands-on experience as a system administrator

- Deep understanding of Linux distributions (e.g., Red Hat, CentOS, Ubuntu)

- Proficiency in scripting and automation tools (e.g., Bash, Python, Ansible)

- Knowledge of networking concepts and protocols relevant to HPC environments

- Familiarity with configuration management tools (e.g., Puppet, Chef)

- Excellent problem-solving and communication skills

- Ability to work effectively in a team environment and handle multiple priorities

Nice-to-Haves

- NVIDIA training or certifications

- Experience with cloud platforms (e.g., AWS, Azure) for HPC workloads

- Knowledge of InfiniBand networking

- Familiarity with job scheduling systems for HPC (e.g., Slurm, PBS)

- Experience with parallel file systems (e.g., Lustre, BeeGFS)

Education

- Bachelor's degree in Computer Science, Information Technology, or a related field, or equivalent work experience

- Relevant certifications such as RHCE (Red Hat Certified Engineer), CompTIA Linux+, or LPIC (Linux Professional Institute Certification) are highly desirable

Additional Requirements

- Must be comfortable working on-site at our customer location in Ashburn, VA

- May be required to participate in on-call rotations

- Willingness to continuously learn and adapt to new technologies in the HPC space

Ready to push the boundaries of high-performance computing? Join our team of skilled professionals and help shape the future of GPU-based HPC solutions! Apply now to become part of a company that's revolutionizing the world of advanced computing.

Eric Krisher
Catapult Solutions Group
Phone: 000

Apply Online
or email this job to apply later


 
Search millions of jobs

Jobseekers
Employers
Company

Jobs by Title | Resumes by Title | Top Job Searches
Privacy | Terms of Use


* Free services are subject to limitations