NVIDIA is looking for a system Linux administrator to join the E2E software verification HPC/AI Infrastructure team. We are focused on building supercomputers and HPC clusters based on groundbreaking technologies. We are looking for a system Linux administrator, be a key player to the most exciting computing hardware and software to contribute to the latest breakthroughs in artificial intelligence and GPU computing. Take part of building large-scale compute and Deep Learning software and hardware platforms, work together and support many scientific researchers, developers, and customers to craft improved workflows and develop new, leading differentiated solutions.What you will be doing:Installations for variety of infrastructure and solutions - cloud, VMs, Storage, Network, HPC and AIData centers and LABs daily operation and supportPlan and build complex cluster and supercomputersDeploy monitoring solutions for the servers, network and storagePerform troubleshooting bottom up from bare metal, operating system, software stack and application levelSupport Research & Development activitiesWhat we need to see:MCSE or MCITP/CCNA certificationOver 3 years of experience as Linux System AdministrationSupported large and complex data centersProven hands-on experience in Linux troubleshooting with good problem identification, resolution and solving skills.In depth knowledge in Linux & Windows Core Services: DHCP, DNS, NIS, AD, etc.Team Work, Service oriented, organizedWays to stand out from the crowd:Scripting experience in Bash and/or PythonExperience with configuration managements tools known in the community (e.g. Ansible, puppet)CI & Known Job schedulers tools (e.g. Jenkins, SLURM)Virtualization: KVM / VMware / Hyper-VExperience with L2 & L3 network protocols.
View Original Job Posting