Senior Site Reliability Engineer

Company: NVIDIA

Location: US, WA, Seattle

Commitment: Full time

Posted on: 2023-09-08 06:01

NVIDIA has continuously reinvented itself over the past several decades. It’s a unique legacy of innovation that’s motivated by outstanding technology and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. NVIDIA is at the forefront of generative AI models, from language to images. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, encouraging environment where everyone is inspired to do their best work.What you’ll be doing:As a Senior Site Reliability Engineer, you will have the responsibility for provisioning and operating our high-availability systems that provide automated control, monitoring, and alerting at our production data centers. Your duties will include:Ensuring high levels of systems reliability and availability in a global enterprise data center setting.Identifying, resolving, and analyzing system issues.Optimizing system performance and ensuring security.Providing systems support and administration in TCP/IP network environments, including both IPv4 and IPv6.Implementing remote access capabilities for out-of-band administration.Automate provisioning systems for rapid and consistent deployments.Collaborate with development teams to improve efficiencies.What we need to see:Strong understanding and hands-on experience with server administration in production data center environments.Proven track record of technical leadership in site reliability.Excellent problem-solving skills and an ability to thrive under pressure..Strong written and verbal communication skills.Bachelor's degree in Computer Science, Information Systems, or a related field, or equivalent experience.5+ years of experience in systems administration, particularly with Linux systems.Ways to stand out from the crowd:Certifications such as the Red Hat Certified System Administrator (RHCSA), CompTIA Linux+, or Linux Professional Institute Certification (LPIC).Experience with cloud services like AWS, Google Cloud, or Microsoft Azure.Knowledge of or experience with containerization technologies like Docker and Kubernetes, can VM technologies like KVM.Familiarity with infrastructure as code (IaC) tools such as Ansible, Terraform, or Puppet.Understanding of and experience with security standards for and networking environments.Experience in automating routine system administration tasks.Active participation in relevant professional or open-source communities.NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you.The base salary range is $118,400 - $224,250. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

View Original Job Posting