Senior Systems Engineer, Site Reliability

Company: NVIDIA

Location: US, CA, Santa Clara

Commitment: Full time

Posted on: 2023-11-08 05:01

NVIDIA is seeking a highly motivated Site Reliability Engineer to join our Tegra Solution Engineering team which develops hardware and software systems to power DRIVE Sim and DRIVE Constellation platform. NVIDIA DRIVE Sim and DRIVE Constellation is a solution that provides autonomous vehicle developers a digital testing environment to craft, develop, deploy, and validate autonomous vehicle applications. Our Site Reliability Team focus on improving the reliability of our platform. We do this by diligently measuring the customer experience, tracing it to the health of our platform, actively responding to outages and collaborating with our internal partners and external customers for continuous improvement.As DRIVE Sim and DRIVE Constellation SRE, you will work with DRIVE Constellation platform team architect deployment plan of customers digital testing platform on the cloud. You will set the best practices and choose and build the tools and automation to improve the reliability of the platform. You will drive roadmaps. At NVIDIA Tegra Solution Engineering team we expect everyone to be highly autonomous, a phenomenal teammate and uniquely passionate about the mission. We self-organize and swarm as needed, and everyone is here to do their life’s work.What you'll be doing:Collaborate with customers define digital test architecture based on NVIDIA DRIVE Constellation platformPartner with Constellation Platform team and SRE leadership to understand customer requirements and translate those into deployment plansLead and maintain physical servers, switches and storage devices in lab and data-center environmentsInstall and provision new hardware and software for Linux-based systems (Ubuntu)Automate configuration management, software updates, and maintenance of system availability using modern DevOps tools (Ansible, Gitlab, etc.)Guide and provide technical support for production system deploymentsPlan and maintain new systems that support the NVIDIA DRIVE SIM and Automated Driving Software stacksWork directly with software engineers and hardware architects to debug issues, identify new requirements, and improve workflowsBuild, deploy, and provide production support for any services you work onDiagnose and tackle hardware, network, and software issuesWhat we need to see:BS+ in a computer-related field or equivalent experience3+ years of experienceDemonstrated ability to script in bash, and at least one high-level language (Python preferred)Experience working with Linux servers and technologies such as: Ansible, GIT and DockerDeep understanding of operating systems, computer networks, and high-performance applicationsProficient verbal and written communication skillsWays to stand out from the crowd:Passionate dedication to providing quality support for your usersExperience of maintaining cloud infrastructure applications.Outstanding teamwork skills across interpersonal boundaries.Experience with computer algorithms and ability to choose the best possible algorithms to meet the scaling challenge.The base salary range is 144,000 USD - 224,250 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

View Original Job Posting

Senior Systems Engineer, Site Reliability - Autonomous Vehicles