Senior System Reliability Engineer

Company: NVIDIA
Company: NVIDIA
Location: US, CA, Santa Clara
Commitment: Full time
Posted on: 2023-09-08 06:00
NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing — with the GPU acting as the brains of computers, robots, and self-driving cars that can perceive and understand the world. Today, we are increasingly known as “the AI computing company.” We're looking to grow our company and build our teams with the most thoughtful people in the world. Join us at the forefront of technological advancement.GPU Servers are one of the fastest-growing segments for NVIDIA and the Artificial Intelligence industry. As the computational power increases with every GPU generation, developing efficient and reliable systems is an imperative. We are looking for a System Reliability Engineer to join NVDIA's existing Reliability Engineering team, involved in NVDIA's diverse system product range specifically Graphics and High-Performance Computing printed circuit boards and Data Center Servers.What you'll be doing:Establish, deliver and maintain product reliability standards and metrics for NVDIA's new system technologies, using existing tools and processes or developing new as required.Participate in product and engineering design reviews, assess the reliability budget of products/designs, and inspire changes that enhance product reliability.Participate in test/diagnostic tools definition and apply stress loadings in test automation. Conduct system/board level test, debug and validation of NVIDIA's products like GPU cards, data center servers. Collect and analyze test data to improve product reliability.Interface and interact with all pertinent engineering groups, suppliers, and partners ensuring the desired reliability is achieved using Design for Reliability (DfR) methods including FMEA and DoE approaches.Perform and lead appropriate testing with associated failure analysis and recommendations for improving designs and manufacturing.Develop and present methods of correlating reliability test results with actual field performance.What we need to see:BS in EE/Computer Engineering, or equivalent experience.5 plus years in a hardware validation/reliability environment related to PCIE peripherals, graphics cards and servers.Proficient in programming and scripting languages: Python, Linux, Ubuntu, SQL, and data analysis tools: JMP, MS Access, Excel, Tableau.Understand power supply, memory, high speed I/O, PCI express, Ethernet and I2C.Hands-on experience in theoretical and practical Reliability concepts as it relates to high-tech electronic enterprise and consumer products.Have a strong command and understanding of statistical concepts/models/analysis and how they relate to product reliability & life analysis.Good verbal and writing skills as well as the ability to communicate at a high level.Self-motivating, independent, and committed to getting things done.Good project management skills and ability to balance multiple simultaneous projects during development and production stages.With competitive salaries and a generous benefits package, we are widely considered to be one of the technology world’s most desirable employers. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you. Come build the future with us!The base salary range is $96,000 - $166,750. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
View Original Job Posting