System Software Manager, Platform Tools

Company: NVIDIA
Company: NVIDIA
Location: US, CA, Santa Clara
Commitment: Full time
Posted on: 2024-03-06 05:05
NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern deep learning — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, we are increasingly known as “the AI computing company.” We're looking to grow our company and establish teams with the most thoughtful people in the world. NVIDIA DGX, HGX, and MGX servers deliver the world's leading solutions for enterprise AI infrastructure at scale. Enterprise needs a computing infrastructure that can be easily managed in a data center. We are the Server Platform Software Tools team at NVIDIA. We deliver Infrastructure and Tools for server’s readiness for data center deployment, Reliability Availability & Serviceability (RAS), firmware package deployment and server manageability. We are looking for a talented and experienced manager having experience with Platform Software, tools and infrastructure. In this role, you will be making impact in releasing world’s best resient GPU and Grace CPU based servers ensuring high uptime in data centers. This is highly visible role at NVIDIA to guarantee high quality RAS and firmware management features for NVIDIA scale out solutions built around NVLink products. This role requires you to work across the server ecosystem, ODMs and OEMs to ensure they are building servers in right way and also working closely with internal cross-functional teams i.e. hardware engineers, system architects, and software developers.  Join us at the forefront of technological advancement. What you’ll be doing: Lead, mentor, and grow Server Platform Software Tools engineering team and be responsible for the planning, execution, performance and quality of the projects. This is a technical leadership role so you will participate in feature design and implementation of Tools for Server Manageability .  Interact with internal and external partners, including ODMs and OEMs, to understand their use cases and requirements. Collaborate with engineering teams, program and product management, and partners to define the product roadmap around server manageability. Continuously review and identify improvement opportunities in established processes, infrastructure, and practices to ensure the teams are executing in the most efficient and visible manner.  Own quality of server firmware for NVIDIA products across the whole ecosystem and strive it to be best for deploying in data center  What we need to see: 8+ overall years of experience in the software industry with specialization in system software and/or firmware development. 2+ years of management experience. BS, MS, or Ph.D. in CS, CE, EE (related technical field) or equivalent experience. Prior systems software or firmware development experience with a successful track record of taking several complex software features or products through the full product life cycle. Strong understanding of server architecture, systems software fundamentals, HW-SW interactions and performance analysis/optimizations. Excellent python programming and debugging skills in Linux. Experience balancing multiple projects with competing priorities. Flexibility to work and communicate effectively across different teams and time zones.  Ways to stand out from the crowd: Familiarity with the architecture of datacenter server software and experience with the in-band and out-of-band management of firmware and hardware components. Understanding REST architecture style especially JSON over HTTPs with OAuth.  NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. If you're creative and autonomous, we want to hear from you! The base salary range is 216,000 USD - 333,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
View Original Job Posting