Senior DL Performance Infrastructure and MLOps Engineer

Company: NVIDIA
Company: NVIDIA
Location: US, CA, Santa Clara
Commitment: Full time
Posted on: 2023-10-28 18:40
We are now looking for a Senior DL Performance Infrastructure & MLOps Engineer:NVIDIA is seeking engineers who love building world-class infrastructure, from automated command-line scripting to full-blown CI/CD systems running on some of the world's largest clusters, to support our work to accelerate training of deep neural networks like Stable Diffusion or ChatGPT via hardware and software innovations. If you have that itch whenever the mechanical aspects of code development, performance analysis, and data processing consume any more human time than necessary, we'd like to hear from you. If you are passionate about accelerating all existing workfloads in a diverse team while also envisioning next-gen opportunities to enable new forms of hardware/software analysis and development we haven't even thought of, this is the place for you.What you'll be doing:Improve all tooling and automation in use in the team, from simple data collection scripts to datacenter-scale ML CI/CD systems.Understand and internalize workflows for GPU performance analysis and optimization so you can help us re-invent them.Build Python-based machinery hooking into common Deep Learning software like PyTorch or JAX to support performance analysis work.Ruthlessly discover and chase down workflow- and tool-related inefficiencies in the team's daily work, and dream up and implement ways to eliminate them.What we need to seeMS degree in CS or adjacent fields or equivalent experience3+ years of relevant work experienceBackground in deep learning fundamentals and common deep learning software, especially PyTorch/JAXExperience in GPU computing, i.e. fundamental understanding of heterogeneous multi-node accelerated computing systemsBackground in analyzing and optimizing application performanceFamiliarity with containerized CI/CD flows, e.g. gitlab + dockerProgramming skills in C++, Python, and CUDADeep passion related to tools, scripts, and automationNVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. Are you creative and autonomous? Do you love a challenge? If so, we want to hear from you! Come, join our DL Architecture team and help build the real-time, cost-effective AI computing platform driving our success in this exciting and quickly growing field.The base salary range is $144,000 - $270,250. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
View Original Job Posting