Product Manager - Inference

Company: NVIDIA

Location: US, CA, Santa Clara

Commitment: Full time

Posted on: 2025-06-13 05:03

Inference is the fastest growing and most competitive area in Generative AI today. It is where AI models impact our daily life, and where ever bit of accuracy and performance matter for quality, safety, and cost. Inference is also constantly evolving, with new acceleration algorithms, usecases, and deployment techniques. As a Product Manager for AI Platform Inference you will be responsible for building the tools, SDKs, and libraries which enables developers' Inference deployments to thrive on NVIDIA GPUs.As NVIDIA Product Managers, our goal is to enable developers to be successful on the NVIDIA Platform, and push the boundaries of what is possible in AI deployments! As Product Managers, we are the champions inside NVIDIA for developers looking to accelerate their deployments on GPUs. We work directly with developers inside and outside of the company to identify key improvements, create roadmaps, and stay alert on the inference landscape. We also work with NVIDIA leaders to define clear product strategy, and marketing team teams to build go-to-market plans. The Product Management organization at NVIDIA is a small, strong, and impactful group. We focus on enabling deep learning across all GPU use cases and providing great solutions for developers. We are seeking a rare blend of product skills, technical depth, and passion to make NVIDIA great for developers. Does that sounds familiar? If so, we would love to hear from you!What you'll be doing:Create products to help developers build better Inference deploymentsDevelop product strategy, roadmaps, and go-to-market plansCollaborate with internal and external developers to build product-based roadmaps for model optimization softwareWork with leadership to align with and drive company strategyWhat we need to see:Experience with Inference deployment and optimization software (ex. vLLM, SGLang, FlashInfer, TensorRT-LLM, Triton, Dynamo, TorchAO, etc.)Demonstrable knowledge of GenAI or machine learning concepts, particularly around performance optimization, and software development and deliveryBS or MS degree in Computer Science, Computer Engineering, or similar experience (or equivalent experience)5+ years of technical product management, or similar, experience at a technology companyStrong communication and interpersonal skillsWays to Stand Out from the crowd:Experience leading optimization products for InferenceWorking on Open Source & Github-first developer products with deep customer interactionsKnowledge of GPU architecture, HW/SW co-design, and performance profilingThe base salary range is 144,000 USD - 258,750 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

View Original Job Posting