We are now looking for a Senior Distributed Systems Software Engineer to work on Triton Inference Server!NVIDIA is hiring expert distributed software engineer for its GPU-accelerated deep learning software team. Academic and commercial groups around the world are using GPUs to power a revolution in deep learning, enabling breakthroughs in problems from image classification to speech recognition to natural language processing. We are a fast-paced team building tools and software to make design and deployment of new deep learning models easier and accessible to more data scientists.What you'll be doing:In this role, you will design and build features, APIs, and services for the next generation of Triton Inference Server. You will be a team leader and an active member of the open source deep learning software engineering community. You will balance a variety of objectives: design scalable software that can be deployed in production servers or cloud settings; implement multi-GPU solutions; understand new customer use cases; work with product teams to define new capabilities; load-balance asynchronous requests across available resources; and integrate the latest open source and NVIDIA technology. As a team lead, you will set project priorities; mentor junior engineers; work with customers, partners, and management to align key features with monthly releases.What we need to see:Masters or PhD or equivalent experience in Computer Science, computer architecture, or related field8+ years of software engineering experience, 3+ years of distributed systems programmingExpertise in distributed systems programming, including controller-worker and client-server architectures, distributed memory management, tightly coupled distributed systemsAbility to work in a fast-paced, agile team environmentExcellent C++ programming and software design skills, including debugging, performance analysis, and test design. Python experience also helpfulWays to stand out from the crowd:Excellent troubleshooting abilities spanning multiple software (storage systems, kernels and containers)Background with deep learning algorithms and frameworks. Especially experience Large Language Models and frameworks such as PyTorch, TensorFlow, TensorRT, and ONNX Runtime.Experience contributing to a large open source project - use of GitHub, bug tracking, branching and merging code, OSS licensing issues handling patches, etc.Experience building and deploying cloud services using HTTP REST, gRPC, protobuf, JSON and related technologies.Experience with container technologies, such as Docker and containers orchestrators, such as Kubernetes.NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most experienced and passionate people in the world working for us. Are you creative and autonomous? Do you love a challenge? If so, we want to hear from you. Come help us build the real-time, efficient computing platform driving our success in the dynamic and quickly growing field Deep Learning and Artificial Intelligence.The base salary range is $176,000 - $333,500. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
View Original Job Posting