We are looking for a Senior Software Engineer to build a scalable messaging infrastructure for NVIDIA DGX Cloud. You will be a member of a team that builds the DGX cloud control plane serving AI/ML workloads and work closely with a variety of teams and architects including Data, Storage and Service platform team. You will define the Software architecture and implementation of the scalable messaging platform for cloud services! We have crafted a team of extraordinary people stretching around the globe, whose mission is to push the frontiers of what is possible today and define the platform of tomorrow.At NVIDIA, we work, think and learn as a team. We thrive in a deeply strong environment, and we're passionate about a culture that demands innovation and the highest standards. The rewards are sweet and include collaborating with some of the smartest people in the industry, an aggressive compensation plan that rewards top performers, and the opportunity to work on products that transform the way people work and play.What you’ll be doing:Platform Architecture: Lead the design and architecture of the cloud-scale messaging platform, drawing inspiration from Kafka and similar technologies, ensuring it meets our scalability, reliability, and performance requirements.Development: Develop and maintain core components of the messaging platform, including message brokers, stream processors, and related infrastructure. Write high-quality, maintainable code in collaboration with the engineering team.Scalability: Design, implement, and optimize the platform to handle high-throughput, low-latency message processing across a distributed and highly available environment.Monitoring and Optimization: Implement robust monitoring and alerting systems to proactively identify and address performance bottlenecks, system failures, and other issues. Continuously optimize the platform for improved efficiency.Security: Ensure the platform adheres to security best practices, including data encryption, access control, and authentication mechanisms.Integration: Collaborate with other engineering teams to integrate the messaging platform with various applications, services, and microservices within the organization. Documentation: Create comprehensive documentation for the platform, including architecture, configurations, and usage guidelines.Testing: Develop and maintain a comprehensive testing strategy, including unit testing, integration testing, and load testing to ensure the reliability and stability of the platform. Troubleshooting: Lead efforts in diagnosing and resolving complex system issues, including performance tuning, debugging, and root cause analysis.Innovation: Stay up-to-date with emerging technologies and industry trends, and propose innovative solutions to improve the messaging platformWhat we need to see:B.Sc., M.Sc. or Ph.D. in Computer Science, or related discipline, or equivalent experience. 12+ yrs of proven experience in a streaming services, messaging platform or Kafka ecosystem.Excellent programming skills in Java, Golang, C/C++ or related. Previous experience with Kafka, Data pipeline, Cloud based data streaming platforms.Strong knowledge of cloud platforms (e.g., AWS, Azure, or GCP) and containerization technologies (Docker, Kubernetes)Experience with monitoring and alerting tools, such as Prometheus and Grafana.Knowledge of security best practices in the context of distributed systems.Ability to quickly adapt to new technology and go deep into new areas.Can work independently. Strong communication and collaboration skills.Able to work with customers and partners. Drive new solutions based on any issues that arise.NVIDIA has continuously reinvented itself over two decades. NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. This is our life’s work — to amplify human imagination and intelligence. With highly competitive salaries and a comprehensive benefits package, NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hard-working people in the world working for us and, due to extraordinary growth, our elite engineering teams are fast-growing fast. If you're a creative and autonomous manager with a sincere real passion for technology, we want to hear from you.The base salary range is 216,000 USD - 414,000 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
View Original Job Posting