Senior Manager, AI for IT Operations Management - AIOps

Company: NVIDIA
Company: NVIDIA
Location: US, CA, Santa Clara
Commitment: Full time
Posted on: 2023-09-08 05:59
NVIDIA has been innovating computer graphics, PC gaming, and accelerated computing for more than 25 years. Today, we’re tapping into the unlimited potential of generative AI to define the next era of computing. An era in which accelerated computing is powered by our GPUs, and generative AI foundational models for the enterprise. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.We are looking for a strong and experienced IT infrastructure engineering mgr to lead the development of IT operations management (ITOM) solutions including IT asset management, configuration management, application performance monitoring (APM), IT incident & change management, and to drive AI/Machine Learning (ML) and Generative AI-based innovations into ITOM (AIOPs) at NVIDIA IT. You will combine your strong IT infrastructure software product development experience with Large Language Models (LLMs) and traditional machine learning (ML) to rethink ITOM and build next-generation AIOps (AI for IT Operations Management) solutions.What you'll be doing:Manage Asset Management, Configuration management, IT infrastructure monitoring, and incident & change management projects and team at NVIDIA IT.Set the technical vision, roadmap, and drive AI/ML/Generative AI-based innovation into IT operations management (AIOPs).Work with business partners to identify IT Operations process inefficiencies, find opportunities for AI-driven/generative-AI driven automations, define the vision, and drive the development of solutions to improve IT operations management business metrics such as mean time to detect (MTTD) and mean time to resolve (MTTR).Perform build Vs. buy tradeoffs and accelerate the AI-driven transformations for NVIDIA enterprise in the space of ITOM.Develop AI-driven/ generative-AI driven analytical solutions for monitoring managing the entire IT real-estate including IT asset management, configuration management, application performance monitoring (APM), alert noise reduction, incident detection, diagnosis, resolution, and avoidance.Lead the team effectively to improve metrics such as meantime to detect, diagnose, and repair IT incidents.Lead a mix of infrastructure, full-stack, and AI/ML engineers. Inspire the team with a sound technical vision and an urgency to execute by creating an inclusive team culture with a growth mindset.Exercise technical judgment, anticipate bottlenecks, bring up issues early, make architectural trade-offs, and balance business needs versus technical constraints.What we need to see:Advanced graduate degree in Computer Science, Machine Learning/Data Science, Information Systems, other equivalent fields, or equivalent experience.Minimum of 5 years of experience in developing solutions in IT Operations Management space as an engineering manager.5+ years of demonstrated ability of leading engineering teams in building AI/ML and software development teams as an engineering manager.Minimum 10 overall years as a hands-on engineer developing software solutions products using Agile methodology of which at least 6 years of experience is in building production solutions with AI/ML.Deep hands-on experience in machine learning, and artificial intelligence. Is up to speed on generative-AI advancements and can leverage generative-AI technologies in building solutions for ITOM problems. Can set teams up to focus and deliver with speed and quality. Ability to craft good metrics and hypotheses and proven ability to deliver on business metrics.Strong people management and team-building skills. Can mentor and grow talent, cultivate healthy engineering environment, and attract/retain talent. Ability to build a diverse, broad, and impactful team.Ability to foster collaboration among teams composed of both technical and non-technical members. Effective communication skills, proven problem-solving skills, and strong leadership.NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative, eager to tackle hard business problems, and enjoy having fun, then what are you waiting for? Apply today!The base salary range is $196,000 - $402,500. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
View Original Job Posting