Senior DevOps Engineer

Company: NVIDIA
Company: NVIDIA
Location: India, Bengaluru
Commitment: Full time
Posted on: 2023-10-28 18:37
NVIDIA is searching for a highly motivated software engineer for the NVIDIA NetQ team that is building a next gen Network management and Telemetry system in cloud and on-prem using modern design principles at internet scale.  NVIDIA NetQ is a highly scalable, modern network operations toolset that provides visibility, troubleshooting, validation and telemetry of NVLink/NVSwitch and Ethernet fabrics in real time. NetQ utilizes telemetry and delivers actionable insights about the health of a data center network, integrating the fabric into the DevOps ecosystem.  What you'll be doing:  The person will be part of the NVIDIA NetQ team that is building the SaaS platform and the on-premise solution for network management and telemetry.The responsibility specifically is for Devops, infrastructure and Site Reliability Engineering requirements for NetQ. Focus on efficiency by automating repetitive workflows.Working on microservices based architecture.Deploying and troubleshooting non-disruptive cloud operations with an emphasis on secure production infrastructure.Continuous evaluation of existing system and driving improvements.Managing deployment/upgrade for Operating Systems, Kubernetes clusters.Day to day support for engineering activities with CI/CD tools like git, jenkins.Efficiently multi-tasking on the different tracks to efficiently address evolving priorities . What we need to see:  5+ years of experience in complex microservices based architectures  Highly skilled in Kubernetes and DockerHaving good programing background in 1 high level language like groovy or python  Strong knowledge of NoSQL DB (preferably Cassandra), Kafka/Kafka Streams and Nginx.Experienced with modern deployment architecture for non-disruptive cloud operations including blue green and canary rollouts Automation expert with hands on skills in frameworks like Ansible & Terraform Expert in AWSKnows best practices and discipline of managing a highly available and secure production infrastructure   Ways to stand out from the crowd:  Skills in Linux/Unix Administration Experience with Prometheus/Grafana.Experience with APM tools like Dynatrace, Datadog, AppDynamics, New Relic, etc.Implemented highly scalable log aggregation systems in past using ELK stack or similar Implemented robust metrics collection and alerting infrastructure    NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. If you're creative, passionate and self-motivated, we want to hear from you!    NVIDIA is leading the way in ground-breaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services.
View Original Job Posting