Senior DevOps Engineer

Company: NVIDIA
Company: NVIDIA
Location: China, Shanghai
Commitment: Full time
Posted on: 2024-03-12 05:26
We are now looking for a Senior DevOps Engineer . You will work on open-source technologies and enterprise adoptions such as:Accelerate Apache Spark with GPU (Spark-RAPIDS) to speedup data processing and machine learning dramaticallyMedical deep learning framework (project MONAI) that revolutionizes healthcare AI solutions worldwideFederated learning technology (NVFlare) that builds generalizable AI models from diverse data sources while ensuring data security and privacyWhat you'll be doing:Serve as a technical leader in defining, designing, developing, and maintaining the DevOps tools, frameworks & platformsImplement, advocate, and carry out CI/CD conventions and write tools to automate various steps involved in this processDevelop and maintain Build, Deployment, and Continuous Integration infrastructureEnable the development team by providing automated build and test solutions using Docker, Kubernetes/YARN, and on-prem/CSPsWork with open source communities, including RAPIDS, Spark, MONAI, and NVFlare, on CI/CDWork closely with Development and QA teams to help ensure end-to-end qualityFull stack development opportunities depending on the candidate's capabilitiesWhat we need to see:BS or MS in Computer Science, Computer Engineering, or closely related fields10+ years of working experience in software development2+ years experience in CI/CD system, Strong programming and debugging skills in Python/Java/C++ with extensive bash scripting experienceStrong hands-on skillsExcellent knowledge of Gitlab/Github or other source version control systemsConfiguring, maintaining, and building upon deployments of industry-standard tools (e.g. Jenkins, Kubernetes, Docker, etc)Strong experience in build tools like maven, setup tools, cmake, unit testing, and code-coverage toolsStrong skills in software release process (maven repository, PyPI, Conda)Familiar with various Linux systems like Ubuntu, CentOS, Rocky and with cloud services like AWS, Azure, GCPGood knowledge of open-source big-data technologies (Spark, Hadoop) and/or ML/DL frameworks (TensorFlow, PyTorch)Ways to stand out from the crowd:Good open-source project management skillsKubernetes, YARN, Spark, or Ray experienceExperience with Configuration Management such as Ansible, and TerraformKnowledge of monitoring systems (Prometheus, Grafana)Experience with CUDA would be a huge plusWe are an AA/EEO/Disabled employer and with highly competitive salaries and a comprehensive benefits package, NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most brilliant and talented people on the planet working for us. Are you creative and autonomous? Do you love a challenge? If so, we want to hear from you.
View Original Job Posting