Senior Infra SRE – Build & Deployments

Company: NVIDIA
Company: NVIDIA
Location: Taiwan, Hsinchu
Commitment: Full time
Posted on: 2023-10-28 18:35
NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing, and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens new universes to explore, enables unique creativity and discovery, and powers what were once science fiction inventions, from artificial intelligence to autonomous cars. NVIDIA is looking for phenomenal people like you to help us accelerate the next wave of artificial intelligence.We are looking for an Infrastructure Deployment Engineer for our Global Compute Team working from Taiwan. As part of this team, you will be helping in new site bring-ups/maintenance and running the corporate servers and virtual infrastructure for an incredible customer experience. We are looking for creative minds, authorities in the field, with a passion to provide outstanding services to our employees and evolving our infrastructure into a cloud centric IaaS.The position required on-site up to 25% in a year.What you’ll be doing:Build infrastructure using IaaC by leveraging Terraform/vSphere working in an Agile environment.Implement CI/CD pipelines,Set up the Monitoring environment. Configure Databases and Antivirus solutions. Participate activities in DevOps solution, including the orchestration/automation of system configuration, security auditing, application deployment, application configuration, and cloud migration/management.Support multi-site, high performance compute infrastructure & services for the global engineering product development organizations.Maintain the infrastructure by measuring & monitoring availability, latency, and overall system health.Support automation within the vSphere & Unix environments. Deploy new tools/wrappers to empower multiple teams to ease the use of infrastructure services.Scale systems sustainably through mechanisms like automation, evolve systems by driving & implementing changes that improve overall reliability & velocity.Travel up to 25% for new site bring up related projects.What we need to see:BS in Computer Science (or equivalent experience) with 8+ years of relevant experience, MS with 5+ years of experience or Ph.D. with 3 years of experienceExclusive experience in handling the Patch management. (Applications, OS & Office updates)Ability to communicate business needs to technical team and to communicate and simplify complex technical information for non-technical people.Experience in system administration and technical support (e.g. installation, configuration, maintenance, upgrade, problem resolution)Experience with cloud infrastructure - AWS, Azure or Google Cloud.Ways to stand out from the Crowd:Experience with Hypervisors (in order of preference): VMware, Hyper-V, and KVM.Hands-on Linux experience (e.g. RHEL, CentOS) and production infrastructure support (e.g. networking, storage, monitoring, compute)Solid grasp of data center operations fundamentals in networking, cooling, and power.
View Original Job Posting