Compute Architect

Company: NVIDIA

Location: China, Shanghai

Commitment: Full time

Posted on: 2025-07-15 05:48

Are you passionate about compiler technology and computer architectures for deep learning? Do you thrive at the intersection of hardware and software? NVIDIA is seeking world-class compiler engineers and performance architects who are excited to push the boundaries of machine learning infrastructure. In this role, you will develop and optimize MLIR-based compiler infrastructure that powers our deep learning libraries and influences the direction of future GPU architectures. This position offers the opportunity to make a significant impact in a fast-moving, technology-focused company.What You'll Be Doing:Design, implement, and optimize MLIR-based compiler passes for deep learning and data analytics workloads.Analyze and improve the performance of machine learning and deep learning algorithms on current and next-generation architectures with compiler technologies.Identify performance bottlenecks in compiler-generated code and propose creative solutions.Collaborate with hardware architects and software teams to co-design features that maximize performance and efficiency.Contribute to the evolution of NVIDIA’s deep learning compiler stack and libraries.What We Need to See:MS or PhD in Computer Science, Electrical Engineering, Mathematics, or a related field, or equivalent experience.5+ years of working experienceProven experience developing compilers or compiler infrastructure, preferably with MLIR, LLVM, or similar frameworks.Strong programming skills in C++ and Python.Solid understanding of computer architecture, especially as it relates to performance optimization.Experience optimizing code for CPUs or GPUs, including low-level programming (assembly, SIMD, or vectorization).Experience with deep learning algorithms, especially matrix multiplication and convolution.Ways to Stand Out from the Crowd:Hands-on experience with MLIR, LLVM, or other modern compiler frameworks.Deep understanding of parallel programming models and GPU architectures.Strong communication and organizational skills.Demonstrated ability to work collaboratively in a fast-paced, cross-functional environment.If you are excited about building the next generation of machine learning compilers and want to work with world-class teams at the forefront of AI and hardware innovation, we want to hear from you!#deeplearning

View Original Job Posting