Deep Learning Performance Architect

Company: NVIDIA
Company: NVIDIA
Location: China, Shanghai
Commitment: Full time
Posted on: 2023-10-28 18:39
Are you passionate about exploring computer architectures for deep learning? Do you like to work at the intersection of hardware and software?  NVIDIA is seeking world-class programmers and performance architects who love to squeeze out every cycle of performance from deep learning codes.  In this role, you will write code that ships in our deep learning libraries, as well as guide the direction of our future GPU architectures.  This position offers the opportunity to have real impact in a fast-moving, technology-focused company. What you'll be doing:Analyze the performance of various machine learning/DL algorithms on existing/new architecturesIdentify bottlenecks and propose creative solutions to improve them.Develop high performance operations for cuDNN/cuBLAS/TRT/etc. librariesUnderstand and analyze the interplay of hardware and software architectures on future algorithms and applicationsAdd new capabilities to GPU architectures What we need to see:MS or PhD in relevant discipline (CS, EE, Math) and 3+ years' work experienceTrack record of optimizing code for performance on CPUs or GPUs, including assembly or SIMD programmingStrong programming skills in C, C++, Perl, or PythonFamiliarity with GPU computing (CUDA, OpenCL, OpenACC)Strong background in computer architectureExperience with matrix multiply and convolution algorithms Ways to stand out from the crowd:MLIR development experienceExperience with parallel programming and CUDA/OpenCLYou are familiar with DL frameworks/fundamentalsGood communication and organizational skills#deeplearning
View Original Job Posting