Deep Learning Kernel Software Performance Architect - New College Grad 2026
Excelero Storage
NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.
We are looking for a Performance Architect for Deep Learning Software! NVIDIA is seeking extraordinary architects to develop processor and system architectures that accelerate machine learning, data analytics and high-performance computing applications. This position offers the chance to create a relevant impact in a dynamic, technology-focused company.
What you'll be doing:
As a member of our Deep Learning Architecture team, you will:
Validate and analyze performance of GPU-accelerated system and software architectures that advance the frontier of deep learning performance.
Debug deep learning and data analytics software to identify root causes of performance bottlenecks.
Develop scripts and tools to analyze, visualize, and debug software using analytical models, simulators, and test suites
Collaborate across NVIDIA teams:
Work with the CUDA and AI Compiler teams to pinpoint and resolve performance issues
Engage AI/ML training and inference performance teams to identify and optimize critical deep learning layers
Collaborate with hardware architecture performance teams to define expectations for emerging deep learning hardware features
What we need to see:
Master's or PhD in Computer Science, Electrical Engineering or Computer Engineering, or equivalent experience.
Proven expertise in software design, including debugging, performance analysis, and test development
Hands-on experience with practical parallel programming, even if it’s not on GPUs.
Strong understanding of computer architecture, with practical experience on performance debugging.
Ability to identify bottlenecks, optimize resource utilization, and enhance system throughput
Fluency in programming languages such as Python, C, C++.
Ways to stand out from the crowd:
Strong foundation in machine learning and deep learning fundamentals to complement your expertise in computer architecture.
A strong background in high performance power efficient designs, energy efficient high-performance computing, performance analysis and profiling to identify performance bottlenecks.
Experience and familiarity with GPU computing and parallel programming models.
Work experience with analytical performance modeling, profiling, and analysis
Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer to you and your family www.nvidiabenefits.com/
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 124,000 USD - 195,500 USD for Level 2, and 152,000 USD - 241,500 USD for Level 3.You will also be eligible for equity and benefits.
This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.