Senior Software Engineer, DL Libraries Infrastructure
Excelero Storage
NVIDIA's Deep Learning Libraries Group is seeking excellent software engineers to enable the next wave of NVIDIA’s highest performing deep learning libraries. The role spans multiple products, including cuDNN and FlashInfer. The mission is to design and develop scalable, modular infrastructure that streamlines development, build, and test across NVIDIA’s diverse set of platforms, from datacenter to autonomous vehicles. Join our technically diverse team of software engineers and infrastructure experts to design the systems that enable NVIDIA to stay ahead of the competition as we deliver the world's fastest deep learning platforms.
What you'll be doing:
Designing and developing software for testing and analysis of our codebases
Building scalable automation for build, test, integration, and release processes for publicly distributed deep learning libraries
Developing throughout the software stack, from the user experience down to the cluster and database layers
Configuring, maintaining, and building upon deployments of industry-standard tools (e.g. Kubernetes, Jenkins, Docker, CMake, Gitlab, Jira, etc)
Advancing innovative in those industry-standard tools and upstreaming contributions to the open source community
What we need to see:
BS or equivalent experience or higher degree in Computer Science or Computer Engineering with 5+ years of relevant experience.
Strong programming skills in Python (or similar) and familiarity with C/C++ development
Experience setting up, maintaining, and automating continuous integration systems
Proficiency in SCM (e.g. Git, Perforce) and build systems (e.g. Make, CMake, Bazel)
A pragmatic approach to solving problems collaboratively with a passion for “it just works” automation to enable team members
Ways to stand out from the crowd:
Experience designing and developing automation in Jenkins, Gitlab CI/CD, or Github Actions and background with distributed systems and cluster/cloud computing (e.g. Slurm, containers, Kubernetes, etc)
Experience designing and developing unit and integration test frameworks with hands-on experience using code coverage and static code analysis tools
Success leading a team of engineers and/or experience as an active contributor to a software project involving many developers
Knowledge of GPU computing systems and experience with mobile/embedded platforms and multiple operating systems (Ubuntu, CentOS, Windows, L4T, or similar)
Track record of identifying useful new technologies and incorporating them into SW development flows
You will also be eligible for equity and benefits.
This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.