Senior Infrastructure Software Engineer, Deep Learning Libraries
Excelero Storage
We are now looking for a Senior Infrastructure Software Engineer for Deep Learning Libraries!
NVIDIA's Deep Learning Libraries Group is seeking excellent software engineers to enable the next wave of NVIDIA’s highest performing deep learning libraries. The role spans multiple products, including TensorRT, TensorRT-LLM, and cuDNN. The mission is to design and develop scalable, modular infrastructure that streamlines development, builds, and testing across NVIDIA’s diverse set of platforms, from Drive AGX for autonomous vehicles to DGX servers for datacenters and large language models. Join our technically diverse team of software engineers and infrastructure experts to design the systems that enable NVIDIA to stay ahead of the competition as we deliver the world's fastest deep learning platforms.
What you'll be doing:
Designing and developing software for testing and analysis of our codebases
Building scalable automation for build, test, integration, and release processes for publicly distributed deep learning libraries
Developing throughout the software stack, from the user experience down to the cluster and database layers
Configuring, maintaining, and building upon deployments of industry-standard tools (e.g. Kubernetes, Jenkins, Docker, CMake, Gitlab, etc.)
Enabling new platforms, which may include preparing hardware for testing and enabling testing in automation for new platforms
What we need to see:
BS or equivalent experience or higher degree in Computer Science or Computer Engineering
5+ years of relevant experience
Strong familiarity with Python (or similar) and experience with building C/C++ codebases
System administration experience maintaining both Linux and Windows systems
Experience setting up, maintaining, and automating continuous integration systems
A pragmatic approach to solving problems and collaboration
Ways to stand out from the crowd:
Experience designing and developing automation in Jenkins with Groovy (or similar)
Background with distributed systems and cluster/cloud computing, especially with Kubernetes
Knowledge of GPU computing systems
Experience with mobile/embedded platforms and multiple operating systems (Ubuntu, RedHat, Windows, QNX, or similar)
Track record of identifying useful new technologies and incorporating them into SW development flows
This is an opportunity to have a wide impact at NVIDIA by improving development velocity across our many compute software projects. Are you creative, driven, and autonomous? Do you love a challenge? If so, we want to hear from you!
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 148,000 USD - 235,750 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.You will also be eligible for equity and benefits.