Full-time

Senior Software Engineer, AI Inference Systems

Posted by NVIDIA Gruppe • June 08, 2026

📍 Remote, Remote, Switzerland

Apply Now

Description

Overview We are seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency. You’ll architect and implement high-performance inference stacks, optimize GPU kernels and compilers, drive industry benchmarks, and scale workloads across multi-GPU, multi-node, and multi-cloud environments. You’ll collaborate across inference, compiler, scheduling, and performance teams to push the frontier of accelerated computing for AI. 
Responsibilities Contribute features to vLLM that empower the newest models with the latest NVIDIA GPU hardware features; profile and optimize the inference framework (vLLM) with methods like speculative decoding, data/tensor/expert/pipeline-parallelism, prefill-decode disaggregation. 
Develop, optimize, and benchmark GPU kernels (hand-tuned and compiler-generated) using techniques such as fusion, autotuning, and memory/layout optimization...
            

Job Details

Location Remote, Remote
Job Type Full-time
Category Other-General
Posted June 08, 2026
Deadline July 18, 2026

Ready to Seal the Deal?

Submit your application today and take the next step in your career with NVIDIA Gruppe.

Apply for this Job