Shanghai, CHN
1 day ago
Software Architect, Enterprise AI Software
NVIDIA is the platform upon which every new AI-powered application is built. We are seeking a Software Architect to define and lead the technical vision for the NVIDIA Inference Microservices (NIM) Factory. You will set the architectural direction for how we build, deploy, and scale enterprise-grade AI services to delight customers, while staying hands-on to guide our most critical implementations. The scope spans day-0 launches and the follow-through to harden them into enterprise-grade software, ensuring reliability, performance, and security across thousands of GPUs. You will shape our strategy for emerging challenges like disaggregated LLM inference and safeguard the long-term technical health of the platform. What you'll be doing: + Define the end-to-end technical architecture for the NIM Factory, from container build systems and CI/CD to Kubernetes deployment patterns and runtime optimization. + Drive technical strategy and roadmap, making high-impact decisions on frameworks, technologies, and standards that empower dozens of engineering teams. + Architect and influence the design of workflow orchestration systems that underpin the NIM factory. + Coach and mentor senior engineers across the organization, fostering a culture of technical excellence, innovation, and knowledge sharing. + Champion best practices in software development, including API design, automation, observability, and secure supply chain management. + Collaborate with leadership across research, backend, SRE, and product to align technical vision with product goals and influence technical roadmaps. What we need to see: + 12+ years of experience designing and building large-scale, production distributed systems. + Proven track record in a technical leadership or architect role, setting technical direction while staying hands-on with implementation. + Deep architectural expertise in cloud-native technologies, including Kubernetes, containers, and microservices. + Exceptional ability to coach, teach, and influence senior engineers; a passion for raising the technical bar of the entire organization. + Strong foundation in modern software development practices, with proficiency in languages like Python for building tooling and services. + Experience architecting solutions for GPU-accelerated or other high-performance computing workloads. + Excellent communication and collaboration skills, with the ability to articulate complex technical concepts to diverse audiences and drive consensus. + A degree in Computer Science, Computer Engineering, or a related field (BS or MS) or equivalent experience. Ways to stand out from the crowd: + Hands-on with LLM inference stacks (Triton Inference Server, TensorRT-LLM, vLLM, FasterTransformer, KServe). + Experience optimizing large-model serving (KV cache sharding/paging, tensor/sequence parallelism, speculative decoding, dynamic batching). + Experience architecting next-generation container build systems or CI/CD platforms at scale. + Background with workflow orchestration engines (e.g., Temporal, Airflow) for complex, distributed processes. + Expertise in designing multi-tenant, multi-cluster, or edge/air-gapped deployment architectures. We are widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and creative people in the world working for us. If you're creative and autonomous with a real passion for technology, we want to hear from you.
Confirmar seu email: Enviar Email
Todos os Empregos de NVIDIA