Skip to main content

GPU Senior Software Engineer

Technology
חברה בתחום כללי
תל אביב יפו, ישראללפני 1 שבועותעד 14.7.2026
משרה מלאה

תיאור המשרה

We are seeking a skilled software engineer to join our NPU software stack development team. This role involves developing high-performance GPU programming frameworks, runtime systems, and libraries for AI/ML workloads. You will be responsible for implementing, optimizing, and maintaining GPU software stack components to support distributed AI training and inference.Key ResponsibilitiesIdentify bottlenecks, analysis and optimize in distributed NPU eco-systemDesign and develop NPU memory management systemDesign and develop optimized NPU development framework, execution path and debuggingDevelop compatibility with AI frameworks (Triton, PyTorch, JAX)Write high-quality, well-tested code with comprehensive documentationCollaborate with other teams (Hardware, Network, QA, AI Framework Integration)Participate in code reviews and technical design discussions.Requirements: Required Qualifications5+ years of experience in distributed system programming3+ years of experience with NPU programming (Triton, CUDA, HIP, OpenCL)Expert-level C/C++ programming with focus on performance optimizationExpert-level Python programming with focus on DL/ML frameworks (PyTorch/JAX/etc)Deep understanding of NPU architecture, memory tiering, and programming modelsKnowledge of NPU runtime systemsExperience with performance profiling and optimization toolsStrong problem-solving and debugging skillsExperience with version control systems, Ticking system and collaborative developmentTeam player with excellent communication skillsFast learner, highly organized, detail-oriented with high motivationPreferred QualificationsExperience with NPU software stack developmentExperience with large-scale NPU systems (100+ NPUs)Experience with DL/ML workloads (oriented AI) and distributed training / inferencingFamiliarity with containerization and orchestration.This position is open to all candidates.

Keywords
OrchestrationCUDAOCamlPyTorchOpenCLBotanPythonMemory managementDebuggerTritiumDebugging

מתעניינים במשרה הזו?