Senior HPC Cluster Engineer - Systems Integrator
- Competitive and based on experience
- Amsterdam, Netherlands
- Permanent
- Enterprise
Looking for a role with plenty of growth opportunities? Join a pioneering company at the forefront of AI infrastructure development, powered by the world’s leading GPU manufacturer, NVIDIA. The AI revolution demands advanced infrastructure—including compute, storage, platforms, tools, and services for developers—precisely what this company is building. With an R&D core of approximately 850 top-tier AI engineers, many with extensive experience in big tech, the company specializes in creating world-class AI infrastructure. Headquartered in Amsterdam, it has a global presence with R&D hubs across Europe, North America, and Israel, operating worldwide as a Nasdaq-listed company.
The Nasdaq-listed AI infrastructure provider is searching for a skilled Senior HPC Cluster Engineer to join the team. If you would like to learn more about this opportunity, feel free to reach out and apply today!
Responsibilities:
- Improve infrastructure supporting GPU-accelerated computing.
- Analyze root causes of performance and reliability issues across various scales and suggest effective solutions.
- Add support for new hardware across the infrastructure software stack.
- Proactively detect and resolve issues to ensure platform stability and efficiency.
Skills/Must have:
- 5+ years of professional software development experience.
- 3+ years working with Linux systems.
- Strong system-level understanding of server architecture, PCIe devices, NICs, and kernel drivers.
- Proficiency in performance-oriented programming languages (e.g., C, C++, Go, Java, Python).
Nice to have:
- Experience tuning performance for HPC workloads.
- Familiarity with RDMA, RoCE, and Infiniband networking.
- Knowledge of Software Defined Networking and HPC cluster networking.
- Understanding of the QEMU/KVM virtualization stack.
- Experience with deep learning frameworks (e.g., PyTorch, TensorFlow).
- Familiarity with collective communication libraries (e.g., MPI, NCCL).
- Willingness to complete a coding interview as part of the hiring process.
Benefits:
- Competitive salary and full benefits package.
- Opportunities for professional growth and internal mobility.
- Hybrid work environment with flexibility.
- Collaborative and forward-thinking engineering culture.
- Contribute to the infrastructure that powers next-generation AI computing.
- Collaborate with experts in virtualization, hardware acceleration, and high-performance clusters.
- Gain exposure to advanced technologies like RDMA, RoCE, Infiniband, and QEMU/KVM.
Salary:
- Competitive and based on experience.
