Montréal, Quebec
6 days ago
Senior AI Infrastructure Operations Manager – Data Centers
The Senior Infrastructure Operations Manager – Data Centers oversees the design, deployment, and lifecycle management of GPU-based AI infrastructure across on-premises and cloud environments. This role ensures reliability, scalability, and efficiency for compute-intensive workloads while leading a high-performing technical team in a fast-paced, data center–driven environment.
Job Benefits: Permanent position Competitive salary between $160,000 and $180,000 + Bonuses Comprehensive benefits package Dynamic and collaborative work environment Exposure to cutting-edge AI and data center technologies International collaboration opportunities Responsibilities of the Senior Infrastructure Operations Manager – Data Centers: Lead, mentor, and grow a high-performing infrastructure and DevOps team. Define objectives, measure performance, and foster a culture of accountability and innovation. Oversee deployment, scaling, and lifecycle management of GPU-based AI clusters across data centers and hybrid cloud environments. Ensure optimal performance, resilience, and cost efficiency for large-scale compute workloads. Collaborate with Networking and System Architecture teams to ensure high-bandwidth, low-latency infrastructure design and execution. Drive continuous integration and deployment (CI/CD) pipelines for infrastructure and AI workloads. Ensure system availability, security, and performance through proactive monitoring and incident management. Manage capacity planning and long-term infrastructure roadmaps to support growing AI demands. Oversee budgets for hardware procurement, cloud services, and licensing, ensuring financial efficiency. Collaborate with Security and Networking teams to enforce access controls, monitoring, and compliance standards.
Qualifications for the Senior Infrastructure Operations Manager – Data Centers: Minimum 7 years of experience in infrastructure or IT operations, including 3+ years in leadership Mandatory experience in data centers or mission-critical environments Proven expertise managing high-performance computing or GPU/TPU infrastructures Strong knowledge of Linux systems, distributed architectures, and automation frameworks Proficiency with Terraform, Ansible, Kubernetes, Docker, and CI/CD pipelines Experience with cloud and hybrid infrastructures (AWS, GCP, Azure) Collaboration with Networking and Architecture teams to ensure scalability and reliability Strong leadership, communication, and organizational abilities Experience ensuring performance, compliance, and cost efficiency in complex environments
Interested in the Senior Infrastructure Operations Manager – Data Centers position in Montreal?
Apply now to start the conversation!

PTWMTL

 
Confirmar seu email: Enviar Email