Manager, Platform Engineering, AI & Datascience
Cargill
Job Purpose and Impact
The Manager for AI Platform Engineering leads the team that designs, builds, and operates Cargill’s enterprise AI-Ops platform—covering MLOps, LLMOps/GenAIOps, HPC scheduling, and optimisation services (e.g., Gurobi, RStudio Workbench). You will own the platform roadmap, allocate people and budget, drive project delivery, and embed best-in-class reliability, security, and compliance practices. Success is measured by platform uptime, model-to-production velocity, cost-to-serve trends, and team engagement.
Key AccountabilitiesPlatform Ownership & Road-mapping Define and maintain the technical roadmap for MLOps, LLMOps, HPC, and optimization tooling Oversee the portfolio of AI-Ops projects; align scope, schedule, and budget to business objectives. Technical Guidance & Governance Champion infrastructure-as-code, GitOps, and CI/CD pipelines Chair design reviews to enforce architecture standards, security controls, and cost-efficiency patterns. Quality, Reliability & Compliance Set and monitor SLIs/SLOs for training, inference, and optimization services; lead post-incident reviews. Ensure Responsible-AI guardrails, data-privacy, and license-management policies are implemented. Process Improvement & Automation Drive continuous-improvement initiatives (test-driven development, auto-scaling policies, cost dashboards). Introduce self-service tooling that reduces manual ops toil and speeds developer onboarding. Stakeholder & Customer Engagement Partner with product managers, data-science leads, and security/compliance teams to capture requirements and set priorities. Provide transparent status updates, KPI dashboards, and quarterly roadmap demos. Team Management & Talent Development Set performance objectives, conduct regular feedback and coaching sessions, and create growth plans. Foster an inclusive culture that values experimentation, blameless post-mortems, and knowledge sharing. QualificationsMinimum requirement: 6 years relevant experience.
Typical requirement: 7–10 years total experience, with 3+ years running production MLOps/LLMOps or HPC environments and 2+ years managing engineers.
Confirmar seu email: Enviar Email