North Bethesda, MD
122 days ago
Principal ML Engineer, ML Platform Engineering

We are looking for a principal machine learning engineer to join our core machine learning platform engineering team. In this role, you will partner closely with the AI/MLE leadership team to deliver the vision and technical implementation for the foundational infrastructure leveraged by Xometry’s AI/ML solutions, including the Instant Quoting Engine® and other AI/ML products powering the Xometry marketplace.

This will be a high visibility role working hands-on to deliver a core aspect of the Xometry ecosystem. You will be given the opportunity to continually challenge yourself, drive innovation, have ownership of your work, and play a crucial role in the Xometry platform. 

Responsibilities:

Hands-On Technical Leadership: Adopt a 'lead by example' approach by actively coding and troubleshooting, as well as creating documentation and technical diagrams. Teaching & Mentorship: You will serve as a mentor and guide to engineers across the organization, teaching and mentoring them to grow their skills. Code Review: You will do code review and mentor others within the organization regarding best practices in ML Engineering. Operational Excellence: Guarantee the delivery of superior infrastructure and software that not only meets but exceeds customer expectations, while aligning with the strategic business timelines. Collaborative Strategy: Forge strong partnerships with product managers, data scientists, and company leadership to promote a culture of open communication and integrated team dynamics. Guide Innovation: Champion the adoption of cutting-edge technologies, methodologies, and practices to enhance problem-solving efficiency and effectiveness across the AI/ML organization.

Qualifications:

At least 7 years of experience in machine learning engineering, software engineering, data science, or similar technical role. A bachelor’s degree is required, but an advanced degree (M.S. or PhD) in computer science, machine learning, AI, or a related field is preferred and may substitute for some years of experience. Demonstrated experience designing and deploying cloud infrastructure (AWS preferred) to support machine learning, and machine learning models, with considerations for scale, reliability and security. Deep understanding of the machine learning lifecycle and related infrastructure needs - feature stores, a/b testing, model registration, drift detection, automated retraining, etc. Strong technical expertise. You will need to either have or demonstrate the ability ability to quickly build technical expertise in the following: Software engineering principles, including parallel and distributed computing, version control, reproducibility, and continuous integration. Machine learning techniques and algorithms, with emphasis on their impact to infrastructure implementation Including large-scale language and vision models (Transformers, GPT, VLMs, LLMs), deep learning (PyTorch, Tensorflow)  Infrastructure as Code (IaC), especially Terraform REST API design and implementation Object oriented and functional programming in Python Multimodal data processing (e.g., combining text, image, and 3D data). Experience with AWS microservices including SageMaker, Service Catalog, IAM, Lambda, Cloudwatch, ECR, EKS, and Kinesis Containerization technologies (Docker and Kubernetes) Demonstrated ability to interact and communicate effectively at all levels of the organization, from executives to product managers and a wide variety of stakeholders and contributors Experience in the manufacturing, supply chain, or similar industries is a plus. Must be a US Citizen or Green Card holder (ITAR)

#LI-Hybrid

Confirmar seu email: Enviar Email