Redmond, WA, 98073, USA
4 days ago
Infra SWE V - ML Compute
Job Description We’re hiring a Senior Infrastructure Software Engineer to join a high-impact ML infrastructure team at a top research and tech company. This role is ideal for engineers who thrive in high-autonomy environments and have deep experience building infrastructure at scale. You’ll spend 80–90% of your time coding in Python, contributing directly to the development and improvement of core ML compute infrastructure. You’ll help maintain and expand a bespoke GPU Kubernetes cluster, working on systems that support a wide range of research customers.  • Write clean, scalable Python code to enhance internal ML infrastructure systems  • Own and operate a custom GPU Kubernetes cluster, including data catalog and caching storage components  • Build features that support onboarding and performance needs of ML and research users  • Improve performance, resolve production issues, and optimize resource usage  • Contribute to and improve pipelines (data ingestion, compute scheduling, etc.)  • Work with tools like Docker, Kubernetes, and automated testing frameworks We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to HR@insightglobal.com.To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/. Skills and Requirements • 5+ years of experience as a Software Engineer, ideally with infrastructure and/or platform teams  • Strong proficiency in Python (this is a Python-only role with a mixture of writing code from scratch as well as debugging and performance improvements)  • Hands-on experience with Kubernetes clusters, ideally at scale  • Familiarity with Docker, automated testing, and ML infrastructure components  • Ability to operate independently and deliver end-to-end projects with minimal oversight • Experience in performance tuning and supporting internal research or ML teams null We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal employment opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment without regard to race, color, ethnicity, religion,sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military oruniformed service member status, or any other status or characteristic protected by applicable laws, regulations, andordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request to HR@insightglobal.com.
Confirmar seu email: Enviar Email