Saint Paul, MN, 55145, USA
5 days ago
SRE
Job Description We are seeking two Site Reliability Engineers (SREs) to join our team supporting a new Azure-based product. This role focuses on system reliability, observability, and monitoring for a data-driven application that provides KPIs and insights to end users daily. The product leverages Azure services, APIs, Databricks, and AI/ML models to process customer data and populate dashboards refreshed once per day. The SREs will ensure the reliability of the entire pipeline, provide hypercare support, and collaborate with engineering teams to streamline monitoring and alerting processes. We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to HR@insightglobal.com.To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/. Skills and Requirements Experience in SRE or similar reliability-focused roles. Strong knowledge of Azure services and cloud-based architectures. Hands-on experience with observability, monitoring, and alerting tools (App Insights, Elastic). Ability to work with REST APIs and understand event-driven architectures (e.g., Service Bus). Proficiency in C# for troubleshooting and minor coding tasks. Excellent communication and ownership mindset—able to manage issues end-to-end. Experience with Terraform and infrastructure-as-code. Familiarity with Databricks, AI/ML pipelines, and data engineering concepts. Knowledge of React for front-end troubleshooting. Exposure to event-driven distributed systems. Ability to streamline monitoring processes across multiple teams.
Confirmar seu email: Enviar Email