DESCRIPTION:
Duties: Engage and improve the lifecycle of Cloud Services from inception, design, deployment, and operations. Automate repeated manual tasks. Develop tools and automation to improve the efficiency of the platform and infrastructure. Analyze defects, propose improvements, and drive efficiencies in systems and processes. Contribute to the development of new Cloud Engineering strategies and implementations for the firm as part of Site Reliability team. Responsible for ensuring the reliability, availability, and performance of the Cloud Infrastructure and the Terraform Enterprise Infrastructure Automation Platform. Develop Observability and Telemetry tools. Author and improve the quality of Technical Engineering documentation. Debug and solve issues in a Production environment. Participate in SRE on-call rotations and escalation workflows.
QUALIFICATIONS:
Minimum education and experience required: Master's degree in Information Systems Technologies - Information Assurance, Computer Science, or related field of study plus 3 years of experience in the job offered or as Site Reliability Engineer, Devops Engineer, or related occupation. The employer will alternatively accept a Bachelor's degree in Information Systems Technologies - Information Assurance, Computer Science, or related field of study plus 5 years of experience in the job offered or as Site Reliability Engineer, Devops Engineer, or related occupation.
Skills Required: This position requires experience with the following: Designing, implementing, and maintaining Infrastructure as Code (IaC) using HashiCorp Terraform Enterprise, ensuring scalability, reliability, and security of cloud environments; developing and managing Terraform modules, enforcing best practices, and optimizing resource provisioning across AWS; performing scripting, development, and automation leveraging at least one of the following programming languages and frameworks: Python or Bash; automating infrastructure deployment and configuration using CI/CD pipelines and automation tools to support continuous delivery, utilizing at least one of the following: GitHub, Jenkins, or Spinnaker; and monitoring the Terraform Enterprise platform by leveraging different monitoring, logging, and observability tools with at least one of the following: Prometheus, Grafana, or Datadog.
Job Location: 8181 Communications Parkway, Plano, TX 75024.