Back to Careers
Open

SRE Engineer

Full-time Remote Engineering

Overview

As an SRE Engineer at AvionPure, you will drive reliability across our clients' complex, distributed infrastructure. You will define and enforce SLOs, design automated incident response runbooks, and build resilient systems that operate at scale. You'll work closely with engineering and security teams to embed reliability practices from the ground up.

Responsibilities

  • Design and maintain SLOs, SLAs, and error budgets across client environments
  • Build and manage infrastructure-as-code using Terraform and Ansible
  • Automate incident detection, triage, and resolution workflows
  • Implement and manage CI/CD pipelines for continuous delivery
  • Conduct capacity planning, performance benchmarking, and chaos engineering
  • Collaborate with developers to embed reliability into the software delivery lifecycle
  • Produce and maintain runbooks, post-mortems, and operational documentation

Requirements

  • 3+ years of experience in SRE, DevOps, or platform engineering roles
  • Strong proficiency with Kubernetes, Helm, and container orchestration
  • Hands-on experience with Terraform and infrastructure-as-code
  • Solid understanding of Linux internals, networking, and distributed systems
  • Experience designing CI/CD pipelines (GitHub Actions, Jenkins, ArgoCD)
  • Familiarity with observability tooling (Datadog, Prometheus, Grafana)
  • Excellent written and verbal communication skills

Skills

LinuxKubernetesTerraform CI/CDDatadogPythonAnsible

Ready to join AvionPure?

Submit your application and we'll get back to you within 3 business days.

Apply Now