Skip to main content
REMOTE

Kubernetes Site Reliability Engineer

Are you an SRE who loves solving complex infrastructure problems but hates the repetitive toil? At Flanksource, we are redefining Managed Services. We are looking for a Kubernetes SRE to manage and optimize diverse Kubernetes environments for our clients—but you won’t be doing it the old way.

You will leverage Mission Control, our internal platform, to gain a consolidated view of configuration, changes, and health across client clusters. Instead of manual firefighting, you will build GitOps-driven Playbooks and teach developers on how to use the Mission Control MCP server—feeding real-time data to AI agents—to troubleshoot and fix issues faster.

Responsibilities

  • Design and maintain Kubernetes clusters across multiple environments (development, staging, production)
  • Build automation for cluster deployment, configuration, and management
  • Monitor and troubleshoot clusters to ensure high availability and optimal performance
  • Implement security best practices for Kubernetes and underlying infrastructure
  • Participate in incident response and work to reduce Mean Time To Recovery (MTTR)
  • Enhance the reliability and scalability of our Kubernetes infrastructure
  • Manage CI/CD pipelines and DevOps tooling
  • Collaborate with development teams on deployment strategies and best practices

Requirements

  • Deep Kubernetes expertise - CKA certification preferred
  • Strong Experience with GitOps tools - (Flux, ArgoCD)
  • Infrastructure as Code - Experience with 2+ IaC tools (Terraform, Pulumi, etc.)
  • Monitoring & Observability - Proficiency with Prometheus, Grafana, and related tools
  • Cloud Platforms - Hands-on experience with AWS, Azure, or GCP
  • CI/CD - Knowledge of GitHub Actions, GitLab CI, or Azure DevOps
  • Networking & Security - Understanding of network fundamentals and security best practices
  • Problem-solving - Strong analytical and troubleshooting abilities
  • Communication - Fluent English for remote asynchronous work
  • Self-motivated - Ability to work independently with an agile approach

Nice-to-haves

  • Go programming knowledge or willingness to learn
  • Active open-source contributions
  • Experience developing Kubernetes operators or controllers

Benefits

  • 100% remote work with flexible hours
  • Work with cutting-edge cloud-native technologies
  • Contribute to open-source projects
  • Collaborative, distributed team environment
  • Opportunity to shape the future of Kubernetes tooling

Our Tech Stack

Kubernetes
ArgoCD
Flux
Helm
Crossplane
Karpenter
vCluster
Go
PowerShell
Prometheus
Grafana
Jaeger
Terraform
PostgreSQL
ClickHouse
AWS
Azure
GCP
GitHub
GitLab
Apply