REMOTE

Kubernetes Site Reliability Engineer

Are you an SRE who loves solving complex infrastructure problems but hates the repetitive toil? At Flanksource, we are redefining Managed Services. We are looking for a Kubernetes SRE to manage and optimize diverse Kubernetes environments for our clients—but you won’t be doing it the old way.

You will leverage Mission Control, our internal platform, to gain a consolidated view of configuration, changes, and health across client clusters. Instead of manual firefighting, you will build GitOps-driven Playbooks and teach developers on how to use the Mission Control MCP server—feeding real-time data to AI agents—to troubleshoot and fix issues faster.

Responsibilities

Design and maintain Kubernetes clusters across multiple environments (development, staging, production)
Build automation for cluster deployment, configuration, and management
Monitor and troubleshoot clusters to ensure high availability and optimal performance
Implement security best practices for Kubernetes and underlying infrastructure
Participate in incident response and work to reduce Mean Time To Recovery (MTTR)
Enhance the reliability and scalability of our Kubernetes infrastructure
Manage CI/CD pipelines and DevOps tooling
Collaborate with development teams on deployment strategies and best practices

Requirements

Deep Kubernetes expertise - CKA certification preferred
Strong Experience with GitOps tools - (Flux, ArgoCD)
Infrastructure as Code - Experience with 2+ IaC tools (Terraform, Pulumi, etc.)
Monitoring & Observability - Proficiency with Prometheus, Grafana, and related tools
Cloud Platforms - Hands-on experience with AWS, Azure, or GCP
CI/CD - Knowledge of GitHub Actions, GitLab CI, or Azure DevOps
Networking & Security - Understanding of network fundamentals and security best practices
Problem-solving - Strong analytical and troubleshooting abilities
Communication - Fluent English for remote asynchronous work
Self-motivated - Ability to work independently with an agile approach

Nice-to-haves

Go programming knowledge or willingness to learn
Active open-source contributions
Experience developing Kubernetes operators or controllers

Benefits

100% remote work with flexible hours
Work with cutting-edge cloud-native technologies
Contribute to open-source projects
Collaborative, distributed team environment
Opportunity to shape the future of Kubernetes tooling

Our Tech Stack

Kubernetes

ArgoCD

Flux

Helm

Crossplane

Karpenter

vCluster

PowerShell

Prometheus

Grafana

Jaeger

Terraform

PostgreSQL

ClickHouse

AWS

Azure

GCP

GitHub

GitLab

Apply