Kubernetes
The Kubernetes check performs requests on Kubernetes resources such as Pods to get the desired information.
kubernetes.yamlapiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: kube-system-checks
spec:
schedule: "@every 5m"
kubernetes:
- name: kube-system
kind: Pod
healthy: true
# resource:
# search: labels.app=test
# OR
# labelSelector: k8s-app=kube-dns
namespaceSelector:
name: kube-*,!*lease
# name: "*"
display:
expr: |
dyn(results).
map(i, i.Object).
filter(i, !k8s.isHealthy(i)).
map(i, "%s/%s -> %s".format([i.metadata.namespace, i.metadata.name, k8s.getHealth(i).message])).join('\n')
test:
expr: dyn(results).all(x, k8s.isHealthy(x))
| Field | Description | Scheme |
|---|---|---|
kind* | Kubernetes object kind |
|
name* | Name of the check, must be unique within the canary |
|
cnrm | CNRM connection details | |
connection | The connection url to use, mutually exclusive with | |
eks | EKS connection details | |
gke | GKE connection details | |
healthy | Fail the check if any resources are unhealthy |
|
ignore | Ignore the specified resources from the fetched resources. Can be a glob pattern. | []glob |
kubeconfig | Source for kubeconfig | |
namespace | Failing checks are placed in this namespace, useful if you have shared namespaces. NOTE: this does not change the namespace of the resources being queried | |
namespaceSelector | Filters namespaces by name or labels | |
ready | Fail the check if any resources are not ready |
|
resource | Filters resources by name, namespace, or labels | |
description | Description for the check |
|
display | Expression to change the formatting of the display | |
icon | Icon for overwriting default icon on the dashboard | |
labels | Labels for check |
|
markFailOnEmpty | If a transformation or datasource returns empty results, the check should fail |
|
metrics | Metrics to export from | |
test | Evaluate whether a check is healthy | |
transform | Transform data from a check into multiple individual checks |
Resource Selector
Resource Selectors are used throughout Mission Control for:
- Creating relationships between configs and configs/components
- Filtering resources in playbook triggers and actions
- Selecting targets for health checks
- Building dynamic views and dashboards
| Field | Description | Scheme | Required |
|---|---|---|---|
id | Select resource by ID. Supports comma-separated values and wildcards (id=abc*,def*) | string | |
name | Select resource by name. Supports comma-separated values and wildcards (name=*-prod,*-staging) | string | |
namespace | Select resources in this namespace only. If empty, selects from all namespaces | string | |
types | Select resources matching any of the specified types (e.g., Kubernetes::Pod, AWS::EC2::Instance) | []string | |
statuses | Select resources matching any of the specified statuses | []string | |
health | Select resources matching the specified health status. Supports multiple values separated by comma (healthy,warning) and negation (!unhealthy) | string | |
scope | Limit selection to resources belonging to a specific parent. For configs this is the scraper id, for checks it's the canary, and for components it's the topology. Can be a UUID or namespace/name | string | |
labelSelector | Kubernetes-style label selector. Supports =, ==, != operators and set-based selectors (key in (v1,v2), key notin (v1,v2), key, !key) | LabelSelector | |
fieldSelector | Select resources by property fields using Kubernetes field selector syntax. Supports fields like owner, topology_id, parent_id for components | FieldSelector | |
tagSelector | Select resources by tags using the same syntax as labelSelector. Tags are key-value pairs assigned during scraping | string | |
agent | Select resources created on a specific agent. Accepts agent UUID, agent name, or special values: local (resources without an agent), self (alias for local), all (resources from any agent). Defaults to local | string | |
cache | Cache settings for selector results. Useful for expensive or frequently-used selectors. Values: no-cache (bypass but allow caching), no-store (bypass and don't cache), max-age=<duration> (cache for duration) | string | |
limit | Maximum number of resources to return | int | |
includeDeleted | Include soft-deleted resources in results. Defaults to false | bool | |
search | Full-text search across resource name, tags, and labels using parsing expression grammar. See Search | string |
Wildcards and Negation
The name, id, types, statuses, and health fields support:
- Prefix matching:
name=prod-*matches names starting withprod- - Suffix matching:
name=*-backendmatches names ending with-backend - Negation:
health=!unhealthyexcludes unhealthy resources - Multiple values:
types=Kubernetes::Pod,Kubernetes::Deploymentmatches either type
Search
The search field provides a powerful query language for filtering resources.
Syntax
field1=value1 field2>value2 field3=value3* field4=*value4
Multiple conditions are combined with AND logic.
Operators
| Operator | Example | Description | Types |
|---|---|---|---|
= | status=healthy | Equals (exact match or wildcard) | string int json |
!= | health!=unhealthy | Not equals | string int json |
=* | name=*-prod or name=api-* | Prefix or suffix match | string int |
> | created_at>now-24h | Greater than | datetime int |
< | updated_at<2025-01-01 | Less than | datetime int |
Date Queries
- Absolute dates:
2025-01-15,2025-01-15T10:30:00Z - Relative dates:
now-24h,now-7d,now+1w - Supported units:
s(seconds),m(minutes),h(hours),d(days),w(weeks),y(years)
JSON Field Access
Access nested fields in labels, tags, and config using dot notation:
labels.app=nginx
tags.env=production
config.spec.replicas>3
Searchable Fields
Catalog Items (Configs)
| Field | Type | Description |
|---|---|---|
name | string | Resource name |
namespace | string | Kubernetes namespace or equivalent |
type | string | Resource type (e.g., Kubernetes::Pod) |
status | string | Current status |
health | string | Health status |
source | string | Source identifier |
agent | string | Agent that scraped this resource |
labels | json | Kubernetes-style labels |
tags | json | Scraper-assigned tags |
config | json | Full configuration data |
created_at | datetime | Creation timestamp |
updated_at | datetime | Last update timestamp |
deleted_at | datetime | Soft deletion timestamp |
Config Changes
| Field | Type | Description |
|---|---|---|
id | string | Change ID |
config_id | string | Parent config ID |
name | string | Change name |
type | string | Config type |
change_type | string | Type of change (e.g., diff, event) |
severity | string | Change severity |
summary | string | Change summary |
count | int | Occurrence count |
agent_id | string | Agent ID |
tags | json | Change tags |
details | json | Additional details |
created_at | datetime | Change timestamp |
first_observed | datetime | First observation time |
Examples
Basic Selection
# Select by exact name
name: my-deployment
# Select by ID
id: 3b1a2c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d
# Select all pods in a namespace
types:
- Kubernetes::Pod
namespace: production
Using Wildcards
# Select all resources with names starting with "prod-"
name: prod-*
# Select all AWS resources
types:
- AWS::*
# Select resources ending with "-backend"
name: "*-backend"
Label and Tag Selectors
# Select by labels (Kubernetes-style)
labelSelector: app=nginx,env in (prod,staging)
# Select by tags
tagSelector: team=platform,cost-center!=shared
# Combine both
labelSelector: app=api
tagSelector: environment=production
Health and Status Filtering
# Select only healthy resources
health: healthy
# Exclude unhealthy resources
health: "!unhealthy"
# Select resources with specific statuses
statuses:
- Running
- Pending
Search Queries
# Find all Kubernetes namespaces starting with "kube"
search: type=Kubernetes::Namespace name=kube*
# Find unhealthy AWS EC2 instances
search: type=AWS::EC2::Instance health=unhealthy
# Find configs created in the last 24 hours
search: created_at>now-24h
# Find nginx pods with specific tags
search: type=Kubernetes::Pod labels.app=nginx tags.cluster=prod
# Complex query with date range
search: updated_at>2025-01-01 updated_at<2025-01-31 type=Kubernetes::Deployment
Multi-Agent Selection
# Select from a specific agent
agent: production-cluster
# Select from all agents
agent: all
# Select only local (agentless) resources
agent: local
Scoped Selection
# Select configs from a specific scraper
scope: namespace/my-scraper
# Select checks from a specific canary
scope: canary-uuid-here
catalog-pod-check.yamlapiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: k8s-checks
spec:
schedule: '@every 30s'
kubernetes:
- name: notification-pod-health-check
selector:
- labelSelector: 'kubernetes.io/app=notification-listener'
types:
- Kubernetes::Pod
test:
expr: dyn(results).all(x, k8s.isHealthy(x))
Healthy
Using healthy: true is functionally equivalent to:
test:
expr: dyn(results).all(x, k8s.isHealthy(x))
kubnetes-healthy.yamlapiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: kube-system-checks
spec:
interval: 30
kubernetes:
- namespace: kube-system
name: kube-system
kind: Pod
healthy: true
resource:
labelSelector: k8s-app=kube-dns
namespaceSelector:
name: kube-system
See the CEL function k8s.isHealthy for more details
Ready
Similar to the healthy flag, there's also a ready flag which is functionally equivalent to having the following test expression
dyn(results).all(x, k8s.isReady(x))
Checking for certificate readiness
cert-manager.yamlapiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: cert-manager
spec:
schedule: "@every 15m"
kubernetes:
- name: cert-manager-check
kind: Certificate
test:
expr: |
dyn(results).
map(i, i.Object).
filter(i, i.status.conditions[0].status != "True").size() == 0
display:
expr: |
dyn(results).
map(i, i.Object).
filter(i, i.status.conditions[0].status != "True").
map(i, "%s/%s -> %s".format([i.metadata.namespace, i.metadata.name, i.status.conditions[0].message])).join('\n')
Remote clusters
A single canary-checker instance can connect to any number of remote clusters via custom kubeconfig. Either the kubeconfig itself or the path to the kubeconfig can be provided.
kubeconfig from kubernetes secret
remote-cluster.yaml---
apiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: pod-access-check
spec:
schedule: "@every 5m"
kubernetes:
- name: pod access on aws cluster
namespace: default
description: "deploy httpbin"
kubeconfig:
valueFrom:
secretKeyRef:
name: aws-kubeconfig
key: kubeconfig
kind: Pod
ready: true
namespaceSelector:
name: default
Kubeconfig inline
remote-cluster.yamlapiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: pod-access-check
spec:
schedule: "@every 5m"
kubernetes:
- name: pod access on aws cluster
namespace: default
kubeconfig:
value: |
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: xxxxx
server: https://xxxxx.sk1.eu-west-1.eks.amazonaws.com
name: arn:aws:eks:eu-west-1:765618022540:cluster/aws-cluster
contexts:
- context:
cluster: arn:aws:eks:eu-west-1:765618022540:cluster/aws-cluster
namespace: mission-control
user: arn:aws:eks:eu-west-1:765618022540:cluster/aws-cluster
name: arn:aws:eks:eu-west-1:765618022540:cluster/aws-cluster
current-context: arn:aws:eks:eu-west-1:765618022540:cluster/aws-cluster
kind: Config
preferences: {}
users:
- name: arn:aws:eks:eu-west-1:765618022540:cluster/aws-cluster
user:
exec:
....
kind: Pod
ready: true
namespaceSelector:
name: default
Kubeconfig from local filesystem
remote-cluster.yaml---
apiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: pod-access-check
spec:
schedule: "@every 5m"
kubernetes:
- name: pod access on aws cluster
namespace: default
kubeconfig:
value: /root/.kube/aws-kubeconfig
kind: Pod
ready: true
namespaceSelector:
name: default