Inhibition

When something breaks in your infrastructure, it rarely breaks alone. A crashing pod makes its ReplicaSet unhealthy, which makes its Deployment unhealthy — and one root cause turns into three notifications.

Inhibition lets you keep the notification that points closest to the root cause and automatically suppress the related notifications that follow it.

How it works

An inhibition rule has two sides:

from — the config type whose notification you want to keep (the inhibitor)
to — the related config types whose notifications you want to suppress

Once a notification is sent for a from resource, it starts inhibiting. For the length of the notification's repeatInterval, any new event for a related to resource is recorded as inhibited instead of being delivered.

Walking through the pod example:

A pod crashes and a config.unhealthy notification for it is sent. The rule below lists Kubernetes::Pod in from, so this notification becomes an inhibitor.
Moments later, the pod's ReplicaSet and Deployment also turn unhealthy. Their types are listed in to, so Mission Control walks the relationship graph from each of them, finds the pod that already notified, and suppresses both.
You receive one notification — the pod alert — instead of three.

inhibitions:
  - direction: incoming
    from: Kubernetes::Pod
    to:
      - Kubernetes::ReplicaSet
      - Kubernetes::Deployment

Things to keep in mind

Inhibition requires repeatInterval on the notification — it doubles as the inhibition window. Without it, inhibition rules are ignored.
Both the kept and the suppressed alerts must come from the same Notification resource, so the notification's events and filter must match all the resource types involved.
Inhibition works on catalog (config) events such as config.unhealthy — not on check or component events.
Order matters: only an already-sent from notification can inhibit. If the Deployment's alert happens to arrive before the Pod's, both are sent.
Inhibited notifications aren't lost — they appear in the notification send history with the status inhibited.

Writing your own rule

Pick the alert to keep. Choose the resource type that gives the clearest signal about the root cause — that's your from. For Kubernetes roll-up health, that's usually the Pod.
List the noise. The related types whose alerts repeat the same information go in to.
Choose a direction. Ask where the to resources sit relative to from in the relationship graph:
- They're parents or owners (Pod → its ReplicaSet/Deployment): use incoming.
- They're children or dependents (Node → its Pods): use outgoing.
- Could be either: use all.
Count the hops and set depth. Each relationship level is one hop: Pod → ReplicaSet is 1, Pod → ReplicaSet → Deployment is 2. Defaults to 5 when omitted.
Set soft: true for soft relationships. Ownership links like Deployment → Pod are hard relationships and match by default. Placement links like Node → Pod are soft, and are only followed when soft: true.

Examples

Keep the Pod alert, suppress its ReplicaSet and Deployment

A pod's failure usually explains why its parents are unhealthy, so this notification keeps the pod alert and inhibits the parent alerts that follow within the 4-hour window. The direction is incoming because ReplicaSets and Deployments are parents of the pod, and depth: 2 covers the two hops from Pod up to Deployment.

deployment-with-inhibition.yaml
apiVersion: mission-control.flanksource.com/v1
kind: Notification
metadata:
  name: pod-with-incoming-inhibition
spec:
  events:
    - config.unhealthy
    - config.warning
  repeatInterval: 4h
  to:
    connection: connection://mission-control/slack
  inhibitions:
    - direction: incoming
      from: Kubernetes::Pod
      to:
        - Kubernetes::Deployment
        - Kubernetes::ReplicaSet
      depth: 2

How this plays out:

Time	Resource	Event	Action
10:00	Pod `api-7d9f`	`config.unhealthy`	Notification sent (becomes the inhibitor)
10:01	ReplicaSet `api-7d9f`	`config.unhealthy`	Inhibited (related pod already notified)
10:02	Deployment `api`	`config.unhealthy`	Inhibited (related pod already notified)
15:30	Deployment `api`	`config.unhealthy`	Notification sent (4h window expired)

Keep the Node alert, suppress its Pods

When a node goes down, every pod scheduled on it raises an alert. This notification keeps the node alert and inhibits the pod alerts. The direction is outgoing because the pods sit below the node, and soft: true is required because Node-to-Pod is a soft relationship.

node-with-inhibition.yaml
apiVersion: mission-control.flanksource.com/v1
kind: Notification
metadata:
  name: node-with-pod-inhibition
spec:
  events:
    - config.unhealthy
    - config.warning
  repeatInterval: 4h
  to:
    connection: connection://mission-control/slack
  inhibitions:
    - direction: outgoing
      from: Kubernetes::Node
      to:
        - Kubernetes::Pod
      soft: true
      depth: 1

Fields

Field	Description	Scheme
`direction*`	Relationship direction from `from` to `to`. Use `outgoing` when `to` resources are downstream or child resources, `incoming` when `to` resources are upstream or parent resources, and `all` to check both directions.	`incoming` \| `outgoing` \| `all`
`from*`	Config type whose sent notification can inhibit notifications for related `to` resources. For example, `Kubernetes::Deployment`.	`string`
`to*`	Config types that can be inhibited when they are related to a `from` resource that already sent this notification within the `repeatInterval` window.	`[]string`
`depth`	Maximum number of relationship levels to traverse. Defaults to 5 when omitted.	`integer`
`soft`	When false, only hard relationships are considered. When true, both hard and soft relationships are considered. For example, Deployment to Pod is a hard relationship, but Node to Pod is a soft relationship.	`boolean`

How it works​

Writing your own rule​

Examples​

Keep the Pod alert, suppress its ReplicaSet and Deployment​

Keep the Node alert, suppress its Pods​

Fields​