Alert rules

Alert rules define the conditions under which the monitoring system sends notifications. Each rule evaluates against one metric on a 30-second cycle.

The four rule types

Threshold

Fires when a metric value crosses a fixed boundary — for example, heap usage above 85%, queue depth above 100, or error rate above 5%. You choose the operator (>, <, >=, <=, ==) and the value.

Use this for steady-state limits: things that should never exceed a known safe level.

Rate of change

Fires when a metric is changing faster than expected — for example, an error count that is rising three times faster than its recent baseline. You define the evaluation window over which the rate is computed.

Use this to catch sudden spikes early, before a threshold rule would trigger.

Absence

Fires when no data has arrived for a metric within the evaluation window — for example, a channel that has received zero messages in 30 minutes when it normally processes continuously.

Use this to detect stalled channels or broken upstream feeds.

State change

Fires when a channel transitions between operational states — for example, a channel moving from Started to Stopped. It does not require a numeric threshold; the state transition itself is the condition.

Use this to get immediate notification when a channel unexpectedly stops or fails to deploy.

Scope: server-wide vs. per-channel

When you create a rule, you can leave the channel scope empty (applies to all channels) or target specific channels or channel groups. Group membership resolves dynamically — add a channel to a group and it is covered by any group-scoped rule on the next evaluation cycle, without re-saving the rule.

Server-level metrics (heap, disk, load, threads) cannot be scoped to individual channels and will be rejected if you try.

Severity and cooldown

Every rule has a severity (Info, Warning, or Critical) that determines how it is displayed and which destinations it routes to. The cooldown setting (default 15 minutes) suppresses repeat firings of the same rule so you are not flooded during a sustained condition.

You can also silence a rule for a fixed duration — the rule continues to evaluate, but notifications are suppressed — useful during planned maintenance.

Built-in templates

The rule drawer includes pre-built templates as starting points:

Template	Type	Condition
High Error Rate	Threshold	Error rate > 5%
Channel Stopped	State change	State → Stopped
Queue Depth Warning	Threshold	Queued > 100
No Messages (30 min)	Absence	No data in 30 min
JVM Heap High	Threshold	Heap > 85%
Disk Usage High	Threshold	Disk > 85%
Error Spike	Rate of change	Errors rising 3×

Select a template, adjust the values, assign destinations, and save.