Kee documentation
Keep an Eye on Everything!Introduction
Kee is a CLI tool that helps operate backend systems and investigate issues. You customise it to your infrastructure, and it reports on the health of its components, across layers and providers.
Unlike centralised monitoring platforms, it is designed to run locally on your computer. While it doesn't replace monitoring and alerting services, it gives you the opportunity to aggregate all of them into one single convenient view. Kee supports a range of systems, providers and services through probe providers.
Kee is for DevOps and SREs who love TUIs configured with plaintext files. It is heavily inspired by k9s and Terraform.
Setup and usage
Setup
Use Go >= 1.24 to install Kee: go install gitlab.com/mwwaa/kee@v1.4.0
Configure
Kee will look for a configuration file at these paths, by order of preference:
- The path provided as command line argument
$HOME/.config/kee/config.hcl./kee.hcl
The configuration file must exist but can be empty. However Kee is more useful if probes are configured:
probe "http" "m.w.fr" {
interval = "30s"
url = "https://maxime.walzberg.fr/"
search_string = "Maxime Walzberg"
}
Run
Provided that $GOPATH/bin is in your shell's $PATH, you can just run kee.
Kee accepts the following optional arguments, in any order:
- The path to a configuration file.
-
-vfollowed by a variable name and value (the value is parsed as a boolean fortrueandfalse, as a base-10 integer if it contains only plus, minus and digits, or else as a string). -
-ffollowed by the path to a variable file (HCL file that only contains attributes). -
-tto run in one-shot test mode: this will print a JSON-formatted summary of all probe's test results.
Full example: kee tools/kee.hcl -v env prod -f tools/kee.tfvars -t
Once kee is up and running, the interface is self-explanatory. Keyboard shortcuts are denoted with angle brackets and do not require the use of modifiers.
Configuration
File structure
A Kee configuration file is written using HashiCorp's HCL language and must conform to the following schema.
Variables
Variables passed to the command line can be used in the configuration file with var.variable_name. They can be used in string interpolations like so: alarm_name = "prefix-${var.variable_name}Suffix", or in any HCL expression.
Constants
Constants are available to help reading configuration files, representing status severity levels, fields or layers. You can use them like so: default_filter = severity.notice
Functions
The following HCL functions are available to make configuration more dynamic:
- today: Returns the current, local day, in a YYYY-MM-DD formatted string.
- toset: Turns an HCL list into a set, returns an error if the list contains values of different types or duplicated values.
Special formats
Some attribute types expect a string in a specific format or an integer in a predefined range/enum. You can find their description here.
Probes
Kee packs several probe providers, identified by a provider type string such as aws_cloudwatch. Providers are put to use by declaring probe blocks:
probe "provider_type_identifier" "unique_label" {
# This is a common attribute, available for all probe types
interval = "10s"
# This would be an attribute specific to the type of probe.
url = ""
}
There can be multiple probe blocks using the same probe provider. However, each probe block must have a unique label. Probe blocks have common attributes and specific attributes (see table below).
Some probe providers generate a single status row per probe block, while others yield a dynamic number of statuses based on what's found in the target system.
| Probe type identifier | Description | Attributes |
|---|---|---|
aws_budgets |
Reports the status of AWS Budgets. | Reference |
aws_cloudwatch |
Reports the status of AWS CloudWatch alarms based on name/type/prefixes filters. | Reference |
dns |
Query a DNS server for records and match reply with expected answers. | Reference |
http |
Sends HTTP requests and assert the response's status code and contents. | Reference |
k8s_daemonset |
Reports on the status of DaemonSet resources. | Reference |
k8s_deployment |
Reports on the status of Deployment resources. | Reference |
k8s_node |
Reports on the status of Node resources. | Reference |
k8s_pod |
Reports on the status of Pod resources. | Reference |
k8s_service |
Reports on the status of Service resources. | Reference |
ping |
Pings a host and assert on packet loss and response times. | Reference |
slack_status |
Monitors Slack's system status | Reference |
statuspage_components |
Reports the condition of components from a service provider that expose its system health through Atlassian's StatusPage public API. | Reference |
statuspage_status |
Reports the status of a service provider that expose its system health through Atlassian's StatusPage public API. | Reference |
Reference
slack_status
Clock
-
label -
stringLabelHuman-readable reminder of what the clock is, for example its timezone.
Example:
UTC -
location -
Timezone location for the clock.
Example:
Europe/Paris -
format -
datetimelayout Optional
Display format for the clock.
Example:
15:04:05 -
status_format -
datetimelayout Optional
Display format for the update and change columns of the status table.
Example:
15:04:05 -
status_day_format -
datetimelayout Optional
Display format for the update and change columns of the status table when the day is not today.
Example:
2006-01-02 15:04:05
Kee configuration file
-
preferences -
Preferences Block
[0, 1]Set general preferences and defaults.
-
theme -
Theme Block
[0, 1]Customize Kee's appearance.
-
probe -
Probe Block
[0, ∞)Define a probe with the specified provider, label and configuration.
HeaderTheme
-
bg_color -
color Optional
Background color.
-
fg_color -
color Optional
Text color.
-
status_count_rule -
ThemeRule Block
[0, ∞)Color rules for the status count. Context for the rule's expression: ctx.statuses, ctx.errors
-
error_count_rule -
ThemeRule Block
[0, ∞)Color rules for the error count. Context for the rule's expression:
-
border_color -
color Optional
Border color.
-
layer_menu -
ThemeMenu Block
[0, 1]Colors for the layer filter menu.
-
severity_menu -
ThemeMenu Block
[0, 1]Colors for the severity filter menu.
-
sort_menu -
ThemeMenu Block
[0, 1]Colors for the sort menu.
Preferences
-
refresh_interval -
duration Optional
Time interval at which the interface is refreshed.
Example:
1s -
refresh_on_update -
boolOptionalSet to true to always refresh the interface when receiving updates from probes.
Example:
false -
expiration -
duration Optional
When a given status row hasn't been updated for this duration, it's removed from the interface.
Example:
5m -
default_filter -
severity Optional
Severity filter set on startup.
Example:
severity.notice -
default_sort -
field Optional
Sort ordering set on startup.
Example:
field.change -
minimum_error_severity -
severity Optional
Minimum severity for a status to count as an error.
Example:
severity.warning -
display_time_for_statuses -
boolOptionalWhen set to true, the status rows will display a date/time rather than a time interval for change and update.
Example:
false -
log_file -
stringOptionalPath to a JSON log file that will contain a record of all status changes.
Example:
/path/to/${var.env}/logs/${today()}.json -
clock -
Clock Block
[0, ∞)Timezone and clock format preferences. Multiple clocks can be configured and cycled through.
Probe
-
type -
stringLabelProbe provider type identifier.
Example:
aws_cloudwatch -
label -
stringLabelHuman-readable, unique label for the probe.
Example:
AWS Alarms -
minimum_severity -
severity Optional
Minimum severity reported when the probe returns a non-OK status row.
Example:
severity.critical -
interval -
duration Optional
Time interval between probe checks.
Example:
1m -
layer -
layer Optional
Assigns the probe to a layer for improved filtering/sorting.
Example:
layer.platform -
probe_configuration -
hcl.BodyRemainProvider-specific attributes and blocks. See the probe provider configuration.
StatusTableTheme
-
bg_color -
color Optional
Background color.
-
border_color -
color Optional
Border color.
-
header_bg_color -
color Optional
Header background color.
-
header_fg_color -
color Optional
Header text color.
-
columns -
field Optional
List of columns to display.
-
rule -
ThemeRule Block
[0, ∞)Color rules for the status rows. Context for the rule's expression: ctx.id, ctx.severity, ctx.layer, ctx.label, ctx.description, ctx.update_sec, ctx.change_sec
Theme
-
title -
stringOptionalTitle displayed on the top left corner of the interface.
Example:
Kee for ${var.env} -
header -
HeaderTheme Block
[0, 1]Customize the top part of the interface.
-
status_table -
StatusTableTheme Block
[0, 1]Customize the status table.
ThemeMenu
ThemeRule
-
condition -
hcl.ExpressionOptionalHCL expression that determines if this rule applies. Available context depends on the parent block.
Example:
ctx.severity == severity.critical -
bg_color -
color Optional
Background color used when this rule applies.
-
fg_color -
color Optional
Text color used when this rule applies.
aws_budgets
-
region -
stringOptionalOverrides the AWS region, if this field is not set and the AWS profile does not set one, defaults to us-east-1.
Example:
eu-north-1 -
profile -
stringOptionalUse a specific AWS profile.
Example:
Accounting -
account_id -
stringOptionalAWS account ID, will be resolved using GetCallerIdentity otherwise.
Example:
123456789012 -
budget_name -
stringOptionalThe name of a single budget, if not specified all budgets of that account are listed.
Example:
prod -
notice_threshold_percentage -
intOptionalWhen the actual or forecasted spend reaches this percentage of the limit, severity will be incresed to notice.
Example:
90
aws_cloudwatch
-
action_prefix -
stringOptionalFilter alarms based on action prefix.
Example:
arn:sns:... -
alarm_name_prefix -
stringOptionalFilter alarms based on name prefix.
Example:
prod- -
alarm_names -
[]stringOptionalFilter alarms based on exact match to this list of names.
Example:
prod-api-healthcheck -
alarm_types -
[]stringOptionalFilter alarms based on their types.
Example:
MetricAlarm -
region -
stringOptionalOverrides the AWS region.
Example:
eu-north-1 -
profile -
stringOptionalUse a specific AWS profile.
Example:
ProdMonitoring -
insufficient_data_severity -
severity Optional
Overrides the severity for alarms with insufficient data.
Example:
severity.notice
dns
-
nameserver -
stringHost:port for the nameserver to query.
Example:
1.1.1.1:53 -
name -
stringFQDN for the record set to query including terminal dot.
Example:
walzberg.fr. -
record_type -
stringType of recordset to query, must be within supported types: A, AAAA, TXT, MX, CNAME, CAA.
Example:
A -
expected_values -
[]stringExpected members of the queried recordset
Example:
123.45.67.89 -
expect_all_values -
boolWhen true, require that all expected records are returned by the nameserver.
Example:
false
http
-
method -
stringOptionalThe request's HTTP method.
Example:
POST -
url -
stringRequest URL.
Example:
https://maxime.walzberg.fr/ -
headers -
map[string][]stringOptionalRequest headers.
-
body -
stringOptionalRequest body
Example:
{"title":"Loreum Ipsum"} -
expected_status -
intOptionalExpected HTTP status code of the response.
Example:
201 -
search_string -
stringOptionalExpects the response body to contain this string.
Example:
OK
k8s_daemonset
-
kubeconfig -
stringOptionalFile path to a kubectl config file.
Example:
.kube/config -
context -
stringOptionalKubectl context name.
Example:
prod -
namespaces -
[]stringOptionalKubernetes namespace.
Example:
api
This configuration schema is used by the following probes:
- k8s_daemonset
- k8s_node
- k8s_service
- k8s_deployment
- k8s_pod
Condition
-
statistic -
stringWhich statistic to base this condition on: packet_loss, avg_rtt, min_rtt, max_rtt or stddev_rtt
Example:
packet_loss -
severity -
Severity to report if the condition is reached.
Example:
severity.notice -
threshold_gt_percentage -
float64OptionalThreshold value for the packet_loss statistic.
Example:
50 -
threshold_gt_ms -
intOptionalThreshold value for the *_rtt statistics.
Example:
250
ping
-
address -
stringIP address or hostname to ping.
Example:
123.45.67.89 -
ping_interval -
duration Optional
Interval at which ping datagrams are sent.
Example:
250ms -
size_bytes -
intOptionalSize of the ping packet in bytes.
Example:
64 -
condition -
Condition Block
[0, ∞)Conditions that trigger a non-OK severity. Setting one or moe overrides the default 3 pct loss alert.
statuspage_components
-
base_url -
stringBase URL of the StatusPage API for the service provider.
Example:
https://www.cloudflarestatus.com
statuspage_status
-
base_url -
stringBase URL of the StatusPage API for the service provider.
Example:
https://www.cloudflarestatus.com -
curious -
boolOptionalWhen true, display the name of the first 3 components with the worst, non-OK severity.
Example:
true
today()
Returns the current, local day, in a YYYY-MM-DD formatted string.
toset()
Turns an HCL list into a set, returns an error if the list contains values of different types or duplicated values.
-
List -
cty.DynamicPseudoTypeThe HCL list to turn into a set.
Constants
-
field -
-
change -
description -
id -
label -
layer -
severity -
update
-
-
layer -
-
application -
infrastructure -
none -
platform
-
-
severity -
-
critical -
notice -
ok -
warning
-
Special formats
-
color -
Color for the interface's looks. Use a named color available in tcell's palette. Example:
"yellow".
-
duration -
Duration (time interval) expressed using Go's duration string format. Example:
"3m"means 3 minutes.
-
field -
Field (column) of status rows. Use a field constant to define such attribute. Example:
field.severity.
-
layer -
Layer of status rows. Use a layer constant to define such attribute. Example:
layer.platform.
-
severity -
Severity of status rows. Use a severity constant to define such attributes. Example:
severity.warning.
-
datetimelocation -
Timezone location, see LoadLocation. Example:
Europe/Paris.
-
datetimelayout -
Date and time format, see Layout. Example:
2006-01-02T15:04:05.