Kee documentation

Keep an Eye on Everything!

Introduction

Kee is a CLI tool that helps operate backend systems and investigate issues. You customise it to your infrastructure, and it reports on the health of its components, across layers and providers.

Unlike centralised monitoring platforms, it is designed to run locally on your computer. While it doesn't replace monitoring and alerting services, it gives you the opportunity to aggregate all of them into one single convenient view. Kee supports a range of systems, providers and services through probe providers.

Kee is for DevOps and SREs who love TUIs configured with plaintext files. It is heavily inspired by k9s and Terraform.

Setup and usage

Setup

Use Go >= 1.24 to install Kee: go install gitlab.com/mwwaa/kee@v1.4.0

Configure

Kee will look for a configuration file at these paths, by order of preference:

  1. The path provided as command line argument
  2. $HOME/.config/kee/config.hcl
  3. ./kee.hcl

The configuration file must exist but can be empty. However Kee is more useful if probes are configured:

            
probe "http" "m.w.fr" {
  interval      = "30s"
  url           = "https://maxime.walzberg.fr/"
  search_string = "Maxime Walzberg"
}
            
        

Run

Provided that $GOPATH/bin is in your shell's $PATH, you can just run kee.

Kee accepts the following optional arguments, in any order:

Full example: kee tools/kee.hcl -v env prod -f tools/kee.tfvars -t

Once kee is up and running, the interface is self-explanatory. Keyboard shortcuts are denoted with angle brackets and do not require the use of modifiers.

Configuration

File structure

A Kee configuration file is written using HashiCorp's HCL language and must conform to the following schema.

Variables

Variables passed to the command line can be used in the configuration file with var.variable_name. They can be used in string interpolations like so: alarm_name = "prefix-${var.variable_name}Suffix", or in any HCL expression.

Constants

Constants are available to help reading configuration files, representing status severity levels, fields or layers. You can use them like so: default_filter = severity.notice

Functions

The following HCL functions are available to make configuration more dynamic:

Special formats

Some attribute types expect a string in a specific format or an integer in a predefined range/enum. You can find their description here.

Probes

Kee packs several probe providers, identified by a provider type string such as aws_cloudwatch. Providers are put to use by declaring probe blocks:

            
probe "provider_type_identifier" "unique_label" {
    # This is a common attribute, available for all probe types
    interval = "10s"

    # This would be an attribute specific to the type of probe.
    url = ""
}
            
        

There can be multiple probe blocks using the same probe provider. However, each probe block must have a unique label. Probe blocks have common attributes and specific attributes (see table below).

Some probe providers generate a single status row per probe block, while others yield a dynamic number of statuses based on what's found in the target system.

Probe type identifier Description Attributes
aws_budgets Reports the status of AWS Budgets. Reference
aws_cloudwatch Reports the status of AWS CloudWatch alarms based on name/type/prefixes filters. Reference
dns Query a DNS server for records and match reply with expected answers. Reference
http Sends HTTP requests and assert the response's status code and contents. Reference
k8s_daemonset Reports on the status of DaemonSet resources. Reference
k8s_deployment Reports on the status of Deployment resources. Reference
k8s_node Reports on the status of Node resources. Reference
k8s_pod Reports on the status of Pod resources. Reference
k8s_service Reports on the status of Service resources. Reference
ping Pings a host and assert on packet loss and response times. Reference
slack_status Monitors Slack's system status Reference
statuspage_components Reports the condition of components from a service provider that expose its system health through Atlassian's StatusPage public API. Reference
statuspage_status Reports the status of a service provider that expose its system health through Atlassian's StatusPage public API. Reference

Reference

slack_status

label

string Label

Human-readable reminder of what the clock is, for example its timezone.

Example: UTC

location

Timezone location for the clock.

Example: Europe/Paris

format

datetimelayout Optional

Display format for the clock.

Example: 15:04:05

status_format

datetimelayout Optional

Display format for the update and change columns of the status table.

Example: 15:04:05

status_day_format

datetimelayout Optional

Display format for the update and change columns of the status table when the day is not today.

Example: 2006-01-02 15:04:05

Kee configuration file

preferences

Preferences Block [0, 1]

Set general preferences and defaults.

theme

Theme Block [0, 1]

Customize Kee's appearance.

probe

Probe Block [0, ∞)

Define a probe with the specified provider, label and configuration.

HeaderTheme

bg_color

color Optional

Background color.

fg_color

color Optional

Text color.

status_count_rule

ThemeRule Block [0, ∞)

Color rules for the status count. Context for the rule's expression: ctx.statuses, ctx.errors

error_count_rule

ThemeRule Block [0, ∞)

Color rules for the error count. Context for the rule's expression:

border_color

color Optional

Border color.

layer_menu

ThemeMenu Block [0, 1]

Colors for the layer filter menu.

severity_menu

ThemeMenu Block [0, 1]

Colors for the severity filter menu.

sort_menu

ThemeMenu Block [0, 1]

Colors for the sort menu.

refresh_interval

duration Optional

Time interval at which the interface is refreshed.

Example: 1s

refresh_on_update

bool Optional

Set to true to always refresh the interface when receiving updates from probes.

Example: false

expiration

duration Optional

When a given status row hasn't been updated for this duration, it's removed from the interface.

Example: 5m

default_filter

severity Optional

Severity filter set on startup.

Example: severity.notice

default_sort

field Optional

Sort ordering set on startup.

Example: field.change

minimum_error_severity

severity Optional

Minimum severity for a status to count as an error.

Example: severity.warning

display_time_for_statuses

bool Optional

When set to true, the status rows will display a date/time rather than a time interval for change and update.

Example: false

log_file

string Optional

Path to a JSON log file that will contain a record of all status changes.

Example: /path/to/${var.env}/logs/${today()}.json

clock

Clock Block [0, ∞)

Timezone and clock format preferences. Multiple clocks can be configured and cycled through.

type

string Label

Probe provider type identifier.

Example: aws_cloudwatch

label

string Label

Human-readable, unique label for the probe.

Example: AWS Alarms

minimum_severity

severity Optional

Minimum severity reported when the probe returns a non-OK status row.

Example: severity.critical

interval

duration Optional

Time interval between probe checks.

Example: 1m

layer

layer Optional

Assigns the probe to a layer for improved filtering/sorting.

Example: layer.platform

probe_configuration

hcl.Body Remain

Provider-specific attributes and blocks. See the probe provider configuration.

StatusTableTheme

bg_color

color Optional

Background color.

border_color

color Optional

Border color.

header_bg_color

color Optional

Header background color.

header_fg_color

color Optional

Header text color.

columns

field Optional

List of columns to display.

rule

ThemeRule Block [0, ∞)

Color rules for the status rows. Context for the rule's expression: ctx.id, ctx.severity, ctx.layer, ctx.label, ctx.description, ctx.update_sec, ctx.change_sec

title

string Optional

Title displayed on the top left corner of the interface.

Example: Kee for ${var.env}

header

HeaderTheme Block [0, 1]

Customize the top part of the interface.

status_table

StatusTableTheme Block [0, 1]

Customize the status table.

ThemeMenu

fg_color

color Optional

Text color.

selected_fg_color

color Optional

Text color for the selected menu entry.

title_fg_color

color Optional

Text color for the menu's title.

condition

hcl.Expression Optional

HCL expression that determines if this rule applies. Available context depends on the parent block.

Example: ctx.severity == severity.critical

bg_color

color Optional

Background color used when this rule applies.

fg_color

color Optional

Text color used when this rule applies.

aws_budgets

region

string Optional

Overrides the AWS region, if this field is not set and the AWS profile does not set one, defaults to us-east-1.

Example: eu-north-1

profile

string Optional

Use a specific AWS profile.

Example: Accounting

account_id

string Optional

AWS account ID, will be resolved using GetCallerIdentity otherwise.

Example: 123456789012

budget_name

string Optional

The name of a single budget, if not specified all budgets of that account are listed.

Example: prod

notice_threshold_percentage

int Optional

When the actual or forecasted spend reaches this percentage of the limit, severity will be incresed to notice.

Example: 90

aws_cloudwatch

action_prefix

string Optional

Filter alarms based on action prefix.

Example: arn:sns:...

alarm_name_prefix

string Optional

Filter alarms based on name prefix.

Example: prod-

alarm_names

[]string Optional

Filter alarms based on exact match to this list of names.

Example: prod-api-healthcheck

alarm_types

[]string Optional

Filter alarms based on their types.

Example: MetricAlarm

region

string Optional

Overrides the AWS region.

Example: eu-north-1

profile

string Optional

Use a specific AWS profile.

Example: ProdMonitoring

insufficient_data_severity

severity Optional

Overrides the severity for alarms with insufficient data.

Example: severity.notice

nameserver

string

Host:port for the nameserver to query.

Example: 1.1.1.1:53

name

string

FQDN for the record set to query including terminal dot.

Example: walzberg.fr.

record_type

string

Type of recordset to query, must be within supported types: A, AAAA, TXT, MX, CNAME, CAA.

Example: A

expected_values

[]string

Expected members of the queried recordset

Example: 123.45.67.89

expect_all_values

bool

When true, require that all expected records are returned by the nameserver.

Example: false

http

method

string Optional

The request's HTTP method.

Example: POST

url

string

Request URL.

Example: https://maxime.walzberg.fr/

headers

map[string][]string Optional

Request headers.

body

string Optional

Request body

Example: {"title":"Loreum Ipsum"}

expected_status

int Optional

Expected HTTP status code of the response.

Example: 201

search_string

string Optional

Expects the response body to contain this string.

Example: OK

k8s_daemonset

kubeconfig

string Optional

File path to a kubectl config file.

Example: .kube/config

context

string Optional

Kubectl context name.

Example: prod

namespaces

[]string Optional

Kubernetes namespace.

Example: api

This configuration schema is used by the following probes:

  • k8s_daemonset
  • k8s_node
  • k8s_service
  • k8s_deployment
  • k8s_pod

Condition

statistic

string

Which statistic to base this condition on: packet_loss, avg_rtt, min_rtt, max_rtt or stddev_rtt

Example: packet_loss

severity

Severity to report if the condition is reached.

Example: severity.notice

threshold_gt_percentage

float64 Optional

Threshold value for the packet_loss statistic.

Example: 50

threshold_gt_ms

int Optional

Threshold value for the *_rtt statistics.

Example: 250

ping

address

string

IP address or hostname to ping.

Example: 123.45.67.89

ping_interval

duration Optional

Interval at which ping datagrams are sent.

Example: 250ms

size_bytes

int Optional

Size of the ping packet in bytes.

Example: 64

condition

Condition Block [0, ∞)

Conditions that trigger a non-OK severity. Setting one or moe overrides the default 3 pct loss alert.

statuspage_components

base_url

string

Base URL of the StatusPage API for the service provider.

Example: https://www.cloudflarestatus.com

statuspage_status

base_url

string

Base URL of the StatusPage API for the service provider.

Example: https://www.cloudflarestatus.com

curious

bool Optional

When true, display the name of the first 3 components with the worst, non-OK severity.

Example: true

Returns the current, local day, in a YYYY-MM-DD formatted string.

Turns an HCL list into a set, returns an error if the list contains values of different types or duplicated values.

List
cty.DynamicPseudoType
The HCL list to turn into a set.

Constants

field
  • change
  • description
  • id
  • label
  • layer
  • severity
  • update
layer
  • application
  • infrastructure
  • none
  • platform
severity
  • critical
  • notice
  • ok
  • warning

Special formats

color
Color for the interface's looks. Use a named color available in tcell's palette. Example: "yellow".
duration
Duration (time interval) expressed using Go's duration string format. Example: "3m" means 3 minutes.
field
Field (column) of status rows. Use a field constant to define such attribute. Example: field.severity.
layer
Layer of status rows. Use a layer constant to define such attribute. Example: layer.platform.
severity
Severity of status rows. Use a severity constant to define such attributes. Example: severity.warning.
datetimelocation
Timezone location, see LoadLocation. Example: Europe/Paris.
datetimelayout
Date and time format, see Layout. Example: 2006-01-02T15:04:05.