This draft documentation may be incomplete or inaccurate, and is subject to change until this release is generally available (GA).

Alerts code reference

Learn about the alerts ThoughtSpot may generate.

This reference identifies the messages that can appear in the TS Stats: System Information and Usage  Critical Alerts panel and in the Alerts dashboard.

Informational alerts

APPLICATION_INVALID_STATE

Raised when Application raises invalid state alert.

Msg

{{.Service}}.{{.Task}} on {{.Machine}} at location {{.Location}}

Type

INFO

DISK_ERROR

Raised when a machine has disk errors.

Msg

Machine {{.Machine}} has disk errors

Type

INFO

HDFS_CORRUPTION

Raised when HDFS root directory is corrupted.

Msg

HDFS root directory is in a corrupted state.

Type

INFO

MASTER_ELECTION

Raised when a new Orion Master is elected.

Msg

{{.Machine}} elected as Orion Master

Type

INFO

PERIODIC_BACKUP

Raised when periodic backup fails.

Msg

{{.Process}} periodic backup for policy {{.Name}} failed.

Type

INFO

PERIODIC_SNAPSHOT

Raised when a periodic snapshot fails.

Msg

{{.Process}} periodic snapshot {{.Name}} failed.

Type

INFO

TASK_TERMINATED

Raised when a task terminates.

Msg

Task {{.Service}}.{{.Task}} terminated on machine {{.Machine}}

Type

INFO

UPDATE_END

Raised when update completes.

Msg

Finished update of ThoughtSpot cluster {{.Cluster}} to release {{.Release}}

Type

INFO

UPDATE_START

Raised when update starts.

Msg

Starting update of ThoughtSpot cluster {{.Cluster}}

Type

INFO

ZK_AVG_LATENCY

Raised when average Zookeeper latency is above a threshold.

Msg

Average Zookeeper latency is more than {{.Num}} msec

Type

INFO

ZK_MAX_LATENCY

Raised when max Zookeeper latency is above a threshold.

Msg

Max Zookeeper latency is more than {{.Num}} msec

Type

INFO

ZK_MIN_LATENCY

Raised when min Zookeeper latency is above a threshold.

Msg

Min Zookeeper latency is more than {{.Num}} msec

Type

INFO

ZK_NUM_WATCHERS

Raised when there are too many Zookeeper watchers.

Msg

Number of Zookeeper watchers exceeds {{.Num}}

Type

INFO

ZK_OUTSTANDING_REQUESTS

Raised when there are too many outstanding Zookeeper requests.

Msg

Number of outstanding Zookeeper requests exceeds {{.Num}}

Type

INFO

Errors

TIMELY_ERROR

Raised when a job manager runs into an inconsistent state.

Msg

Job manager {{.Message}}

Type

ERROR

TIMELY_JOB_RUN_ERROR

Raised when a job run fails.

Msg

Job run {{.Message}}

Type

ERROR

Warnings

BOOT_DISK_SPACE

Raised when a machine is low on available disk space on boot partition.

Msg

Machine {{.Machine}} has less than {{.Perc}}% disk space free on boot partition

Type

WARNING

DISK_ERROR_EXTERNAL

Raised when more than 2 disk errors happen in a day.

Msg

Machine {{.Machine}} has disk errors

Type

WARNING

DISK_SPACE

Raised when a disk is low on available disk space. Valid only in the 3.2 version of ThoughtSpot.

Msg

Machine {{.Machine}} has less than {{.Perc}}% disk space free

Type

WARNING

EXPORT_DISK_SPACE

Raised when a machine is low on available disk space on export partition.

Msg

Machine {{.Machine}} has less than {{.Perc}}% disk space free on export partition

Type

WARNING

HDFS_NAMENODE_DISK_SPACE

Raised when a machine is low on available disk space on HDFS namenode drive.

Msg

Machine {{.Machine}} has less than {{.Perc}}% disk space free on HDFS namenode drive

Type

WARNING

HOST_DOWN

Raised when a host is down.

Msg

{{.Machine}} is down

Type

WARNING

MEMORY

Raised when a machine is low on free memory.

Msg

Machine {{.Machine}} has less than {{.Perc}}% memory free

Type

WARNING

OS_PROCS

Raised when a machine has more too many processes.

Msg

Machine {{.Machine}} has more than {{.Num}} processes

Type

WARNING

OS_USERS

Raised when a machine has too many users logged in.

Msg

Machine {{.Machine}} has more than {{.Num}} logged in users

Type

WARNING

ROOT_DISK_SPACE

Raised when a machine is low on available disk space on root partition.

Msg

Machine {{.Machine}} has less than {{.Perc}}% disk space free on root partition

Type

WARNING

SSH

Raised when a machine has more than 600 processes.

Msg

Machine {{.Machine}} doesn’t have an active SSH server

Type

WARNING

TASK_NOT_RUNNING

Raised when a service task is not running on any machine in the cluster.

Msg

{{.ServiceDesc}} is not running

Type

WARNING

TASK_UNREACHABLE

Raised when a task is unreachable over HTTP.

Msg

{{.ServiceDesc}} on {{.Machine}} is unreachable over HTTP

Type

WARNING

UPDATE_DISK_SPACE

Raised when a machine is low on available disk space on update partition.

Msg

Machine {{.Machine}} has less than {{.Perc}}% disk space free on update partition

Type

WARNING

ZK_EPHEMERAL_COUNT

Raised when there are too many Zookeeper ephemeral files.

Msg

Zookeeper has more than {{.Num}} ephemeral files

Type

WARNING

ZK_FD_COUNT

Raised when there are too many open Zookeeper files.

Msg

Zookeeper has more than {{.Num}} open file descriptors

Type

WARNING

Critical alerts

APPLICATION_INVALID_STATE_EXTERNAL

Raised when Application raises invalid state alert.

Msg

{{.Service}}.{{.Task}} on {{.Machine}} at location {{.Location}}

Type

CRITICAL

HDFS_DISK_SPACE

Raised when a HDFS cluster is low on total available disk space.

Msg

HDFS has less than {{.Perc}}% space free

Type

CRITICAL

OREO_TERMINATED

Raised when the Oreo daemon on a machine terminates due to an error. This typically happens due to an error accessing Zookeeper, HDFS, or a hardware issue.

Msg

Oreo terminated on machine {{.Machine}}

Type

CRITICAL

PERIODIC_BACKUP_FLAPPING

This alert is raised when a periodic backup failed repeatedly.

Msg

Periodic backup failed {{._actual_num_occurrences}} times in last {{._earliest_duration_str}}

Type

CRITICAL

PERIODIC_SNAPSHOT_FLAPPING

This alert is raised when periodic snapshot failed repeatedly.

Msg

Periodic snapshot failed {{._actual_num_occurrences}} times in last {{._earliest_duration_str}}

Type

CRITICAL

TASK_FLAPPING

Raised when a task is crashing repeatedly. The service is evaluted across the whole cluster. So, if a service crashes 5 times in a day across all nodes in the cluster, this alert is generated.

Msg

Task {{.Service}}.{{.Task}} terminated {{._actual_num_occurrences}} times in last {{._earliest_duration_str}}

Type

CRITICAL

ZK_INACCESSIBLE

Raised when Zookeeper is inaccessible.

Msg

Zookeeper is not accessible

Type

CRITICAL


Was this page helpful?