homework-jianmu/docs/en/07-operation/10-monitor.md

12 KiB
Raw Blame History

title description
TDengine Monitoring This document describes how to monitor your TDengine cluster.

After TDengine is started, it automatically writes monitoring data including CPU, memory and disk usage, bandwidth, number of requests, disk I/O speed, slow queries, into a designated database at a predefined interval through taosKeeper. Additionally, some important system operations, like logon, create user, drop database, and alerts and warnings generated in TDengine are written into the log database too. A system operator can view the data in log database from TDengine CLI or from a web console.

The collection of the monitoring information is enabled by default, but can be disabled by parameter monitor in the configuration file.

TDinsight

TDinsight is a complete solution which uses the monitoring database log mentioned previously, and Grafana, to monitor a TDengine cluster.

A script TDinsight.sh is provided to deploy TDinsight automatically.

Download TDinsight.sh with the below command:

wget https://github.com/taosdata/grafanaplugin/raw/master/dashboards/TDinsight.sh
chmod +x TDinsight.sh

Prepare:

  1. TDengine Server

    • The URL of REST service: for example http://localhost:6041 if TDengine is deployed locally
    • User name and password
  2. Grafana Alert Notification

You can use below command to setup Grafana alert notification.

An existing Grafana Notification Channel can be specified with parameter -E, the notifier uid of the channel can be obtained by curl -u admin:admin localhost:3000/api/alert-notifications |jq

 ```bash
 ./TDinsight.sh -a http://localhost:6041 -u root -p taosdata -E <notifier uid>
 ```

Launch TDinsight.sh with the command above and restart Grafana, then open Dashboard http://localhost:3000/d/tdinsight.

log database

The data of tdinsight dashboard is stored in log database (default. You can change it in taoskeeper's config file. For more infrmation, please reference to taoskeeper document). The taoskeeper will create log database on taoskeeper startup.

taosd_cluster_basic table

taosd_cluster_basic table contains cluster basic information.

field type is_tag comment
ts TIMESTAMP timestamp
first_ep VARCHAR first ep of cluster
first_ep_dnode_id INT dnode id or first_ep
cluster_version VARCHAR tdengine version. such as: 3.0.4.0
cluster_id VARCHAR TAG cluster id

taosd_cluster_info table

taosd_cluster_info table contains cluster information records.

field type is_tag comment
ts TIMESTAMP timestamp
cluster_uptime DOUBLE seconds of master's uptime
dbs_total DOUBLE total number of databases in cluster
tbs_total DOUBLE total number of tables in cluster
stbs_total DOUBLE total number of stables in cluster
dnodes_total DOUBLE total number of dnodes in cluster
dnodes_alive DOUBLE total number of dnodes in ready state
mnodes_total DOUBLE total number of mnodes in cluster
mnodes_alive DOUBLE total number of mnodes in ready state
vgroups_total DOUBLE total number of vgroups in cluster
vgroups_alive DOUBLE total number of vgroups in ready state
vnodes_total DOUBLE total number of vnode in cluster
vnodes_alive DOUBLE total number of vnode in ready state
connections_total DOUBLE total number of connections to cluster
topics_total DOUBLE total number of topics in cluster
streams_total DOUBLE total number of streams in cluster
grants_expire_time DOUBLE time until grants expire in seconds
grants_timeseries_used DOUBLE timeseries used
grants_timeseries_total DOUBLE total timeseries
cluster_id VARCHAR TAG cluster id

taosd_vgroups_info table

taosd_vgroups_info table contains vgroups information records.

field type is_tag comment
ts TIMESTAMP timestamp
tables_num DOUBLE number of tables per vgroup
status DOUBLE status, value range:unsynced = 0, ready = 1
vgroup_id VARCHAR TAG vgroup id
database_name VARCHAR TAG database for the vgroup
cluster_id VARCHAR TAG cluster id

taosd_dnodes_info table

taosd_dnodes_info table contains dnodes information records.

field type is_tag comment
ts TIMESTAMP timestamp
uptime DOUBLE dnode uptime in seconds
cpu_engine DOUBLE cpu usage of tdengine. read from /proc/<taosd_pid>/stat
cpu_system DOUBLE cpu usage of server. read from /proc/stat
cpu_cores DOUBLE cpu cores of server
mem_engine DOUBLE memory usage of tdengine. read from /proc/<taosd_pid>/status
mem_free DOUBLE available memory on the server in KB
mem_total DOUBLE total memory of server in KB
disk_used DOUBLE usage of data dir in bytes
disk_total DOUBLE the capacity of data dir in bytes
system_net_in DOUBLE network throughput rate in byte/s. read from /proc/net/dev
system_net_out DOUBLE network throughput rate in byte/s. read from /proc/net/dev
io_read DOUBLE io throughput rate in byte/s. read from /proc/<taosd_pid>/io
io_write DOUBLE io throughput rate in byte/s. read from /proc/<taosd_pid>/io
io_read_disk DOUBLE io throughput rate of disk in byte/s. read from /proc/<taosd_pid>/io
io_write_disk DOUBLE io throughput rate of disk in byte/s. read from /proc/<taosd_pid>/io
vnodes_num DOUBLE number of vnodes per dnode
masters DOUBLE number of master vnodes
has_mnode DOUBLE if the dnode has mnode, value range:include=1, not_include=0
has_qnode DOUBLE if the dnode has qnode, value range:include=1, not_include=0
has_snode DOUBLE if the dnode has snode, value range:include=1, not_include=0
has_bnode DOUBLE if the dnode has bnode, value range:include=1, not_include=0
error_log_count DOUBLE error count
info_log_count DOUBLE info count
debug_log_count DOUBLE debug count
trace_log_count DOUBLE trace count
dnode_id VARCHAR TAG dnode id
dnode_ep VARCHAR TAG dnode endpoint
cluster_id VARCHAR TAG cluster id

taosd_dnodes_status table

taosd_dnodes_status table contains dnodes information records.

field type is_tag comment
ts TIMESTAMP timestamp
status DOUBLE dnode status, value range:ready=1offline =0
dnode_id VARCHAR TAG dnode id
dnode_ep VARCHAR TAG dnode endpoint
cluster_id VARCHAR TAG cluster id

taosd_dnodes_log_dir table

log_dir table contains log directory information records.

field type is_tag comment
ts TIMESTAMP timestamp
avail DOUBLE available space for log directory in bytes
used DOUBLE used space for data directory in bytes
total DOUBLE total space for data directory in bytes
name VARCHAR TAG log directory. default is /var/log/taos/
dnode_id VARCHAR TAG dnode id
dnode_ep VARCHAR TAG dnode endpoint
cluster_id VARCHAR TAG cluster id

taosd_dnodes_data_dir table

taosd_dnodes_data_dir table contains data directory information records.

field type is_tag comment
ts TIMESTAMP timestamp
avail DOUBLE available space for data directory in bytes
used DOUBLE used space for data directory in bytes
total DOUBLE total space for data directory in bytes
level VARCHAR TAG level for multi-level storage
name VARCHAR TAG data directory. default is /var/lib/taos
dnode_id VARCHAR TAG dnode id
dnode_ep VARCHAR TAG dnode endpoint
cluster_id VARCHAR TAG cluster id

taosd_mnodes_info table

taosd_mnodes_info table contains mnode information records.

field type is_tag comment
ts TIMESTAMP timestamp
role DOUBLE the role of mnode. value range:offline = 0,follower = 100,candidate = 101,leader = 102,error = 103,learner = 104
mnode_id VARCHAR TAG master node id
mnode_ep VARCHAR TAG master node endpoint
cluster_id VARCHAR TAG cluster id

taosd_vnodes_role table

taosd_vnodes_role table contains vnode role information records.

field type is_tag comment
ts TIMESTAMP timestamp
role DOUBLE role. value range:offline = 0,follower = 100,candidate = 101,leader = 102,error = 103,learner = 104
vgroup_id VARCHAR TAG vgroup id
database_name VARCHAR TAG database for the vgroup
dnode_id VARCHAR TAG dnode id
cluster_id VARCHAR TAG cluster id

taosd_sql_req table

taosd_sql_req tables contains taosd sql records.

field type is_tag comment
ts TIMESTAMP timestamp
count DOUBLE sql count
result VARCHAR TAG sql execution resultvalue range: Success, Failed
username VARCHAR TAG user name who executed the sql
sql_type VARCHAR TAG sql typevalue range:inserted_rows
dnode_id VARCHAR TAG dnode id
dnode_ep VARCHAR TAG dnode endpoint
vgroup_id VARCHAR TAG dnode id
cluster_id VARCHAR TAG cluster id

taos_sql_req 表

taos_sql_req tables contains taos sql records.

field type is_tag comment
ts TIMESTAMP timestamp
count DOUBLE sql count
result VARCHAR TAG sql execution resultvalue range: Success, Failed
username VARCHAR TAG user name who executed the sql
sql_type VARCHAR TAG sql typevalue range:select, insertdelete
cluster_id VARCHAR TAG cluster id

taos_slow_sql 表

taos_slow_sql ables contains taos slow sql records.

field type is_tag comment
ts TIMESTAMP timestamp
count DOUBLE sql count
result VARCHAR TAG sql execution resultvalue range: Success, Failed
username VARCHAR TAG user name who executed the sql
duration VARCHAR TAG sql execution durationvalue range:3-10s,10-100s,100-1000s,1000s-
cluster_id VARCHAR TAG cluster id

keeper_monitor table

keeper_monitor table contains keeper monitor information records.

field type is_tag comment
ts TIMESTAMP timestamp
cpu FLOAT cpu usage
mem FLOAT memory usage
identify NCHAR TAG

taosadapter_restful_http_request_total table

taosadapter_restful_http_request_total table contains taosadapter rest request information record. The timestamp column of this table is _ts.

field type is_tag comment
_ts TIMESTAMP timestamp
gauge DOUBLE metric value
client_ip NCHAR TAG client ip
endpoint NCHAR TAG taosadpater endpoint
request_method NCHAR TAG request method
request_uri NCHAR TAG request uri
status_code NCHAR TAG status code

taosadapter_restful_http_request_fail table

taosadapter_restful_http_request_fail table contains taosadapter failed rest request information record. The timestamp column of this table is _ts.

field type is_tag comment
_ts TIMESTAMP timestamp
gauge DOUBLE metric value
client_ip NCHAR TAG client ip
endpoint NCHAR TAG taosadpater endpoint
request_method NCHAR TAG request method
request_uri NCHAR TAG request uri
status_code NCHAR TAG status code

taosadapter_restful_http_request_in_flight table

taosadapter_restful_http_request_in_flight table contains taosadapter rest request information record in real time. The timestamp column of this table is _ts.

field type is_tag comment
_ts TIMESTAMP timestamp
gauge DOUBLE metric value
endpoint NCHAR TAG taosadpater endpoint

taosadapter_restful_http_request_summary_milliseconds table

taosadapter_restful_http_request_summary_milliseconds table contains the summary or rest information record. The timestamp column of this table is _ts.

field type is_tag comment
_ts TIMESTAMP timestamp
count DOUBLE
sum DOUBLE
0.5 DOUBLE
0.9 DOUBLE
0.99 DOUBLE
0.1 DOUBLE
0.2 DOUBLE
endpoint NCHAR TAG taosadpater endpoint
request_method NCHAR TAG request method
request_uri NCHAR TAG request uri

taosadapter_system_mem_percent table

taosadapter_system_mem_percent table contains taosadapter memory usage information. The timestamp of this table is _ts.

field type is_tag comment
_ts TIMESTAMP timestamp
gauge DOUBLE metric value
endpoint NCHAR TAG taosadpater endpoint

taosadapter_system_cpu_percent table

taosadapter_system_cpu_percent table contains taosadapter cup usage information. The timestamp of this table is _ts.

field type is_tag comment
_ts TIMESTAMP timestamp
gauge DOUBLE mertic value
endpoint NCHAR TAG taosadpater endpoint