12 KiB
		
	
	
	
	
	
			
		
		
	
	| title | description | 
|---|---|
| TDengine Monitoring | This document describes how to monitor your TDengine cluster. | 
After TDengine is started, it automatically writes monitoring data including CPU, memory and disk usage, bandwidth, number of requests, disk I/O speed, slow queries, into a designated database at a predefined interval through taosKeeper. Additionally, some important system operations, like logon, create user, drop database, and alerts and warnings generated in TDengine are written into the log database too. A system operator can view the data in log database from TDengine CLI or from a web console.
The collection of the monitoring information is enabled by default, but can be disabled by parameter monitor in the configuration file.
TDinsight
TDinsight is a complete solution which uses the monitoring database log mentioned previously, and Grafana, to monitor a TDengine cluster.
Please refer to TDinsight Grafana Dashboard to learn more details about using TDinsight to monitor TDengine.
A script TDinsight.sh is provided to deploy TDinsight automatically.
Download TDinsight.sh with the below command:
wget https://github.com/taosdata/grafanaplugin/raw/master/dashboards/TDinsight.sh
chmod +x TDinsight.sh
Prepare:
- 
TDengine Server - The URL of REST service: for example http://localhost:6041if TDengine is deployed locally
- User name and password
 
- The URL of REST service: for example 
- 
Grafana Alert Notification 
You can use below command to setup Grafana alert notification.
An existing Grafana Notification Channel can be specified with parameter -E, the notifier uid of the channel can be obtained by curl -u admin:admin localhost:3000/api/alert-notifications |jq
 ```bash
 ./TDinsight.sh -a http://localhost:6041 -u root -p taosdata -E <notifier uid>
 ```
Launch TDinsight.sh with the command above and restart Grafana, then open Dashboard http://localhost:3000/d/tdinsight.
For more use cases and restrictions please refer to TDinsight.
log database
The data of tdinsight dashboard is stored in log database (default. You can change it in taoskeeper's config file. For more infrmation, please reference to taoskeeper document). The taoskeeper will create log database on taoskeeper startup.
cluster_info table
cluster_info table contains cluster information records.
| field | type | is_tag | comment | 
|---|---|---|---|
| ts | TIMESTAMP | timestamp | |
| first_ep | VARCHAR | first ep of cluster | |
| first_ep_dnode_id | INT | dnode id or first_ep | |
| version | VARCHAR | tdengine version. such as: 3.0.4.0 | |
| master_uptime | FLOAT | days of master's uptime | |
| monitor_interval | INT | monitor interval in second | |
| dbs_total | INT | total number of databases in cluster | |
| tbs_total | BIGINT | total number of tables in cluster | |
| stbs_total | INT | total number of stables in cluster | |
| dnodes_total | INT | total number of dnodes in cluster | |
| dnodes_alive | INT | total number of dnodes in ready state | |
| mnodes_total | INT | total number of mnodes in cluster | |
| mnodes_alive | INT | total number of mnodes in ready state | |
| vgroups_total | INT | total number of vgroups in cluster | |
| vgroups_alive | INT | total number of vgroups in ready state | |
| vnodes_total | INT | total number of vnode in cluster | |
| vnodes_alive | INT | total number of vnode in ready state | |
| connections_total | INT | total number of connections to cluster | |
| topics_total | INT | total number of topics in cluster | |
| streams_total | INT | total number of streams in cluster | |
| protocol | INT | protocol version | |
| cluster_id | NCHAR | TAG | cluster id | 
d_info table
d_info table contains dnodes information records.
| field | type | is_tag | comment | 
|---|---|---|---|
| ts | TIMESTAMP | timestamp | |
| status | VARCHAR | dnode status | |
| dnode_ep | NCHAR | TAG | dnode endpoint | 
| cluster_id | NCHAR | TAG | cluster id | 
m_info table
m_info table contains mnode information records.
| field | type | is_tag | comment | 
|---|---|---|---|
| ts | TIMESTAMP | timestamp | |
| role | VARCHAR | the role of mnode. leader or follower | |
| mnode_id | INT | TAG | master node id | 
| mnode_ep | NCHAR | TAG | master node endpoint | 
| cluster_id | NCHAR | TAG | cluster id | 
dnodes_info table
dnodes_info table contains dnodes information records.
| field | type | is_tag | comment | 
|---|---|---|---|
| ts | TIMESTAMP | timestamp | |
| uptime | FLOAT | dnode uptime | |
| cpu_engine | FLOAT | cpu usage of tdengine. read from /proc/<taosd_pid>/stat | |
| cpu_system | FLOAT | cpu usage of server. read from /proc/stat | |
| cpu_cores | FLOAT | cpu cores of server | |
| mem_engine | INT | memory usage of tdengine. read from /proc/<taosd_pid>/status | |
| mem_system | INT | available memory on the server | |
| mem_total | INT | total memory of server in KB | |
| disk_engine | INT | ||
| disk_used | BIGINT | usage of data dir in bytes | |
| disk_total | BIGINT | the capacity of data dir in bytes | |
| net_in | FLOAT | network throughput rate in kb/s. read from /proc/net/dev | |
| net_out | FLOAT | network throughput rate in kb/s. read from /proc/net/dev | |
| io_read | FLOAT | io throughput rate in kb/s. read from /proc/<taosd_pid>/io | |
| io_write | FLOAT | io throughput rate in kb/s. read from /proc/<taosd_pid>/io | |
| io_read_disk | FLOAT | io throughput rate of disk in kb/s. read from /proc/<taosd_pid>/io | |
| io_write_disk | FLOAT | io throughput rate of disk in kb/s. read from /proc/<taosd_pid>/io | |
| req_select | INT | number of select queries received per dnode | |
| req_select_rate | FLOAT | number of select queries received per dnode divided by monitor interval. | |
| req_insert | INT | number of insert queries received per dnode | |
| req_insert_success | INT | number of successfully insert queries received per dnode | |
| req_insert_rate | FLOAT | number of insert queries received per dnode divided by monitor interval | |
| req_insert_batch | INT | number of batch insertions | |
| req_insert_batch_success | INT | number of successful batch insertions | |
| req_insert_batch_rate | FLOAT | number of batch insertions divided by monitor interval | |
| errors | INT | dnode errors | |
| vnodes_num | INT | number of vnodes per dnode | |
| masters | INT | number of master vnodes | |
| has_mnode | INT | if the dnode has mnode | |
| has_qnode | INT | if the dnode has qnode | |
| has_snode | INT | if the dnode has snode | |
| has_bnode | INT | if the dnode has bnode | |
| dnode_id | INT | TAG | dnode id | 
| dnode_ep | NCHAR | TAG | dnode endpoint | 
| cluster_id | NCHAR | TAG | cluster id | 
data_dir table
data_dir table contains data directory information records.
| field | type | is_tag | comment | 
|---|---|---|---|
| ts | TIMESTAMP | timestamp | |
| name | NCHAR | data directory. default is /var/lib/taos | |
| level | INT | level for multi-level storage | |
| avail | BIGINT | available space for data directory | |
| used | BIGINT | used space for data directory | |
| total | BIGINT | total space for data directory | |
| dnode_id | INT | TAG | dnode id | 
| dnode_ep | NCHAR | TAG | dnode endpoint | 
| cluster_id | NCHAR | TAG | cluster id | 
log_dir table
log_dir table contains log directory information records.
| field | type | is_tag | comment | 
|---|---|---|---|
| ts | TIMESTAMP | timestamp | |
| name | NCHAR | log directory. default is /var/log/taos/ | |
| avail | BIGINT | available space for log directory | |
| used | BIGINT | used space for data directory | |
| total | BIGINT | total space for data directory | |
| dnode_id | INT | TAG | dnode id | 
| dnode_ep | NCHAR | TAG | dnode endpoint | 
| cluster_id | NCHAR | TAG | cluster id | 
temp_dir table
temp_dir table contains temp dir information records.
| field | type | is_tag | comment | 
|---|---|---|---|
| ts | TIMESTAMP | timestamp | |
| name | NCHAR | temp directory. default is /tmp/ | |
| avail | BIGINT | available space for temp directory | |
| used | BIGINT | used space for temp directory | |
| total | BIGINT | total space for temp directory | |
| dnode_id | INT | TAG | dnode id | 
| dnode_ep | NCHAR | TAG | dnode endpoint | 
| cluster_id | NCHAR | TAG | cluster id | 
vgroups_info table
vgroups_info table contains vgroups information records.
| field | type | is_tag | comment | 
|---|---|---|---|
| ts | TIMESTAMP | timestamp | |
| vgroup_id | INT | vgroup id | |
| database_name | VARCHAR | database for the vgroup | |
| tables_num | BIGINT | number of tables per vgroup | |
| status | VARCHAR | status | |
| dnode_id | INT | TAG | dnode id | 
| dnode_ep | NCHAR | TAG | dnode endpoint | 
| cluster_id | NCHAR | TAG | cluster id | 
vnodes_role table
vnodes_role table contains vnode role information records.
| field | type | is_tag | comment | 
|---|---|---|---|
| ts | TIMESTAMP | timestamp | |
| vnode_role | VARCHAR | role. leader or follower | |
| dnode_id | INT | TAG | dnode id | 
| dnode_ep | NCHAR | TAG | dnode endpoint | 
| cluster_id | NCHAR | TAG | cluster id | 
logs table
logs table contains login information records.
| field | type | is_tag | comment | 
|---|---|---|---|
| ts | TIMESTAMP | timestamp | |
| level | VARCHAR | log level | |
| content | NCHAR | log content | |
| dnode_id | INT | TAG | dnode id | 
| dnode_ep | NCHAR | TAG | dnode endpoint | 
| cluster_id | NCHAR | TAG | cluster id | 
log_summary table
log_summary table contains log summary information records.
| field | type | is_tag | comment | 
|---|---|---|---|
| ts | TIMESTAMP | timestamp | |
| error | INT | error count | |
| info | INT | info count | |
| debug | INT | debug count | |
| trace | INT | trace count | |
| dnode_id | INT | TAG | dnode id | 
| dnode_ep | NCHAR | TAG | dnode endpoint | 
| cluster_id | NCHAR | TAG | cluster id | 
grants_info table
grants_info table contains grants information records.
| field | type | is_tag | comment | 
|---|---|---|---|
| ts | TIMESTAMP | timestamp | |
| expire_time | BIGINT | time until grants expire in seconds | |
| timeseries_used | BIGINT | timeseries used | |
| timeseries_total | BIGINT | total timeseries | |
| dnode_id | INT | TAG | dnode id | 
| dnode_ep | NCHAR | TAG | dnode endpoint | 
| cluster_id | NCHAR | TAG | cluster id | 
keeper_monitor table
keeper_monitor table contains keeper monitor information records.
| field | type | is_tag | comment | 
|---|---|---|---|
| ts | TIMESTAMP | timestamp | |
| cpu | FLOAT | cpu usage | |
| mem | FLOAT | memory usage | |
| identify | NCHAR | TAG | 
taosadapter_restful_http_request_total table
taosadapter_restful_http_request_total table contains taosadapter rest request information record. The timestamp column of this table is _ts.
| field | type | is_tag | comment | 
|---|---|---|---|
| _ts | TIMESTAMP | timestamp | |
| gauge | DOUBLE | metric value | |
| client_ip | NCHAR | TAG | client ip | 
| endpoint | NCHAR | TAG | taosadpater endpoint | 
| request_method | NCHAR | TAG | request method | 
| request_uri | NCHAR | TAG | request uri | 
| status_code | NCHAR | TAG | status code | 
taosadapter_restful_http_request_fail table
taosadapter_restful_http_request_fail table contains taosadapter failed rest request information record. The timestamp column of this table is _ts.
| field | type | is_tag | comment | 
|---|---|---|---|
| _ts | TIMESTAMP | timestamp | |
| gauge | DOUBLE | metric value | |
| client_ip | NCHAR | TAG | client ip | 
| endpoint | NCHAR | TAG | taosadpater endpoint | 
| request_method | NCHAR | TAG | request method | 
| request_uri | NCHAR | TAG | request uri | 
| status_code | NCHAR | TAG | status code | 
taosadapter_restful_http_request_in_flight table
taosadapter_restful_http_request_in_flight table contains taosadapter rest request information record in real time. The timestamp column of this table is _ts.
| field | type | is_tag | comment | 
|---|---|---|---|
| _ts | TIMESTAMP | timestamp | |
| gauge | DOUBLE | metric value | |
| endpoint | NCHAR | TAG | taosadpater endpoint | 
taosadapter_restful_http_request_summary_milliseconds table
taosadapter_restful_http_request_summary_milliseconds table contains the summary or rest information record. The timestamp column of this table is _ts.
| field | type | is_tag | comment | 
|---|---|---|---|
| _ts | TIMESTAMP | timestamp | |
| count | DOUBLE | ||
| sum | DOUBLE | ||
| 0.5 | DOUBLE | ||
| 0.9 | DOUBLE | ||
| 0.99 | DOUBLE | ||
| 0.1 | DOUBLE | ||
| 0.2 | DOUBLE | ||
| endpoint | NCHAR | TAG | taosadpater endpoint | 
| request_method | NCHAR | TAG | request method | 
| request_uri | NCHAR | TAG | request uri | 
taosadapter_system_mem_percent table
taosadapter_system_mem_percent table contains taosadapter memory usage information. The timestamp of this table is _ts.
| field | type | is_tag | comment | 
|---|---|---|---|
| _ts | TIMESTAMP | timestamp | |
| gauge | DOUBLE | metric value | |
| endpoint | NCHAR | TAG | taosadpater endpoint | 
taosadapter_system_cpu_percent table
taosadapter_system_cpu_percent table contains taosadapter cup usage information. The timestamp of this table is _ts.
| field | type | is_tag | comment | 
|---|---|---|---|
| _ts | TIMESTAMP | timestamp | |
| gauge | DOUBLE | mertic value | |
| endpoint | NCHAR | TAG | taosadpater endpoint |