Update index.mdx

Added TDengine Alert function description to the TDinsight manual
Introduced the 14 newly added alert rules in a table format
This commit is contained in:
Yibo Liu 2024-12-04 11:27:16 +08:00 committed by GitHub
parent 3c4e0bd475
commit 4cf63805f5
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 38 additions and 0 deletions

View File

@ -145,6 +145,44 @@ TDinsight 仪表盘旨在提供 TDengine 相关资源的使用情况和状态,
还有上述分类的细分维度折线图。
### 预配置告警规则自动导入
涛思总结用户使用经验整理出14个常用的告警规则alert rule能够对集群关键指标进行监测并及时上报指标异常、超限等告警信息。
从TDengine-server 3.3.4.3版本tdengine-datasource 3.6.3开始TDengine Datasource 支持预配置告警规则自动导入功能用户可将14个告警规则一键导入Grafana直接使用。
预配置告警规则导入方法如下图所示在tdengine-datasource setting界面打开Load TDengine Alert开关即可导入所有预配置告警规则如不需要点击Clear TDengine Alert按钮即可删除所有预配置告警规则。
![TDengine Alert](./assets/TDengine-Alert.webp)
导入后点击Grafana左侧Alert rules可查看当前所有告警规则。
用户只需配置联络点Contact points即可获取告警通知。联络点配置方法见[告警配置](https://docs.taosdata.com/third-party/visual/grafana/#%E5%91%8A%E8%AD%A6%E9%85%8D%E7%BD%AE)。
![Alert-rules](./assets/Alert-rules.webp)
14个告警规则具体配置如下
| 规则名称| 规则阈值| 无监控数据时的行为 | 数据扫描间隔 |持续时间 | 执行SQL |
| ------ | --------- | ---------------- | ----------- |------- |----------------------|
|dnode 节点的CPU负载|均值 > 80%|触发告警|5分钟|5分钟 |select now(), dnode_id, last(cpu_system) as cup_use from log.taosd_dnodes_info where _ts >= (now- 5m) and _ts < now partition by dnode_id having first(_ts) > 0 |
|dnode 节点的的内存 |均值 > 60%|触发告警|5分钟|5分钟|select now(), dnode_id, last(mem_engine) / last(mem_total) * 100 as taosd from log.taosd_dnodes_info where _ts >= (now- 5m) and _ts <now partition by dnode_id |
|dnode 节点的磁盘容量占用 | > 80%|触发告警|5分钟|5分钟|select now(), dnode_id, last(mem_engine) / last(mem_total) * 100 as taosd from log.taosd_dnodes_info where _ts >= (now- 5m) and _ts <now partition by dnode_id |
|集群授权到期 |< 60天|触发告警|1天|0秒|select now(), dnode_id, last(mem_engine) / last(mem_total) * 100 as taosd from log.taosd_dnodes_info where _ts >= (now- 5m) and _ts <now partition by dnode_id |
|测点数达到授权测点数|>= 90%|触发告警|1天|0秒|select now(), dnode_id, last(mem_engine) / last(mem_total) * 100 as taosd from log.taosd_dnodes_info where _ts >= (now- 5m) and _ts <now partition by dnode_id |
|查询并发请求数 | > 100|不触发报警|1分钟|0秒|select now(), dnode_id, last(mem_engine) / last(mem_total) * 100 as taosd from log.taosd_dnodes_info where _ts >= (now- 5m) and _ts <now partition by dnode_id |
|慢查询执行最长时间 (无时间窗口) |> 300秒|不触发报警|1分钟|0秒|select now(), dnode_id, last(mem_engine) / last(mem_total) * 100 as taosd from log.taosd_dnodes_info where _ts >= (now- 5m) and _ts <now partition by dnode_id |
|dnode下线 |total != alive|触发告警|30秒|0秒|select now(), dnode_id, last(mem_engine) / last(mem_total) * 100 as taosd from log.taosd_dnodes_info where _ts >= (now- 5m) and _ts <now partition by dnode_id |
|vnode下线 |total != alive|触发告警|30秒|0秒|select now(), dnode_id, last(mem_engine) / last(mem_total) * 100 as taosd from log.taosd_dnodes_info where _ts >= (now- 5m) and _ts <now partition by dnode_id |
|数据删除请求数 |> 0|不触发报警|30秒|0秒|select now(), dnode_id, last(mem_engine) / last(mem_total) * 100 as taosd from log.taosd_dnodes_info where _ts >= (now- 5m) and _ts <now partition by dnode_id |
|Adapter RESTful 请求失败 |> 5|不触发报警|30秒|0秒|select now(), dnode_id, last(mem_engine) / last(mem_total) * 100 as taosd from log.taosd_dnodes_info where _ts >= (now- 5m) and _ts <now partition by dnode_id |
|Adapter WebSocket 请求失败 |> 5|不触发报警|30秒|0秒|select now(), dnode_id, last(mem_engine) / last(mem_total) * 100 as taosd from log.taosd_dnodes_info where _ts >= (now- 5m) and _ts <now partition by dnode_id |
|dnode 数据上报缺少 |< 3|触发告警|180秒|0秒|select now(), dnode_id, last(mem_engine) / last(mem_total) * 100 as taosd from log.taosd_dnodes_info where _ts >= (now- 5m) and _ts <now partition by dnode_id |
|dnode 重启 |max(update_time) > last(update_time)|触发告警|90秒|0秒|select now(), dnode_id, last(mem_engine) / last(mem_total) * 100 as taosd from log.taosd_dnodes_info where _ts >= (now- 5m) and _ts <now partition by dnode_id |
用户可参考上述告警规则,根据自己业务需求进行修改与完善。
Grafana7.5及以下版本Dashboards与Alert rules功能合在一起而之后的新版本两个功能是分开的。为兼容Grafana7.5及以下版本TDinsight面板中增加了Alert Used Only面板仅Grafana7.5及以下版本需要使用。
![Alert Used Only](./assets/Alert-Used-Only.webp)
## 升级
下面三种方式都可以进行升级:
- 用图形界面,若有新版本,可以在 ”TDengine Datasource“ 插件页面点击 update 升级。