Merge branch 'main' into doc/analysis
14
README-CN.md
|
@ -41,21 +41,23 @@
|
|||
|
||||
# 1. 简介
|
||||
|
||||
TDengine 是一款开源、高性能、云原生的时序数据库 (Time-Series Database, TSDB)。TDengine 能被广泛运用于物联网、工业互联网、车联网、IT 运维、金融等领域。除核心的时序数据库功能外,TDengine 还提供缓存、数据订阅、流式计算等功能,是一极简的时序数据处理平台,最大程度的减小系统设计的复杂度,降低研发和运营成本。与其他时序数据库相比,TDengine 的主要优势如下:
|
||||
TDengine 是一款开源、高性能、云原生、AI 驱动的时序数据库 (Time-Series Database, TSDB)。TDengine 能被广泛运用于物联网、工业互联网、车联网、IT 运维、金融等领域。除核心的时序数据库功能外,TDengine 还提供缓存、数据订阅、流式计算、AI 智能体等功能,是一极简的时序数据处理平台,最大程度的减小系统设计的复杂度,降低研发和运营成本。与其他时序数据库相比,TDengine 的主要优势如下:
|
||||
|
||||
- **高性能**:通过创新的存储引擎设计,无论是数据写入还是查询,TDengine 的性能比通用数据库快 10 倍以上,也远超其他时序数据库,存储空间不及通用数据库的 1/10。
|
||||
|
||||
- **云原生**:通过原生分布式的设计,充分利用云平台的优势,TDengine 提供了水平扩展能力,具备弹性、韧性和可观测性,支持 k8s 部署,可运行在公有云、私有云和混合云上。
|
||||
|
||||
- **极简时序数据平台**:TDengine 内建消息队列、缓存、流式计算等功能,应用无需再集成 Kafka/Redis/HBase/Spark 等软件,大幅降低系统的复杂度,降低应用开发和运营成本。
|
||||
- **极简时序数据平台**:TDengine 内建消息队列、缓存、流式计算、AI 智能体等功能,应用无需再集成 Kafka/Redis/HBase/Spark 等软件,大幅降低系统的复杂度,降低应用开发和运营成本。
|
||||
|
||||
- **分析能力**:支持 SQL,同时为时序数据特有的分析提供SQL扩展。通过超级表、存储计算分离、分区分片、预计算、自定义函数等技术,TDengine 具备强大的分析能力。
|
||||
- **分析能力**:支持 SQL,同时为时序数据特有的分析提供 SQL 扩展。通过超级表、存储计算分离、分区分片、预计算、自定义函数以及 AI Agent 等技术,TDengine 具备强大的分析能力。
|
||||
|
||||
- **简单易用**:无任何依赖,安装、集群几秒搞定;提供REST以及各种语言连接器,与众多第三方工具无缝集成;提供命令行程序,便于管理和即席查询;提供各种运维工具。
|
||||
- **AI智能体**:内置时序数据智能体 TDgpt, 无缝连接时序数据基础模型、大语言模型、机器学习、传统统计算法等,提供时序数据预测、异常检测、数据补全和数据分类的功能。
|
||||
|
||||
- **简单易用**:无任何依赖,安装、集群几秒搞定;提供 REST 以及各种语言连接器,与众多第三方工具无缝集成;提供命令行程序,便于管理和即席查询;提供各种运维工具。
|
||||
|
||||
- **核心开源**:TDengine 的核心代码包括集群功能全部开源,截止到 2022 年 8 月 1 日,全球超过 135.9k 个运行实例,GitHub Star 18.7k,Fork 4.4k,社区活跃。
|
||||
|
||||
了解TDengine高级功能的完整列表,请 [点击](https://tdengine.com/tdengine/)。体验 TDengine 最简单的方式是通过 [TDengine云平台](https://cloud.tdengine.com)。
|
||||
了解TDengine高级功能的完整列表,请 [点击](https://tdengine.com/tdengine/)。体验 TDengine 最简单的方式是通过 [TDengine云平台](https://cloud.tdengine.com)。对最新发布的 TDengine 组件 TDgpt,请访问 [TDgpt README](./tools/tdgpt/README.md) 了解细节。
|
||||
|
||||
# 2. 文档
|
||||
|
||||
|
@ -67,7 +69,7 @@ TDengine 是一款开源、高性能、云原生的时序数据库 (Time-Series
|
|||
|
||||
# 3. 前置条件
|
||||
|
||||
TDengine 目前可以在 Linux、 Windows、macOS 等平台上安装和运行。任何 OS 的应用也可以选择 taosAdapter 的 RESTful 接口连接服务端 taosd。CPU 支持 X64、ARM64,后续会支持 MIPS64、Alpha64、ARM32、RISC-V 等 CPU 架构。目前不支持使用交叉编译器构建。
|
||||
TDengine 目前可以在 Linux 和 macOS 平台上安装和运行 (企业版支持 Windows)。任何 OS 的应用也可以选择 taosAdapter 的 RESTful 接口连接服务端 taosd。CPU 支持 X64、ARM64,后续会支持 MIPS64、Alpha64、ARM32、RISC-V 等 CPU 架构。目前不支持使用交叉编译器构建。
|
||||
|
||||
如果你想要编译 taosAdapter 或者 taosKeeper,需要安装 Go 1.18 及以上版本。
|
||||
|
||||
|
|
14
README.md
|
@ -54,23 +54,23 @@ English | [简体中文](README-CN.md) | [TDengine Cloud](https://cloud.tdengine
|
|||
|
||||
# 1. Introduction
|
||||
|
||||
TDengine is an open source, high-performance, cloud native [time-series database](https://tdengine.com/tsdb/) optimized for Internet of Things (IoT), Connected Cars, and Industrial IoT. It enables efficient, real-time data ingestion, processing, and monitoring of TB and even PB scale data per day, generated by billions of sensors and data collectors. TDengine differentiates itself from other time-series databases with the following advantages:
|
||||
TDengine is an open source, high-performance, cloud native and AI powered [time-series database](https://tdengine.com/tsdb/) designed for Internet of Things (IoT), Connected Cars, and Industrial IoT. It enables efficient, real-time data ingestion, processing, and analysis of TB and even PB scale data per day, generated by billions of sensors and data collectors. TDengine differentiates itself from other time-series databases with the following advantages:
|
||||
|
||||
- **[High Performance](https://tdengine.com/tdengine/high-performance-time-series-database/)**: TDengine is the only time-series database to solve the high cardinality issue to support billions of data collection points while out performing other time-series databases for data ingestion, querying and data compression.
|
||||
|
||||
- **[Simplified Solution](https://tdengine.com/tdengine/simplified-time-series-data-solution/)**: Through built-in caching, stream processing and data subscription features, TDengine provides a simplified solution for time-series data processing. It reduces system design complexity and operation costs significantly.
|
||||
- **[Simplified Solution](https://tdengine.com/tdengine/simplified-time-series-data-solution/)**: Through built-in caching, stream processing, data subscription and AI agent features, TDengine provides a simplified solution for time-series data processing. It reduces system design complexity and operation costs significantly.
|
||||
|
||||
- **[Cloud Native](https://tdengine.com/tdengine/cloud-native-time-series-database/)**: Through native distributed design, sharding and partitioning, separation of compute and storage, RAFT, support for kubernetes deployment and full observability, TDengine is a cloud native Time-Series Database and can be deployed on public, private or hybrid clouds.
|
||||
|
||||
- **[AI Powered](https://tdengine.com/tdengine/tdgpt/)**: Through the built in AI agent TDgpt, TDengine can connect to a variety of time series foundation model, large language model, machine learning and traditional algorithms to provide time series data forecasting, anomly detection, imputation and classification.
|
||||
|
||||
- **[Ease of Use](https://tdengine.com/tdengine/easy-time-series-data-platform/)**: For administrators, TDengine significantly reduces the effort to deploy and maintain. For developers, it provides a simple interface, simplified solution and seamless integrations for third party tools. For data users, it gives easy data access.
|
||||
|
||||
- **[Easy Data Analytics](https://tdengine.com/tdengine/time-series-data-analytics-made-easy/)**: Through super tables, storage and compute separation, data partitioning by time interval, pre-computation and other means, TDengine makes it easy to explore, format, and get access to data in a highly efficient way.
|
||||
- **[Easy Data Analytics](https://tdengine.com/tdengine/time-series-data-analytics-made-easy/)**: Through super tables, storage and compute separation, data partitioning by time interval, pre-computation and AI agent, TDengine makes it easy to explore, format, and get access to data in a highly efficient way.
|
||||
|
||||
- **[Open Source](https://tdengine.com/tdengine/open-source-time-series-database/)**: TDengine’s core modules, including cluster feature, are all available under open source licenses. It has gathered 19.9k stars on GitHub. There is an active developer community, and over 139k running instances worldwide.
|
||||
- **[Open Source](https://tdengine.com/tdengine/open-source-time-series-database/)**: TDengine’s core modules, including cluster feature and AI agent, are all available under open source licenses. It has gathered 23.7k stars on GitHub. There is an active developer community, and over 730k running instances worldwide.
|
||||
|
||||
For a full list of TDengine competitive advantages, please [check here](https://tdengine.com/tdengine/). The easiest way to experience TDengine is through [TDengine Cloud](https://cloud.tdengine.com).
|
||||
|
||||
For the latest TDengine component TDgpt, please refer to [TDgpt README](./tools/tdgpt/README.md) for details.
|
||||
For a full list of TDengine competitive advantages, please [check here](https://tdengine.com/tdengine/). The easiest way to experience TDengine is through [TDengine Cloud](https://cloud.tdengine.com). For the latest TDengine component TDgpt, please refer to [TDgpt README](./tools/tdgpt/README.md) for details.
|
||||
|
||||
# 2. Documentation
|
||||
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
# taosadapter
|
||||
ExternalProject_Add(taosadapter
|
||||
GIT_REPOSITORY https://github.com/taosdata/taosadapter.git
|
||||
GIT_TAG 3.3.6
|
||||
GIT_TAG main
|
||||
SOURCE_DIR "${TD_SOURCE_DIR}/tools/taosadapter"
|
||||
BINARY_DIR ""
|
||||
#BUILD_IN_SOURCE TRUE
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
# taosws-rs
|
||||
ExternalProject_Add(taosws-rs
|
||||
GIT_REPOSITORY https://github.com/taosdata/taos-connector-rust.git
|
||||
GIT_TAG 3.0
|
||||
GIT_TAG main
|
||||
SOURCE_DIR "${TD_SOURCE_DIR}/tools/taosws-rs"
|
||||
BINARY_DIR ""
|
||||
#BUILD_IN_SOURCE TRUE
|
||||
|
|
|
@ -6,6 +6,8 @@ slug: /basic-features/data-model
|
|||
|
||||
import Image from '@theme/IdealImage';
|
||||
import dataModel from '../assets/data-model-01.png';
|
||||
import origintable from '../assets/data-model-origin-table.png';
|
||||
import origintable2 from '../assets/data-model-origin-table-2.png';
|
||||
|
||||
To clearly explain the concepts of time-series data and facilitate the writing of example programs, the TDengine documentation uses smart meters as an example. These example smart meters can collect three metrics: current, voltage, and phase. In addition, each smart meter also has two static attributes: location and group ID. The data collected by these smart meters is shown in the table below.
|
||||
|
||||
|
@ -79,6 +81,19 @@ To better understand the relationship between metrics, tags, supertables, and su
|
|||
<figcaption>Figure 1. The TDengine data model</figcaption>
|
||||
</figure>
|
||||
|
||||
### Virtual Tables
|
||||
|
||||
The design of "one table per data collection point" and "supertables" addresses most challenges in time-series data management and analysis for industrial and IoT scenarios. However, in real-world scenarios, a single device often has multiple sensors with varying collection frequencies. For example, a wind turbine may have electrical parameters, environmental parameters, and mechanical parameters, each collected by different sensors at different intervals. This makes it difficult to describe a device with a single table, often requiring multiple tables. When analyzing data across multiple sensors, multi-level join queries become necessary, which can lead to usability and performance issues. From a user perspective, "one table per device" is more intuitive. However, directly implementing this model would result in excessive NULL values at each timestamp due to varying collection frequencies, reducing storage and query efficiency.
|
||||
|
||||
To resolve this, TDengine introduces **Virtual Tables** (VTables). A virtual table is a logical entity that does not store physical data but enables analytical computations by dynamically combining columns from multiple source tables (subtables or regular tables). Like physical tables, virtual tables can be categorized into **virtual supertables**, **virtual subtables**, and **virtual regular tables**. A virtual supertable can represent a complete dataset for a device or group of devices, while each virtual subtable can flexibly reference columns from different sources. This allows users to define custom data views tailored to specific analytical needs, achieving a "personalized schema per user" effect. Virtual tables cannot be written to or deleted from but are queried like physical tables. The key distinction is that virtual table data is dynamically generated during queries—only columns referenced in a query are merged into the virtual table. Thus, the same virtual table may present entirely different datasets across different queries.
|
||||
|
||||
**Key Features of Virtual Supertables:**
|
||||
1. **Column Selection & Merging**: Users can select specific columns from multiple source tables and combine them into a unified view.
|
||||
2. **Timestamp-Based Alignment**: Data is aligned by timestamp. If multiple tables have data at the same timestamp, their column values are merged into a single row. Missing values are filled with NULL.
|
||||
3. **Dynamic Updates**: Virtual tables automatically reflect changes in source tables, ensuring real-time data without physical storage.
|
||||
|
||||
By introducing virtual tables, TDengine simplifies the management of complex device data. Regardless of how individual collection points are modeled (single-column or multi-column) or distributed across databases/tables, users can freely define data sources through virtual supertables. This enables cross-collection-point aggregation and analysis, making "one table per device" a practical reality.
|
||||
|
||||
### Database
|
||||
|
||||
A database in TDengine is used to manage a collection of tables. TDengine allows a running instance to contain multiple databases, and each database can be configured with different storage strategies. Since different types of data collection points usually have different data characteristics, such as data collection frequency, data retention period, number of replicas, data block size, etc., it is recommended to create supertables with different data characteristics in different databases.
|
||||
|
@ -226,3 +241,174 @@ TDengine supports flexible data model designs, including multi-column and single
|
|||
Although TDengine recommends using the multi-column model because it generally offers better writing and storage efficiency, the single-column model might be more suitable in certain specific scenarios. For example, if the types of quantities collected at a data collection point frequently change, using a multi-column model would require frequent modifications to the supertable's structural definition, increasing the complexity of the application. In such cases, using a single-column model can simplify the design and management of the application, as it allows independent management and expansion of each physical quantity's supertable.
|
||||
|
||||
Overall, TDengine offers flexible data model options, allowing users to choose the most suitable model based on actual needs and scenarios to optimize performance and manage complexity.
|
||||
|
||||
### Creating Virtual Tables
|
||||
|
||||
Whether using single-column or multi-column models, TDengine enables cross-table operations through virtual tables. Using smart meters as an example, here we introduce two typical use cases for virtual tables:
|
||||
|
||||
1. Single-Source Multi-Dimensional Time-Series Aggregation
|
||||
2. Cross-Source Metric Comparative Analysis
|
||||
|
||||
---
|
||||
|
||||
#### 1. Single-Source Multi-Dimensional Time-Series Aggregation
|
||||
In this scenario, "single-source" refers to multiple **single-column time-series tables** from the **same data collection point**. While these tables are physically split due to business requirements or constraints, they maintain logical consistency through device tags and timestamps. Virtual tables restore "vertically" split data into a complete "horizontal" view of the collection point.
|
||||
For example, Suppose three supertables are created for current, voltage, and phase measurements using a single-column model. Virtual tables can aggregate these three measurements into one unified view.
|
||||
|
||||
The SQL statement for creating a supertable in the single-column model is as follows:
|
||||
|
||||
```sql
|
||||
|
||||
CREATE STABLE current_stb (
|
||||
ts timestamp,
|
||||
current float
|
||||
) TAGS (
|
||||
device_id varchar(64),
|
||||
location varchar(64),
|
||||
group_id int
|
||||
);
|
||||
|
||||
CREATE STABLE voltage_stb (
|
||||
ts timestamp,
|
||||
voltage int
|
||||
) TAGS (
|
||||
device_id varchar(64),
|
||||
location varchar(64),
|
||||
group_id int
|
||||
);
|
||||
|
||||
CREATE STABLE phase_stb (
|
||||
ts timestamp,
|
||||
phase float
|
||||
) TAGS (
|
||||
device_id varchar(64),
|
||||
location varchar(64),
|
||||
group_id int
|
||||
);
|
||||
```
|
||||
|
||||
Assume there are four devices: d1001, d1002, d1003, and d1004. To create subtables for their current, voltage, and phase measurements, use the following SQL statements:
|
||||
|
||||
```sql
|
||||
create table current_d1001 using current_stb(deviceid, location, group_id) tags("d1001", "California.SanFrancisco", 2);
|
||||
create table current_d1002 using current_stb(deviceid, location, group_id) tags("d1002", "California.SanFrancisco", 3);
|
||||
create table current_d1003 using current_stb(deviceid, location, group_id) tags("d1003", "California.LosAngeles", 3);
|
||||
create table current_d1004 using current_stb(deviceid, location, group_id) tags("d1004", "California.LosAngeles", 2);
|
||||
|
||||
create table voltage_d1001 using voltage_stb(deviceid, location, group_id) tags("d1001", "California.SanFrancisco", 2);
|
||||
create table voltage_d1002 using voltage_stb(deviceid, location, group_id) tags("d1002", "California.SanFrancisco", 3);
|
||||
create table voltage_d1003 using voltage_stb(deviceid, location, group_id) tags("d1003", "California.LosAngeles", 3);
|
||||
create table voltage_d1004 using voltage_stb(deviceid, location, group_id) tags("d1004", "California.LosAngeles", 2);
|
||||
|
||||
create table phase_d1001 using phase_stb(deviceid, location, group_id) tags("d1001", "California.SanFrancisco", 2);
|
||||
create table phase_d1002 using phase_stb(deviceid, location, group_id) tags("d1002", "California.SanFrancisco", 3);
|
||||
create table phase_d1003 using phase_stb(deviceid, location, group_id) tags("d1003", "California.LosAngeles", 3);
|
||||
create table phase_d1004 using phase_stb(deviceid, location, group_id) tags("d1004", "California.LosAngeles", 2);
|
||||
```
|
||||
|
||||
A virtual supertable can be used to aggregate these three types of measurements into a single table. The SQL statement to create the virtual supertable is as follows:
|
||||
|
||||
```sql
|
||||
CREATE STABLE meters_v (
|
||||
ts timestamp,
|
||||
current float,
|
||||
voltage int,
|
||||
phase float
|
||||
) TAGS (
|
||||
location varchar(64),
|
||||
group_id int
|
||||
) VIRTUAL 1;
|
||||
```
|
||||
|
||||
For the four devices d1001, d1002, d1003, and d1004, create virtual subtables with the following SQL statements:
|
||||
|
||||
```sql
|
||||
CREATE VTABLE d1001_v (
|
||||
current from current_d1001.current,
|
||||
voltage from voltage_d1001.voltage,
|
||||
phase from phase_d1001.phase
|
||||
)
|
||||
USING meters_v
|
||||
TAGS (
|
||||
"California.SanFrancisco",
|
||||
2
|
||||
);
|
||||
|
||||
CREATE VTABLE d1002_v (
|
||||
current from current_d1002.current,
|
||||
voltage from voltage_d1002.voltage,
|
||||
phase from phase_d1002.phase
|
||||
)
|
||||
USING meters_v
|
||||
TAGS (
|
||||
"California.SanFrancisco",
|
||||
3
|
||||
);
|
||||
|
||||
CREATE VTABLE d1003_v (
|
||||
current from current_d1003.current,
|
||||
voltage from voltage_d1003.voltage,
|
||||
phase from phase_d1003.phase
|
||||
)
|
||||
USING meters_v
|
||||
TAGS (
|
||||
"California.LosAngeles",
|
||||
3
|
||||
);
|
||||
|
||||
CREATE VTABLE d1004_v (
|
||||
current from current_d1004.current,
|
||||
voltage from voltage_d1004.voltage,
|
||||
phase from phase_d1004.phase
|
||||
)
|
||||
USING meters_v
|
||||
TAGS (
|
||||
"California.LosAngeles",
|
||||
2
|
||||
);
|
||||
```
|
||||
|
||||
Taking device d1001 as an example, assume that the current, voltage, and phase data of device d1001 are as follows:
|
||||
|
||||
<img src={origintable} width="500" alt="data-model-origin-table" />
|
||||
|
||||
| Timestamp | Current | Voltage | Phase |
|
||||
|-------------------|---------|---------|-------|
|
||||
| 1538548685000 | 10.3 | 219 | 0.31 |
|
||||
| 1538548695000 | 12.6 | 218 | 0.33 |
|
||||
| 1538548696800 | 12.3 | 221 | 0.31 |
|
||||
| 1538548697100 | 12.1 | 220 | NULL |
|
||||
| 1538548697200 | NULL | NULL | 0.32 |
|
||||
| 1538548697700 | 11.8 | NULL | NULL |
|
||||
| 1538548697800 | NULL | 222 | 0.33 |
|
||||
|
||||
---
|
||||
|
||||
#### 2. Cross-Source Metric Comparative Analysis
|
||||
In this scenario, "cross-source" refers to data from **different data collection points**. Virtual tables align and merge semantically comparable measurements from multiple devices for comparative analysis.
|
||||
For example, Compare current measurements across devices `d1001`, `d1002`, `d1003`, and `d1004`. The SQL statement to create the virtual table is as follows:
|
||||
|
||||
```sql
|
||||
CREATE VTABLE current_v (
|
||||
ts TIMESTAMP,
|
||||
d1001_current FLOAT FROM current_d1001.current,
|
||||
d1002_current FLOAT FROM current_d1002.current,
|
||||
d1003_current FLOAT FROM current_d1003.current,
|
||||
d1004_current FLOAT FROM current_d1004.current
|
||||
);
|
||||
```
|
||||
|
||||
Assume that the current data of devices d1001, d1002, d1003, and d1004 are as follows:
|
||||
<img src={origintable2} width="500" alt="data-model-origin-table-2" />
|
||||
|
||||
The virtual table `current_v` aligns current data by timestamp:
|
||||
|
||||
| Timestamp | d1001_current | d1002_current | d1003_current | d1004_current |
|
||||
|-------------------|---------------|---------------|---------------|---------------|
|
||||
| 1538548685000 | 10.3 | 11.7 | 11.2 | 12.4 |
|
||||
| 1538548695000 | 12.6 | 11.9 | 10.8 | 11.3 |
|
||||
| 1538548696800 | 12.3 | 12.4 | 12.3 | 10.1 |
|
||||
| 1538548697100 | 12.1 | NULL | 11.1 | NULL |
|
||||
| 1538548697200 | NULL | 12.2 | NULL | 11.7 |
|
||||
| 1538548697700 | 11.8 | 11.4 | NULL | NULL |
|
||||
| 1538548697800 | NULL | NULL | 12.1 | 12.6 |
|
||||
|
|
|
@ -0,0 +1,193 @@
|
|||
---
|
||||
title: Installation
|
||||
sidebar_label: Installation
|
||||
---
|
||||
|
||||
## Preparing Your Environment
|
||||
|
||||
To use the analytics capabilities offered by TDgpt, you deploy an AI node (anode) in your TDengine cluster. Anodes run on Linux and require Python 3.10 or later.
|
||||
|
||||
TDgpt is supported in TDengine 3.3.6 and later. You must upgrade your cluster to version 3.3.6 or later before deploying any anodes.
|
||||
|
||||
You can run the following commands to install Python 3.10 in Ubuntu.
|
||||
|
||||
### Install Python
|
||||
|
||||
```shell
|
||||
sudo apt-get install software-properties-common
|
||||
sudo add-apt-repository ppa:deadsnakes/ppa
|
||||
sudo apt update
|
||||
sudo apt install python3.10
|
||||
sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 2
|
||||
sudo update-alternatives --config python3
|
||||
sudo apt install python3.10-venv
|
||||
sudo apt install python3.10-dev
|
||||
```
|
||||
|
||||
### Install pip
|
||||
|
||||
```shell
|
||||
curl -sS https://bootstrap.pypa.io/get-pip.py | python3.10
|
||||
```
|
||||
|
||||
### Configure Environment Variables
|
||||
|
||||
Add `~/.local/bin` to the `PATH` environment variable in `~/.bashrc` or `~/.bash_profile`.
|
||||
```shell
|
||||
export PATH=$PATH:~/.local/bin
|
||||
```
|
||||
The Python environment has been installed. You can now install TDgpt.
|
||||
|
||||
### Install TDgpt
|
||||
|
||||
Obtain the installation package `TDengine-anode-3.3.x.x-Linux-x64.tar.gz` and install it on your machine:
|
||||
|
||||
```bash
|
||||
tar -xzvf TDengine-anode-3.3.6.0-Linux-x64.tar.gz
|
||||
cd TDengine-anode-3.3.6.0
|
||||
sudo ./install.sh
|
||||
```
|
||||
|
||||
You can run the `rmtaosanode` command to uninstall TDgpt.
|
||||
To prevent TDgpt from affecting Python environments that may exist on your machine, anodes are installed in a virtual environment. When you install an anode, a virtual Python environment is deployed in the `/var/lib/taos/taosanode/venv/` directory. All libraries required by the anode are installed in this directory. Note that this virtual environment is not uninstalled automatically by the `rmtaosanode` command. If you are sure that you do not want to use TDgpt on a machine, you can remove the directory manually.
|
||||
|
||||
### Start the TDgpt Service
|
||||
|
||||
The `taosanoded` service is created when you install an anode. You can use systemd to manage this service:
|
||||
|
||||
```bash
|
||||
systemctl start taosanoded
|
||||
systemctl stop taosanoded
|
||||
systemctl status taosanoded
|
||||
```
|
||||
|
||||
## Directory and Configuration Information
|
||||
|
||||
The directory structure of an anode is described in the following table:
|
||||
|
||||
|Directory or File|Description|
|
||||
|---------------|------|
|
||||
|/usr/local/taos/taosanode/bin|Directory containing executable files|
|
||||
|/usr/local/taos/taosanode/resource|Directory containing resource files, linked to `/var/lib/taos/taosanode/resource/`|
|
||||
|/usr/local/taos/taosanode/lib|Directory containing libraries|
|
||||
|/usr/local/taos/taosanode/model|Directory containing models, linked to `/var/lib/taos/taosanode/model`|
|
||||
|/var/log/taos/taosanode/|Log directory|
|
||||
|/etc/taos/taosanode.ini|Configuration file|
|
||||
|
||||
### Configuration
|
||||
|
||||
The anode provides services through an uWSGI driver. The configuration for the anode and for uWSGI are both found in the `taosanode.ini` file, located by default in the `/etc/taos/` directory.
|
||||
|
||||
The configuration options are described as follows:
|
||||
|
||||
```ini
|
||||
[uwsgi]
|
||||
|
||||
# Anode RESTful service ip:port
|
||||
http = 127.0.0.1:6090
|
||||
|
||||
# base directory for Anode python files, do NOT modified this
|
||||
chdir = /usr/local/taos/taosanode/lib
|
||||
|
||||
# initialize Anode python file
|
||||
wsgi-file = /usr/local/taos/taosanode/lib/taos/app.py
|
||||
|
||||
# pid file
|
||||
pidfile = /usr/local/taos/taosanode/taosanode.pid
|
||||
|
||||
# conflict with systemctl, so do NOT uncomment this
|
||||
# daemonize = /var/log/taos/taosanode/taosanode.log
|
||||
|
||||
# uWSGI log files
|
||||
logto = /var/log/taos/taosanode/taosanode.log
|
||||
|
||||
# uWSGI monitor port
|
||||
stats = 127.0.0.1:8387
|
||||
|
||||
# python virtual environment directory, used by Anode
|
||||
virtualenv = /usr/local/taos/taosanode/venv/
|
||||
|
||||
[taosanode]
|
||||
# default taosanode log file
|
||||
app-log = /var/log/taos/taosanode/taosanode.app.log
|
||||
|
||||
# model storage directory
|
||||
model-dir = /usr/local/taos/taosanode/model/
|
||||
|
||||
# default log level
|
||||
log-level = INFO
|
||||
|
||||
```
|
||||
|
||||
:::note
|
||||
Do not specify a value for the `daemonize` parameter. This parameter causes a conflict between uWSGI and systemctl. If you enable the `daemonize` parameter, your anode will fail to start.
|
||||
:::
|
||||
|
||||
The configuration file above includes only the basic configuration needed for an anode to provide services. For more information about configuring uWSGI, see the [official documentation](https://uwsgi-docs.readthedocs.io/en/latest/).
|
||||
|
||||
The main configuration options for an anode are described as follows:
|
||||
|
||||
- app-log: Specify the directory in which anode log files are stored.
|
||||
- model-dir: Specify the directory in which models are stored. Models are generated by algorithms based on existing datasets.
|
||||
- log-level: Specify the log level for anode logs.
|
||||
|
||||
## Managing Anodes
|
||||
|
||||
You manage anodes through the TDengine CLI. The following actions must be performed within the CLI on a client that is connected to your TDengine cluster.
|
||||
|
||||
### Create an Anode
|
||||
|
||||
```sql
|
||||
CREATE ANODE {node_url}
|
||||
```
|
||||
|
||||
The `node_url` parameter determines the IP address and port of the anode. This information will be registered to your TDengine cluster. Do not register a single anode to multiple TDengine clusters.
|
||||
|
||||
### View Anodes
|
||||
|
||||
You can run the following command to display the FQDN and status of the anodes in your cluster:
|
||||
|
||||
```sql
|
||||
SHOW ANODES;
|
||||
|
||||
taos> show anodes;
|
||||
id | url | status | create_time | update_time |
|
||||
==================================================================================================================
|
||||
1 | 192.168.0.1:6090 | ready | 2024-11-28 18:44:27.089 | 2024-11-28 18:44:27.089 |
|
||||
Query OK, 1 row(s) in set (0.037205s)
|
||||
|
||||
```
|
||||
|
||||
### View Advanced Analytics Services
|
||||
|
||||
```SQL
|
||||
SHOW ANODES FULL;
|
||||
|
||||
taos> show anodes full;
|
||||
id | type | algo |
|
||||
============================================================================
|
||||
1 | anomaly-detection | shesd |
|
||||
1 | anomaly-detection | iqr |
|
||||
1 | anomaly-detection | ksigma |
|
||||
1 | anomaly-detection | lof |
|
||||
1 | anomaly-detection | grubbs |
|
||||
1 | anomaly-detection | ad_encoder |
|
||||
1 | forecast | holtwinters |
|
||||
1 | forecast | arima |
|
||||
Query OK, 8 row(s) in set (0.008796s)
|
||||
|
||||
```
|
||||
|
||||
### Refresh the Algorithm Cache
|
||||
|
||||
```SQL
|
||||
UPDATE ANODE {anode_id}
|
||||
UPDATE ALL ANODES
|
||||
```
|
||||
|
||||
### Delete an Anode
|
||||
|
||||
```sql
|
||||
DROP ANODE {anode_id}
|
||||
```
|
||||
Deleting an anode only removes it from your TDengine cluster. To stop an anode, use systemctl on the machine where the anode is located. To remove an anode, run the `rmtaosanode` command on the machine where the anode is located.
|
|
@ -0,0 +1,65 @@
|
|||
---
|
||||
title: Data Preprocessing
|
||||
sidebar_label: Data Preprocessing
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import preprocFlow from '../../assets/tdgpt-02.png';
|
||||
import wnData from '../../assets/tdgpt-03.png'
|
||||
|
||||
## Analysis Workflow
|
||||
|
||||
Data must be preprocessed before it can be analyzed by TDgpt. This process is described in the following figure:
|
||||
|
||||
<figure>
|
||||
<Image img={preprocFlow} alt="Preprocessing workflow" />
|
||||
<figcaption>Preprocessing workflow</figcaption>
|
||||
</figure>
|
||||
|
||||
|
||||
TDgpt first performs a white noise data check on the dataset that you input. Data that passes this check and is intended for use in forecasting is then resampled and its timestamps are aligned. Note that resampling and alignment are not performed for datasets used in anomaly detection.
|
||||
|
||||
After the data has been preprocessed, forecasting or anomaly detection is performed. Preprocessing is not part of the business logic for forecasting and anomaly detection.
|
||||
|
||||
## White Noise Data Check
|
||||
|
||||
<figure>
|
||||
<Image img={wnData} alt="White noise data"/>
|
||||
<figcaption>White noise data</figcaption>
|
||||
</figure>
|
||||
|
||||
The white noise data check determines whether the input data consists of random numbers. The figure above shows an example of a regular distribution of random numbers. Random numbers cannot be analyzed meaningfully, and this data is rejected by the system. The white noise data check is performed using the classic Ljung-Box test. The test is performed over an entire time series. If you are certain that your data is not random, you can specify the `wncheck=0` parameter to force TDgpt to skip this check.
|
||||
|
||||
TDgpt does not provide white noise checking as an independent feature. It is performed only as part of data preprocessing.
|
||||
|
||||
## Resampling and Timestamp Alignment
|
||||
|
||||
Time-series data must be preprocessed before forecasting can be performed. Preprocessing is intended to resolve the following two issues:
|
||||
|
||||
The timestamps of real time-series datasets are not aligned. It is impossible to guarantee that devices generating data or network gateways create timestamps at strict intervals. For this reason, it cannot be guaranteed that the timestamps of time-series data are in strict alignment with the sampling rate of the data. For example, a time series sampled at 1 Hz may have the following timestamps:
|
||||
|
||||
```text
|
||||
['20:12:21.143', '20:12:22.187', '20:12:23.032', '20:12:24.384', '20:12:25.033']
|
||||
```
|
||||
|
||||
The data returned by the forecasting algorithm is strictly aligned by timestamp. For example, the next two data points in the set must be `['20:12:26.000', '20:12:27.000']`. For this reason, data such as the preceding set must be aligned as follows:
|
||||
|
||||
```
|
||||
['20:12:21.000', '20:12:22.000', '20:12:23.000', '20:12:24.000', '20:12:25.000']
|
||||
```
|
||||
|
||||
The sampling rate input by the user can exceed the output rate of the results. For example, the following data was sampled at 5 second intervals, but the user could request forecasting in 10 second intervals:
|
||||
|
||||
```
|
||||
['20:12:20.000', '20:12:25.000', '20:12:30.000', '20:12:35.000', '20:12:40.000']
|
||||
```
|
||||
|
||||
The data is then resampled to 10 second intervals as follows:
|
||||
|
||||
```
|
||||
['20:12:20.000', '20:12:30.000', '20:12:40.000']
|
||||
```
|
||||
|
||||
This resampled data is then input into the forecasting algorithm. In this case, the data points `['20:12:25.000', '20:12:35.000']` are discarded.
|
||||
|
||||
It is important to note that TDgpt does not fill in missing data during preprocessing. If you input the dataset `['20:12:10.113', '20:12:21.393', '20:12:29.143', '20:12:51.330']` and specify an interval of 10 seconds, the aligned dataset will be `['20:12:10.000', '20:12:20.000', '20:12:30.000', '20:12:50.000']`. This will cause the forecasting algorithm to return an error.
|
|
@ -0,0 +1,63 @@
|
|||
---
|
||||
title: ARIMA
|
||||
sidebar_label: ARIMA
|
||||
---
|
||||
|
||||
This document describes how to generate autoregressive integrated moving average (ARIMA) models.
|
||||
|
||||
## Description
|
||||
|
||||
The ARIMA(*p*, *d*, *q*) model is one of the most common in time-series forecasting. It is an autoregressive model that can predict future data from an independent variable. ARIMA requires that time-series data be stationary. Accurate results cannot be obtained from non-stationary data.
|
||||
|
||||
A stationary time series is one whose characteristics do not change based on the time at which it is observed. Time series that experience trends or seasonality are not stationary because they exhibit different characteristics at different times.
|
||||
|
||||
The following variables can be dynamically input to generate appropriate ARIMA models:
|
||||
|
||||
- *p* is the order of the autoregressive model
|
||||
- *d* is the order of differencing
|
||||
- *q* is the order of the moving-average model
|
||||
|
||||
## Parameters
|
||||
|
||||
Automated ARIMA modeling is performed in TDgpt. For this reason, the results for each input are automatically fitted to the most appropriate model. Forecasting is then performed based on the specified model.
|
||||
|
||||
|Parameter|Description|Required?|
|
||||
|---|---|-----|
|
||||
|period|The number of data points included in each period. If not specified or set to 0, non-seasonal ARIMA models are used.|No|
|
||||
|start_p|The starting order of the autoregressive model. Enter an integer greater than or equal to 0. Values greater than 10 are not recommended.|No|
|
||||
|max_p|The ending order of the autoregressive model. Enter an integer greater than or equal to 0. Values greater than 10 are not recommended.|No|
|
||||
|start_q|The starting order of the moving-average model. Enter an integer greater than or equal to 0. Values greater than 10 are not recommended.|No|
|
||||
|max_q|The ending order of the moving-average model. Enter an integer greater than or equal to 0. Values greater than 10 are not recommended.|No|
|
||||
|d|The order of differencing.|No|
|
||||
|
||||
The `start_p`, `max_p`, `start_q`, and `max_q` parameters cause the model to find the optimal solution within the specified restrictions. Given the same input data, a larger range will result in higher resource consumption and slower response time.
|
||||
|
||||
## Example
|
||||
|
||||
In this example, forecasting is performed on the `i32` column. Each 10 data points in the column form a period. The values of `start_p` and `start_q` are both 1, and the corresponding ending values are both 5. The forecasting results are within a 95% confidence interval.
|
||||
|
||||
```
|
||||
FORECAST(i32, "algo=arima,alpha=95,period=10,start_p=1,max_p=5,start_q=1,max_q=5")
|
||||
```
|
||||
|
||||
The complete SQL statement is shown as follows:
|
||||
|
||||
```SQL
|
||||
SELECT _frowts, FORECAST(i32, "algo=arima,alpha=95,period=10,start_p=1,max_p=5,start_q=1,max_q=5") from foo
|
||||
```
|
||||
|
||||
```json5
|
||||
{
|
||||
"rows": fc_rows, // Rows returned
|
||||
"period": period, // Period of results (equivalent to input period)
|
||||
"alpha": alpha, // Confidence interval of results (equivalent to input confidence interval)
|
||||
"algo": "arima", // Algorithm
|
||||
"mse": mse, // Mean square error (MSE) of model generated for input time series
|
||||
"res": res // Results in column format
|
||||
}
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- https://en.wikipedia.org/wiki/Autoregressive_moving-average_model
|
||||
- [https://baike.baidu.com/item/自回归滑动平均模型/5023931](https://baike.baidu.com/item/%E8%87%AA%E5%9B%9E%E5%BD%92%E6%BB%91%E5%8A%A8%E5%B9%B3%E5%9D%87%E6%A8%A1%E5%9E%8B/5023931)
|
|
@ -0,0 +1,53 @@
|
|||
---
|
||||
title: Holt-Winters
|
||||
sidebar_label: Holt-Winters
|
||||
---
|
||||
|
||||
This document describes the usage of the Holt-Winters method for forecasting.
|
||||
|
||||
## Description
|
||||
|
||||
Holt-Winters, or exponential moving average (EMA), is used to forecast non-stationary time series that have linear trends or periodic fluctuations. This method uses exponential smoothing to constantly adapt the model parameters to the changes in the time series and perform short-term forecasting.
|
||||
|
||||
If seasonal variation remains mostly consistent within a time series, the additive Holt-Winters model is used, whereas if seasonal variation is proportional to the level of the time series, the multiplicative Holt-Winters model is used.
|
||||
|
||||
Holt-Winters does not provide results within a confidence interval. The forecast results are the same as those on the upper and lower thresholds of the confidence interval.
|
||||
|
||||
## Parameters
|
||||
|
||||
Automated Holt-Winters modeling is performed in TDgpt. For this reason, the results for each input are automatically fitted to the most appropriate model. Forecasting is then performed based on the specified model.
|
||||
|
||||
|Parameter|Description|Required?|
|
||||
|---|---|---|
|
||||
|period|The number of data points included in each period. If not specified or set to 0, exponential smoothing is applied for data fitting, and then future data is forecast.|No|
|
||||
|trend|Use additive (`add`) or multiplicative (`mul`) Holt-Winters for the trend model.|No|
|
||||
|seasonal|Use additive (`add`) or multiplicative (`mul`) Holt-Winters for seasonality.|No|
|
||||
|
||||
## Example
|
||||
|
||||
In this example, forecasting is performed on the `i32` column. Each 10 data points in the column form a period. Multiplicative Holt-Winters is used for trends and for seasonality.
|
||||
|
||||
```
|
||||
FORECAST(i32, "algo=holtwinters,period=10,trend=mul,seasonal=mul")
|
||||
```
|
||||
|
||||
The complete SQL statement is shown as follows:
|
||||
|
||||
```SQL
|
||||
SELECT _frowts, FORECAST(i32, "algo=holtwinters, peroid=10,trend=mul,seasonal=mul") from foo
|
||||
```
|
||||
|
||||
```json5
|
||||
{
|
||||
"rows": fc_rows, // Rows returned
|
||||
"period": period, // Period of results (equivalent to input period; set to 0 if no periodicity)
|
||||
"algo": 'holtwinters' // Algorithm
|
||||
"mse": mse, // Mean square error (MSE)
|
||||
"res": res // Results in column format (typically returned as two columns, `timestamp` and `fc_results`.)
|
||||
}
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- https://en.wikipedia.org/wiki/Exponential_smoothing
|
||||
- https://orangematter.solarwinds.com/2019/12/15/holt-winters-forecasting-simplified/
|
|
@ -0,0 +1,31 @@
|
|||
---
|
||||
title: LSTM
|
||||
sidebar_label: LSTM
|
||||
---
|
||||
|
||||
This document describes how to use LSTM in TDgpt.
|
||||
|
||||
## Description
|
||||
|
||||
Long short-term memory (LSTM) is a special type of recurrent neural network (RNN) well-suited for tasks such as time-series data processing and natural language processing. Its unique gating mechanism allows it to effectively capture long-term dependencies. and address the gradient vanishing problem found in traditional RNNs, enabling more accurate predictions on sequential data. However, it does not directly provide confidence interval results for its computations.
|
||||
|
||||
The complete SQL statement is shown as follows:
|
||||
|
||||
```SQL
|
||||
SELECT _frowts, FORECAST(i32, "algo=lstm,alpha=95,period=10,start_p=1,max_p=5,start_q=1,max_q=5") from foo
|
||||
```
|
||||
|
||||
```json5
|
||||
{
|
||||
"rows": fc_rows, // Rows returned
|
||||
"period": period, // Period of results (equivalent to input period)
|
||||
"alpha": alpha, // Confidence interval of results (equivalent to input confidence interval)
|
||||
"algo": "lstm", // Algorithm
|
||||
"mse": mse, // Mean square error (MSE) of model generated for input time series
|
||||
"res": res // Results in column format
|
||||
}
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- [1] Hochreiter S. Long Short-term Memory[J]. Neural Computation MIT-Press, 1997.
|
|
@ -0,0 +1,33 @@
|
|||
---
|
||||
title: MLP
|
||||
sidebar_label: MLP
|
||||
---
|
||||
|
||||
This document describes how to use MLP in TDgpt.
|
||||
|
||||
## Description
|
||||
|
||||
MLP (Multilayer Perceptron) is a classic neural network model that can learn nonlinear relationships from historical data, capture patterns in time-series data, and make future value predictions. It performs feature extraction and mapping through multiple fully connected layers, generating prediction results based on the input historical data. Since it does not directly account for trends or seasonal variations, it typically requires data preprocessing to improve performance. It is well-suited for handling nonlinear and complex time-series problems.
|
||||
|
||||
The complete SQL statement is shown as follows:
|
||||
|
||||
```SQL
|
||||
SELECT _frowts, FORECAST(i32, "algo=mlp") from foo
|
||||
```
|
||||
|
||||
```json5
|
||||
{
|
||||
"rows": fc_rows, // Rows returned
|
||||
"period": period, // Period of results (equivalent to input period)
|
||||
"alpha": alpha, // Confidence interval of results (equivalent to input confidence interval)
|
||||
"algo": "mlp", // Algorithm
|
||||
"mse": mse, // Mean square error (MSE) of model generated for input time series
|
||||
"res": res // Results in column format
|
||||
}
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- [1]Rumelhart D E, Hinton G E, Williams R J. Learning representations by back-propagating errors[J]. nature, 1986, 323(6088): 533-536.
|
||||
- [2]Rosenblatt F. The perceptron: a probabilistic model for information storage and organization in the brain[J]. Psychological review, 1958, 65(6): 386.
|
||||
- [3]LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
|
|
@ -0,0 +1,198 @@
|
|||
---
|
||||
title: Forecasting Algorithms
|
||||
description: Forecasting Algorithms
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import fcResult from '../../../assets/tdgpt-04.png';
|
||||
|
||||
Time-series forecasting takes a continuous period of time-series data as its input and forecasts how the data will trend in the next continuous period. The number of data points in the forecast results is not fixed, but can be specified by the user. TDgpt uses the `FORECAST` function to provide forecasting. The input for this function is the historical time-series data used as a basis for forecasting, and the output is forecast data. You can use the `FORECAST` function to invoke a forecasting algorithm on an anode to provide service. Forecasting is typically performed on a subtable or on the same time series across tables.
|
||||
|
||||
In this section, the table `foo` is used as an example to describe how to perform forecasting and anomaly detection in TDgpt. This table is described as follows:
|
||||
|
||||
| Column | Type | Description |
|
||||
| ------ | --------- | ---------------------------- |
|
||||
|ts|timestamp|Primary timestamp|
|
||||
|i32|int32|Metric generated by a device as a 4-byte integer|
|
||||
|
||||
```sql
|
||||
taos> select * from foo;
|
||||
ts | i32 |
|
||||
========================================
|
||||
2020-01-01 00:00:12.681 | 13 |
|
||||
2020-01-01 00:00:13.727 | 14 |
|
||||
2020-01-01 00:00:14.378 | 8 |
|
||||
2020-01-01 00:00:15.774 | 10 |
|
||||
2020-01-01 00:00:16.170 | 16 |
|
||||
2020-01-01 00:00:17.558 | 26 |
|
||||
2020-01-01 00:00:18.938 | 32 |
|
||||
2020-01-01 00:00:19.308 | 27 |
|
||||
```
|
||||
|
||||
## Syntax
|
||||
|
||||
```SQL
|
||||
FORECAST(column_expr, option_expr)
|
||||
|
||||
option_expr: {"
|
||||
algo=expr1
|
||||
[,wncheck=1|0]
|
||||
[,conf=conf_val]
|
||||
[,every=every_val]
|
||||
[,rows=rows_val]
|
||||
[,start=start_ts_val]
|
||||
[,expr2]
|
||||
"}
|
||||
```
|
||||
|
||||
1. `column_expr`: The time-series data column to forecast. Enter a column whose data type is numerical.
|
||||
2. `options`: The parameters for forecasting. Enter parameters in key=value format, separating multiple parameters with a comma (,). It is not necessary to use quotation marks or escape characters. Only ASCII characters are supported. The supported parameters are described as follows:
|
||||
|
||||
## Parameter Description
|
||||
|
||||
|Parameter|Definition|Default|
|
||||
| ------- | ------------------------------------------ | ---------------------------------------------- |
|
||||
|algo|Forecasting algorithm.|holtwinters|
|
||||
|wncheck|White noise data check. Enter 1 to enable or 0 to disable.|1|
|
||||
|conf|Confidence interval for forecast data. Enter an integer between 0 and 100, inclusive.|95|
|
||||
|every|Sampling period.|The sampling period of the input data|
|
||||
|start|Starting timestamp for forecast data.|One sampling period after the final timestamp in the input data|
|
||||
|rows|Number of forecast rows to return.|10|
|
||||
|
||||
1. Three pseudocolumns are used in forecasting:`_FROWTS`: the timestamp of the forecast data; `_FLOW`: the lower threshold of the confidence interval; and `_FHIGH`: the upper threshold of the confidence interval. For algorithms that do not include a confidence interval, the `_FLOW` and `_FHIGH` pseudocolumns contain the forecast results.
|
||||
2. You can specify the `START` parameter to modify the starting time of forecast results. This does not affect the forecast values, only the time range.
|
||||
3. The `EVERY` parameter can be lesser than or equal to the sampling period of the input data. However, it cannot be greater than the sampling period of the input data.
|
||||
4. If you specify a confidence interval for an algorithm that does not use it, the upper and lower thresholds of the confidence interval regress to a single point.
|
||||
5. The maximum value of rows is 1024. If you specify a higher value, only 1024 rows are returned.
|
||||
6. The maximum size of the input historical data is 40,000 rows. Note that some models may have stricter limitations.
|
||||
|
||||
## Example
|
||||
|
||||
```SQL
|
||||
--- ARIMA forecast, return 10 rows of results (default), perform white noise data check, with 95% confidence interval
|
||||
SELECT _flow, _fhigh, _frowts, FORECAST(i32, "algo=arima")
|
||||
FROM foo;
|
||||
|
||||
--- ARIMA forecast, periodic input data, 10 samples per period, disable white noise data check, with 95% confidence interval
|
||||
SELECT _flow, _fhigh, _frowts, FORECAST(i32, "algo=arima,alpha=95,period=10,wncheck=0")
|
||||
FROM foo;
|
||||
```
|
||||
|
||||
```sql
|
||||
taos> select _flow, _fhigh, _frowts, forecast(i32) from foo;
|
||||
_flow | _fhigh | _frowts | forecast(i32) |
|
||||
========================================================================================
|
||||
10.5286684 | 41.8038254 | 2020-01-01 00:01:35.000 | 26 |
|
||||
-21.9861946 | 83.3938904 | 2020-01-01 00:01:36.000 | 30 |
|
||||
-78.5686035 | 144.6729126 | 2020-01-01 00:01:37.000 | 33 |
|
||||
-154.9797363 | 230.3057709 | 2020-01-01 00:01:38.000 | 37 |
|
||||
-253.9852905 | 337.6083984 | 2020-01-01 00:01:39.000 | 41 |
|
||||
-375.7857971 | 466.4594727 | 2020-01-01 00:01:40.000 | 45 |
|
||||
-514.8043823 | 622.4426270 | 2020-01-01 00:01:41.000 | 53 |
|
||||
-680.6343994 | 796.2861328 | 2020-01-01 00:01:42.000 | 57 |
|
||||
-868.4956665 | 992.8603516 | 2020-01-01 00:01:43.000 | 62 |
|
||||
-1076.1566162 | 1214.4498291 | 2020-01-01 00:01:44.000 | 69 |
|
||||
```
|
||||
|
||||
## Built-In Forecasting Algorithms
|
||||
|
||||
- [ARIMA](./arima/)
|
||||
- [Holt-Winters](./holtwinters/)
|
||||
- Complex exponential smoothing (CES)
|
||||
- Theta
|
||||
- Prophet
|
||||
- XGBoost
|
||||
- LightGBM
|
||||
- Multiple Seasonal-Trend decomposition using LOESS (MSTL)
|
||||
- ETS (Error, Trend, Seasonal)
|
||||
- Long Short-Term Memory (LSTM)
|
||||
- Multilayer Perceptron (MLP)
|
||||
- DeepAR
|
||||
- N-BEATS
|
||||
- N-HiTS
|
||||
- Patch Time Series Transformer (PatchTST)
|
||||
- Temporal Fusion Transformer
|
||||
- TimesNet
|
||||
|
||||
## Evaluating Algorithm Effectiveness
|
||||
|
||||
TDengine Enterprise includes `analytics_compare`, a tool that evaluates the effectiveness of time-series forecasting algorithms in TDgpt. You can configure this tool to perform backtesting on data stored in TDengine and determine which algorithms and models are most effective for your data. The evaluation is based on mean squared error (MSE). MAE and MAPE are in development.
|
||||
|
||||
The configuration of the evaluation tool is described as follows:
|
||||
|
||||
```ini
|
||||
[forecast]
|
||||
# number of data points per training period
|
||||
period = 10
|
||||
|
||||
# consider final 10 rows of in-scope data as forecasting results
|
||||
rows = 10
|
||||
|
||||
# start time of training data
|
||||
start_time = 1949-01-01T00:00:00
|
||||
|
||||
# end time of training data
|
||||
end_time = 1960-12-01T00:00:00
|
||||
|
||||
# start time of results
|
||||
res_start_time = 1730000000000
|
||||
|
||||
# specify whether to create a graphical chart
|
||||
gen_figure = true
|
||||
```
|
||||
|
||||
To use the tool, run `analytics_compare` in TDgpt's `misc` directory. Ensure that you run the tool on a machine with a Python environment installed. You can test the tool as follows:
|
||||
|
||||
1. Configure your TDengine cluster information in the `analytics.ini` file:
|
||||
|
||||
```ini
|
||||
[taosd]
|
||||
# taosd hostname
|
||||
host = 127.0.0.1
|
||||
|
||||
# username
|
||||
user = root
|
||||
|
||||
# password
|
||||
password = taosdata
|
||||
|
||||
# tdengine configuration file
|
||||
conf = /etc/taos/taos.cfg
|
||||
|
||||
[input_data]
|
||||
# database for testing forecasting algorithms
|
||||
db_name = test
|
||||
|
||||
# table with test data
|
||||
table_name = passengers
|
||||
|
||||
# columns with test data
|
||||
column_name = val, _c0
|
||||
```
|
||||
|
||||
2. Prepare your data. A sample data file `sample-fc.sql` is included in the `resource` directory. Run the following command to ingest the sample data into TDengine:
|
||||
|
||||
```shell
|
||||
taos -f sample-fc.sql
|
||||
```
|
||||
|
||||
You can now begin the evaluation.
|
||||
|
||||
3. Ensure that the Python environment on the local machine is operational. Then run the following command:
|
||||
|
||||
```shell
|
||||
python3.10 ./analytics_compare.py forecast
|
||||
```
|
||||
|
||||
4. The evaluation results are written to `fc_result.xlsx`. The first card shows the results, shown as follows, including the algorithm name, parameters, mean square error, and elapsed time.
|
||||
|
||||
| algorithm | params | MSE | elapsed_time(ms.) |
|
||||
| ----------- | ------------------------------------------------------------------------- | ------- | ----------------- |
|
||||
| holtwinters | `{"trend":"add", "seasonal":"add"}` | 351.622 | 125.1721 |
|
||||
| arima | `{"time_step":3600000, "start_p":0, "max_p":10, "start_q":0, "max_q":10}` | 433.709 | 45577.9187 |
|
||||
|
||||
If you set `gen_figure` to `true`, a chart is also generated, as displayed in the following figure.
|
||||
|
||||
<figure>
|
||||
<Image img={fcResult} alt="Forecasting comparison"/>
|
||||
</figure>
|
|
@ -0,0 +1,67 @@
|
|||
---
|
||||
title: Statistical Algorithms
|
||||
sidebar_label: Statistical Algorithms
|
||||
---
|
||||
|
||||
- k-sigma<sup>[1]</sup>, or ***68–95–99.7 rule***: The *k* value defines how many standard deviations indicate an anomaly. The default value is 3. The k-sigma algorithm require data to be in a regular distribution. Data points that lie outside of *k* standard deviations are considered anomalous.
|
||||
|
||||
|Parameter|Description|Required?|Default|
|
||||
|---|---|---|---|
|
||||
|k|Number of standard deviations|No|3|
|
||||
|
||||
```SQL
|
||||
--- Use the k-sigma algorithm with a k value of 2
|
||||
SELECT _WSTART, COUNT(*)
|
||||
FROM foo
|
||||
ANOMALY_WINDOW(foo.i32, "algo=ksigma,k=2")
|
||||
```
|
||||
|
||||
- Interquartile range (IQR)<sup>[2]</sup>: IQR divides a rank-ordered dataset into even quartiles, Q1 through Q3. IQR=Q3=Q1, for *v*, Q1 - (1.5 x IQR) \<= v \<= Q3 + (1.5 x IQR) is normal. Data points outside this range are considered anomalous. This algorithm does not take any parameters.
|
||||
|
||||
```SQL
|
||||
--- Use the IQR algorithm.
|
||||
SELECT _WSTART, COUNT(*)
|
||||
FROM foo
|
||||
ANOMALY_WINDOW(foo.i32, "algo=iqr")
|
||||
```
|
||||
|
||||
- Grubbs's test<sup>[3]</sup>, or maximum normalized residual test: Grubbs is used to test whether the deviation from mean of the maximum and minimum is anomalous. It requires a univariate data set in a close to normal distribution. Grubbs's test cannot be uses for datasets that are not normally distributed. This algorithm does not take any parameters.
|
||||
|
||||
```SQL
|
||||
--- Use Grubbs's test.
|
||||
SELECT _WSTART, COUNT(*)
|
||||
FROM foo
|
||||
ANOMALY_WINDOW(foo.i32, "algo=grubbs")
|
||||
```
|
||||
|
||||
- Seasonal Hybrid ESD (S-H-ESD)<sup>[4]</sup>: Extreme Studentized Deviate (ESD) can identify multiple anomalies in time-series data. You define whether to detect positive anomalies (`pos`), negative anomalies (`neg`), or both (`both`). The maximum proportion of data that can be anomalous (`max_anoms`) is at worst 49.9% Typically, the proportion of anomalies in a dataset does not exceed 5%.
|
||||
|
||||
|Parameter|Description|Required?|Default|
|
||||
|---|---|---|---|
|
||||
|direction|Specify the direction of anomalies ('pos', 'neg', or 'both').|No|"both"|
|
||||
|max_anoms|Specify maximum proportion of data that can be anomalous *k*, where 0 \< *k* \<= 49.9|No|0.05|
|
||||
|period|The number of data points included in each period|No|0|
|
||||
|
||||
|
||||
```SQL
|
||||
--- Use the SHESD algorithm in both direction with a maximum 5% of the data being anomalous
|
||||
SELECT _WSTART, COUNT(*)
|
||||
FROM foo
|
||||
ANOMALY_WINDOW(foo.i32, "algo=shesd,direction=both,anoms=0.05")
|
||||
```
|
||||
|
||||
The following algorithms are in development:
|
||||
|
||||
- Gaussian Process Regression
|
||||
|
||||
Change point detection--based algorithms:
|
||||
|
||||
- CUSUM (Cumulative Sum Control Chart)
|
||||
- PELT (Pruned Exact Linear Time)
|
||||
|
||||
## References
|
||||
|
||||
1. [https://en.wikipedia.org/wiki/68–95–99.7 rule](https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule)
|
||||
2. https://en.wikipedia.org/wiki/Interquartile_range
|
||||
3. Adikaram, K. K. L. B.; Hussein, M. A.; Effenberger, M.; Becker, T. (2015-01-14). "Data Transformation Technique to Improve the Outlier Detection Power of Grubbs's Test for Data Expected to Follow Linear Relation". Journal of Applied Mathematics. 2015: 1–9. doi:10.1155/2015/708948.
|
||||
4. Hochenbaum, O. S. Vallis, and A. Kejariwal. 2017. Automatic Anomaly Detection in the Cloud Via Statistical Learning. arXiv preprint arXiv:1704.07706 (2017).
|
|
@ -0,0 +1,33 @@
|
|||
---
|
||||
title: Data Density Algorithms
|
||||
sidebar_label: Data Density Algorithms
|
||||
---
|
||||
|
||||
## Data Density/Mining Algorithms
|
||||
|
||||
Local outlier factor (LOF)<sup>[1]</sup>:
|
||||
|
||||
LOF is a density-based algorithm for determining local outliers proposed by Breunig et al. in 2000. It is suitable for data with varying cluster densities and diverse dispersion. First, the local reachability density of each data point is calculated based on the density of its neighborhood. The local reachability density is then used to assign an outlier factor to each data point.
|
||||
|
||||
This outlier factor indicates how anomalous a data point is. A higher factor indicates more anomalous data. Finally, the top *k* outliers are output.
|
||||
|
||||
```SQL
|
||||
--- Use LOF.
|
||||
SELECT count(*)
|
||||
FROM foo
|
||||
ANOMALY_WINDOW(foo.i32, "algo=lof")
|
||||
```
|
||||
|
||||
The following algorithms are in development:
|
||||
|
||||
- DBSCAN (Density-Based Spatial Clustering of Applications with Noise)
|
||||
- K-Nearest Neighbors (KNN)
|
||||
- Principal Component Analysis (PCA)
|
||||
|
||||
Third-party anomaly detection algorithms:
|
||||
|
||||
- PyOD
|
||||
|
||||
## References
|
||||
|
||||
1. Breunig, M. M.; Kriegel, H.-P.; Ng, R. T.; Sander, J. (2000). LOF: Identifying Density-based Local Outliers (PDF). Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. SIGMOD. pp. 93–104. doi:10.1145/335191.335388. ISBN 1-58113-217-4.
|
|
@ -0,0 +1,27 @@
|
|||
---
|
||||
title: Machine Learning Algorithms
|
||||
sidebar_label: Machine Learning Algorithms
|
||||
---
|
||||
|
||||
TDgpt includes a built-in autoencoder for anomaly detection.
|
||||
|
||||
This algorithm is suitable for detecting anomalies in periodic time-series data. It must be pre-trained on your time-series data.
|
||||
|
||||
The trained model is saved to the `ad_autoencoder` directory. You then specify the model in your SQL statement.
|
||||
|
||||
```SQL
|
||||
--- Add the name of the model `ad_autoencoder_foo` in the options of the anomaly window and detect anomalies in the dataset `foo` using the autoencoder algorithm.
|
||||
SELECT COUNT(*), _WSTART
|
||||
FROM foo
|
||||
ANOMALY_WINDOW(col1, 'algo=encoder, model=ad_autoencoder_foo');
|
||||
```
|
||||
|
||||
The following algorithms are in development:
|
||||
|
||||
- Isolation Forest
|
||||
- One-Class Support Vector Machines (SVM)
|
||||
- Prophet
|
||||
|
||||
## References
|
||||
|
||||
1. https://en.wikipedia.org/wiki/Autoencoder
|
|
@ -0,0 +1,119 @@
|
|||
---
|
||||
title: Anomaly Detection Algorithms
|
||||
description: Anomaly Detection Algorithms
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import anomDetect from '../../../assets/tdgpt-05.png';
|
||||
import adResult from '../../../assets/tdgpt-06.png';
|
||||
|
||||
Anomaly detection is provided via an anomaly window that has been introduced into TDengine. An anomaly window is a special type of event window, defined by the anomaly detection algorithm as a time window during which an anomaly is occurring. This window differs from an event window in that the algorithm determines when it opens and closes instead of expressions input by the user. You can use the `ANOMALY_WINDOW` keyword in a `WHERE` clause to invoke the anomaly detection service. The window pseudocolumns `_WSTART`, `_WEND`, and `_WDURATION` record the start, end, and duration of the window. For example:
|
||||
|
||||
```SQL
|
||||
--- Use the IQR algorithm to detect anomalies in the `col_val` column. Also return the start and end time of the anomaly window as well as the sum of the `col` column within the window.
|
||||
SELECT _wstart, _wend, SUM(col)
|
||||
FROM foo
|
||||
ANOMALY_WINDOW(col_val, "algo=iqr");
|
||||
```
|
||||
|
||||
As shown in the following figure, the anode returns the anomaly window [10:51:30, 10:53:40].
|
||||
|
||||
<figure>
|
||||
<Image img={anomDetect} alt="Anomaly detection" />
|
||||
</figure>
|
||||
|
||||
|
||||
You can then query, aggregate, or perform other operations on the data in the window.
|
||||
|
||||
## Syntax
|
||||
|
||||
```SQL
|
||||
ANOMALY_WINDOW(column_name, option_expr)
|
||||
|
||||
option_expr: {"
|
||||
algo=expr1
|
||||
[,wncheck=1|0]
|
||||
[,expr2]
|
||||
"}
|
||||
```
|
||||
|
||||
1. `column_name`: The data column in which to detect anomalies. Specify only one column per query. The data type of the column must be numerical; string types such as NCHAR are not supported. Functions are not supported.
|
||||
2. `options`: The parameters for anomaly detection. Enter parameters in key=value format, separating multiple parameters with a comma (,). It is not necessary to use quotation marks or escape characters. Only ASCII characters are supported. For example: `algo=ksigma,k=2` indicates that the anomaly detection algorithm is k-sigma and the k value is 2.
|
||||
3. You can use the results of anomaly detection as the inner part of a nested query. The same functions are supported as in other windowed queries.
|
||||
4. White noise checking is performed on the input data by default. If the input data is white noise, no results are returned.
|
||||
|
||||
## Parameter Description
|
||||
|
||||
|Parameter|Definition|Default|
|
||||
| ------- | ------------------------------------------ | ------ |
|
||||
|algo|Specify the anomaly detection algorithm.|iqr|
|
||||
|wncheck|Enter 1 to perform the white noise data check or 0 to disable the white noise data check.|1|
|
||||
|
||||
## Example
|
||||
|
||||
```SQL
|
||||
--- Use the IQR algorithm to detect anomalies in the `i32` column.
|
||||
SELECT _wstart, _wend, SUM(i32)
|
||||
FROM foo
|
||||
ANOMALY_WINDOW(i32, "algo=iqr");
|
||||
|
||||
--- Use the k-sigma algorithm with k value of 2 to detect anomalies in the `i32`
|
||||
SELECT _wstart, _wend, SUM(i32)
|
||||
FROM foo
|
||||
ANOMALY_WINDOW(i32, "algo=ksigma,k=2");
|
||||
|
||||
taos> SELECT _wstart, _wend, count(*) FROM foo ANOMAYL_WINDOW(i32);
|
||||
_wstart | _wend | count(*) |
|
||||
====================================================================
|
||||
2020-01-01 00:00:16.000 | 2020-01-01 00:00:17.000 | 2 |
|
||||
Query OK, 1 row(s) in set (0.028946s)
|
||||
```
|
||||
|
||||
## Built-In Anomaly Detection Algorithms
|
||||
|
||||
TDgpt comes with six anomaly detection algorithms, divided among the following three categories: [Statistical Algorithms](./02-statistics-approach.md), [Data Density Algorithms](./03-data-density.md), and [Machine Learning Algorithms](./04-machine-learning.md). If you do not specify an algorithm, the IQR algorithm is used by default.
|
||||
|
||||
## Evaluating Algorithm Effectiveness
|
||||
|
||||
TDgpt provides an automated tool to compare the effectiveness of different algorithms across various datasets. For anomaly detection algorithms, it uses the recall and precision metrics to evaluate their performance.
|
||||
|
||||
By setting the following options in the configuration file `analysis.ini`, you can specify the anomaly detection algorithm to be used, the time range of the test data, whether to generate annotated result images, the desired algorithm, and its corresponding parameters.
|
||||
|
||||
Before comparing anomaly detection algorithms, you must manually label the results of the anomaly detection dataset. This is done by setting the value of the [anno_res] option. Each number in the array represents the index of an anomaly. For example, in the test dataset below, if the 9th point is an anomaly, the labeled result would be [9].
|
||||
|
||||
```bash
|
||||
[ad]
|
||||
# training data start time
|
||||
start_time = 2021-01-01T01:01:01
|
||||
|
||||
# training data end time
|
||||
end_time = 2021-01-01T01:01:11
|
||||
|
||||
# draw the results or not
|
||||
gen_figure = true
|
||||
|
||||
# annotate the anomaly_detection result
|
||||
anno_res = [9]
|
||||
|
||||
# algorithms list that is involved in the comparion
|
||||
[ad.algos]
|
||||
ksigma={"k": 2}
|
||||
iqr={}
|
||||
grubbs={}
|
||||
lof={"algorithm":"auto", "n_neighbor": 3}
|
||||
```
|
||||
|
||||
After the comparison program finishes running, it automatically generates a file named ·ad_result.xlsx·. The first sheet contains the algorithm execution results (as shown in the table below), including five metrics: algorithm name, execution parameters, recall, precision, and execution time.
|
||||
|
||||
| algorithm | params | precision(%) | recall(%) | elapsed_time(ms.) |
|
||||
| --------- | -------------------------------------- | ------------ | --------- | ----------------- |
|
||||
| ksigma | `{"k":2}` | 100 | 100 | 0.453 |
|
||||
| iqr | `{}` | 100 | 100 | 2.727 |
|
||||
| grubbs | `{}` | 100 | 100 | 2.811 |
|
||||
| lof | `{"algorithm":"auto", "n_neighbor":3}` | 0 | 0 | 4.660 |
|
||||
|
||||
If `gen_figure` is set to true, the tool automatically generates a visual representation of the analysis results for each algorithm being compared. The k-sigma algorithm is shown here as an example.
|
||||
|
||||
<figure>
|
||||
<Image img={adResult} alt="Anomaly detection results"/>
|
||||
</figure>
|
|
@ -0,0 +1,112 @@
|
|||
---
|
||||
title: Forecasting Algorithms
|
||||
sidebar_label: Forecasting Algorithms
|
||||
---
|
||||
|
||||
## Input Limitations
|
||||
|
||||
`execute` is the core method of forecasting algorithms. Before calling this method, the framework configures the historical time-series data used for forecasting in the `self.list` object attribute.
|
||||
|
||||
## Output Limitations and Parent Class Attributes
|
||||
|
||||
Running the `execute` method generates the following dictionary objects:
|
||||
|
||||
```python
|
||||
return {
|
||||
"mse": mse, # Mean squared error of the fit data
|
||||
"res": res # Result groups [timestamp, forecast results, lower boundary of confidence interval, upper boundary of confidence interval]
|
||||
}
|
||||
```
|
||||
|
||||
The parent class `AbstractForecastService` of forecasting algorithms includes the following object attributes.
|
||||
|
||||
|Attribute|Description|Default|
|
||||
|---|---|---|
|
||||
|period|Specify the periodicity of the data, i.e. the number of data points included in each period. If the data is not periodic, enter 0.|0|
|
||||
|start_ts|Specify the start time of forecasting results.|0|
|
||||
|time_step|Specify the interval between consecutive data points in the forecast results.|0|
|
||||
|fc_rows|Specify the number of forecast rows to return.|0|
|
||||
|return_conf|Specify 1 to include a confidence interval in the forecast results or 0 to not include a confidence interval in the results. If you specify 0, the mean is returned as the upper and lower boundaries.|1|
|
||||
|conf|Specify a confidence interval quantile.|95|
|
||||
|
||||
## Sample Code
|
||||
|
||||
The following code is an sample algorithm that always returns 1 as the forecast results.
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
from taosanalytics.service import AbstractForecastService
|
||||
|
||||
|
||||
# Algorithm files must start with an underscore ("_") and end with "Service".
|
||||
class _MyForecastService(AbstractForecastService):
|
||||
""" Define a class inheriting from AbstractAnomalyDetectionService and implementing the `execute` method. """
|
||||
|
||||
# Name the algorithm using only lowercase ASCII characters.
|
||||
name = 'myfc'
|
||||
|
||||
# Include a description of the algorithm (recommended)
|
||||
desc = """return the forecast time series data"""
|
||||
|
||||
def __init__(self):
|
||||
"""Method to initialize the class"""
|
||||
super().__init__()
|
||||
|
||||
def execute(self):
|
||||
""" Implementation of algorithm logic"""
|
||||
res = []
|
||||
|
||||
"""This algorithm always returns 1 as the forecast result. The number of results returned is determined by the self.fc_rows value input by the user."""
|
||||
ts_list = [self.start_ts + i * self.time_step for i in range(self.fc_rows)]
|
||||
res.append(ts_list) # set timestamp column for forecast results
|
||||
|
||||
"""Generate forecast results whose value is 1. """
|
||||
res_list = [1] * self.fc_rows
|
||||
res.append(res_list)
|
||||
|
||||
"""Check whether user has requested the upper and lower boundaries of the confidence interval."""
|
||||
if self.return_conf:
|
||||
"""If the algorithm does not calculate these values, return the forecast results."""
|
||||
bound_list = [1] * self.fc_rows
|
||||
res.append(bound_list) # lower confidence limit
|
||||
res.append(bound_list) # upper confidence limit
|
||||
|
||||
"""Return results"""
|
||||
return {"res": res, "mse": 0}
|
||||
|
||||
def set_params(self, params):
|
||||
"""This algorithm does not take any parameters, only calling a parent function, so this logic is not included."""
|
||||
return super().set_params(params)
|
||||
|
||||
```
|
||||
|
||||
Save this file to the `./lib/taosanalytics/algo/fc/` directory and restart the `taosanode` service. In the TDengine CLI, run `SHOW ANODES FULL` to see your new algorithm. Your applications can now use this algorithm via SQL.
|
||||
|
||||
```SQL
|
||||
--- Detect anomalies in the `col` column using the newly added `myfc` algorithm
|
||||
SELECT _flow, _fhigh, _frowts, FORECAST(col_name, "algo=myfc")
|
||||
FROM foo;
|
||||
```
|
||||
|
||||
If you have never started the anode, see [Installation](../../management/) to add the anode to your TDengine cluster.
|
||||
|
||||
## Unit Testing
|
||||
|
||||
You can add unit test cases to the `forecase_test.py` file in the `taosanalytics/test` directory or create a file for unit tests. Unit tests have a depency on the Python unittest module.
|
||||
|
||||
```python
|
||||
def test_myfc(self):
|
||||
""" Test the myfc class """
|
||||
s = loader.get_service("myfc")
|
||||
|
||||
# Configure data for forecasting
|
||||
s.set_input_list(self.get_input_list(), None)
|
||||
# Check whether all results are 1
|
||||
r = s.set_params(
|
||||
{"fc_rows": 10, "start_ts": 171000000, "time_step": 86400 * 30, "start_p": 0}
|
||||
)
|
||||
r = s.execute()
|
||||
|
||||
expected_list = [1] * 10
|
||||
self.assertEqlist(r["res"][0], expected_list)
|
||||
```
|
|
@ -0,0 +1,79 @@
|
|||
---
|
||||
title: Anomaly Detection Algorithms
|
||||
sidebar_label: Anomaly Detection Algorithms
|
||||
---
|
||||
|
||||
## Input Limitations
|
||||
|
||||
`execute` is the core method of anomaly detection algorithms. Before calling this method, the framework configures the historical time-series data used for anomaly detection in the `self.list` object attribute.
|
||||
|
||||
## Output Limitations
|
||||
|
||||
The `execute` method returns an array of the same length as `self.list`. A value of `-1` in the array indicates an anomaly.
|
||||
|
||||
For example, in the series `[2, 2, 2, 2, 100]`, assuming that `100` is an anomaly, the method returns `[1, 1, 1, 1, -1]`.
|
||||
|
||||
## Sample Code
|
||||
|
||||
This section describes an example anomaly detection algorithm that returns the final data point in a time series as an anomaly.
|
||||
|
||||
```python
|
||||
from taosanalytics.service import AbstractAnomalyDetectionService
|
||||
|
||||
# Algorithm files must start with an underscore ("_") and end with "Service".
|
||||
class _MyAnomalyDetectionService(AbstractAnomalyDetectionService):
|
||||
""" Define a class inheriting from AbstractAnomalyDetectionService and implementing the abstract method of that class. """
|
||||
|
||||
# Name the algorithm using only lowercase ASCII characters.
|
||||
name = 'myad'
|
||||
|
||||
# Include a description of the algorithm (recommended)
|
||||
desc = """return the last value as the anomaly data"""
|
||||
|
||||
def __init__(self):
|
||||
"""Method to initialize the class"""
|
||||
super().__init__()
|
||||
|
||||
def execute(self):
|
||||
""" Implementation of algorithm logic"""
|
||||
|
||||
"""Create an array with length len(self.list) whose results are all 1, then set the final value in the array to -1 to indicate an anomaly"""
|
||||
res = [1] * len(self.list)
|
||||
res[-1] = -1
|
||||
|
||||
"""Return results"""
|
||||
return res
|
||||
|
||||
|
||||
def set_params(self, params):
|
||||
"""This algorithm does not take any parameters, so this logic is not included."""
|
||||
|
||||
```
|
||||
|
||||
Save this file to the `./lib/taosanalytics/algo/ad/` directory and restart the `taosanode` service. In the TDengine CLI, run `SHOW ANODES FULL` to see your new algorithm. Your applications can now invoke this algorithm via SQL.
|
||||
|
||||
```SQL
|
||||
--- Detect anomalies in the `col` column using the newly added `myad` algorithm
|
||||
SELECT COUNT(*) FROM foo ANOMALY_WINDOW(col, 'algo=myad')
|
||||
```
|
||||
|
||||
If you have never started the anode, see [Installation](../../management/) to add the anode to your TDengine cluster.
|
||||
|
||||
### Unit Testing
|
||||
|
||||
You can add unit test cases to the `anomaly_test.py` file in the `taosanalytics/test` directory or create a file for unit tests. The framework uses the Python unittest module.
|
||||
|
||||
```python
|
||||
def test_myad(self):
|
||||
""" Test the _IqrService class """
|
||||
s = loader.get_service("myad")
|
||||
|
||||
# Configure the data to test
|
||||
s.set_input_list(AnomalyDetectionTest.input_list, None)
|
||||
|
||||
r = s.execute()
|
||||
|
||||
# The final value is an anomaly
|
||||
self.assertEqual(r[-1], -1)
|
||||
self.assertEqual(len(r), len(AnomalyDetectionTest.input_list))
|
||||
```
|
|
@ -0,0 +1,100 @@
|
|||
---
|
||||
title: Algorithm Developer's Guide
|
||||
sidebar_label: Algorithm Developer's Guide
|
||||
---
|
||||
|
||||
TDgpt is an extensible platform for advanced time-series data analytics. You can follow the steps described in this document to develop your own analytics algorithms and add them to the platform. Your applications can then use SQL statements to invoke these algorithms. Custom algorithms must be developed in Python.
|
||||
|
||||
The anode adds algorithms semi-dynamically. When the anode is started, it scans specified directories for files that meet its requirements and adds those files to the platform. To add an algorithm to your TDgpt, perform the following steps:
|
||||
|
||||
1. Develop an analytics algorithm according to the TDgpt requirements.
|
||||
2. Place the source code files in the appropriate directory and restart the anode.
|
||||
3. Run the `CREATE ANODE` statement to add the anode to your TDengine cluster.
|
||||
|
||||
Your algorithm has been added to TDgpt and can be used by your applications. Because TDgpt is decoupled from TDengine, adding or upgrading algorithms on the anode does not affect the TDengine server (taosd). On the application side, it is necessary only to update your SQL statements to start using new or upgraded algorithms.
|
||||
|
||||
This extensibility makes TDgpt suitable for a wide range of use cases. You can add any algorithms needed by your use cases on demand and invoke them via SQL. You can also update algorithms without making significant changes to your applications.
|
||||
|
||||
This document describes how to add algorithms to an anode and invoke them with SQL statements.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
The directory structure of an anode is described as follows:
|
||||
|
||||
```bash
|
||||
.
|
||||
├── bin
|
||||
├── cfg
|
||||
├── lib
|
||||
│ └── taosanalytics
|
||||
│ ├── algo
|
||||
│ │ ├── ad
|
||||
│ │ └── fc
|
||||
│ ├── misc
|
||||
│ └── test
|
||||
├── log -> /var/log/taos/taosanode
|
||||
├── model -> /var/lib/taos/taosanode/model
|
||||
└── venv -> /var/lib/taos/taosanode/venv
|
||||
|
||||
```
|
||||
|
||||
|Directory|Description|
|
||||
|---|---|
|
||||
|taosanalytics| Source code, including the `algo` subdirectory for algorithms, the `test` subdirectory for unit and integration tests, and the `misc` subdirectory for other files. Within the `algo` subdirectory, the `ad` subdirectory includes anomaly detection algorithms, and the `fc` subdirectory includes forecasting algorithms.|
|
||||
|venv| Virtual Python environment |
|
||||
|model|Trained models for datasets|
|
||||
|cfg|Configuration files|
|
||||
|
||||
:::note
|
||||
- Place Python source code for anomaly detection in the `./taos/algo/ad` directory.
|
||||
- Place Python source code for forecasting in the `./taos/algo/fc` directory.
|
||||
:::
|
||||
|
||||
## Class Naming Rules
|
||||
|
||||
The anode adds algorithms automatically. Your algorithm must therefore consist of appropriately named Python files. Algorithm files must start with an underscore (`_`) and end with `Service`. For example: `_KsigmaService` is the name of the k-sigma anomaly detection algorithm.
|
||||
|
||||
## Class Inheritance Rules
|
||||
|
||||
- All anomaly detection algorithms must inherit `AbstractAnomalyDetectionService` and implement the `execute` method.
|
||||
- All forecasting algorithms must inherit `AbstractForecastService` and implement the `execute` method.
|
||||
|
||||
## Class Property Initialization
|
||||
|
||||
Your classes must initialize the following properties:
|
||||
|
||||
- `name`: identifier of the algorithm. Use lowercase letters only. This identifier is displayed when you use the `SHOW` statement to display available algorithms.
|
||||
- `desc`: basic description of the algorithm.
|
||||
|
||||
```SQL
|
||||
--- The `algo` key takes the defined `name` value.
|
||||
SELECT COUNT(*)
|
||||
FROM foo ANOMALY_WINDOW(col_name, 'algo=name')
|
||||
```
|
||||
|
||||
## Adding Algorithms with Models
|
||||
|
||||
Certain machine learning algorithms must be trained on your data and generate a model. The same algorithm may use different models for different datasets.
|
||||
When you add an algorithm that uses models to your anode, first create subdirectories for your models in the `model` directory, and save the trained model for each algorithm and dataset to the corresponding subdirectory. You can specify custom names for these subdirectories in your algorithms. Use the `joblib` library to serialize trained models to ensure that they can be read and loaded.
|
||||
|
||||
The following section describes how to add an anomaly detection algorithm that requires trained models. The autoencoder algorithm is used as an example.
|
||||
First, create the `ad_autoencoder` subdirectory in the `model` directory. This subdirectory is used to store models for the autoencoder algorithm. Next, train the algorithm on the `foo` table and obtain a trained model named `ad_autoencoder_foo`. Use the `joblib` library to serialize the model and save it to the `ad_autoencoder` subdirectory. As shown in the following figure, the `ad_autoencoder_foo` model comprises two files: the model file `ad_autoencoder_foo.dat` and the model description `ad_autoencoder_foo.info`.
|
||||
|
||||
```bash
|
||||
.
|
||||
└── model
|
||||
└── ad_autoencoder
|
||||
├── ad_autoencoder_foo.dat
|
||||
└── ad_autoencoder_foo.info
|
||||
|
||||
```
|
||||
|
||||
The following section describes how to invoke this model with a SQL statement.
|
||||
Set the `algo` parameter to `ad_encoder` to instruct TDgpt to use the autoencoder algorithm. This algorithm is in the available algorithms list and can be used directly. Set the `model` parameter to `ad_autoencoder_foo` to instruct TDgpt to use the trained model generated in the previous section.
|
||||
|
||||
```SQL
|
||||
--- Add the name of the model `ad_autoencoder_foo` in the options of the anomaly window and detect anomalies in the dataset `foo` using the autoencoder algorithm.
|
||||
SELECT COUNT(*), _WSTART
|
||||
FROM foo
|
||||
ANOMALY_WINDOW(col1, 'algo=ad_encoder, model=ad_autoencoder_foo');
|
||||
```
|
|
@ -0,0 +1,6 @@
|
|||
---
|
||||
title: Data Imputation
|
||||
sidebar_label: Data Imputation
|
||||
---
|
||||
|
||||
Coming soon
|
|
@ -0,0 +1,6 @@
|
|||
---
|
||||
title: Time-Series Classification
|
||||
sidebar_label: Time-Series Classification
|
||||
---
|
||||
|
||||
Coming soon
|
|
@ -0,0 +1,75 @@
|
|||
---
|
||||
title: Quick Start Guide
|
||||
sidebar_label: Quick Start Guide
|
||||
---
|
||||
|
||||
## Get Started with Docker
|
||||
|
||||
This document describes how to get started with TDgpt in Docker.
|
||||
|
||||
### Start TDgpt
|
||||
|
||||
If you have installed Docker, pull the latest TDengine container:
|
||||
|
||||
```shell
|
||||
docker pull tdengine/tdengine:latest
|
||||
```
|
||||
|
||||
You can specify a version if desired:
|
||||
|
||||
```shell
|
||||
docker pull tdengine/tdengine:3.3.3.0
|
||||
```
|
||||
|
||||
Then run the following command:
|
||||
|
||||
```shell
|
||||
docker run -d -p 6030:6030 -p 6041:6041 -p 6043:6043 -p 6044-6049:6044-6049 -p 6044-6045:6044-6045/udp -p 6060:6060 tdengine/tdengine
|
||||
```
|
||||
|
||||
Note: TDgpt runs on TCP port 6090. TDgpt is a stateless analytics agent and does not persist data. It only saves log files to local disk
|
||||
|
||||
Confirm that your Docker container is running:
|
||||
|
||||
```shell
|
||||
docker ps
|
||||
```
|
||||
|
||||
Enter the container and run the bash shell:
|
||||
|
||||
```shell
|
||||
docker exec -it <container name> bash
|
||||
```
|
||||
|
||||
You can now run Linux commands and access TDengine.
|
||||
|
||||
## Get Started with an Installation Package
|
||||
|
||||
### Obtain the Package
|
||||
|
||||
1. Download the tar.gz package from the list:
|
||||
2. Open the directory containing the downloaded package and decompress it.
|
||||
3. Open the directory containing the decompressed package and run the `install.sh` script.
|
||||
|
||||
Note: Replace `<version>` with the version that you downloaded.
|
||||
|
||||
```bash
|
||||
tar -zxvf TDengine-anode-<version>-Linux-x64.tar.gz
|
||||
```
|
||||
|
||||
Decompress the file, open the directory created, and run the `install.sh` script:
|
||||
|
||||
```bash
|
||||
sudo ./install.sh
|
||||
```
|
||||
|
||||
### Deploy TDgpt
|
||||
|
||||
See [Installing TDgpt](../management/) to prepare your environment and deploy TDgpt.
|
||||
|
||||
## Get Started in TDengine Cloud
|
||||
|
||||
You can use TDgpt with your TDengine Cloud deployment. Register for a TDengine Cloud account, ensure that you have at least one instance, and register TDgpt to your TDengine Cloud instance as described in the documentation. See the TDengine Cloud documentation for more information.
|
||||
|
||||
Create a TDgpt instance, and then refer to [Installing TDgpt](../management/) to manage your anode.
|
||||
|
|
@ -0,0 +1,48 @@
|
|||
---
|
||||
title: Frequently Asked questions
|
||||
sidebar_label: Frequently Asked questions
|
||||
---
|
||||
|
||||
## 1. During the installation process, uWSGI fails to compile
|
||||
|
||||
The TDgpt installation process compiles uWSGI on your local machine. In certain Python distributions, such as Anaconda, conflicts may occur during compilation. In this case, you can choose not to install uWSGI.
|
||||
|
||||
However, this means that you must manually run the `python3.10 /usr/local/taos/taosanode/lib/taosanalytics/app.py` command when starting the taosanode service. Use a virtual Python environment when running this command to ensure that dependencies can be loaded.
|
||||
|
||||
## 2. Anodes fail to be created because the service cannot be accessed.
|
||||
|
||||
```bash
|
||||
taos> create anode '127.0.0.1:6090';
|
||||
|
||||
DB error: Analysis service can't access[0x80000441] (0.117446s)
|
||||
```
|
||||
|
||||
First, use curl to check whether the anode is providing services: The output of `curl '127.0.0.1:6090'` should be as follows:
|
||||
|
||||
```bash
|
||||
TDengine© Time Series Data Analytics Platform (ver 1.0.x)
|
||||
```
|
||||
|
||||
The following output indicates that the anode is not providing services:
|
||||
|
||||
```bash
|
||||
curl: (7) Failed to connect to 127.0.0.1 port 6090: Connection refused
|
||||
```
|
||||
|
||||
If the anode has not started or is not running, check the uWSGI log logs in the `/var/log/taos/taosanode/taosanode.log` file to find and resolve any errors.
|
||||
|
||||
Note: Do not use systemctl to check the status of the taosanode service.
|
||||
|
||||
## 3. The service is operational, but queries return that the service is not available.
|
||||
|
||||
```bash
|
||||
taos> select _frowts,forecast(current, 'algo=arima, alpha=95, wncheck=0, rows=20') from d1 where ts<='2017-07-14 10:40:09.999';
|
||||
|
||||
DB error: Analysis service can't access[0x80000441] (60.195613s)
|
||||
```
|
||||
|
||||
The timeout period for the analysis service is 60 seconds. If the analysis process cannot be completed within this period, this error will occur. You can reduce the scope of data being analyzed or try another algorithm to avoid the error.
|
||||
|
||||
## 4. Illegal json format error is returned.
|
||||
|
||||
This indicates that the analysis results contain an error. Check the anode operation logs in the `/var/log/taos/taosanode/taosanode.app.log` file to find and resolve any issues.
|
|
@ -0,0 +1,116 @@
|
|||
---
|
||||
sidebar_label: TDgpt
|
||||
title: TDgpt
|
||||
---
|
||||
|
||||
import Image from '@theme/IdealImage';
|
||||
import tdgptArch from '../../assets/tdgpt-01.png';
|
||||
|
||||
## Introduction
|
||||
|
||||
Numerous algorithms have been proposed to perform time-series forecasting, anomaly detection, imputation, and classification, with varying technical characteristics suited for different scenarios.
|
||||
|
||||
Typically, these analysis algorithms are packaged as toolkits in high-level programming languages (such as Python or R) and are widely distributed and used through open-source channels. This model helps software developers integrate complex analysis algorithms into their systems and greatly lowers the barrier to using advanced algorithms.
|
||||
|
||||
Database system developers have also attempted to integrate data analysis algorithm models directly into database systems. By building machine learning libraries (e.g., Spark’s MLlib), they aim to leverage mature analytical techniques to enhance the advanced data analysis capabilities of databases or analytical computing engines.
|
||||
|
||||
The rapid development of artificial intelligence (AI) has brought new opportunities to time-series data analysis. Efficiently applying AI capabilities to this field also presents new possibilities for databases. To this end, TDengine has introduced TDgpt, an intelligent agent for time-series analytics. With TDgpt, you can use statistical analysis algorithms, machine learning models, deep learning models, foundational models for time-series data, and large language models via SQL statements. TDgpt exposes the analytical capabilities of these algorithms and models through SQL and applies them to your time-series data using new windows and functions.
|
||||
|
||||
## Technical Features
|
||||
|
||||
TDgpt is an external agent that integrates seamlessly with TDengine’s main process taosd. It allows time-series analysis services to be embedded directly into TDengine’s query execution flow.
|
||||
|
||||
TDgpt is a stateless platform that includes the classic statsmodels library of statistical analysis models as well as embedded frameworks such as Torch and Keras for machine and deep learning. In addition, it can directly invoke TDengine’s proprietary foundation model TDtsfm through request forwarding and adaptation.
|
||||
|
||||
As an analytics agent, TDgpt will also support integration with third-party time-series model-as-a-service (MaaS) platforms in the future. By modifying just a single parameter (algo), you will be able to access cutting-edge time-series model services.
|
||||
|
||||
TDgpt is an open system to which you can easily add your own algorithms for forecasting, anomaly detection, imputation, and classification. Once added, the new algorithms can be used simply by changing the corresponding parameters in the SQL statement, with no need to modify a single line of application code.
|
||||
|
||||
## System Architecture
|
||||
|
||||
TDgpt is composed of one or more stateless analysis nodes, called AI nodes (anodes). These anodes can be deployed as needed across the TDengine cluster in appropriate hardware environments (for example, on compute nodes equipped with GPUs) depending on the requirements of the algorithms being used.
|
||||
|
||||
TDgpt provides a unified interface and invocation method for different types of analysis algorithms. Based on user-specified parameters, it calls advanced algorithm packages and other analytical tools, then returns the results to TDengine’s main process (taosd) in a predefined format.
|
||||
|
||||
TDgpt consists of four main components:
|
||||
|
||||
- Built-in analytics libraries: Includes libraries such as statsmodels, pyculiarity, and pmdarima, offering ready-to-use models for forecasting and anomaly detection.
|
||||
- Built-in machine learning libraries: Includes libraries like Torch, Keras, and Scikit-learn to run pre-trained machine and deep learning models within TDgpt’s process space. The training process can be managed using end-to-end open-source ML frameworks such as Merlion or Kats, and trained models can be deployed by uploading them to a designated TDgpt directory.
|
||||
- Request adapter for general-purpose LLMs: Converts time-series forecasting requests into prompts for general-purpose LLMs such as Llama in a MaaS manner. (Note: This functionality is not open source.)
|
||||
- Adapter for locally deployed time-series models: Sends requests directly to models like Time-MoE and TDtsfm that are specifically designed for time-series data. Compared to general-purpose LLMs, these models do not require prompt engineering, are lighter-weight, and are easier to deploy locally with lower hardware requirements. In addition, the adapter can also connect to cloud-based time-series MaaS systems such as TimeGPT, enabling localized analysis powered by cloud-hosted models.
|
||||
|
||||
<figure>
|
||||
<Image img={tdgptArch} alt="TDgpt Architecture"/>
|
||||
<figcaption>TDgpt architecture</figcaption>
|
||||
</figure>
|
||||
|
||||
During query execution, the vnode in TDengine forwards any elements involving advanced time-series data analytics directly to the anode. Once the analysis is completed, the results are assembled and embedded back into the query execution process.
|
||||
|
||||
## Advanced Analytics
|
||||
|
||||
The analytics services provided by TDgpt are described as follows:
|
||||
|
||||
- Anomaly detection: This service is provided via a new anomaly window that has been introduced into TDengine. An anomaly window is a special type of event window, defined by the anomaly detection algorithm as a time window during which an anomaly is occurring. This window differs from an event window in that the algorithm determines when it opens and closes instead of expressions input by the user. The query operations supported by other windows are also supported for anomaly windows.
|
||||
- Time-series forecasting: The FORECAST function invokes a specified (or default) forecasting algorithm to predict future time-series data based on input historical data.
|
||||
- Data imputation: To be released in July 2025
|
||||
- Time-series classification: To be released in July 2025
|
||||
|
||||
## Custom Algorithms
|
||||
|
||||
TDgpt is an extensible platform to which you can add your own algorithms and models using the process described in [Algorithm Developer's Guide](./dev/). After adding an algorithm, you can access it through SQL statements just like the built-in algorithms. It is not necessary to make updates to your applications.
|
||||
|
||||
Custom algorithms must be developed in Python. The anode adds algorithms dynamically. When the anode is started, it scans specified directories for files that meet its requirements and adds those files to the platform. To add an algorithm to your TDgpt, perform the following steps:
|
||||
|
||||
1. Develop an analytics algorithm according to the TDgpt requirements.
|
||||
2. Place the source code files in the appropriate directory and restart the anode.
|
||||
3. Refresh the algorithm cache table.
|
||||
|
||||
You can then use your new algorithm in SQL statements.
|
||||
|
||||
## Algorithm Evaluation
|
||||
|
||||
TDengine Enterprise includes a tool that evaluates the effectiveness of different algorithms and models. You can use this tool on any algorithm or model in TDgpt, including built-in and custom forecasting and anomaly detection algorithms and models. The tool uses quantitative metrics to evaluate the accuracy and performance of each algorithm with a given dataset in TDengine.
|
||||
|
||||
## Model Management
|
||||
|
||||
Trained models for machine learning frameworks such as Torch, TensorFlow, and Keras must be placed in the designated directory on the anode. The anode automatically detects and loads models from this directory.
|
||||
|
||||
TDengine Enterprise includes a model manager that integrates seamlessly with open-source end-to-end ML frameworks for time-series data such as Merlion and Kats.
|
||||
|
||||
## Processing Performance
|
||||
|
||||
Time-series analytics is a CPU-intensive workflow. Using a more powerful CPU or GPU can improve performance.
|
||||
|
||||
Machine and deep learning models in TDgpt are run through torch, and you can use standard methods of improving performance, for example deploying TDgpt on a machine with more RAM and uing a torch model that can use GPUs.
|
||||
|
||||
You can add different algorithms and models to different anodes to enable concurrent processing.
|
||||
|
||||
## Operations and Maintenance
|
||||
|
||||
With TDengine OSS, permissions and resource management are not provided for TDgpt.
|
||||
|
||||
TDgpt is deployed as a Flask service through uWSGI. You can monitor its status by opening the port in uWSGI.
|
||||
|
||||
## References
|
||||
|
||||
[1] Merlion: https://opensource.salesforce.com/Merlion/latest/index.html
|
||||
|
||||
[2] Kats: https://facebookresearch.github.io/Kats/
|
||||
|
||||
[3] StatsModels: https://www.statsmodels.org/stable/index.html
|
||||
|
||||
[4] Keras: https://keras.io/guides/
|
||||
|
||||
[5] Torch: https://pytorch.org/
|
||||
|
||||
[6] Scikit-learn: https://scikit-learn.org/stable/index.html
|
||||
|
||||
[7] Time-MoE: https://github.com/Time-MoE/Time-MoE
|
||||
|
||||
[8] TimeGPT: https://docs.nixtla.io/docs/getting-started-about_timegpt
|
||||
|
||||
[9] DeepSeek: https://www.deepseek.com/
|
||||
|
||||
[10] Llama: https://www.llama.com/docs/overview/
|
||||
|
||||
[11] Spark MLlib: https://spark.apache.org/docs/latest/ml-guide.html
|
|
@ -0,0 +1,272 @@
|
|||
---
|
||||
sidebar_label: Virtual Tables
|
||||
title: Virtual Tables
|
||||
description: Various management operations for virtual tables
|
||||
---
|
||||
|
||||
import origintable from './assets/virtual-table-origin-table.png';
|
||||
import queryres from './assets/virtual-table-query-res.png';
|
||||
import partres from './assets/virtual-table-query-res-part.png';
|
||||
|
||||
## Create Virtual Table
|
||||
|
||||
The `CREATE VTABLE` statement is used to create virtual basic tables and virtual subtables using virtual supertables as templates.
|
||||
|
||||
### Create Virtual Supertables
|
||||
|
||||
Refer to the `VIRTUAL` parameter in [Create Supertable](./04-stable.md#create-supertable).
|
||||
|
||||
### Create Virtual Basic Table
|
||||
|
||||
```sql
|
||||
CREATE VTABLE [IF NOT EXISTS] [db_name].vtb_name
|
||||
ts_col_name timestamp,
|
||||
(create_definition[ ,create_definition] ...)
|
||||
|
||||
create_definition:
|
||||
vtb_col_name column_definition
|
||||
|
||||
column_definition:
|
||||
type_name [FROM [db_name.]table_name.col_name]
|
||||
```
|
||||
|
||||
### Create Virtual Subtable
|
||||
|
||||
```sql
|
||||
CREATE VTABLE [IF NOT EXISTS] [db_name].vtb_name
|
||||
(create_definition[ ,create_definition] ...)
|
||||
USING [db_name.]stb_name
|
||||
[(tag_name [, tag_name] ...)]
|
||||
TAGS (tag_value [, tag_value] ...)
|
||||
|
||||
create_definition:
|
||||
[stb_col_name FROM] [db_name.]table_name.col_name
|
||||
tag_value:
|
||||
const_value
|
||||
```
|
||||
|
||||
**Usage Notes**
|
||||
|
||||
1. Naming rules for virtual tables/columns follow [Name Rules](./19-limit.md#naming-rules).
|
||||
2. Maximum table name length: 192 characters.
|
||||
3. The first column must be TIMESTAMP and is automatically set as primary key.
|
||||
4. Row length cannot exceed 64KB (Note: VARCHAR/NCHAR/GEOMETRY columns consume 2 extra bytes each).
|
||||
5. Specify maximum length for VARCHAR/NCHAR/GEOMETRY types (e.g., VARCHAR(20)).
|
||||
6. Use `FROM` to specify column data sources. Cross-database sources are supported via `db_name`.
|
||||
7. The timestamp column (ts) values of virtual table are merged results from all involved tables' timestamp primary keys during queries.
|
||||
8. Virtual supertables only support creating virtual subtables, virtual subtables can only use virtual supertables as template.
|
||||
9. Ensure virtual tables' column/tag data types match their source columns/tags.
|
||||
10. Virtual table names must be unique within a database and cannot conflict with table names, and it is recommended that view names do not duplicate virtual table names (not enforced). When a view and a virtual table have the same name, operations such as writing, querying, granting, and revoking permissions prioritize the virtual table with the same name. .
|
||||
11. When creating virtual subtables/basic tables, `FROM` columns must originate from basic tables/subtables (not supertables, views, or other virtual tables).
|
||||
|
||||
## Query Virtual Tables
|
||||
|
||||
Virtual tables use the same query syntax as regular tables, but their dataset may vary between queries based on data alignment rules.
|
||||
|
||||
### Data Alignment Rules
|
||||
|
||||
1. Align data from multiple source tables by timestamp.
|
||||
2. Combine columns with same timestamp into one row; missing values fill with NULL.
|
||||
3. Virtual table timestamps are the union of all involved columns' origin tables' timestamps. Therefore, the number of rows in the result set may vary when different queries select different columns.
|
||||
4. Users can combine any columns from multiple tables; unselected columns are excluded.
|
||||
|
||||
**Example**
|
||||
|
||||
Given tables t1, t2, t3 with data:
|
||||
|
||||
<img src={origintable} width="500" alt="Original Table Structure and Data" />
|
||||
|
||||
Create a virtual table v1:
|
||||
|
||||
```sql
|
||||
CREATE VTABLE v1 (
|
||||
ts timestamp,
|
||||
c1 int FROM t1.value,
|
||||
c2 int FROM t2.value,
|
||||
c3 int FROM t3.value1,
|
||||
c4 int FROM t3.value2);
|
||||
```
|
||||
|
||||
Querying all columns:
|
||||
|
||||
```sql
|
||||
SELECT * FROM v1;
|
||||
```
|
||||
|
||||
Result:
|
||||
<img src={queryres} width="200" alt="Full Query Result" />
|
||||
|
||||
Partial column query:
|
||||
|
||||
```sql
|
||||
SELECT c1, c2 FROM v1;
|
||||
```
|
||||
|
||||
Result:
|
||||
<img src={partres} width="200" alt="Partial Query Result" />
|
||||
|
||||
Since the original tables t1 and t2 (corresponding to columns c1 and c2) lack the timestamp 0:00:03, this timestamp will not appear in the final result.
|
||||
|
||||
**Limitations**
|
||||
1. Querying virtual supertables does not support subtables from different databases.
|
||||
|
||||
## Modify Virtual Basic Tables
|
||||
|
||||
```sql
|
||||
ALTER VTABLE [db_name.]vtb_name alter_table_clause
|
||||
|
||||
alter_table_clause: {
|
||||
ADD COLUMN vtb_col_name vtb_column_type [FROM table_name.col_name]
|
||||
| DROP COLUMN vtb_col_name
|
||||
| ALTER COLUMN vtb_col_name SET {table_name.col_name | NULL }
|
||||
| MODIFY COLUMN col_name column_type
|
||||
| RENAME COLUMN old_col_name new_col_name
|
||||
}
|
||||
```
|
||||
|
||||
### Add Column
|
||||
|
||||
```sql
|
||||
ALTER VTABLE vtb_name ADD COLUMN vtb_col_name vtb_col_type [FROM [db_name].table_name.col_name]
|
||||
```
|
||||
|
||||
### Drop Column
|
||||
|
||||
```sql
|
||||
ALTER VTABLE vtb_name DROP COLUMN vtb_col_name
|
||||
```
|
||||
|
||||
### Modify Column Width
|
||||
|
||||
```sql
|
||||
ALTER VTABLE vtb_name MODIFY COLUMN vtb_col_name data_type(length);
|
||||
```
|
||||
|
||||
### Rename Column
|
||||
|
||||
```sql
|
||||
ALTER VTABLE vtb_name RENAME COLUMN old_col_name new_col_name
|
||||
```
|
||||
|
||||
### Change Column Source
|
||||
|
||||
```sql
|
||||
ALTER VTABLE vtb_name ALTER COLUMN vtb_col_name SET {[db_name.]table_name.col_name | NULL}
|
||||
```
|
||||
|
||||
## Modify Virtual Subtables
|
||||
|
||||
```sql
|
||||
ALTER VTABLE [db_name.]vtb_name alter_table_clause
|
||||
|
||||
alter_table_clause: {
|
||||
ALTER COLUMN vtb_col_name SET table_name.col_name
|
||||
| SET TAG tag_name = new_tag_value
|
||||
}
|
||||
```
|
||||
|
||||
### Modify Subtable Tag Value
|
||||
|
||||
```sql
|
||||
ALTER VTABLE tb_name SET TAG tag_name1=new_tag_value1, tag_name2=new_tag_value2 ...;
|
||||
```
|
||||
|
||||
### Change Column Source
|
||||
|
||||
```sql
|
||||
ALTER VTABLE vtb_name ALTER COLUMN vtb_col_name SET {[db_name.]table_name.col_name | NULL}
|
||||
```
|
||||
|
||||
## Drop Virtual Tables
|
||||
|
||||
```sql
|
||||
DROP VTABLE [IF EXISTS] [dbname].vtb_name;
|
||||
```
|
||||
|
||||
## View Virtual Table Information
|
||||
|
||||
### List Virtual Tables
|
||||
|
||||
```sql
|
||||
SHOW [NORMAL | CHILD] [db_name.]VTABLES [LIKE 'pattern'];
|
||||
```
|
||||
|
||||
### Show Creation Statement
|
||||
|
||||
```sql
|
||||
SHOW CREATE VTABLE [db_name.]vtable_name;
|
||||
```
|
||||
|
||||
### Describe Structure
|
||||
|
||||
```sql
|
||||
DESCRIBE [db_name.]vtb_name;
|
||||
```
|
||||
|
||||
### Query All Virtual Tables' Information
|
||||
|
||||
```sql
|
||||
SELECT ... FROM information_schema.ins_tables WHERE type = 'VIRTUAL_NORMAL_TABLE' OR type = 'VIRTUAL_CHILD_TABLE';
|
||||
```
|
||||
|
||||
## Write to Virtual Tables
|
||||
|
||||
Writing or deleting data in virtual tables is **not supported**. Virtual tables are logical views computed from source tables.
|
||||
|
||||
## Virtual Tables vs. Views
|
||||
|
||||
| Property | Virtual Table | View |
|
||||
|-----------------------|-----------------------------------|-------------------------------|
|
||||
| **Definition** | Dynamic structure combining multiple tables by timestamp. | Saved SQL query definition. |
|
||||
| **Data Source** | Multiple tables with timestamp alignment. | Single/multiple table query results. |
|
||||
| **Storage** | No physical storage; dynamic generation. | No storage; query logic only. |
|
||||
| **Timestamp Handling**| Aligns timestamps across tables. | Follows query logic. |
|
||||
| **Update Mechanism** | Real-time reflection of source changes. | Depends on query execution. |
|
||||
| **Special Features** | Supports NULL filling and interpolation (prev/next/linear). | No built-in interpolation. |
|
||||
| **Use Case** | Time series alignment, cross-table analysis. | Simplify complex queries, access control. |
|
||||
| **Performance** | Potentially higher complexity. | Similar to underlying queries. |
|
||||
|
||||
Mutual conversion between virtual tables and views is not supported. For example, you cannot create a view based on a virtual table or create a virtual table from a view.
|
||||
|
||||
## Permissions
|
||||
|
||||
Virtual table permissions are categorized into READ and WRITE. Query operations require READ permission, while operations to delete or modify the virtual table itself require WRITE permission.
|
||||
|
||||
### Syntax
|
||||
|
||||
#### Grant
|
||||
```sql
|
||||
GRANT privileges ON [db_name.]vtable_name TO user_name
|
||||
privileges: { ALL | READ | WRITE }
|
||||
```
|
||||
|
||||
#### Revoke
|
||||
```sql
|
||||
REVOKE privileges ON [db_name.]vtable_name FROM user_name
|
||||
privileges: { ALL | READ | WRITE }
|
||||
```
|
||||
|
||||
### Permission Rules
|
||||
|
||||
1. The creator of a virtual table and the root user have all permissions by default.
|
||||
2. Users can grant or revoke read/write permissions for specific virtual tables (including virtual supertables and virtual regular tables) via `dbname.vtbname`. Direct permission operations on virtual subtables are not supported.
|
||||
3. Virtual subtables and virtual supertables do not support tag-based authorization (table-level authorization). Virtual subtables inherit permissions from their virtual supertables.
|
||||
4. Granting and revoking permissions for other users must be performed through `GRANT` and `REVOKE` statements, and only the root user can execute these operations.
|
||||
5. The detailed permission control rules are summarized below:
|
||||
|
||||
| No. | Operation | Permission Requirements |
|
||||
|-----|--------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| 1 | CREATE VTABLE | The user has **WRITE** permission on the database to which the virtual table belongs, and <br />the user has **READ** permission on the source tables corresponding to the virtual table's data sources. |
|
||||
| 2 | DROP/ALTER VTABLE | The user has **WRITE** permission on the virtual table. If specifying a column's data source, the user must also have **READ** permission on the source table corresponding to that column. |
|
||||
| 3 | SHOW VTABLES | None |
|
||||
| 4 | SHOW CREATE VTABLE | None |
|
||||
| 5 | DESCRIBE VTABLE | None |
|
||||
| 6 | Query System Tables | None |
|
||||
| 7 | SELECT FROM VTABLE | The user has **READ** permission on the virtual table. |
|
||||
| 8 | GRANT/REVOKE | Only the **root user** has permission. |
|
||||
|
||||
## Use Cases
|
||||
|
||||
| SQL Query | SQL Write | STMT Query | STMT Write | Subscribe | Stream Compute |
|
||||
|----------|-----------|------------|------------|-----------|----------------|
|
||||
| Supported | Not Supported | Not Supported | Not Supported | Not Supported | Supported |
|
|
@ -99,6 +99,7 @@ The list of keywords is as follows:
|
|||
| CONSUMER | |
|
||||
| CONSUMERS | |
|
||||
| CONTAINS | |
|
||||
| CONTINUOUS_WINDOW_CLOSE | 3.3.6.0+ |
|
||||
| COPY | |
|
||||
| COUNT | |
|
||||
| COUNT_WINDOW | |
|
||||
|
@ -113,7 +114,7 @@ The list of keywords is as follows:
|
|||
| DATABASE | |
|
||||
| DATABASES | |
|
||||
| DBS | |
|
||||
| DECIMAL | |
|
||||
| DECIMAL | 3.3.6.0+ |
|
||||
| DEFERRED | |
|
||||
| DELETE | |
|
||||
| DELETE_MARK | |
|
||||
|
|
After Width: | Height: | Size: 7.8 KiB |
After Width: | Height: | Size: 3.1 KiB |
After Width: | Height: | Size: 4.9 KiB |
After Width: | Height: | Size: 55 KiB |
After Width: | Height: | Size: 53 KiB |
After Width: | Height: | Size: 280 KiB |
After Width: | Height: | Size: 89 KiB |
After Width: | Height: | Size: 87 KiB |
After Width: | Height: | Size: 61 KiB |
After Width: | Height: | Size: 324 KiB |
After Width: | Height: | Size: 20 KiB |
|
@ -357,7 +357,7 @@ TAGS (
|
|||
|
||||
以设备 d1001 为例,假设 d1001 设备的电流、电压、相位数据如下:
|
||||
|
||||
<img src={origintable} width="70%" alt="data-model-origin-table" />
|
||||
<img src={origintable} width="500" alt="data-model-origin-table" />
|
||||
|
||||
虚拟表 d1001_v 中的数据如下:
|
||||
|
||||
|
@ -390,7 +390,7 @@ CREATE VTABLE current_v (
|
|||
|
||||
假设 d1001, d1002, d1003, d1004 四个设备的电流数据如下:
|
||||
|
||||
<img src={origintable2} width="70%" alt="data-model-origin-table-2" />
|
||||
<img src={origintable2} width="500" alt="data-model-origin-table-2" />
|
||||
|
||||
虚拟表 current_v 中的数据如下:
|
||||
|
||||
|
|
|
@ -136,7 +136,7 @@ create stream if not exists count_history_s fill_history 1 into count_history as
|
|||
```sql
|
||||
create stream if not exists continuous_query_s trigger force_window_close into continuous_query as select count(*) from power.meters interval(10s) sliding(1s)
|
||||
```
|
||||
5. CONTINUOUS_WINDOW_CLOSE:窗口关闭时输出结果。修改、删除数据,并不会立即触发重算,每等待 rec_time_val 时长,会进行周期性重算。如果不指定 rec_time_val,那么重算周期是60分钟。如果重算的时间长度超过 rec_time_val,在本次重算后,自动开启下一次重算。该模式当前只支持 INTERVAL 窗口。如果使用 FILL,需要配置 adapter的相关信息:adapterFqdn、adapterPort、adapterToken。adapterToken 为 `{username}:{password}` 经过 Base64 编码之后的字符串,例如 `root:taosdata` 编码后为 `cm9vdDp0YW9zZGF0YQ==`。
|
||||
5. CONTINUOUS_WINDOW_CLOSE:窗口关闭时输出结果。修改、删除数据,并不会立即触发重算,每等待 rec_time_val 时长,会进行周期性重算。如果不指定 rec_time_val,那么重算周期是 60 分钟。如果重算的时间长度超过 rec_time_val,在本次重算后,自动开启下一次重算。该模式当前只支持 INTERVAL 窗口。如果使用 FILL,需要配置 adapter的相关信息:adapterFqdn、adapterPort、adapterToken。adapterToken 为 `{username}:{password}` 经过 Base64 编码之后的字符串,例如 `root:taosdata` 编码后为 `cm9vdDp0YW9zZGF0YQ==`。
|
||||
|
||||
窗口关闭是由事件时间决定的,如事件流中断、或持续延迟,此时事件时间无法更新,可能导致无法得到最新的计算结果。
|
||||
|
||||
|
|
|
@ -8,11 +8,11 @@ import TDgpt from './pic/data-analysis.png';
|
|||
# 背景介绍
|
||||
针对时间序列数据预测分析、异常检测、数据补全和数据分类的应用领域,相关领域的研究人员提出并开发出了众多不同技术特点、适用于不同场景的时序数据分析算法,广泛应用在时间序列数据预测、异常检测等领域。
|
||||
|
||||
分析算法通常以高级编程语言(Python语言或R语言)工具包的形式存在,并通过开源的方式广泛分发和使用,这种应用模式极大地便利了软件开发人员在应用系统中调用复杂的分析算法,极大地降低了使用高级算法的门槛。
|
||||
分析算法通常以高级编程语言(Python 语言或 R 语言)工具包的形式存在,并通过开源的方式广泛分发和使用,这种应用模式极大地便利了软件开发人员在应用系统中调用复杂的分析算法,极大地降低了使用高级算法的门槛。
|
||||
|
||||
另一方面,数据库系统研发人员也尝试将数据分析算法模型整合到数据库系统中,通过建立Machine Learning 库(例如 Spark 的机器学习库)充分利用成熟分析技术增强数据库或分析计算引擎的高级数据分析能力。
|
||||
另一方面,数据库系统研发人员也尝试将数据分析算法模型整合到数据库系统中,通过建立 Machine Learning 库(例如 Spark 的机器学习库)充分利用成熟分析技术增强数据库或分析计算引擎的高级数据分析能力。
|
||||
|
||||
飞速发展的人工智能(AI)为时序数据分析应用带来的新机遇,快速有效地将 AI 能力应用在时间序列数据分析领域也为数据库。为此,涛思数据创新性地提出了时序数据分析智能体 TDgpt,使用 TDgpt,将您能够通过SQL 语句,直接调用适配和整合驱动统计分析算法、机器学习算法模型、深度学习模型,时序数据基础模型以及大语言模型,并将这些分析能力转化为 SQL 语句的调用,通过异常检测窗口和预测函数的方式应用在时序数据上。
|
||||
飞速发展的人工智能(AI)为时序数据分析应用带来的新机遇,快速有效地将 AI 能力应用在时间序列数据分析领域也为数据库。为此,涛思数据创新性地提出了时序数据分析智能体 TDgpt,使用 TDgpt,将您能够通过 SQL 语句,直接调用适配和整合驱动统计分析算法、机器学习算法模型、深度学习模型,时序数据基础模型以及大语言模型,并将这些分析能力转化为 SQL 语句的调用,通过异常检测窗口和预测函数的方式应用在时序数据上。
|
||||
|
||||
|
||||
# 技术特点
|
||||
|
@ -27,9 +27,9 @@ TDgpt 是一个开放的系统,用户能够根据自己的需要,添加预
|
|||
TDgpt 由若干个无状态的分析节点 anode 构成,可以按需在系统集群中部署 Anode 节点,也可以根据分析模型算法的特点,将 Anode 部署在合适的硬件环境中,例如带有 GPU 的计算节点。
|
||||
TDgpt 针对不同的分析算法,提供统一的调用接口和调用方式,根据用户请求的参数,调用高级分析算法包及其他的分析工具,并将分析获得的结果按照约定的方式返回给 TDengine 的主进程 taosd。
|
||||
TDgpt 的主要包含四个部分的内容。
|
||||
- 第一部分是内置分析库,包括statsmodels, pyculiarity, pmdarima 等,提供可以直接调用的预测分析和异常检测算法模型。
|
||||
- 第一部分是内置分析库,包括 statsmodels, pyculiarity, pmdarima 等,提供可以直接调用的预测分析和异常检测算法模型。
|
||||
- 第二部分是内置的机器学习库(包括:torch,keras,scikit-learn等),用于驱动预训练完成的机器(深度)学习模型在 TDgpt 的进程空间内运行。预训练的流程可以使用 Merlion/Kats 等 开源的端到端机器学习框架进行管理,并将完成训练的模型上传到 TDgpt 指定目录即可;
|
||||
- 第三部分是通用大语言模型的请求适配模块。将时序数据预测请求转换后,基于 Prompt 向 DeepSeek,LlaMa 等通用大语言模型 MaaS 请求服务(这部分功能暂未开源);
|
||||
- 第三部分是通用大语言模型的请求适配模块。将时序数据预测请求转换后,基于 Prompt 向 DeepSeek、LlaMa 等通用大语言模型 MaaS 请求服务(这部分功能暂未开源);
|
||||
- 第四部分是通过 Adapter 直接向本地部署的 Time-MoE、TDtsfm 等时序数据模型请求服务。时序数据专用模型相对于通用语言大模型,无需 Prompt,更加便捷轻量,本地应用部署对硬件资源要求也较低;除此之外,Adapter 还可以直接请求 TimeGPT 这种类型的时序数据分析 MaaS 服务,调用云端的时序模型服务提供本地化时序数据分析能力。
|
||||
|
||||
<img src={TDgpt} alt="TDgpt架构图" />
|
||||
|
@ -41,14 +41,14 @@ TDgpt 的主要包含四个部分的内容。
|
|||
使用TDgpt 提供的时序数据分析服务,包括:
|
||||
- 时序数据异常检测:TDengine 中定义了新的时间窗口——异常(状态)窗口——来提供异常检测服务。异常窗口可以视为一种特殊的事件窗口(Event Window),即异常检测算法确定的连续异常时间序列数据所在的时间窗口。与普通事件窗口区别在于——时间窗口的起始时间和结束时间均是分析算法确定,不是用户指定的表达式判定。异常窗口使用方式与其他类型的时间窗口(例如状态窗口、会话窗口等)类似。因此时间窗口内可使用的查询操作均可应用在异常窗口上。
|
||||
- 时序数据分析预测:TDengine 中提供了一个新的函数FORECAST提供时序数据预测服务,基于输入的(历史)时间序列数据调用指定(或默认)预测算法给出输入时序数据后续时间序列的预测数据。
|
||||
- 时序数据补全:研发测试中,2025年7月发布
|
||||
- 时序数据分类:研发测试中,2025年7月发布
|
||||
- 时序数据补全:研发测试中,2025 年 7 月发布
|
||||
- 时序数据分类:研发测试中,2025 年 7 月发布
|
||||
|
||||
# 自定义分析算法
|
||||
|
||||
TDgpt 是一个可扩展的时序数据高级分析智能体,用户遵循[算法开发者指南](./dev)中的简易步骤就能将自己开发的分析算法添加到系统中。之后应用可以通过 SQL语句直接调用, 让高级分析算法的使用门槛降到几乎为零。对于新引入的算法或模型,应用不用做任何调整。
|
||||
TDgpt 是一个可扩展的时序数据高级分析智能体,用户遵循 [算法开发者指南](./dev)中的简易步骤就能将自己开发的分析算法添加到系统中。之后应用可以通过 SQL 语句直接调用, 让高级分析算法的使用门槛降到几乎为零。对于新引入的算法或模型,应用不用做任何调整。
|
||||
|
||||
TDpgt 只支持使用 Python 语言开发的分析算法。 Anode 采用 Python 类动态加载模式,在启动的时候扫描特定目录内满足约定条件的所有代码文件,并将其加载到系统中。因此,开发者只需要遵循以下几步就能完成新算法的添加工作:
|
||||
TDpgt 只支持使用 Python 语言开发的分析算法。Anode 采用 Python 类动态加载模式,在启动的时候扫描特定目录内满足约定条件的所有代码文件,并将其加载到系统中。因此,开发者只需要遵循以下几步就能完成新算法的添加工作:
|
||||
1. 开发完成符合要求的分析算法类
|
||||
2. 将代码文件放入对应目录,然后重启 Anode
|
||||
3. 使用SQL命令更新算法缓存列表即可。
|
||||
|
@ -60,8 +60,8 @@ TDpgt 只支持使用 Python 语言开发的分析算法。 Anode 采用 Python
|
|||
|
||||
# 模型管理
|
||||
|
||||
对于Torch/Tensorflow/Keras 等机器学习库框架驱动的预训练模型,需要首先将训练完成的数据模型添加到 Anode 的指定目录中,Anode 可以自动调用该目录内的模型,驱动其运行并提供服务。
|
||||
企业版本的 TDgpt 具备模型的管理能力,能够与开源的端到端时序数据机器学习框架(例如:Merlion、Kats等)无缝集成。
|
||||
对于 Torch/Tensorflow/Keras 等机器学习库框架驱动的预训练模型,需要首先将训练完成的数据模型添加到 Anode 的指定目录中,Anode 可以自动调用该目录内的模型,驱动其运行并提供服务。
|
||||
企业版本的 TDgpt 具备模型的管理能力,能够与开源的端到端时序数据机器学习框架(例如:Merlion、Kats 等)无缝集成。
|
||||
处理能力
|
||||
|
||||
通常意义上,时间序列数据分析主要是计算密集型任务。这种计算密集型任务,可以使用更高性能的 CPU 或 GPU 来提升处理性能。
|
||||
|
|
|
@ -3,6 +3,8 @@ title: "安装部署"
|
|||
sidebar_label: "安装部署"
|
||||
---
|
||||
|
||||
import PkgListV3 from "/components/PkgListV3";
|
||||
|
||||
### 环境准备
|
||||
使用 TDgpt 的高级时序数据分析功能需要在 TDengine 集群中安装部署 AI node(Anode)。Anode 运行在 Linux 平台上,并需要 3.10 或以上版本的 Python 环境支持。
|
||||
> 部署 Anode 需要 TDengine 3.3.6.0 及以后版本,请首先确认搭配 Anode 使用的 TDengine 能够支持 Anode。
|
||||
|
@ -35,11 +37,20 @@ export PATH=$PATH:~/.local/bin
|
|||
至此 Python 环境准备完成,可以进行 taosanode 的安装和部署。
|
||||
|
||||
### 安装及卸载
|
||||
使用 Linux 环境下的安装包 TDengine-anode-3.3.x.x-Linux-x64.tar.gz 可进行 Anode 的安装部署工作,命令如下:
|
||||
1. 从列表中下载获得 tar.gz 安装包:
|
||||
<PkgListV3 type={9}/>
|
||||
|
||||
2. 进入到安装包所在目录,使用 tar 解压安装包;
|
||||
> 请将 `<version>` 替换为下载的安装包版本
|
||||
```bash
|
||||
tar -zxvf TDengine-anode-<version>-Linux-x64.tar.gz
|
||||
```
|
||||
|
||||
3. 解压文件后,进入相应子目录,执行其中的 `install.sh` 安装脚本:
|
||||
请将 `<version>` 替换为下载的安装包版本
|
||||
|
||||
```bash
|
||||
tar -xzvf TDengine-anode-3.3.6.0-Linux-x64.tar.gz
|
||||
cd TDengine-anode-3.3.6.0
|
||||
cd TDengine-anode-<version>
|
||||
sudo ./install.sh
|
||||
```
|
||||
|
||||
|
|
|
@ -98,9 +98,7 @@ grubbs={}
|
|||
lof={"algorithm":"auto", "n_neighbor": 3}
|
||||
```
|
||||
|
||||
对比程序执行完成以后,会自动生成名称为`ad_result.xlsx` 的文件,第一个卡片是算法运行结果(如下表所示),分别包含了算法名称、执行调用参数、查全率、查准率、执行时间 5 个指标。
|
||||
|
||||
|
||||
对比程序执行完成以后,会自动生成名称为 `ad_result.xlsx` 的文件,第一个卡片是算法运行结果(如下表所示),分别包含了算法名称、执行调用参数、查全率、查准率、执行时间 5 个指标。
|
||||
|
||||
| algorithm | params | precision(%) | recall(%) | elapsed_time(ms.) |
|
||||
| --------- | -------------------------------------- | ------------ | --------- | ----------------- |
|
||||
|
|
|
@ -1096,14 +1096,14 @@ charset 的有效值是 UTF-8。
|
|||
- 支持版本:v3.3.6.0 引入
|
||||
|
||||
#### adapterFqdn
|
||||
- 说明:taosadapter服务的地址 `内部参数`
|
||||
- 说明:taosAdapter 服务的地址 `内部参数`
|
||||
- 类型:fqdn
|
||||
- 默认值:localhost
|
||||
- 动态修改:不支持
|
||||
- 支持版本:v3.3.6.0 引入
|
||||
|
||||
#### adapterPort
|
||||
- 说明:taosadapter服务的端口号 `内部参数`
|
||||
- 说明:taosAdapter 服务的端口号 `内部参数`
|
||||
- 类型:整数
|
||||
- 默认值:6041
|
||||
- 最小值:1
|
||||
|
|
|
@ -64,16 +64,16 @@ CREATE DATABASE db_name PRECISION 'ns';
|
|||
|
||||
:::
|
||||
|
||||
### DECIMAL数据类型
|
||||
`DECIMAL`数据类型用于高精度数值存储, 自版本3.3.6开始支持, 定义语法: DECIMAL(18, 2), DECIMAL(38, 10), 其中需要指定两个参数, 分别为`precision`和`scale`. `precision`是指最大支持的有效数字个数, `scale`是指最大支持的小数位数. 如DECIMAL(8, 4), 可表示范围即[-9999.9999, 9999.9999]. 定义DECIMAL数据类型时, `precision`范围为: [1,38], scale的范围为: [0,precision], scale为0时, 仅表示整数. 也可以不指定scale, 默认为0, 如DECIMAL(18), 与DECIMAL(18,0)相同。
|
||||
### DECIMAL 数据类型
|
||||
`DECIMAL` 数据类型用于高精度数值存储,自 v3.3.6.0 开始支持, 定义语法:`DECIMAL(18, 2)`,`DECIMAL(38, 10)`, 其中需要指定两个参数, 分别为 `precision` 和 `scale`。`precision` 是指最大支持的有效数字个数,`scale` 是指最大支持的小数位数。如 `DECIMAL(8, 4)`,可表示范围即 `[-9999.9999, 9999.9999]`。定义 DECIMAL 数据类型时,`precision` 范围为:`[1, 38]`, scale 的范围为:`[0, precision]`,scale 为 0 时,仅表示整数。也可以不指定 scale,默认为 0,例如 `DECIMAL(18)`,与 `DECIMAL(18, 0)` 相同。
|
||||
|
||||
当`precision`值不大于18时, 内部使用8字节存储(DECIMAL64), 当precision范围为(18, 38]时, 使用16字节存储(DECIMAL). SQL中写入DECIMAL类型数据时, 可直接使用数值写入, 当写入值大于类型可表示的最大值时会报DECIMAL_OVERFLOW错误, 当未大于类型表示的最大值, 但小数位数超过SCALE时, 会自动四舍五入处理, 如定义类型DECIMAL(10, 2), 写入10.987, 则实际存储值为10.99。
|
||||
当 `precision` 值不大于 18 时, 内部使用 8 字节存储(DECIMAL64), 当 `precision` 范围为 `(18, 38]` 时, 使用 16 字节存储(DECIMAL)。SQL 中写入 DECIMAL 类型数据时,可直接使用数值写入,当写入值大于类型可表示的最大值时会报 DECIMAL_OVERFLOW 错误, 当未大于类型表示的最大值, 但小数位数超过 SCALE 时, 会自动四舍五入处理。如定义类型 DECIMAL(10, 2),写入10.987,则实际存储值为 10.99 。
|
||||
|
||||
DECIMAL类型仅支持普通列, 暂不支持tag列. DECIMAL类型只支持SQL写入, 暂不支持stmt写入和schemeless写入。
|
||||
DECIMAL 类型仅支持普通列,暂不支持 tag 列。DECIMAL 类型只支持 SQL 写入,暂不支持 stmt 写入和 schemeless 写入。
|
||||
|
||||
整数类型和DECIMAL类型操作时, 会将整数类型转换为DECIMAL类型再进行计算. DECIMAL类型与DOUBLE/FLOAT/VARCHAR/NCHAR等类型计算时, 转换为DOUBLE类型进行计算.
|
||||
整数类型和 DECIMAL 类型操作时, 会将整数类型转换为 DECIMAL 类型再进行计算。DECIMAL 类型与 DOUBLE/FLOAT/VARCHAR/NCHAR 等类型计算时, 转换为 DOUBLE 类型进行计算。
|
||||
|
||||
查询DECIMAL类型表达式时, 若计算的中间结果超出当前类型可表示的最大值时, 报DECIMAL OVERFLOW错误.
|
||||
查询 DECIMAL 类型表达式时,若计算的中间结果超出当前类型可表示的最大值时,报 DECIMAL OVERFLOW 错误.
|
||||
|
||||
|
||||
## 常量
|
||||
|
|
|
@ -73,7 +73,7 @@ CREATE VTABLE [IF NOT EXISTS] [db_name].vtb_name
|
|||
|
||||
假设有表 t1、t2、t3 结构和数据如下:
|
||||
|
||||
<img src={origintable} width="100%" alt="origintable" />
|
||||
<img src={origintable} width="500" alt="origintable" />
|
||||
|
||||
并且有虚拟普通表 v1 ,创建方式如下:
|
||||
|
||||
|
@ -94,7 +94,7 @@ select * from v1;
|
|||
|
||||
结果如下:
|
||||
|
||||
<img src={queryres} width="100%" alt="queryres" />
|
||||
<img src={queryres} width="200" alt="queryres" />
|
||||
|
||||
如果没有选择全部列,只是选择了部分列,查询的结果只会包含选择的列的原始表的时间戳,例如执行如下查询:
|
||||
|
||||
|
@ -104,7 +104,7 @@ select c1, c2 from v1;
|
|||
|
||||
得到的结果如下图所示:
|
||||
|
||||
<img src={partres} width="100%" alt="partres" />
|
||||
<img src={partres} width="200" alt="partres" />
|
||||
|
||||
因为 c1、c2 列对应的原始表 t1、t2 中没有 0:00:03 这个时间戳,所以最后的结果也不会包含这个时间戳。
|
||||
|
||||
|
|
|
@ -1137,7 +1137,7 @@ CAST(expr AS type_name)
|
|||
- 字符串类型转换数值类型时可能出现的无效字符情况,例如 "a" 可能转为 0,但不会报错。
|
||||
- 转换到数值类型时,数值大于 type_name 可表示的范围时,则会溢出,但不会报错。
|
||||
- 转换到字符串类型时,如果转换后长度超过 type_name 中指定的长度,则会截断,但不会报错。
|
||||
- DECIMAL类型不支持与JSON,VARBINARY,GEOMERTY类型的互转.
|
||||
- DECIMAL 类型不支持与 JSON、VARBINARY、GEOMERTY 类型的互转。
|
||||
|
||||
#### TO_CHAR
|
||||
|
||||
|
@ -1619,13 +1619,13 @@ AVG(expr)
|
|||
|
||||
**功能说明**:统计指定字段的平均值。
|
||||
|
||||
**返回数据类型**:DOUBLE, DECIMAL。
|
||||
**返回数据类型**:DOUBLE、DECIMAL。
|
||||
|
||||
**适用数据类型**:数值类型。
|
||||
|
||||
**适用于**:表和超级表。
|
||||
|
||||
**说明**: 当输入类型为DECIMAL类型时, 输出类型也为DECIMAL类型, 输出的precision和scale大小符合数据类型章节中的描述规则, 通过计算SUM类型和UINT64的除法得到结果类型, 若SUM的结果导致DECIMAL类型溢出, 则报DECIMAL OVERFLOW错误。
|
||||
**说明**: 当输入类型为 DECIMAL 类型时,输出类型也为 DECIMAL 类型,输出的 precision 和 scale 大小符合数据类型章节中的描述规则,通过计算 SUM 类型和 UINT64 的除法得到结果类型,若 SUM 的结果导致 DECIMAL 类型溢出, 则报 DECIMAL OVERFLOW 错误。
|
||||
|
||||
### COUNT
|
||||
|
||||
|
@ -1808,13 +1808,13 @@ SUM(expr)
|
|||
|
||||
**功能说明**:统计表/超级表中某列的和。
|
||||
|
||||
**返回数据类型**:DOUBLE、BIGINT,DECIMAL。
|
||||
**返回数据类型**:DOUBLE、BIGINT、DECIMAL。
|
||||
|
||||
**适用数据类型**:数值类型。
|
||||
|
||||
**适用于**:表和超级表。
|
||||
|
||||
**说明**: 输入类型为DECIMAL类型时, 输出类型为DECIMAL(38, scale), precision为当前支持的最大值, scale为输入类型的scale, 若SUM的结果溢出时, 报DECIMAL OVERFLOW错误.
|
||||
**说明**: 输入类型为 DECIMAL 类型时,输出类型为 DECIMAL(38, scale) ,precision 为当前支持的最大值,scale 为输入类型的 scale,若 SUM 的结果溢出时,报 DECIMAL OVERFLOW 错误.
|
||||
|
||||
### VAR_POP
|
||||
|
||||
|
|
|
@ -168,7 +168,7 @@ SELECT * from information_schema.`ins_streams`;
|
|||
|
||||
4. FORCE_WINDOW_CLOSE:以操作系统当前时间为准,只计算当前关闭窗口的结果,并推送出去。窗口只会在被关闭的时刻计算一次,后续不会再重复计算。该模式当前只支持 INTERVAL 窗口(不支持滑动);FILL_HISTORY 必须为 0,IGNORE EXPIRED 必须为 1,IGNORE UPDATE 必须为 1;FILL 只支持 PREV、NULL、NONE、VALUE。
|
||||
|
||||
5. CONTINUOUS_WINDOW_CLOSE:窗口关闭时输出结果。修改、删除数据,并不会立即触发重算,每等待 rec_time_val 时长,会进行周期性重算。如果不指定 rec_time_val,那么重算周期是60分钟。如果重算的时间长度超过 rec_time_val,在本次重算后,自动开启下一次重算。该模式当前只支持 INTERVAL 窗口。如果使用 FILL,需要配置 adapter的相关信息:adapterFqdn、adapterPort、adapterToken。adapterToken 为 `{username}:{password}` 经过 Base64 编码之后的字符串,例如 `root:taosdata` 编码后为 `cm9vdDp0YW9zZGF0YQ==`
|
||||
5. CONTINUOUS_WINDOW_CLOSE:窗口关闭时输出结果。修改、删除数据,并不会立即触发重算,每等待 rec_time_val 时长,会进行周期性重算。如果不指定 rec_time_val,那么重算周期是 60 分钟。如果重算的时间长度超过 rec_time_val,在本次重算后,自动开启下一次重算。该模式当前只支持 INTERVAL 窗口。如果使用 FILL,需要配置 adapter的相关信息:adapterFqdn、adapterPort、adapterToken。adapterToken 为 `{username}:{password}` 经过 Base64 编码之后的字符串,例如 `root:taosdata` 编码后为 `cm9vdDp0YW9zZGF0YQ==`
|
||||
|
||||
由于窗口关闭是由事件时间决定的,如事件流中断、或持续延迟,则事件时间无法更新,可能导致无法得到最新的计算结果。
|
||||
|
||||
|
|
|
@ -35,6 +35,7 @@ description: TDengine 保留关键字的详细列表
|
|||
| AS | |
|
||||
| ASC | |
|
||||
| ASOF | |
|
||||
| ASYNC | 3.3.6.0+ |
|
||||
| AT_ONCE | |
|
||||
| ATTACH | |
|
||||
| AUTO | 3.3.5.0+ |
|
||||
|
@ -96,6 +97,7 @@ description: TDengine 保留关键字的详细列表
|
|||
| CONSUMER | |
|
||||
| CONSUMERS | |
|
||||
| CONTAINS | |
|
||||
| CONTINUOUS_WINDOW_CLOSE | 3.3.6.0+ |
|
||||
| COPY | |
|
||||
| COUNT | |
|
||||
| COUNT_WINDOW | |
|
||||
|
@ -109,7 +111,7 @@ description: TDengine 保留关键字的详细列表
|
|||
| DATABASE | |
|
||||
| DATABASES | |
|
||||
| DBS | |
|
||||
| DECIMAL | |
|
||||
| DECIMAL | 3.3.6.0+ |
|
||||
| DEFERRED | |
|
||||
| DELETE | |
|
||||
| DELETE_MARK | |
|
||||
|
@ -239,7 +241,7 @@ description: TDengine 保留关键字的详细列表
|
|||
| LEADER | |
|
||||
| LEADING | |
|
||||
| LEFT | |
|
||||
| LEVEL | 3.3.0.0 到 3.3.2.11 的所有版本 |
|
||||
| LEVEL | 3.3.0.0 - 3.3.2.11 |
|
||||
| LICENCES | |
|
||||
| LIKE | |
|
||||
| LIMIT | |
|
||||
|
|
|
@ -827,7 +827,7 @@ TDengine 客户端驱动的版本号与 TDengine 服务端的版本号是一一
|
|||
- **返回值**:非 `NULL`:成功,返回一个指向 TAOS_FIELD 结构体的指针,每个元素代表一列的元数据。`NULL`:失败。
|
||||
|
||||
- `TAOS_FIELD_E *taos_fetch_fields_e(TAOS_RES *res)`
|
||||
- **接口说明**:获取查询结果集每列数据的属性(列的名称、列的数据类型、列的长度),与 `taos_num_fields()` 配合使用,可用来解析 `taos_fetch_row()` 返回的一个元组(一行)的数据。TAOS_FIELD_E中除了TAOS_FIELD的基本信息外, 还包括了类型的`precision`和`scale`信息。
|
||||
- **接口说明**:获取查询结果集每列数据的属性(列的名称、列的数据类型、列的长度),与 `taos_num_fields()` 配合使用,可用来解析 `taos_fetch_row()` 返回的一个元组(一行)的数据。TAOS_FIELD_E中 除了 TAOS_FIELD 的基本信息外, 还包括了类型的 `precision` 和 `scale` 信息。
|
||||
- **参数说明**:
|
||||
- res:[入参] 结果集。
|
||||
- **返回值**:非 `NULL`:成功,返回一个指向 TAOS_FIELD_E 结构体的指针,每个元素代表一列的元数据。`NULL`:失败。
|
||||
|
|
|
@ -79,10 +79,12 @@ typedef struct SStreamTaskNodeUpdateMsg {
|
|||
int64_t streamId;
|
||||
int32_t taskId;
|
||||
SArray* pNodeList; // SArray<SNodeUpdateInfo>
|
||||
SArray* pTaskList; // SArray<int32_t>, taskId list
|
||||
} SStreamTaskNodeUpdateMsg;
|
||||
|
||||
int32_t tEncodeStreamTaskUpdateMsg(SEncoder* pEncoder, const SStreamTaskNodeUpdateMsg* pMsg);
|
||||
int32_t tDecodeStreamTaskUpdateMsg(SDecoder* pDecoder, SStreamTaskNodeUpdateMsg* pMsg);
|
||||
void tDestroyNodeUpdateMsg(SStreamTaskNodeUpdateMsg* pMsg);
|
||||
|
||||
typedef struct {
|
||||
int64_t reqId;
|
||||
|
|
|
@ -1257,18 +1257,18 @@ int32_t tDeserializeRetrieveIpWhite(void* buf, int32_t bufLen, SRetrieveIpWhiteR
|
|||
typedef struct {
|
||||
int32_t dnodeId;
|
||||
int64_t analVer;
|
||||
} SRetrieveAnalAlgoReq;
|
||||
} SRetrieveAnalyticsAlgoReq;
|
||||
|
||||
typedef struct {
|
||||
int64_t ver;
|
||||
SHashObj* hash; // algoname:algotype -> SAnalUrl
|
||||
} SRetrieveAnalAlgoRsp;
|
||||
} SRetrieveAnalyticAlgoRsp;
|
||||
|
||||
int32_t tSerializeRetrieveAnalAlgoReq(void* buf, int32_t bufLen, SRetrieveAnalAlgoReq* pReq);
|
||||
int32_t tDeserializeRetrieveAnalAlgoReq(void* buf, int32_t bufLen, SRetrieveAnalAlgoReq* pReq);
|
||||
int32_t tSerializeRetrieveAnalAlgoRsp(void* buf, int32_t bufLen, SRetrieveAnalAlgoRsp* pRsp);
|
||||
int32_t tDeserializeRetrieveAnalAlgoRsp(void* buf, int32_t bufLen, SRetrieveAnalAlgoRsp* pRsp);
|
||||
void tFreeRetrieveAnalAlgoRsp(SRetrieveAnalAlgoRsp* pRsp);
|
||||
int32_t tSerializeRetrieveAnalyticAlgoReq(void* buf, int32_t bufLen, SRetrieveAnalyticsAlgoReq* pReq);
|
||||
int32_t tDeserializeRetrieveAnalyticAlgoReq(void* buf, int32_t bufLen, SRetrieveAnalyticsAlgoReq* pReq);
|
||||
int32_t tSerializeRetrieveAnalyticAlgoRsp(void* buf, int32_t bufLen, SRetrieveAnalyticAlgoRsp* pRsp);
|
||||
int32_t tDeserializeRetrieveAnalyticAlgoRsp(void* buf, int32_t bufLen, SRetrieveAnalyticAlgoRsp* pRsp);
|
||||
void tFreeRetrieveAnalyticAlgoRsp(SRetrieveAnalyticAlgoRsp* pRsp);
|
||||
|
||||
typedef struct {
|
||||
int8_t alterType;
|
||||
|
|
|
@ -361,6 +361,7 @@
|
|||
TD_DEF_MSG_TYPE(TDMT_STREAM_RETRIEVE_TRIGGER, "stream-retri-trigger", NULL, NULL)
|
||||
TD_DEF_MSG_TYPE(TDMT_STREAM_CONSEN_CHKPT, "stream-consen-chkpt", NULL, NULL)
|
||||
TD_DEF_MSG_TYPE(TDMT_STREAM_CHKPT_EXEC, "stream-exec-chkpt", NULL, NULL)
|
||||
TD_DEF_MSG_TYPE(TDMT_STREAM_TASK_START, "stream-task-start", NULL, NULL)
|
||||
TD_CLOSE_MSG_SEG(TDMT_STREAM_MSG)
|
||||
|
||||
TD_NEW_MSG_SEG(TDMT_MON_MSG) //5 << 8
|
||||
|
@ -429,6 +430,8 @@
|
|||
TD_DEF_MSG_TYPE(TDMT_MND_ARB_UPDATE_GROUP, "mnd-arb-update-group", NULL, NULL) // no longer used
|
||||
TD_DEF_MSG_TYPE(TDMT_MND_ARB_UPDATE_GROUP_BATCH, "mnd-arb-update-group-batch", NULL, NULL)
|
||||
TD_DEF_MSG_TYPE(TDMT_MND_ARB_ASSIGN_LEADER, "mnd-arb-assign-leader", NULL, NULL)
|
||||
TD_DEF_MSG_TYPE(TDMT_MND_START_STREAM, "start-stream", NULL, NULL)
|
||||
TD_DEF_MSG_TYPE(TDMT_MND_STOP_STREAM, "stop-stream", NULL, NULL)
|
||||
TD_CLOSE_MSG_SEG(TDMT_MND_ARB_MSG)
|
||||
|
||||
TD_NEW_MSG_SEG(TDMT_MAX_MSG) // msg end mark
|
||||
|
|
|
@ -252,6 +252,8 @@ typedef struct STableColsData {
|
|||
char tbName[TSDB_TABLE_NAME_LEN];
|
||||
SArray* aCol;
|
||||
bool getFromHash;
|
||||
bool isOrdered;
|
||||
bool isDuplicateTs;
|
||||
} STableColsData;
|
||||
|
||||
typedef struct STableVgUid {
|
||||
|
|
|
@ -522,6 +522,7 @@ typedef struct STaskStartInfo {
|
|||
|
||||
typedef struct STaskUpdateInfo {
|
||||
SHashObj* pTasks;
|
||||
SArray* pTaskList;
|
||||
int32_t activeTransId;
|
||||
int32_t completeTransId;
|
||||
int64_t completeTs;
|
||||
|
|
|
@ -1853,6 +1853,9 @@ int32_t doProcessMsgFromServer(void* param) {
|
|||
void processMsgFromServer(void* parent, SRpcMsg* pMsg, SEpSet* pEpSet) {
|
||||
int32_t code = 0;
|
||||
SEpSet* tEpSet = NULL;
|
||||
|
||||
tscDebug("msg callback, ahandle %p", pMsg->info.ahandle);
|
||||
|
||||
if (pEpSet != NULL) {
|
||||
tEpSet = taosMemoryCalloc(1, sizeof(SEpSet));
|
||||
if (NULL == tEpSet) {
|
||||
|
@ -1894,6 +1897,7 @@ void processMsgFromServer(void* parent, SRpcMsg* pMsg, SEpSet* pEpSet) {
|
|||
goto _exit;
|
||||
}
|
||||
return;
|
||||
|
||||
_exit:
|
||||
tscError("failed to sched msg to tsc since %s", tstrerror(code));
|
||||
code = doProcessMsgFromServerImpl(pMsg, tEpSet);
|
||||
|
|
|
@ -2239,7 +2239,7 @@ int taos_stmt2_bind_param(TAOS_STMT2 *stmt, TAOS_STMT2_BINDV *bindv, int32_t col
|
|||
SVCreateTbReq *pCreateTbReq = NULL;
|
||||
if (bindv->tags && bindv->tags[i]) {
|
||||
code = stmtSetTbTags2(stmt, bindv->tags[i], &pCreateTbReq);
|
||||
} else if (pStmt->sql.autoCreateTbl || pStmt->bInfo.needParse) {
|
||||
} else if (pStmt->bInfo.tbNameFlag & IS_FIXED_TAG) {
|
||||
code = stmtCheckTags2(stmt, &pCreateTbReq);
|
||||
} else {
|
||||
pStmt->sql.autoCreateTbl = false;
|
||||
|
|
|
@ -86,6 +86,7 @@ static int32_t stmtCreateRequest(STscStmt2* pStmt) {
|
|||
if (pStmt->reqid != 0) {
|
||||
pStmt->reqid++;
|
||||
}
|
||||
pStmt->exec.pRequest->type = RES_TYPE__QUERY;
|
||||
if (pStmt->db != NULL) {
|
||||
taosMemoryFreeClear(pStmt->exec.pRequest->pDb);
|
||||
pStmt->exec.pRequest->pDb = taosStrdup(pStmt->db);
|
||||
|
@ -1236,13 +1237,16 @@ static int stmtFetchStbColFields2(STscStmt2* pStmt, int32_t* fieldNum, TAOS_FIEL
|
|||
qBuildStmtStbColFields(*pDataBlock, pStmt->bInfo.boundTags, pStmt->bInfo.tbNameFlag, fieldNum, fields));
|
||||
|
||||
if (pStmt->bInfo.tbType == TSDB_SUPER_TABLE && cleanStb) {
|
||||
pStmt->bInfo.needParse = true;
|
||||
qDestroyStmtDataBlock(*pDataBlock);
|
||||
*pDataBlock = NULL;
|
||||
if (taosHashRemove(pStmt->exec.pBlockHash, pStmt->bInfo.tbFName, strlen(pStmt->bInfo.tbFName)) != 0) {
|
||||
tscError("get fileds %s remove exec blockHash fail", pStmt->bInfo.tbFName);
|
||||
STMT_ERRI_JRET(TSDB_CODE_APP_ERROR);
|
||||
}
|
||||
pStmt->sql.autoCreateTbl = false;
|
||||
pStmt->bInfo.tagsCached = false;
|
||||
pStmt->bInfo.sname = (SName){0};
|
||||
stmtCleanBindInfo(pStmt);
|
||||
}
|
||||
|
||||
_return:
|
||||
|
@ -1560,6 +1564,8 @@ int stmtBindBatch2(TAOS_STMT2* stmt, TAOS_STMT2_BIND* bind, int32_t colIdx, SVCr
|
|||
// param->tblData.aCol = taosArrayInit(20, POINTER_BYTES);
|
||||
|
||||
param->restoreTbCols = false;
|
||||
param->tblData.isOrdered = true;
|
||||
param->tblData.isDuplicateTs = false;
|
||||
tstrncpy(param->tblData.tbName, pStmt->bInfo.tbName, TSDB_TABLE_NAME_LEN);
|
||||
|
||||
param->pCreateTbReq = pCreateTbReq;
|
||||
|
@ -1577,6 +1583,8 @@ int stmtBindBatch2(TAOS_STMT2* stmt, TAOS_STMT2_BIND* bind, int32_t colIdx, SVCr
|
|||
code = qBindStmtStbColsValue2(*pDataBlock, pCols, bind, pStmt->exec.pRequest->msgBuf,
|
||||
pStmt->exec.pRequest->msgBufLen, &pStmt->sql.siInfo.pTSchema, pStmt->sql.pBindInfo,
|
||||
pStmt->taos->optionInfo.charsetCxt);
|
||||
param->tblData.isOrdered = (*pDataBlock)->ordered;
|
||||
param->tblData.isDuplicateTs = (*pDataBlock)->duplicateTs;
|
||||
} else {
|
||||
if (colIdx == -1) {
|
||||
if (pStmt->sql.bindRowFormat) {
|
||||
|
|
|
@ -958,6 +958,11 @@ TEST(stmt2Case, stmt2_stb_insert) {
|
|||
"insert into stmt2_testdb_1.? using stmt2_testdb_1.stb (t1,t2)tags(?,?) (ts,b)values(?,?)", 3, 3, 3, true,
|
||||
true);
|
||||
}
|
||||
// TD-34123 : interlace=0 with fixed tags
|
||||
{
|
||||
do_stmt("no-interlcace", taos, &option, "insert into `stmt2_testdb_1`.`stb` (tbname,ts,b,t1,t2) values(?,?,?,?,?)",
|
||||
3, 3, 3, false, true);
|
||||
}
|
||||
|
||||
// interlace = 0 & use db]
|
||||
do_query(taos, "use stmt2_testdb_1");
|
||||
|
@ -1212,6 +1217,45 @@ TEST(stmt2Case, stmt2_insert_non_statndard) {
|
|||
|
||||
taos_stmt2_close(stmt);
|
||||
}
|
||||
// TD-34123 disorder pk ts
|
||||
{
|
||||
do_query(taos, "create stable stmt2_testdb_6.stb2 (ts timestamp, int_col int PRIMARY KEY) tags(int_tag int);");
|
||||
TAOS_STMT2* stmt = taos_stmt2_init(taos, &option);
|
||||
ASSERT_NE(stmt, nullptr);
|
||||
const char* sql =
|
||||
"INSERT INTO stmt2_testdb_6.? using stmt2_testdb_6.stb2 (int_tag)tags(1) (ts,int_col)VALUES (?,?)";
|
||||
printf("stmt2 [%s] : %s\n", "disorder pk ts", sql);
|
||||
int code = taos_stmt2_prepare(stmt, sql, 0);
|
||||
checkError(stmt, code);
|
||||
|
||||
int tag_i = 0;
|
||||
int tag_l = sizeof(int);
|
||||
int tag_bl = 3;
|
||||
int64_t ts[5] = {1591060628003, 1591060628002, 1591060628002, 1591060628002, 1591060628001};
|
||||
int t64_len[5] = {sizeof(int64_t), sizeof(int64_t), sizeof(int64_t), sizeof(int64_t), sizeof(int64_t)};
|
||||
int coli[5] = {1, 4, 4, 3, 2};
|
||||
int ilen[5] = {sizeof(int), sizeof(int), sizeof(int), sizeof(int), sizeof(int)};
|
||||
int total_affect_rows = 0;
|
||||
char is_null[2] = {1, 1};
|
||||
|
||||
TAOS_STMT2_BIND params1[2] = {
|
||||
{TSDB_DATA_TYPE_TIMESTAMP, &ts, &t64_len[0], NULL, 5},
|
||||
{TSDB_DATA_TYPE_INT, &coli, &ilen[0], NULL, 5},
|
||||
};
|
||||
|
||||
TAOS_STMT2_BIND* paramv = ¶ms1[0];
|
||||
char* tbname = "tb3";
|
||||
TAOS_STMT2_BINDV bindv = {1, &tbname, NULL, ¶mv};
|
||||
code = taos_stmt2_bind_param(stmt, &bindv, -1);
|
||||
checkError(stmt, code);
|
||||
|
||||
int affected_rows;
|
||||
taos_stmt2_exec(stmt, &affected_rows);
|
||||
checkError(stmt, code);
|
||||
ASSERT_EQ(affected_rows, 4);
|
||||
|
||||
taos_stmt2_close(stmt);
|
||||
}
|
||||
|
||||
// get fields insert into ? valuse
|
||||
{
|
||||
|
@ -1241,6 +1285,31 @@ TEST(stmt2Case, stmt2_insert_non_statndard) {
|
|||
taos_stmt2_close(stmt);
|
||||
}
|
||||
|
||||
// get fields cache error
|
||||
{
|
||||
TAOS_STMT2* stmt = taos_stmt2_init(taos, &option);
|
||||
ASSERT_NE(stmt, nullptr);
|
||||
const char* sql = " INSERT INTO ? using stmt2_testdb_6.stb1(int_tag) tags(1)(ts) VALUES(?) ";
|
||||
printf("stmt2 [%s] : %s\n", "get fields", sql);
|
||||
int code = taos_stmt2_prepare(stmt, sql, 0);
|
||||
checkError(stmt, code);
|
||||
|
||||
int fieldNum = 0;
|
||||
TAOS_FIELD_ALL* pFields = NULL;
|
||||
code = taos_stmt2_get_fields(stmt, &fieldNum, &pFields);
|
||||
checkError(stmt, code);
|
||||
ASSERT_EQ(fieldNum, 2);
|
||||
ASSERT_STREQ(pFields[0].name, "tbname");
|
||||
ASSERT_STREQ(pFields[1].name, "ts");
|
||||
|
||||
char* tbname = "stmt2_testdb_6.中文表名";
|
||||
TAOS_STMT2_BINDV bindv = {1, &tbname, NULL, NULL};
|
||||
code = taos_stmt2_bind_param(stmt, &bindv, -1);
|
||||
ASSERT_EQ(code, TSDB_CODE_INVALID_PARA);
|
||||
|
||||
taos_stmt2_close(stmt);
|
||||
}
|
||||
|
||||
do_query(taos, "drop database if exists stmt2_testdb_6");
|
||||
taos_close(taos);
|
||||
}
|
||||
|
|
|
@ -170,6 +170,7 @@ void taos_close(TAOS *taos) {
|
|||
}
|
||||
|
||||
const char *taos_data_type(int type) {
|
||||
(void)taos_init();
|
||||
CHECK_PTR(fp_taos_data_type);
|
||||
return (*fp_taos_data_type)(type);
|
||||
}
|
||||
|
@ -504,6 +505,7 @@ int taos_get_current_db(TAOS *taos, char *database, int len, int *required) {
|
|||
}
|
||||
|
||||
const char *taos_errstr(TAOS_RES *res) {
|
||||
(void)taos_init();
|
||||
if (fp_taos_errstr == NULL) {
|
||||
return tstrerror(terrno);
|
||||
}
|
||||
|
@ -511,6 +513,7 @@ const char *taos_errstr(TAOS_RES *res) {
|
|||
}
|
||||
|
||||
int taos_errno(TAOS_RES *res) {
|
||||
(void)taos_init();
|
||||
if (fp_taos_errno == NULL) {
|
||||
return terrno;
|
||||
}
|
||||
|
@ -649,7 +652,7 @@ TAOS_RES *taos_schemaless_insert_ttl_with_reqid_tbname_key(TAOS *taos, char *lin
|
|||
}
|
||||
|
||||
tmq_conf_t *tmq_conf_new() {
|
||||
taos_init();
|
||||
(void)taos_init();
|
||||
CHECK_PTR(fp_tmq_conf_new);
|
||||
return (*fp_tmq_conf_new)();
|
||||
}
|
||||
|
@ -670,7 +673,7 @@ void tmq_conf_set_auto_commit_cb(tmq_conf_t *conf, tmq_commit_cb *cb, void *para
|
|||
}
|
||||
|
||||
tmq_list_t *tmq_list_new() {
|
||||
taos_init();
|
||||
(void)taos_init();
|
||||
CHECK_PTR(fp_tmq_list_new);
|
||||
return (*fp_tmq_list_new)();
|
||||
}
|
||||
|
@ -696,7 +699,7 @@ char **tmq_list_to_c_array(const tmq_list_t *tlist) {
|
|||
}
|
||||
|
||||
tmq_t *tmq_consumer_new(tmq_conf_t *conf, char *errstr, int32_t errstrLen) {
|
||||
taos_init();
|
||||
(void)taos_init();
|
||||
CHECK_PTR(fp_tmq_consumer_new);
|
||||
return (*fp_tmq_consumer_new)(conf, errstr, errstrLen);
|
||||
}
|
||||
|
@ -866,13 +869,13 @@ TSDB_SERVER_STATUS taos_check_server_status(const char *fqdn, int port, char *de
|
|||
}
|
||||
|
||||
void taos_write_crashinfo(int signum, void *sigInfo, void *context) {
|
||||
taos_init();
|
||||
(void)taos_init();
|
||||
CHECK_VOID(fp_taos_write_crashinfo);
|
||||
(*fp_taos_write_crashinfo)(signum, sigInfo, context);
|
||||
}
|
||||
|
||||
char *getBuildInfo() {
|
||||
taos_init();
|
||||
(void)taos_init();
|
||||
CHECK_PTR(fp_getBuildInfo);
|
||||
return (*fp_getBuildInfo)();
|
||||
}
|
||||
|
|
|
@ -143,6 +143,18 @@ int32_t tEncodeStreamTaskUpdateMsg(SEncoder* pEncoder, const SStreamTaskNodeUpda
|
|||
|
||||
// todo this new attribute will be result in being incompatible with previous version
|
||||
TAOS_CHECK_EXIT(tEncodeI32(pEncoder, pMsg->transId));
|
||||
|
||||
int32_t numOfTasks = taosArrayGetSize(pMsg->pTaskList);
|
||||
TAOS_CHECK_EXIT(tEncodeI32(pEncoder, numOfTasks));
|
||||
|
||||
for (int32_t i = 0; i < numOfTasks; ++i) {
|
||||
int32_t* pId = taosArrayGet(pMsg->pTaskList, i);
|
||||
if (pId == NULL) {
|
||||
TAOS_CHECK_EXIT(terrno);
|
||||
}
|
||||
TAOS_CHECK_EXIT(tEncodeI32(pEncoder, *(int32_t*)pId));
|
||||
}
|
||||
|
||||
tEndEncode(pEncoder);
|
||||
_exit:
|
||||
if (code) {
|
||||
|
@ -162,10 +174,10 @@ int32_t tDecodeStreamTaskUpdateMsg(SDecoder* pDecoder, SStreamTaskNodeUpdateMsg*
|
|||
|
||||
int32_t size = 0;
|
||||
TAOS_CHECK_EXIT(tDecodeI32(pDecoder, &size));
|
||||
|
||||
pMsg->pNodeList = taosArrayInit(size, sizeof(SNodeUpdateInfo));
|
||||
if (pMsg->pNodeList == NULL) {
|
||||
TAOS_CHECK_EXIT(terrno);
|
||||
}
|
||||
TSDB_CHECK_NULL(pMsg->pNodeList, code, lino, _exit, terrno);
|
||||
|
||||
for (int32_t i = 0; i < size; ++i) {
|
||||
SNodeUpdateInfo info = {0};
|
||||
TAOS_CHECK_EXIT(tDecodeI32(pDecoder, &info.nodeId));
|
||||
|
@ -179,11 +191,33 @@ int32_t tDecodeStreamTaskUpdateMsg(SDecoder* pDecoder, SStreamTaskNodeUpdateMsg*
|
|||
|
||||
TAOS_CHECK_EXIT(tDecodeI32(pDecoder, &pMsg->transId));
|
||||
|
||||
// number of tasks
|
||||
TAOS_CHECK_EXIT(tDecodeI32(pDecoder, &size));
|
||||
pMsg->pTaskList = taosArrayInit(size, sizeof(int32_t));
|
||||
if (pMsg->pTaskList == NULL) {
|
||||
TAOS_CHECK_EXIT(terrno);
|
||||
}
|
||||
|
||||
for (int32_t i = 0; i < size; ++i) {
|
||||
int32_t id = 0;
|
||||
TAOS_CHECK_EXIT(tDecodeI32(pDecoder, &id));
|
||||
if (taosArrayPush(pMsg->pTaskList, &id) == NULL) {
|
||||
TAOS_CHECK_EXIT(terrno);
|
||||
}
|
||||
}
|
||||
|
||||
tEndDecode(pDecoder);
|
||||
_exit:
|
||||
return code;
|
||||
}
|
||||
|
||||
void tDestroyNodeUpdateMsg(SStreamTaskNodeUpdateMsg* pMsg) {
|
||||
taosArrayDestroy(pMsg->pNodeList);
|
||||
taosArrayDestroy(pMsg->pTaskList);
|
||||
pMsg->pNodeList = NULL;
|
||||
pMsg->pTaskList = NULL;
|
||||
}
|
||||
|
||||
int32_t tEncodeStreamTaskCheckReq(SEncoder* pEncoder, const SStreamTaskCheckReq* pReq) {
|
||||
int32_t code = 0;
|
||||
int32_t lino;
|
||||
|
|
|
@ -2297,7 +2297,7 @@ _exit:
|
|||
return code;
|
||||
}
|
||||
|
||||
int32_t tSerializeRetrieveAnalAlgoReq(void *buf, int32_t bufLen, SRetrieveAnalAlgoReq *pReq) {
|
||||
int32_t tSerializeRetrieveAnalyticAlgoReq(void *buf, int32_t bufLen, SRetrieveAnalyticsAlgoReq *pReq) {
|
||||
SEncoder encoder = {0};
|
||||
int32_t code = 0;
|
||||
int32_t lino;
|
||||
|
@ -2319,7 +2319,7 @@ _exit:
|
|||
return tlen;
|
||||
}
|
||||
|
||||
int32_t tDeserializeRetrieveAnalAlgoReq(void *buf, int32_t bufLen, SRetrieveAnalAlgoReq *pReq) {
|
||||
int32_t tDeserializeRetrieveAnalyticAlgoReq(void *buf, int32_t bufLen, SRetrieveAnalyticsAlgoReq *pReq) {
|
||||
SDecoder decoder = {0};
|
||||
int32_t code = 0;
|
||||
int32_t lino;
|
||||
|
@ -2336,7 +2336,7 @@ _exit:
|
|||
return code;
|
||||
}
|
||||
|
||||
int32_t tSerializeRetrieveAnalAlgoRsp(void *buf, int32_t bufLen, SRetrieveAnalAlgoRsp *pRsp) {
|
||||
int32_t tSerializeRetrieveAnalyticAlgoRsp(void *buf, int32_t bufLen, SRetrieveAnalyticAlgoRsp *pRsp) {
|
||||
SEncoder encoder = {0};
|
||||
int32_t code = 0;
|
||||
int32_t lino;
|
||||
|
@ -2387,7 +2387,7 @@ _exit:
|
|||
return tlen;
|
||||
}
|
||||
|
||||
int32_t tDeserializeRetrieveAnalAlgoRsp(void *buf, int32_t bufLen, SRetrieveAnalAlgoRsp *pRsp) {
|
||||
int32_t tDeserializeRetrieveAnalyticAlgoRsp(void *buf, int32_t bufLen, SRetrieveAnalyticAlgoRsp *pRsp) {
|
||||
if (pRsp->hash == NULL) {
|
||||
pRsp->hash = taosHashInit(64, MurmurHash3_32, true, HASH_ENTRY_LOCK);
|
||||
if (pRsp->hash == NULL) {
|
||||
|
@ -2425,7 +2425,10 @@ int32_t tDeserializeRetrieveAnalAlgoRsp(void *buf, int32_t bufLen, SRetrieveAnal
|
|||
TAOS_CHECK_EXIT(tDecodeBinaryAlloc(&decoder, (void **)&url.url, NULL) < 0);
|
||||
}
|
||||
|
||||
TAOS_CHECK_EXIT(taosHashPut(pRsp->hash, name, nameLen, &url, sizeof(SAnalyticsUrl)));
|
||||
char dstName[TSDB_ANALYTIC_ALGO_NAME_LEN] = {0};
|
||||
strntolower(dstName, name, nameLen);
|
||||
|
||||
TAOS_CHECK_EXIT(taosHashPut(pRsp->hash, dstName, nameLen, &url, sizeof(SAnalyticsUrl)));
|
||||
}
|
||||
|
||||
tEndDecode(&decoder);
|
||||
|
@ -2435,7 +2438,7 @@ _exit:
|
|||
return code;
|
||||
}
|
||||
|
||||
void tFreeRetrieveAnalAlgoRsp(SRetrieveAnalAlgoRsp *pRsp) {
|
||||
void tFreeRetrieveAnalyticAlgoRsp(SRetrieveAnalyticAlgoRsp *pRsp) {
|
||||
void *pIter = taosHashIterate(pRsp->hash, NULL);
|
||||
while (pIter != NULL) {
|
||||
SAnalyticsUrl *pUrl = (SAnalyticsUrl *)pIter;
|
||||
|
|
|
@ -520,11 +520,12 @@ int32_t tRowBuildFromBind(SBindInfo *infos, int32_t numOfInfos, bool infoSorted,
|
|||
*pOrdered = true;
|
||||
*pDupTs = false;
|
||||
} else {
|
||||
// no more compare if we already get disordered or duplicate rows
|
||||
if (*pOrdered && !*pDupTs) {
|
||||
int32_t code = tRowKeyCompare(&rowKey, &lastRowKey);
|
||||
*pOrdered = (code >= 0);
|
||||
*pDupTs = (code == 0);
|
||||
if (*pOrdered) {
|
||||
int32_t res = tRowKeyCompare(&rowKey, &lastRowKey);
|
||||
*pOrdered = (res >= 0);
|
||||
if (!*pDupTs) {
|
||||
*pDupTs = (res == 0);
|
||||
}
|
||||
}
|
||||
}
|
||||
lastRowKey = rowKey;
|
||||
|
@ -3349,17 +3350,17 @@ int32_t tRowBuildFromBind2(SBindInfo2 *infos, int32_t numOfInfos, bool infoSorte
|
|||
*pOrdered = true;
|
||||
*pDupTs = false;
|
||||
} else {
|
||||
// no more compare if we already get disordered or duplicate rows
|
||||
if (*pOrdered && !*pDupTs) {
|
||||
int32_t code = tRowKeyCompare(&rowKey, &lastRowKey);
|
||||
*pOrdered = (code >= 0);
|
||||
*pDupTs = (code == 0);
|
||||
if (*pOrdered) {
|
||||
int32_t res = tRowKeyCompare(&rowKey, &lastRowKey);
|
||||
*pOrdered = (res >= 0);
|
||||
if (!*pDupTs) {
|
||||
*pDupTs = (res == 0);
|
||||
}
|
||||
}
|
||||
lastRowKey = rowKey;
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
_exit:
|
||||
taosArrayDestroy(colValArray);
|
||||
taosArrayDestroy(bufArray);
|
||||
|
|
|
@ -543,7 +543,11 @@ static int32_t taosLoadCfg(SConfig *pCfg, const char **envCmd, const char *input
|
|||
char cfgFile[PATH_MAX + 100] = {0};
|
||||
|
||||
TAOS_CHECK_RETURN(taosExpandDir(inputCfgDir, cfgDir, PATH_MAX));
|
||||
char lastC = cfgDir[strlen(cfgDir) - 1];
|
||||
int32_t pos = strlen(cfgDir);
|
||||
if(pos > 0) {
|
||||
pos -= 1;
|
||||
}
|
||||
char lastC = cfgDir[pos];
|
||||
char *tdDirsep = TD_DIRSEP;
|
||||
if (lastC == '\\' || lastC == '/') {
|
||||
tdDirsep = "";
|
||||
|
|
|
@ -98,15 +98,15 @@ static void dmMayShouldUpdateAnalFunc(SDnodeMgmt *pMgmt, int64_t newVer) {
|
|||
if (oldVer == newVer) return;
|
||||
dDebug("analysis on dnode ver:%" PRId64 ", status ver:%" PRId64, oldVer, newVer);
|
||||
|
||||
SRetrieveAnalAlgoReq req = {.dnodeId = pMgmt->pData->dnodeId, .analVer = oldVer};
|
||||
int32_t contLen = tSerializeRetrieveAnalAlgoReq(NULL, 0, &req);
|
||||
SRetrieveAnalyticsAlgoReq req = {.dnodeId = pMgmt->pData->dnodeId, .analVer = oldVer};
|
||||
int32_t contLen = tSerializeRetrieveAnalyticAlgoReq(NULL, 0, &req);
|
||||
if (contLen < 0) {
|
||||
dError("failed to serialize analysis function ver request since %s", tstrerror(contLen));
|
||||
return;
|
||||
}
|
||||
|
||||
void *pHead = rpcMallocCont(contLen);
|
||||
contLen = tSerializeRetrieveAnalAlgoReq(pHead, contLen, &req);
|
||||
contLen = tSerializeRetrieveAnalyticAlgoReq(pHead, contLen, &req);
|
||||
if (contLen < 0) {
|
||||
rpcFreeCont(pHead);
|
||||
dError("failed to serialize analysis function ver request since %s", tstrerror(contLen));
|
||||
|
|
|
@ -251,6 +251,7 @@ SArray *mmGetMsgHandles() {
|
|||
if (dmSetMgmtHandle(pArray, TDMT_STREAM_TASK_PAUSE_RSP, mmPutMsgToWriteQueue, 0) == NULL) goto _OVER;
|
||||
if (dmSetMgmtHandle(pArray, TDMT_STREAM_TASK_RESUME_RSP, mmPutMsgToWriteQueue, 0) == NULL) goto _OVER;
|
||||
if (dmSetMgmtHandle(pArray, TDMT_STREAM_TASK_STOP_RSP, mmPutMsgToWriteQueue, 0) == NULL) goto _OVER;
|
||||
if (dmSetMgmtHandle(pArray, TDMT_STREAM_TASK_START_RSP, mmPutMsgToWriteQueue, 0) == NULL) goto _OVER;
|
||||
if (dmSetMgmtHandle(pArray, TDMT_STREAM_TASK_UPDATE_CHKPT_RSP, mmPutMsgToWriteQueue, 0) == NULL) goto _OVER;
|
||||
if (dmSetMgmtHandle(pArray, TDMT_STREAM_CONSEN_CHKPT_RSP, mmPutMsgToWriteQueue, 0) == NULL) goto _OVER;
|
||||
if (dmSetMgmtHandle(pArray, TDMT_STREAM_CREATE, mmPutMsgToWriteQueue, 0) == NULL) goto _OVER;
|
||||
|
|
|
@ -89,6 +89,7 @@ SArray *smGetMsgHandles() {
|
|||
if (dmSetMgmtHandle(pArray, TDMT_STREAM_TASK_PAUSE, smPutNodeMsgToMgmtQueue, 1) == NULL) goto _OVER;
|
||||
if (dmSetMgmtHandle(pArray, TDMT_STREAM_TASK_RESUME, smPutNodeMsgToMgmtQueue, 1) == NULL) goto _OVER;
|
||||
if (dmSetMgmtHandle(pArray, TDMT_STREAM_TASK_STOP, smPutNodeMsgToMgmtQueue, 1) == NULL) goto _OVER;
|
||||
if (dmSetMgmtHandle(pArray, TDMT_STREAM_TASK_START, smPutNodeMsgToMgmtQueue, 1) == NULL) goto _OVER;
|
||||
if (dmSetMgmtHandle(pArray, TDMT_STREAM_TASK_CHECKPOINT_READY, smPutNodeMsgToStreamQueue, 1) == NULL) goto _OVER;
|
||||
if (dmSetMgmtHandle(pArray, TDMT_STREAM_TASK_CHECKPOINT_READY_RSP, smPutNodeMsgToStreamQueue, 1) == NULL) goto _OVER;
|
||||
if (dmSetMgmtHandle(pArray, TDMT_STREAM_RETRIEVE_TRIGGER, smPutNodeMsgToStreamQueue, 1) == NULL) goto _OVER;
|
||||
|
|
|
@ -1033,6 +1033,7 @@ SArray *vmGetMsgHandles() {
|
|||
if (dmSetMgmtHandle(pArray, TDMT_STREAM_TASK_PAUSE, vmPutMsgToWriteQueue, 0) == NULL) goto _OVER;
|
||||
if (dmSetMgmtHandle(pArray, TDMT_STREAM_TASK_RESUME, vmPutMsgToWriteQueue, 0) == NULL) goto _OVER;
|
||||
if (dmSetMgmtHandle(pArray, TDMT_STREAM_TASK_STOP, vmPutMsgToWriteQueue, 0) == NULL) goto _OVER;
|
||||
if (dmSetMgmtHandle(pArray, TDMT_STREAM_TASK_START, vmPutMsgToWriteQueue, 0) == NULL) goto _OVER;
|
||||
if (dmSetMgmtHandle(pArray, TDMT_VND_STREAM_CHECK_POINT_SOURCE, vmPutMsgToWriteQueue, 0) == NULL) goto _OVER;
|
||||
if (dmSetMgmtHandle(pArray, TDMT_VND_STREAM_TASK_UPDATE, vmPutMsgToWriteQueue, 0) == NULL) goto _OVER;
|
||||
if (dmSetMgmtHandle(pArray, TDMT_VND_STREAM_TASK_RESET, vmPutMsgToWriteQueue, 0) == NULL) goto _OVER;
|
||||
|
|
|
@ -116,13 +116,13 @@ static bool dmIsForbiddenIp(int8_t forbidden, char *user, uint32_t clientIp) {
|
|||
}
|
||||
}
|
||||
|
||||
static void dmUpdateAnalFunc(SDnodeData *pData, void *pTrans, SRpcMsg *pRpc) {
|
||||
SRetrieveAnalAlgoRsp rsp = {0};
|
||||
if (tDeserializeRetrieveAnalAlgoRsp(pRpc->pCont, pRpc->contLen, &rsp) == 0) {
|
||||
static void dmUpdateAnalyticFunc(SDnodeData *pData, void *pTrans, SRpcMsg *pRpc) {
|
||||
SRetrieveAnalyticAlgoRsp rsp = {0};
|
||||
if (tDeserializeRetrieveAnalyticAlgoRsp(pRpc->pCont, pRpc->contLen, &rsp) == 0) {
|
||||
taosAnalyUpdate(rsp.ver, rsp.hash);
|
||||
rsp.hash = NULL;
|
||||
}
|
||||
tFreeRetrieveAnalAlgoRsp(&rsp);
|
||||
tFreeRetrieveAnalyticAlgoRsp(&rsp);
|
||||
rpcFreeCont(pRpc->pCont);
|
||||
}
|
||||
|
||||
|
@ -176,7 +176,7 @@ static void dmProcessRpcMsg(SDnode *pDnode, SRpcMsg *pRpc, SEpSet *pEpSet) {
|
|||
dmUpdateRpcIpWhite(&pDnode->data, pTrans->serverRpc, pRpc);
|
||||
return;
|
||||
case TDMT_MND_RETRIEVE_ANAL_ALGO_RSP:
|
||||
dmUpdateAnalFunc(&pDnode->data, pTrans->serverRpc, pRpc);
|
||||
dmUpdateAnalyticFunc(&pDnode->data, pTrans->serverRpc, pRpc);
|
||||
return;
|
||||
default:
|
||||
break;
|
||||
|
|
|
@ -36,7 +36,7 @@ extern "C" {
|
|||
#define MND_STREAM_TASK_UPDATE_NAME "stream-task-update"
|
||||
#define MND_STREAM_CHKPT_UPDATE_NAME "stream-chkpt-update"
|
||||
#define MND_STREAM_CHKPT_CONSEN_NAME "stream-chkpt-consen"
|
||||
#define MND_STREAM_RESTART_NAME "stream-restart"
|
||||
#define MND_STREAM_STOP_NAME "stream-stop"
|
||||
|
||||
typedef struct SStreamTransInfo {
|
||||
int64_t startTime;
|
||||
|
@ -148,7 +148,7 @@ int32_t mndStreamSetResetTaskAction(SMnode *pMnode, STrans *pTrans, SStreamObj *
|
|||
int32_t mndStreamSetUpdateChkptAction(SMnode *pMnode, STrans *pTrans, SStreamObj *pStream);
|
||||
int32_t mndCreateStreamResetStatusTrans(SMnode *pMnode, SStreamObj *pStream, int64_t chkptId);
|
||||
int32_t mndStreamSetChkptIdAction(SMnode* pMnode, STrans* pTrans, SStreamObj* pStream, int64_t checkpointId, SArray *pList);
|
||||
int32_t mndStreamSetRestartAction(SMnode *pMnode, STrans *pTrans, SStreamObj *pStream);
|
||||
int32_t mndStreamSetStopAction(SMnode *pMnode, STrans *pTrans, SStreamObj *pStream);
|
||||
int32_t mndStreamSetCheckpointAction(SMnode *pMnode, STrans *pTrans, SStreamTask *pTask, int64_t checkpointId,
|
||||
int8_t mndTrigger);
|
||||
int32_t mndStreamSetStopStreamTasksActions(SMnode* pMnode, STrans *pTrans, uint64_t dbUid);
|
||||
|
|
|
@ -847,10 +847,10 @@ static int32_t mndProcessAnalAlgoReq(SRpcMsg *pReq) {
|
|||
SAnalyticsUrl url;
|
||||
int32_t nameLen;
|
||||
char name[TSDB_ANALYTIC_ALGO_KEY_LEN];
|
||||
SRetrieveAnalAlgoReq req = {0};
|
||||
SRetrieveAnalAlgoRsp rsp = {0};
|
||||
SRetrieveAnalyticsAlgoReq req = {0};
|
||||
SRetrieveAnalyticAlgoRsp rsp = {0};
|
||||
|
||||
TAOS_CHECK_GOTO(tDeserializeRetrieveAnalAlgoReq(pReq->pCont, pReq->contLen, &req), NULL, _OVER);
|
||||
TAOS_CHECK_GOTO(tDeserializeRetrieveAnalyticAlgoReq(pReq->pCont, pReq->contLen, &req), NULL, _OVER);
|
||||
|
||||
rsp.ver = sdbGetTableVer(pSdb, SDB_ANODE);
|
||||
if (req.analVer != rsp.ver) {
|
||||
|
@ -906,15 +906,15 @@ static int32_t mndProcessAnalAlgoReq(SRpcMsg *pReq) {
|
|||
}
|
||||
}
|
||||
|
||||
int32_t contLen = tSerializeRetrieveAnalAlgoRsp(NULL, 0, &rsp);
|
||||
int32_t contLen = tSerializeRetrieveAnalyticAlgoRsp(NULL, 0, &rsp);
|
||||
void *pHead = rpcMallocCont(contLen);
|
||||
(void)tSerializeRetrieveAnalAlgoRsp(pHead, contLen, &rsp);
|
||||
(void)tSerializeRetrieveAnalyticAlgoRsp(pHead, contLen, &rsp);
|
||||
|
||||
pReq->info.rspLen = contLen;
|
||||
pReq->info.rsp = pHead;
|
||||
|
||||
_OVER:
|
||||
tFreeRetrieveAnalAlgoRsp(&rsp);
|
||||
tFreeRetrieveAnalyticAlgoRsp(&rsp);
|
||||
TAOS_RETURN(code);
|
||||
}
|
||||
|
||||
|
|
|
@ -107,6 +107,7 @@ int32_t mndInitStream(SMnode *pMnode) {
|
|||
mndSetMsgHandle(pMnode, TDMT_STREAM_TASK_PAUSE_RSP, mndTransProcessRsp);
|
||||
mndSetMsgHandle(pMnode, TDMT_STREAM_TASK_RESUME_RSP, mndTransProcessRsp);
|
||||
mndSetMsgHandle(pMnode, TDMT_STREAM_TASK_STOP_RSP, mndTransProcessRsp);
|
||||
mndSetMsgHandle(pMnode, TDMT_STREAM_TASK_START_RSP, mndTransProcessRsp);
|
||||
mndSetMsgHandle(pMnode, TDMT_VND_STREAM_TASK_UPDATE_RSP, mndTransProcessRsp);
|
||||
mndSetMsgHandle(pMnode, TDMT_VND_STREAM_TASK_RESET_RSP, mndTransProcessRsp);
|
||||
mndSetMsgHandle(pMnode, TDMT_STREAM_TASK_UPDATE_CHKPT_RSP, mndTransProcessRsp);
|
||||
|
@ -133,6 +134,8 @@ int32_t mndInitStream(SMnode *pMnode) {
|
|||
mndSetMsgHandle(pMnode, TDMT_MND_STREAM_CONSEN_TIMER, mndProcessConsensusInTmr);
|
||||
|
||||
mndSetMsgHandle(pMnode, TDMT_MND_PAUSE_STREAM, mndProcessPauseStreamReq);
|
||||
mndSetMsgHandle(pMnode, TDMT_MND_STOP_STREAM, mndProcessPauseStreamReq);
|
||||
mndSetMsgHandle(pMnode, TDMT_MND_START_STREAM, mndProcessPauseStreamReq);
|
||||
mndSetMsgHandle(pMnode, TDMT_MND_RESUME_STREAM, mndProcessResumeStreamReq);
|
||||
mndSetMsgHandle(pMnode, TDMT_MND_RESET_STREAM, mndProcessResetStreamReq);
|
||||
|
||||
|
@ -1092,7 +1095,7 @@ _OVER:
|
|||
return code;
|
||||
}
|
||||
|
||||
static int32_t mndProcessRestartStreamReq(SRpcMsg *pReq) {
|
||||
static int32_t mndProcessStopStreamReq(SRpcMsg *pReq) {
|
||||
SMnode *pMnode = pReq->info.node;
|
||||
SStreamObj *pStream = NULL;
|
||||
int32_t code = 0;
|
||||
|
@ -1120,7 +1123,7 @@ static int32_t mndProcessRestartStreamReq(SRpcMsg *pReq) {
|
|||
}
|
||||
|
||||
// check if it is conflict with other trans in both sourceDb and targetDb.
|
||||
code = mndStreamTransConflictCheck(pMnode, pStream->uid, MND_STREAM_RESTART_NAME, true);
|
||||
code = mndStreamTransConflictCheck(pMnode, pStream->uid, MND_STREAM_STOP_NAME, true);
|
||||
if (code) {
|
||||
sdbRelease(pMnode->pSdb, pStream);
|
||||
return code;
|
||||
|
@ -1134,15 +1137,15 @@ static int32_t mndProcessRestartStreamReq(SRpcMsg *pReq) {
|
|||
}
|
||||
|
||||
STrans *pTrans = NULL;
|
||||
code = doCreateTrans(pMnode, pStream, pReq, TRN_CONFLICT_NOTHING, MND_STREAM_RESTART_NAME, "restart the stream",
|
||||
code = doCreateTrans(pMnode, pStream, pReq, TRN_CONFLICT_NOTHING, MND_STREAM_STOP_NAME, "stop the stream",
|
||||
&pTrans);
|
||||
if (pTrans == NULL || code) {
|
||||
mError("stream:%s failed to pause stream since %s", pauseReq.name, tstrerror(code));
|
||||
mError("stream:%s failed to stop stream since %s", pauseReq.name, tstrerror(code));
|
||||
sdbRelease(pMnode->pSdb, pStream);
|
||||
return code;
|
||||
}
|
||||
|
||||
code = mndStreamRegisterTrans(pTrans, MND_STREAM_RESTART_NAME, pStream->uid);
|
||||
code = mndStreamRegisterTrans(pTrans, MND_STREAM_STOP_NAME, pStream->uid);
|
||||
if (code) {
|
||||
sdbRelease(pMnode->pSdb, pStream);
|
||||
mndTransDrop(pTrans);
|
||||
|
@ -1150,7 +1153,7 @@ static int32_t mndProcessRestartStreamReq(SRpcMsg *pReq) {
|
|||
}
|
||||
|
||||
// if nodeUpdate happened, not send pause trans
|
||||
code = mndStreamSetRestartAction(pMnode, pTrans, pStream);
|
||||
code = mndStreamSetStopAction(pMnode, pTrans, pStream);
|
||||
if (code) {
|
||||
mError("stream:%s, failed to restart task since %s", pauseReq.name, tstrerror(code));
|
||||
sdbRelease(pMnode->pSdb, pStream);
|
||||
|
|
|
@ -127,7 +127,7 @@ static int32_t doStreamTransConflictCheck(SMnode *pMnode, int64_t streamId, cons
|
|||
|
||||
if (strcmp(tInfo.name, MND_STREAM_CHECKPOINT_NAME) == 0) {
|
||||
if ((strcmp(pTransName, MND_STREAM_DROP_NAME) != 0) && (strcmp(pTransName, MND_STREAM_TASK_RESET_NAME) != 0) &&
|
||||
(strcmp(pTransName, MND_STREAM_RESTART_NAME) != 0)) {
|
||||
(strcmp(pTransName, MND_STREAM_STOP_NAME) != 0)) {
|
||||
mWarn("conflict with other transId:%d streamUid:0x%" PRIx64 ", trans:%s", tInfo.transId, tInfo.streamId,
|
||||
tInfo.name);
|
||||
return TSDB_CODE_MND_TRANS_CONFLICT;
|
||||
|
@ -138,7 +138,7 @@ static int32_t doStreamTransConflictCheck(SMnode *pMnode, int64_t streamId, cons
|
|||
(strcmp(tInfo.name, MND_STREAM_TASK_RESET_NAME) == 0) ||
|
||||
(strcmp(tInfo.name, MND_STREAM_TASK_UPDATE_NAME) == 0) ||
|
||||
(strcmp(tInfo.name, MND_STREAM_CHKPT_CONSEN_NAME) == 0) ||
|
||||
strcmp(tInfo.name, MND_STREAM_RESTART_NAME) == 0) {
|
||||
strcmp(tInfo.name, MND_STREAM_STOP_NAME) == 0) {
|
||||
mWarn("conflict with other transId:%d streamUid:0x%" PRIx64 ", trans:%s", tInfo.transId, tInfo.streamId,
|
||||
tInfo.name);
|
||||
return TSDB_CODE_MND_TRANS_CONFLICT;
|
||||
|
|
|
@ -145,7 +145,7 @@ static int32_t doSetDropActionFromId(SMnode *pMnode, STrans *pTrans, SOrphanTask
|
|||
return 0;
|
||||
}
|
||||
|
||||
static void initNodeUpdateMsg(SStreamTaskNodeUpdateMsg *pMsg, const SVgroupChangeInfo *pInfo, SStreamTaskId *pId,
|
||||
static void initNodeUpdateMsg(SStreamTaskNodeUpdateMsg *pMsg, const SVgroupChangeInfo *pInfo, SArray* pTaskList, SStreamTaskId *pId,
|
||||
int32_t transId) {
|
||||
int32_t code = 0;
|
||||
|
||||
|
@ -158,6 +158,8 @@ static void initNodeUpdateMsg(SStreamTaskNodeUpdateMsg *pMsg, const SVgroupChang
|
|||
code = terrno;
|
||||
}
|
||||
|
||||
pMsg->pTaskList = pTaskList;
|
||||
|
||||
if (code == 0) {
|
||||
void *p = taosArrayAddAll(pMsg->pNodeList, pInfo->pUpdateNodeList);
|
||||
if (p == NULL) {
|
||||
|
@ -166,10 +168,10 @@ static void initNodeUpdateMsg(SStreamTaskNodeUpdateMsg *pMsg, const SVgroupChang
|
|||
}
|
||||
}
|
||||
|
||||
static int32_t doBuildStreamTaskUpdateMsg(void **pBuf, int32_t *pLen, SVgroupChangeInfo *pInfo, int32_t nodeId,
|
||||
static int32_t doBuildStreamTaskUpdateMsg(void **pBuf, int32_t *pLen, SVgroupChangeInfo *pInfo, SArray* pList, int32_t nodeId,
|
||||
SStreamTaskId *pId, int32_t transId) {
|
||||
SStreamTaskNodeUpdateMsg req = {0};
|
||||
initNodeUpdateMsg(&req, pInfo, pId, transId);
|
||||
initNodeUpdateMsg(&req, pInfo, pList, pId, transId);
|
||||
|
||||
int32_t code = 0;
|
||||
int32_t blen;
|
||||
|
@ -177,7 +179,7 @@ static int32_t doBuildStreamTaskUpdateMsg(void **pBuf, int32_t *pLen, SVgroupCha
|
|||
tEncodeSize(tEncodeStreamTaskUpdateMsg, &req, blen, code);
|
||||
if (code < 0) {
|
||||
terrno = TSDB_CODE_OUT_OF_MEMORY;
|
||||
taosArrayDestroy(req.pNodeList);
|
||||
tDestroyNodeUpdateMsg(&req);
|
||||
return terrno;
|
||||
}
|
||||
|
||||
|
@ -185,7 +187,7 @@ static int32_t doBuildStreamTaskUpdateMsg(void **pBuf, int32_t *pLen, SVgroupCha
|
|||
|
||||
void *buf = taosMemoryMalloc(tlen);
|
||||
if (buf == NULL) {
|
||||
taosArrayDestroy(req.pNodeList);
|
||||
tDestroyNodeUpdateMsg(&req);
|
||||
return terrno;
|
||||
}
|
||||
|
||||
|
@ -196,7 +198,7 @@ static int32_t doBuildStreamTaskUpdateMsg(void **pBuf, int32_t *pLen, SVgroupCha
|
|||
if (code == -1) {
|
||||
tEncoderClear(&encoder);
|
||||
taosMemoryFree(buf);
|
||||
taosArrayDestroy(req.pNodeList);
|
||||
tDestroyNodeUpdateMsg(&req);
|
||||
return code;
|
||||
}
|
||||
|
||||
|
@ -209,7 +211,27 @@ static int32_t doBuildStreamTaskUpdateMsg(void **pBuf, int32_t *pLen, SVgroupCha
|
|||
*pBuf = buf;
|
||||
*pLen = tlen;
|
||||
|
||||
taosArrayDestroy(req.pNodeList);
|
||||
tDestroyNodeUpdateMsg(&req);
|
||||
return TSDB_CODE_SUCCESS;
|
||||
}
|
||||
|
||||
// todo: set the task id list for a given nodeId
|
||||
static int32_t createUpdateTaskList(int32_t vgId, SArray* pList) {
|
||||
for (int32_t i = 0; i < taosArrayGetSize(execInfo.pTaskList); ++i) {
|
||||
STaskId *p = taosArrayGet(execInfo.pTaskList, i);
|
||||
if (p == NULL) {
|
||||
continue;
|
||||
}
|
||||
|
||||
STaskStatusEntry *pe = taosHashGet(execInfo.pTaskMap, p, sizeof(*p));
|
||||
if (pe->nodeId == vgId) {
|
||||
void *pRet = taosArrayPush(pList, &pe->id.taskId);
|
||||
if (pRet == NULL) {
|
||||
return terrno;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return TSDB_CODE_SUCCESS;
|
||||
}
|
||||
|
||||
|
@ -218,11 +240,23 @@ static int32_t doSetUpdateTaskAction(SMnode *pMnode, STrans *pTrans, SStreamTask
|
|||
int32_t len = 0;
|
||||
SEpSet epset = {0};
|
||||
bool hasEpset = false;
|
||||
SArray* pTaskList = taosArrayInit(4, sizeof(int32_t));
|
||||
if (pTaskList == NULL) {
|
||||
return terrno;
|
||||
}
|
||||
|
||||
bool unusedRet = streamTaskUpdateEpsetInfo(pTask, pInfo->pUpdateNodeList);
|
||||
int32_t code = doBuildStreamTaskUpdateMsg(&pBuf, &len, pInfo, pTask->info.nodeId, &pTask->id, pTrans->id);
|
||||
int32_t code = createUpdateTaskList(pTask->info.nodeId, pTaskList);
|
||||
if (code != 0) {
|
||||
taosArrayDestroy(pTaskList);
|
||||
return code;
|
||||
}
|
||||
|
||||
// pTaskList already freed here
|
||||
code = doBuildStreamTaskUpdateMsg(&pBuf, &len, pInfo, pTaskList,pTask->info.nodeId, &pTask->id, pTrans->id);
|
||||
if (code) {
|
||||
mError("failed to build stream task epset update msg, code:%s", tstrerror(code));
|
||||
taosMemoryFree(pBuf);
|
||||
return code;
|
||||
}
|
||||
|
||||
|
@ -706,7 +740,7 @@ int32_t mndStreamSetCheckpointAction(SMnode *pMnode, STrans *pTrans, SStreamTask
|
|||
return code;
|
||||
}
|
||||
|
||||
int32_t mndStreamSetRestartAction(SMnode* pMnode, STrans *pTrans, SStreamObj* pStream) {
|
||||
int32_t mndStreamSetStopAction(SMnode* pMnode, STrans *pTrans, SStreamObj* pStream) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
|
|
@ -168,6 +168,10 @@ int32_t sndProcessWriteMsg(SSnode *pSnode, SRpcMsg *pMsg, SRpcMsg *pRsp) {
|
|||
return tqStreamTaskProcessTaskPauseReq(pSnode->pMeta, pMsg->pCont);
|
||||
case TDMT_STREAM_TASK_RESUME:
|
||||
return tqStreamTaskProcessTaskResumeReq(pSnode->pMeta, pMsg->info.conn.applyIndex, pMsg->pCont, false);
|
||||
case TDMT_STREAM_TASK_STOP:
|
||||
return tqStreamTaskProcessTaskPauseReq(pSnode->pMeta, pMsg->pCont);
|
||||
case TDMT_STREAM_TASK_START:
|
||||
return tqStreamTaskProcessTaskPauseReq(pSnode->pMeta, pMsg->pCont);
|
||||
case TDMT_STREAM_TASK_UPDATE_CHKPT:
|
||||
return tqStreamTaskProcessUpdateCheckpointReq(pSnode->pMeta, true, pMsg->pCont);
|
||||
case TDMT_STREAM_CONSEN_CHKPT:
|
||||
|
|
|
@ -179,20 +179,20 @@ int32_t tqStreamTaskProcessUpdateReq(SStreamMeta* pMeta, SMsgCb* cb, SRpcMsg* pM
|
|||
int32_t code = 0;
|
||||
SStreamTask* pTask = NULL;
|
||||
SStreamTask* pHTask = NULL;
|
||||
|
||||
SStreamTaskNodeUpdateMsg req = {0};
|
||||
|
||||
SDecoder decoder;
|
||||
|
||||
tDecoderInit(&decoder, (uint8_t*)msg, len);
|
||||
if (tDecodeStreamTaskUpdateMsg(&decoder, &req) < 0) {
|
||||
code = tDecodeStreamTaskUpdateMsg(&decoder, &req);
|
||||
tDecoderClear(&decoder);
|
||||
|
||||
if (code < 0) {
|
||||
rsp.code = TSDB_CODE_MSG_DECODE_ERROR;
|
||||
tqError("vgId:%d failed to decode task update msg, code:%s", vgId, tstrerror(rsp.code));
|
||||
tDecoderClear(&decoder);
|
||||
tDestroyNodeUpdateMsg(&req);
|
||||
return rsp.code;
|
||||
}
|
||||
|
||||
tDecoderClear(&decoder);
|
||||
|
||||
int32_t gError = streamGetFatalError(pMeta);
|
||||
if (gError != 0) {
|
||||
tqError("vgId:%d global fatal occurs, code:%s, ts:%" PRId64 " func:%s", pMeta->vgId, tstrerror(gError),
|
||||
|
|
|
@ -538,7 +538,7 @@ void ctgdShowDBCache(SCatalog *pCtg, SHashObj *dbHash) {
|
|||
"] %s: cfgVersion:%d, numOfVgroups:%d, numOfStables:%d, buffer:%d, cacheSize:%d, pageSize:%d, pages:%d"
|
||||
", daysPerFile:%d, daysToKeep0:%d, daysToKeep1:%d, daysToKeep2:%d, minRows:%d, maxRows:%d, walFsyncPeriod:%d"
|
||||
", hashPrefix:%d, hashSuffix:%d, walLevel:%d, precision:%d, compression:%d, replications:%d, strict:%d"
|
||||
", cacheLast:%d, tsdbPageSize:%d, walRetentionPeriod:%d, walRollPeriod:%d, walRetentionSize:%" PRId64 ""
|
||||
", cacheLast:%d, tsdbPageSize:%d, walRetentionPeriod:%d, walRollPeriod:%d, walRetentionSize:%" PRId64
|
||||
", walSegmentSize:%" PRId64 ", numOfRetensions:%d, schemaless:%d, sstTrigger:%d",
|
||||
i, (int32_t)len, dbFName, dbCache->dbId, dbCache->deleted ? "deleted" : "",
|
||||
pCfg->cfgVersion, pCfg->numOfVgroups, pCfg->numOfStables, pCfg->buffer,
|
||||
|
|
|
@ -327,7 +327,7 @@ static int32_t anomalyParseJson(SJson* pJson, SArray* pWindows, const char* pId)
|
|||
qError("%s failed to exec forecast, msg:%s", pId, pMsg);
|
||||
}
|
||||
|
||||
return TSDB_CODE_ANA_INTERNAL_ERROR;
|
||||
return TSDB_CODE_ANA_ANODE_RETURN_ERROR;
|
||||
} else if (rows == 0) {
|
||||
return TSDB_CODE_SUCCESS;
|
||||
}
|
||||
|
@ -593,7 +593,7 @@ static int32_t anomalyAggregateBlocks(SOperatorInfo* pOperator) {
|
|||
|
||||
for (int32_t r = 0; r < pBlock->info.rows; ++r) {
|
||||
TSKEY key = tsList[r];
|
||||
bool keyInWin = (key >= pSupp->curWin.skey && key < pSupp->curWin.ekey);
|
||||
bool keyInWin = (key >= pSupp->curWin.skey && key <= pSupp->curWin.ekey);
|
||||
bool lastRow = (r == pBlock->info.rows - 1);
|
||||
|
||||
if (keyInWin) {
|
||||
|
|
|
@ -145,6 +145,7 @@ static int32_t forecastCloseBuf(SForecastSupp* pSupp, const char* id) {
|
|||
if (!hasWncheck) {
|
||||
qDebug("%s forecast wncheck not found from %s, use default:%" PRId64, id, pSupp->algoOpt, wncheck);
|
||||
}
|
||||
|
||||
code = taosAnalyBufWriteOptInt(pBuf, "wncheck", wncheck);
|
||||
if (code != 0) return code;
|
||||
|
||||
|
@ -235,7 +236,7 @@ static int32_t forecastAnalysis(SForecastSupp* pSupp, SSDataBlock* pBlock, const
|
|||
}
|
||||
|
||||
tjsonDelete(pJson);
|
||||
return TSDB_CODE_ANA_INTERNAL_ERROR;
|
||||
return TSDB_CODE_ANA_ANODE_RETURN_ERROR;
|
||||
}
|
||||
|
||||
if (code < 0) {
|
||||
|
|
|
@ -639,10 +639,10 @@ int32_t insAppendStmtTableDataCxt(SHashObj* pAllVgHash, STableColsData* pTbData,
|
|||
}
|
||||
}
|
||||
|
||||
if (!pTbCtx->ordered) {
|
||||
if (!pTbData->isOrdered) {
|
||||
code = tRowSort(pTbCtx->pData->aRowP);
|
||||
}
|
||||
if (code == TSDB_CODE_SUCCESS && (!pTbCtx->ordered || pTbCtx->duplicateTs)) {
|
||||
if (code == TSDB_CODE_SUCCESS && (!pTbData->isOrdered || pTbData->isDuplicateTs)) {
|
||||
code = tRowMerge(pTbCtx->pData->aRowP, pTbCtx->pSchema, 0);
|
||||
}
|
||||
|
||||
|
|
|
@ -219,6 +219,9 @@ void destroySendMsgInfo(SMsgSendInfo* pMsgBody) {
|
|||
return;
|
||||
}
|
||||
|
||||
|
||||
qDebug("ahandle %p freed, QID:0x%" PRIx64, pMsgBody, pMsgBody->requestId);
|
||||
|
||||
taosMemoryFreeClear(pMsgBody->target.dbFName);
|
||||
taosMemoryFreeClear(pMsgBody->msgInfo.pData);
|
||||
if (pMsgBody->paramFreeFp) {
|
||||
|
|
|
@ -660,7 +660,7 @@ int32_t schGenerateCallBackInfo(SSchJob *pJob, SSchTask *pTask, void *msg, uint3
|
|||
int32_t code = 0;
|
||||
SMsgSendInfo *msgSendInfo = taosMemoryCalloc(1, sizeof(SMsgSendInfo));
|
||||
if (NULL == msgSendInfo) {
|
||||
SCH_TASK_ELOG("calloc %d failed", (int32_t)sizeof(SMsgSendInfo));
|
||||
qError("calloc SMsgSendInfo size %d failed", (int32_t)sizeof(SMsgSendInfo));
|
||||
SCH_ERR_JRET(terrno);
|
||||
}
|
||||
|
||||
|
@ -672,8 +672,12 @@ int32_t schGenerateCallBackInfo(SSchJob *pJob, SSchTask *pTask, void *msg, uint3
|
|||
if (pJob) {
|
||||
msgSendInfo->requestId = pJob->conn.requestId;
|
||||
msgSendInfo->requestObjRefId = pJob->conn.requestObjRefId;
|
||||
} else {
|
||||
SCH_ERR_JRET(taosGetSystemUUIDU64(&msgSendInfo->requestId));
|
||||
}
|
||||
|
||||
qDebug("ahandle %p alloced, QID:0x%" PRIx64, msgSendInfo, msgSendInfo->requestId);
|
||||
|
||||
if (TDMT_SCH_LINK_BROKEN != msgType) {
|
||||
msgSendInfo->msgInfo.pData = msg;
|
||||
msgSendInfo->msgInfo.len = msgSize;
|
||||
|
|
|
@ -382,6 +382,31 @@ void streamMetaRemoveDB(void* arg, char* key) {
|
|||
streamMutexUnlock(&pMeta->backendMutex);
|
||||
}
|
||||
|
||||
int32_t streamMetaUpdateInfoInit(STaskUpdateInfo* pInfo) {
|
||||
_hash_fn_t fp = taosGetDefaultHashFunction(TSDB_DATA_TYPE_VARCHAR);
|
||||
|
||||
pInfo->pTasks = taosHashInit(64, fp, false, HASH_NO_LOCK);
|
||||
if (pInfo->pTasks == NULL) {
|
||||
return terrno;
|
||||
}
|
||||
|
||||
pInfo->pTaskList = taosArrayInit(4, sizeof(int32_t));
|
||||
if (pInfo->pTaskList == NULL) {
|
||||
return terrno;
|
||||
}
|
||||
|
||||
return TSDB_CODE_SUCCESS;
|
||||
}
|
||||
|
||||
void streamMetaUpdateInfoCleanup(STaskUpdateInfo* pInfo) {
|
||||
taosHashCleanup(pInfo->pTasks);
|
||||
taosArrayDestroy(pInfo->pTaskList);
|
||||
pInfo->pTasks = NULL;
|
||||
pInfo->pTaskList = NULL;
|
||||
}
|
||||
|
||||
|
||||
|
||||
int32_t streamMetaOpen(const char* path, void* ahandle, FTaskBuild buildTaskFn, FTaskExpand expandTaskFn, int32_t vgId,
|
||||
int64_t stage, startComplete_fn_t fn, SStreamMeta** p) {
|
||||
QRY_PARAM_CHECK(p);
|
||||
|
@ -435,8 +460,8 @@ int32_t streamMetaOpen(const char* path, void* ahandle, FTaskBuild buildTaskFn,
|
|||
pMeta->pTasksMap = taosHashInit(64, fp, true, HASH_NO_LOCK);
|
||||
TSDB_CHECK_NULL(pMeta->pTasksMap, code, lino, _err, terrno);
|
||||
|
||||
pMeta->updateInfo.pTasks = taosHashInit(64, fp, false, HASH_NO_LOCK);
|
||||
TSDB_CHECK_NULL(pMeta->updateInfo.pTasks, code, lino, _err, terrno);
|
||||
code = streamMetaUpdateInfoInit(&pMeta->updateInfo);
|
||||
TSDB_CHECK_CODE(code, lino, _err);
|
||||
|
||||
code = streamMetaInitStartInfo(&pMeta->startInfo);
|
||||
TSDB_CHECK_CODE(code, lino, _err);
|
||||
|
@ -641,8 +666,8 @@ void streamMetaCloseImpl(void* arg) {
|
|||
|
||||
taosHashCleanup(pMeta->pTasksMap);
|
||||
taosHashCleanup(pMeta->pTaskDbUnique);
|
||||
taosHashCleanup(pMeta->updateInfo.pTasks);
|
||||
|
||||
streamMetaUpdateInfoCleanup(&pMeta->updateInfo);
|
||||
streamMetaClearStartInfo(&pMeta->startInfo);
|
||||
|
||||
destroyMetaHbInfo(pMeta->pHbInfo);
|
||||
|
@ -1492,16 +1517,19 @@ void streamMetaAddIntoUpdateTaskList(SStreamMeta* pMeta, SStreamTask* pTask, SSt
|
|||
|
||||
void streamMetaClearSetUpdateTaskListComplete(SStreamMeta* pMeta) {
|
||||
STaskUpdateInfo* pInfo = &pMeta->updateInfo;
|
||||
int32_t num = taosArrayGetSize(pInfo->pTaskList);
|
||||
|
||||
taosHashClear(pInfo->pTasks);
|
||||
taosArrayClear(pInfo->pTaskList);
|
||||
|
||||
int32_t prev = pInfo->completeTransId;
|
||||
pInfo->completeTransId = pInfo->activeTransId;
|
||||
pInfo->activeTransId = -1;
|
||||
pInfo->completeTs = taosGetTimestampMs();
|
||||
|
||||
stDebug("vgId:%d set the nodeEp update complete, ts:%" PRId64 ", complete transId:%d->%d, reset active transId",
|
||||
pMeta->vgId, pInfo->completeTs, prev, pInfo->completeTransId);
|
||||
stDebug("vgId:%d set the nodeEp update complete, ts:%" PRId64
|
||||
", complete transId:%d->%d, update Tasks:%d reset active transId",
|
||||
pMeta->vgId, pInfo->completeTs, prev, pInfo->completeTransId, num);
|
||||
}
|
||||
|
||||
bool streamMetaInitUpdateTaskList(SStreamMeta* pMeta, int32_t transId) {
|
||||
|
|
|
@ -158,9 +158,17 @@ int32_t streamMetaStartAllTasks(SStreamMeta* pMeta) {
|
|||
pMeta->startInfo.curStage = START_MARK_REQ_CHKPID;
|
||||
SStartTaskStageInfo info = {.stage = pMeta->startInfo.curStage, .ts = now};
|
||||
|
||||
taosArrayPush(pMeta->startInfo.pStagesList, &info);
|
||||
stDebug("vgId:%d %d task(s) 0 stage -> mark_req stage, reqTs:%" PRId64 " numOfStageHist:%d", pMeta->vgId, numOfConsensusChkptIdTasks,
|
||||
info.ts, (int32_t)taosArrayGetSize(pMeta->startInfo.pStagesList));
|
||||
void* p = taosArrayPush(pMeta->startInfo.pStagesList, &info);
|
||||
int32_t num = (int32_t)taosArrayGetSize(pMeta->startInfo.pStagesList);
|
||||
|
||||
if (p != NULL) {
|
||||
stDebug("vgId:%d %d task(s) 0 stage -> mark_req stage, reqTs:%" PRId64 " numOfStageHist:%d", pMeta->vgId,
|
||||
numOfConsensusChkptIdTasks, info.ts, num);
|
||||
} else {
|
||||
stError("vgId:%d %d task(s) 0 stage -> mark_req stage, reqTs:%" PRId64
|
||||
" numOfStageHist:%d, FAILED, out of memory",
|
||||
pMeta->vgId, numOfConsensusChkptIdTasks, info.ts, num);
|
||||
}
|
||||
}
|
||||
|
||||
// prepare the fill-history task before starting all stream tasks, to avoid fill-history tasks are started without
|
||||
|
@ -230,8 +238,8 @@ static void streamMetaLogLaunchTasksInfo(SStreamMeta* pMeta, int32_t numOfTotal,
|
|||
displayStatusInfo(pMeta, pStartInfo->pFailedTaskSet, false);
|
||||
}
|
||||
|
||||
int32_t streamMetaAddTaskLaunchResultNoLock(SStreamMeta* pMeta, int64_t streamId, int32_t taskId,
|
||||
int64_t startTs, int64_t endTs, bool ready) {
|
||||
int32_t streamMetaAddTaskLaunchResultNoLock(SStreamMeta* pMeta, int64_t streamId, int32_t taskId, int64_t startTs,
|
||||
int64_t endTs, bool ready) {
|
||||
STaskStartInfo* pStartInfo = &pMeta->startInfo;
|
||||
STaskId id = {.streamId = streamId, .taskId = taskId};
|
||||
int32_t vgId = pMeta->vgId;
|
||||
|
@ -312,7 +320,7 @@ bool allCheckDownstreamRsp(SStreamMeta* pMeta, STaskStartInfo* pStartInfo, int32
|
|||
if (px == NULL) {
|
||||
px = taosHashGet(pStartInfo->pFailedTaskSet, &idx, sizeof(idx));
|
||||
if (px == NULL) {
|
||||
stDebug("vgId:%d s-task:0x%x start result not rsp yet", pMeta->vgId, (int32_t) idx.taskId);
|
||||
stDebug("vgId:%d s-task:0x%x start result not rsp yet", pMeta->vgId, (int32_t)idx.taskId);
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
|
|
@ -426,11 +426,15 @@ int32_t walReadVer(SWalReader *pReader, int64_t ver) {
|
|||
} else {
|
||||
if (contLen < 0) {
|
||||
code = terrno;
|
||||
} else {
|
||||
code = TSDB_CODE_WAL_FILE_CORRUPTED;
|
||||
}
|
||||
wError("vgId:%d, failed to read WAL record head, index:%" PRId64 ", from log file since %s",
|
||||
pReader->pWal->cfg.vgId, ver, tstrerror(code));
|
||||
} else {
|
||||
code = TSDB_CODE_WAL_FILE_CORRUPTED;
|
||||
wError("vgId:%d, failed to read WAL record head, index:%" PRId64 ", not enough bytes read, readLen:%" PRId64
|
||||
", "
|
||||
"expectedLen:%d",
|
||||
pReader->pWal->cfg.vgId, ver, contLen, (int32_t)sizeof(SWalCkHead));
|
||||
}
|
||||
TAOS_UNUSED(taosThreadMutexUnlock(&pReader->mutex));
|
||||
TAOS_RETURN(code);
|
||||
}
|
||||
|
|
|
@ -378,7 +378,7 @@ TAOS_DEFINE_ERROR(TSDB_CODE_ANA_BUF_INVALID_TYPE, "Analysis invalid buffe
|
|||
TAOS_DEFINE_ERROR(TSDB_CODE_ANA_ANODE_RETURN_ERROR, "Analysis failed since anode return error")
|
||||
TAOS_DEFINE_ERROR(TSDB_CODE_ANA_ANODE_TOO_MANY_ROWS, "Analysis failed since too many input rows for anode")
|
||||
TAOS_DEFINE_ERROR(TSDB_CODE_ANA_WN_DATA, "white-noise data not processed")
|
||||
TAOS_DEFINE_ERROR(TSDB_CODE_ANA_INTERNAL_ERROR, "tdgpt internal error, not processed")
|
||||
TAOS_DEFINE_ERROR(TSDB_CODE_ANA_INTERNAL_ERROR, "Analysis internal error, not processed")
|
||||
|
||||
// mnode-sma
|
||||
TAOS_DEFINE_ERROR(TSDB_CODE_MND_SMA_ALREADY_EXIST, "SMA already exists")
|
||||
|
|
|
@ -23,8 +23,8 @@ endi
|
|||
|
||||
print =============== show info
|
||||
sql show anodes full
|
||||
if $rows != 8 then
|
||||
print expect 8 , actual $rows
|
||||
if $rows != 10 then
|
||||
print expect 10 , actual $rows
|
||||
return -1
|
||||
endi
|
||||
|
||||
|
|
|
@ -0,0 +1,75 @@
|
|||
import os
|
||||
import subprocess
|
||||
import time
|
||||
|
||||
def get_time_seconds():
|
||||
current_time = time.strftime("%H:%M:%S", time.localtime())
|
||||
hh, mm, ss = map(int, current_time.split(':'))
|
||||
return (hh * 60 + mm) * 60 + ss
|
||||
|
||||
def color_echo(color, message, time1, log_file=None):
|
||||
current_time = time.strftime("%H:%M:%S", time.localtime())
|
||||
time2 = get_time_seconds()
|
||||
inter_time = time2 - time1
|
||||
print(f"End at {current_time} , cast {inter_time}s")
|
||||
print(message)
|
||||
if log_file:
|
||||
with open(log_file, 'r') as file:
|
||||
print(file.read())
|
||||
|
||||
def check_skip_case(line):
|
||||
skip_case = False
|
||||
if line == "python3 ./test.py -f 1-insert/insertWithMoreVgroup.py":
|
||||
skip_case = False
|
||||
elif line == "python3 ./test.py -f 2-query/queryQnode.py":
|
||||
skip_case = False
|
||||
elif "-R" in line:
|
||||
skip_case = True
|
||||
return skip_case
|
||||
|
||||
def run_tests(case_file):
|
||||
exit_num = 0
|
||||
a = 0
|
||||
failed_tests = []
|
||||
|
||||
with open(case_file, 'r') as file:
|
||||
for line in file:
|
||||
line = line.strip()
|
||||
if check_skip_case(line):
|
||||
continue
|
||||
|
||||
if line.startswith("python3"):
|
||||
a += 1
|
||||
print(f"{a} Processing {line}")
|
||||
time1 = get_time_seconds()
|
||||
print(f"Start at {time.strftime('%H:%M:%S', time.localtime())}")
|
||||
|
||||
log_file = f"log_{a}.txt"
|
||||
with open(log_file, 'w') as log:
|
||||
process = subprocess.run(line.split(), stdout=log, stderr=log)
|
||||
if process.returncode != 0:
|
||||
color_echo("0c", "failed", time1, log_file)
|
||||
exit_num = 8
|
||||
failed_tests.append(line)
|
||||
else:
|
||||
color_echo("0a", "Success", time1)
|
||||
|
||||
if failed_tests:
|
||||
with open("failed.txt", 'w') as file:
|
||||
for test in failed_tests:
|
||||
file.write(test + '\n')
|
||||
|
||||
return exit_num
|
||||
|
||||
if __name__ == "__main__":
|
||||
case_file = "simpletest.bat"
|
||||
if len(os.sys.argv) > 1 and os.sys.argv[1] == "full":
|
||||
case_file = "fulltest.sh"
|
||||
if len(os.sys.argv) > 2:
|
||||
case_file = os.sys.argv[2]
|
||||
|
||||
exit_code = run_tests(case_file)
|
||||
print(f"Final exit code: {exit_code}")
|
||||
if exit_code != 0:
|
||||
print("One or more tests failed.")
|
||||
os.sys.exit(exit_code)
|
|
@ -145,5 +145,6 @@ void shellTestNetWork();
|
|||
|
||||
// shellMain.c
|
||||
extern SShellObj shell;
|
||||
extern char configDirShell[PATH_MAX];
|
||||
|
||||
#endif /*_TD_SHELL_INT_H_*/
|
||||
|
|
|
@ -15,7 +15,7 @@
|
|||
|
||||
#include "shellInt.h"
|
||||
#include "../../inc/pub.h"
|
||||
|
||||
char configDirShell[PATH_MAX] = {0};
|
||||
|
||||
#define TAOS_CONSOLE_PROMPT_CONTINUE " -> "
|
||||
|
||||
|
@ -381,9 +381,16 @@ static int32_t shellCheckArgs() {
|
|||
printf("Invalid cfgdir:%s\r\n", pArgs->cfgdir);
|
||||
return -1;
|
||||
} else {
|
||||
if (taosExpandDir(pArgs->cfgdir, configDir, PATH_MAX) != 0) {
|
||||
tstrncpy(configDir, pArgs->cfgdir, PATH_MAX);
|
||||
if (taosExpandDir(pArgs->cfgdir, configDirShell, PATH_MAX) != 0) {
|
||||
tstrncpy(configDirShell, pArgs->cfgdir, PATH_MAX);
|
||||
}
|
||||
// check cfg dir exist
|
||||
/*
|
||||
if(!taosIsDir(configDirShell)) {
|
||||
printf("folder not exist. cfgdir:%s expand:%s\r\n", pArgs->cfgdir, configDirShell);
|
||||
configDirShell[0] = 0;
|
||||
return -1;
|
||||
}*/
|
||||
}
|
||||
}
|
||||
|
||||
|
|
|
@ -55,6 +55,7 @@ void initArgument(SShellArgs *pArgs) {
|
|||
}
|
||||
|
||||
int main(int argc, char *argv[]) {
|
||||
int code = 0;
|
||||
#if !defined(WINDOWS)
|
||||
taosSetSignal(SIGBUS, shellCrashHandler);
|
||||
#endif
|
||||
|
@ -92,19 +93,34 @@ int main(int argc, char *argv[]) {
|
|||
return 0;
|
||||
}
|
||||
|
||||
if (shell.args.is_dump_config) {
|
||||
shellDumpConfig();
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (getDsnEnv() != 0) {
|
||||
return -1;
|
||||
}
|
||||
|
||||
// first taos_option(TSDB_OPTION_DRIVER ...) no load driver
|
||||
if (setConnMode(shell.args.connMode, shell.args.dsn, false)) {
|
||||
return -1;
|
||||
}
|
||||
|
||||
// second taos_option(TSDB_OPTION_CONFIGDIR ...) set configDir global
|
||||
if (configDirShell[0] != 0) {
|
||||
code = taos_options(TSDB_OPTION_CONFIGDIR, configDirShell);
|
||||
if (code) {
|
||||
fprintf(stderr, "failed to set config dir:%s code:[0x%08X]\r\n", configDirShell, code);
|
||||
return -1;
|
||||
}
|
||||
//printf("Load with input config dir:%s\n", configDirShell);
|
||||
}
|
||||
|
||||
#ifndef TD_ASTRA
|
||||
// dump config
|
||||
if (shell.args.is_dump_config) {
|
||||
shellDumpConfig();
|
||||
return 0;
|
||||
}
|
||||
#endif
|
||||
|
||||
// taos_init
|
||||
if (taos_init() != 0) {
|
||||
fprintf(stderr, "failed to init shell since %s [0x%08X]\r\n", taos_errstr(NULL), taos_errno(NULL));
|
||||
return -1;
|
||||
|
@ -114,12 +130,6 @@ int main(int argc, char *argv[]) {
|
|||
taos_set_hb_quit(1);
|
||||
|
||||
#ifndef TD_ASTRA
|
||||
if (shell.args.is_dump_config) {
|
||||
shellDumpConfig();
|
||||
taos_cleanup();
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (shell.args.is_startup || shell.args.is_check) {
|
||||
shellCheckServerStatus();
|
||||
taos_cleanup();
|
||||
|
|
|
@ -80,7 +80,7 @@ void shellGenerateAuth() {
|
|||
void shellDumpConfig() {
|
||||
(void)osDefaultInit();
|
||||
|
||||
if (taosInitCfg(configDir, NULL, NULL, NULL, NULL, 1) != 0) {
|
||||
if (taosInitCfg(configDirShell, NULL, NULL, NULL, NULL, 1) != 0) {
|
||||
fprintf(stderr, "failed to load cfg since %s [0x%08X]\n", terrstr(), terrno);
|
||||
return;
|
||||
}
|
||||
|
|
|
@ -10,8 +10,7 @@
|
|||
1. [Testing](#8-testing)
|
||||
1. [Releasing](#9-releasing)
|
||||
1. [CI/CD](#10-cicd)
|
||||
1. [Coverage](#11-coverage)
|
||||
1. [Contributing](#12-contributing)
|
||||
1. [Contributing](#11-contributing)
|
||||
|
||||
# 1. Introduction
|
||||
|
||||
|
@ -93,7 +92,6 @@ The taosanode will be installed as an system service, but will not automatic sta
|
|||
systemctl start taosanoded
|
||||
```
|
||||
|
||||
|
||||
## 6.2 Configure the Service
|
||||
taosanode provides the RESTFul service powered by `uWSGI`. You can config the options to tune the
|
||||
performance by changing the default configuration file `taosanode.ini` located in `/etc/taos`, which is also the configuration directory for `taosd` service.
|
||||
|
@ -123,10 +121,7 @@ For the complete list of taosanode Releases, please see Releases.
|
|||
|
||||
We use Github Actions for CI/CD workflow configuration. Please refer to the workflow definition yaml file in [.github/workflows](../../.github/workflows/) for details.
|
||||
|
||||
# 11 Coverage
|
||||
|
||||
|
||||
# 12 Contributing
|
||||
# 11 Contributing
|
||||
|
||||
Guidelines for contributing to the project:
|
||||
|
||||
|
|
|
@ -78,4 +78,4 @@ model-dir = /usr/local/taos/taosanode/model/
|
|||
log-level = DEBUG
|
||||
|
||||
# draw the query results
|
||||
draw-result = 1
|
||||
draw-result = 0
|
||||
|
|
|
@ -0,0 +1,23 @@
|
|||
create database if not exists test keep 36500;
|
||||
use test;
|
||||
|
||||
drop table if exists ad_sample;
|
||||
create table if not exists ad_sample(ts timestamp, val int);
|
||||
|
||||
insert into ad_sample values(1577808000000, 5);
|
||||
insert into ad_sample values(1577808001000, 14);
|
||||
insert into ad_sample values(1577808002000, 15);
|
||||
insert into ad_sample values(1577808003000, 15);
|
||||
insert into ad_sample values(1577808004000, 14);
|
||||
insert into ad_sample values(1577808005000, 19);
|
||||
insert into ad_sample values(1577808006000, 17);
|
||||
insert into ad_sample values(1577808007000, 16);
|
||||
insert into ad_sample values(1577808008000, 20);
|
||||
insert into ad_sample values(1577808009000, 22);
|
||||
insert into ad_sample values(1577808010000, 8);
|
||||
insert into ad_sample values(1577808011000, 21);
|
||||
insert into ad_sample values(1577808012000, 28);
|
||||
insert into ad_sample values(1577808013000, 11);
|
||||
insert into ad_sample values(1577808014000, 9);
|
||||
insert into ad_sample values(1577808015000, 29);
|
||||
insert into ad_sample values(1577808016000, 40);
|
|
@ -0,0 +1,150 @@
|
|||
create database if not exists test keep 36500;
|
||||
use test;
|
||||
|
||||
drop table if exists passengers;
|
||||
create table if not exists passengers(ts timestamp, val int);
|
||||
|
||||
insert into passengers values("1949-01-01 00:00:00",112);
|
||||
insert into passengers values("1949-02-01 00:00:00",118);
|
||||
insert into passengers values("1949-03-01 00:00:00",132);
|
||||
insert into passengers values("1949-04-01 00:00:00",129);
|
||||
insert into passengers values("1949-05-01 00:00:00",121);
|
||||
insert into passengers values("1949-06-01 00:00:00",135);
|
||||
insert into passengers values("1949-07-01 00:00:00",148);
|
||||
insert into passengers values("1949-08-01 00:00:00",148);
|
||||
insert into passengers values("1949-09-01 00:00:00",136);
|
||||
insert into passengers values("1949-10-01 00:00:00",119);
|
||||
insert into passengers values("1949-11-01 00:00:00",104);
|
||||
insert into passengers values("1949-12-01 00:00:00",118);
|
||||
insert into passengers values("1950-01-01 00:00:00",115);
|
||||
insert into passengers values("1950-02-01 00:00:00",126);
|
||||
insert into passengers values("1950-03-01 00:00:00",141);
|
||||
insert into passengers values("1950-04-01 00:00:00",135);
|
||||
insert into passengers values("1950-05-01 00:00:00",125);
|
||||
insert into passengers values("1950-06-01 00:00:00",149);
|
||||
insert into passengers values("1950-07-01 00:00:00",170);
|
||||
insert into passengers values("1950-08-01 00:00:00",170);
|
||||
insert into passengers values("1950-09-01 00:00:00",158);
|
||||
insert into passengers values("1950-10-01 00:00:00",133);
|
||||
insert into passengers values("1950-11-01 00:00:00",114);
|
||||
insert into passengers values("1950-12-01 00:00:00",140);
|
||||
insert into passengers values("1951-01-01 00:00:00",145);
|
||||
insert into passengers values("1951-02-01 00:00:00",150);
|
||||
insert into passengers values("1951-03-01 00:00:00",178);
|
||||
insert into passengers values("1951-04-01 00:00:00",163);
|
||||
insert into passengers values("1951-05-01 00:00:00",172);
|
||||
insert into passengers values("1951-06-01 00:00:00",178);
|
||||
insert into passengers values("1951-07-01 00:00:00",199);
|
||||
insert into passengers values("1951-08-01 00:00:00",199);
|
||||
insert into passengers values("1951-09-01 00:00:00",184);
|
||||
insert into passengers values("1951-10-01 00:00:00",162);
|
||||
insert into passengers values("1951-11-01 00:00:00",146);
|
||||
insert into passengers values("1951-12-01 00:00:00",166);
|
||||
insert into passengers values("1952-01-01 00:00:00",171);
|
||||
insert into passengers values("1952-02-01 00:00:00",180);
|
||||
insert into passengers values("1952-03-01 00:00:00",193);
|
||||
insert into passengers values("1952-04-01 00:00:00",181);
|
||||
insert into passengers values("1952-05-01 00:00:00",183);
|
||||
insert into passengers values("1952-06-01 00:00:00",218);
|
||||
insert into passengers values("1952-07-01 00:00:00",230);
|
||||
insert into passengers values("1952-08-01 00:00:00",242);
|
||||
insert into passengers values("1952-09-01 00:00:00",209);
|
||||
insert into passengers values("1952-10-01 00:00:00",191);
|
||||
insert into passengers values("1952-11-01 00:00:00",172);
|
||||
insert into passengers values("1952-12-01 00:00:00",194);
|
||||
insert into passengers values("1953-01-01 00:00:00",196);
|
||||
insert into passengers values("1953-02-01 00:00:00",196);
|
||||
insert into passengers values("1953-03-01 00:00:00",236);
|
||||
insert into passengers values("1953-04-01 00:00:00",235);
|
||||
insert into passengers values("1953-05-01 00:00:00",229);
|
||||
insert into passengers values("1953-06-01 00:00:00",243);
|
||||
insert into passengers values("1953-07-01 00:00:00",264);
|
||||
insert into passengers values("1953-08-01 00:00:00",272);
|
||||
insert into passengers values("1953-09-01 00:00:00",237);
|
||||
insert into passengers values("1953-10-01 00:00:00",211);
|
||||
insert into passengers values("1953-11-01 00:00:00",180);
|
||||
insert into passengers values("1953-12-01 00:00:00",201);
|
||||
insert into passengers values("1954-01-01 00:00:00",204);
|
||||
insert into passengers values("1954-02-01 00:00:00",188);
|
||||
insert into passengers values("1954-03-01 00:00:00",235);
|
||||
insert into passengers values("1954-04-01 00:00:00",227);
|
||||
insert into passengers values("1954-05-01 00:00:00",234);
|
||||
insert into passengers values("1954-06-01 00:00:00",264);
|
||||
insert into passengers values("1954-07-01 00:00:00",302);
|
||||
insert into passengers values("1954-08-01 00:00:00",293);
|
||||
insert into passengers values("1954-09-01 00:00:00",259);
|
||||
insert into passengers values("1954-10-01 00:00:00",229);
|
||||
insert into passengers values("1954-11-01 00:00:00",203);
|
||||
insert into passengers values("1954-12-01 00:00:00",229);
|
||||
insert into passengers values("1955-01-01 00:00:00",242);
|
||||
insert into passengers values("1955-02-01 00:00:00",233);
|
||||
insert into passengers values("1955-03-01 00:00:00",267);
|
||||
insert into passengers values("1955-04-01 00:00:00",269);
|
||||
insert into passengers values("1955-05-01 00:00:00",270);
|
||||
insert into passengers values("1955-06-01 00:00:00",315);
|
||||
insert into passengers values("1955-07-01 00:00:00",364);
|
||||
insert into passengers values("1955-08-01 00:00:00",347);
|
||||
insert into passengers values("1955-09-01 00:00:00",312);
|
||||
insert into passengers values("1955-10-01 00:00:00",274);
|
||||
insert into passengers values("1955-11-01 00:00:00",237);
|
||||
insert into passengers values("1955-12-01 00:00:00",278);
|
||||
insert into passengers values("1956-01-01 00:00:00",284);
|
||||
insert into passengers values("1956-02-01 00:00:00",277);
|
||||
insert into passengers values("1956-03-01 00:00:00",317);
|
||||
insert into passengers values("1956-04-01 00:00:00",313);
|
||||
insert into passengers values("1956-05-01 00:00:00",318);
|
||||
insert into passengers values("1956-06-01 00:00:00",374);
|
||||
insert into passengers values("1956-07-01 00:00:00",413);
|
||||
insert into passengers values("1956-08-01 00:00:00",405);
|
||||
insert into passengers values("1956-09-01 00:00:00",355);
|
||||
insert into passengers values("1956-10-01 00:00:00",306);
|
||||
insert into passengers values("1956-11-01 00:00:00",271);
|
||||
insert into passengers values("1956-12-01 00:00:00",306);
|
||||
insert into passengers values("1957-01-01 00:00:00",315);
|
||||
insert into passengers values("1957-02-01 00:00:00",301);
|
||||
insert into passengers values("1957-03-01 00:00:00",356);
|
||||
insert into passengers values("1957-04-01 00:00:00",348);
|
||||
insert into passengers values("1957-05-01 00:00:00",355);
|
||||
insert into passengers values("1957-06-01 00:00:00",422);
|
||||
insert into passengers values("1957-07-01 00:00:00",465);
|
||||
insert into passengers values("1957-08-01 00:00:00",467);
|
||||
insert into passengers values("1957-09-01 00:00:00",404);
|
||||
insert into passengers values("1957-10-01 00:00:00",347);
|
||||
insert into passengers values("1957-11-01 00:00:00",305);
|
||||
insert into passengers values("1957-12-01 00:00:00",336);
|
||||
insert into passengers values("1958-01-01 00:00:00",340);
|
||||
insert into passengers values("1958-02-01 00:00:00",318);
|
||||
insert into passengers values("1958-03-01 00:00:00",362);
|
||||
insert into passengers values("1958-04-01 00:00:00",348);
|
||||
insert into passengers values("1958-05-01 00:00:00",363);
|
||||
insert into passengers values("1958-06-01 00:00:00",435);
|
||||
insert into passengers values("1958-07-01 00:00:00",491);
|
||||
insert into passengers values("1958-08-01 00:00:00",505);
|
||||
insert into passengers values("1958-09-01 00:00:00",404);
|
||||
insert into passengers values("1958-10-01 00:00:00",359);
|
||||
insert into passengers values("1958-11-01 00:00:00",310);
|
||||
insert into passengers values("1958-12-01 00:00:00",337);
|
||||
insert into passengers values("1959-01-01 00:00:00",360);
|
||||
insert into passengers values("1959-02-01 00:00:00",342);
|
||||
insert into passengers values("1959-03-01 00:00:00",406);
|
||||
insert into passengers values("1959-04-01 00:00:00",396);
|
||||
insert into passengers values("1959-05-01 00:00:00",420);
|
||||
insert into passengers values("1959-06-01 00:00:00",472);
|
||||
insert into passengers values("1959-07-01 00:00:00",548);
|
||||
insert into passengers values("1959-08-01 00:00:00",559);
|
||||
insert into passengers values("1959-09-01 00:00:00",463);
|
||||
insert into passengers values("1959-10-01 00:00:00",407);
|
||||
insert into passengers values("1959-11-01 00:00:00",362);
|
||||
insert into passengers values("1959-12-01 00:00:00",405);
|
||||
insert into passengers values("1960-01-01 00:00:00",417);
|
||||
insert into passengers values("1960-02-01 00:00:00",391);
|
||||
insert into passengers values("1960-03-01 00:00:00",419);
|
||||
insert into passengers values("1960-04-01 00:00:00",461);
|
||||
insert into passengers values("1960-05-01 00:00:00",472);
|
||||
insert into passengers values("1960-06-01 00:00:00",535);
|
||||
insert into passengers values("1960-07-01 00:00:00",622);
|
||||
insert into passengers values("1960-08-01 00:00:00",606);
|
||||
insert into passengers values("1960-09-01 00:00:00",508);
|
||||
insert into passengers values("1960-10-01 00:00:00",461);
|
||||
insert into passengers values("1960-11-01 00:00:00",390);
|
||||
insert into passengers values("1960-12-01 00:00:00",432);
|
|
@ -22,6 +22,7 @@ emailName="taosdata.com"
|
|||
tarName="package.tar.gz"
|
||||
logDir="/var/log/${PREFIX}/${PRODUCTPREFIX}"
|
||||
moduleDir="/var/lib/${PREFIX}/${PRODUCTPREFIX}/model"
|
||||
resourceDir="/var/lib/${PREFIX}/${PRODUCTPREFIX}/resource"
|
||||
venvDir="/var/lib/${PREFIX}/${PRODUCTPREFIX}/venv"
|
||||
global_conf_dir="/etc/${PREFIX}"
|
||||
installDir="/usr/local/${PREFIX}/${PRODUCTPREFIX}"
|
||||
|
@ -376,6 +377,13 @@ function install_module() {
|
|||
${csudo}ln -sf ${moduleDir} ${install_main_dir}/model
|
||||
}
|
||||
|
||||
function install_resource() {
|
||||
${csudo}mkdir -p ${resourceDir} && ${csudo}chmod 777 ${resourceDir}
|
||||
${csudo}ln -sf ${resourceDir} ${install_main_dir}/resource
|
||||
|
||||
${csudo}cp ${script_dir}/resource/*.sql ${install_main_dir}/resource/
|
||||
}
|
||||
|
||||
function install_anode_venv() {
|
||||
${csudo}mkdir -p ${venvDir} && ${csudo}chmod 777 ${venvDir}
|
||||
${csudo}ln -sf ${venvDir} ${install_main_dir}/venv
|
||||
|
@ -401,6 +409,7 @@ function install_anode_venv() {
|
|||
${csudo}${venvDir}/bin/pip3 install torch --index-url https://download.pytorch.org/whl/cpu
|
||||
${csudo}${venvDir}/bin/pip3 install --upgrade keras
|
||||
${csudo}${venvDir}/bin/pip3 install requests
|
||||
${csudo}${venvDir}/bin/pip3 install taospy
|
||||
|
||||
echo -e "Install python library for venv completed!"
|
||||
}
|
||||
|
@ -620,6 +629,7 @@ function updateProduct() {
|
|||
install_main_path
|
||||
install_log
|
||||
install_module
|
||||
install_resource
|
||||
install_config
|
||||
|
||||
if [ -z $1 ]; then
|
||||
|
@ -668,6 +678,7 @@ function installProduct() {
|
|||
install_log
|
||||
install_anode_config
|
||||
install_module
|
||||
install_resource
|
||||
|
||||
install_bin_and_lib
|
||||
if ! is_container; then
|
||||
|
|
|
@ -51,6 +51,7 @@ install_files="${script_dir}/install.sh"
|
|||
# make directories.
|
||||
mkdir -p ${install_dir}
|
||||
mkdir -p ${install_dir}/cfg && cp ${cfg_dir}/${configFile} ${install_dir}/cfg/${configFile}
|
||||
mkdir -p ${install_dir}/resource && cp ${top_dir}/resource/* ${install_dir}/resource/
|
||||
|
||||
if [ -f "${cfg_dir}/${serverName}.service" ]; then
|
||||
cp ${cfg_dir}/${serverName}.service ${install_dir}/cfg || :
|
||||
|
|
|
@ -5,6 +5,7 @@
|
|||
from matplotlib import pyplot as plt
|
||||
from taosanalytics.conf import app_logger, conf
|
||||
from taosanalytics.servicemgmt import loader
|
||||
from taosanalytics.util import convert_results_to_windows
|
||||
|
||||
|
||||
def do_ad_check(input_list, ts_list, algo_name, params):
|
||||
|
@ -22,17 +23,19 @@ def do_ad_check(input_list, ts_list, algo_name, params):
|
|||
|
||||
res = s.execute()
|
||||
|
||||
n_error = abs(sum(filter(lambda x: x == -1, res)))
|
||||
n_error = abs(sum(filter(lambda x: x != s.valid_code, res)))
|
||||
app_logger.log_inst.debug("There are %d in input, and %d anomaly points found: %s",
|
||||
len(input_list),
|
||||
n_error,
|
||||
res)
|
||||
|
||||
draw_ad_results(input_list, res, algo_name)
|
||||
return res
|
||||
# draw_ad_results(input_list, res, algo_name, s.valid_code)
|
||||
|
||||
ano_window = convert_results_to_windows(res, ts_list, s.valid_code)
|
||||
return res, ano_window
|
||||
|
||||
|
||||
def draw_ad_results(input_list, res, fig_name):
|
||||
def draw_ad_results(input_list, res, fig_name, valid_code):
|
||||
""" draw the detected anomaly points """
|
||||
|
||||
# not in debug, do not visualize the anomaly detection result
|
||||
|
@ -41,8 +44,7 @@ def draw_ad_results(input_list, res, fig_name):
|
|||
|
||||
plt.clf()
|
||||
for index, val in enumerate(res):
|
||||
if val != -1:
|
||||
continue
|
||||
if val != valid_code:
|
||||
plt.scatter(index, input_list[index], marker='o', color='r', alpha=0.5, s=100, zorder=3)
|
||||
|
||||
plt.plot(input_list, label='sample')
|
||||
|
|
|
@ -86,11 +86,11 @@ class _ArimaService(AbstractForecastService):
|
|||
if len(self.list) > 3000:
|
||||
raise ValueError("number of input data is too large")
|
||||
|
||||
if self.fc_rows <= 0:
|
||||
if self.rows <= 0:
|
||||
raise ValueError("fc rows is not specified yet")
|
||||
|
||||
res, mse, model_info = self.__do_forecast_helper(self.fc_rows)
|
||||
insert_ts_list(res, self.start_ts, self.time_step, self.fc_rows)
|
||||
res, mse, model_info = self.__do_forecast_helper(self.rows)
|
||||
insert_ts_list(res, self.start_ts, self.time_step, self.rows)
|
||||
|
||||
return {
|
||||
"mse": mse,
|
||||
|
|
|
@ -10,31 +10,26 @@ from taosanalytics.service import AbstractForecastService
|
|||
|
||||
|
||||
class _GPTService(AbstractForecastService):
|
||||
name = 'td_gpt_fc'
|
||||
name = 'tdtsfm_1'
|
||||
desc = "internal gpt forecast model based on transformer"
|
||||
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
|
||||
self.table_name = None
|
||||
self.service_host = 'http://127.0.0.1:5000/ds_predict'
|
||||
self.service_host = 'http://127.0.0.1:5000/tdtsfm'
|
||||
self.headers = {'Content-Type': 'application/json'}
|
||||
|
||||
self.std = None
|
||||
self.threshold = None
|
||||
self.time_interval = None
|
||||
self.dir = 'internal-gpt'
|
||||
|
||||
|
||||
def execute(self):
|
||||
if self.list is None or len(self.list) < self.period:
|
||||
raise ValueError("number of input data is less than the periods")
|
||||
|
||||
if self.fc_rows <= 0:
|
||||
if self.rows <= 0:
|
||||
raise ValueError("fc rows is not specified yet")
|
||||
|
||||
# let's request the gpt service
|
||||
data = {"input": self.list, 'next_len': self.fc_rows}
|
||||
data = {"input": self.list, 'next_len': self.rows}
|
||||
try:
|
||||
response = requests.post(self.service_host, data=json.dumps(data), headers=self.headers)
|
||||
except Exception as e:
|
||||
|
@ -54,7 +49,7 @@ class _GPTService(AbstractForecastService):
|
|||
"res": [pred_y]
|
||||
}
|
||||
|
||||
insert_ts_list(res["res"], self.start_ts, self.time_step, self.fc_rows)
|
||||
insert_ts_list(res["res"], self.start_ts, self.time_step, self.rows)
|
||||
return res
|
||||
|
||||
|
||||
|
|