Merge branch '3.0' of https://github.com/taosdata/TDengine into 3.0

2024-04-25 19:26:41 +08:00 · 2024-04-25 19:26:41 +08:00 · b8e373f6a9
parent 0551c1189d e8ef5c570c
commit b8e373f6a9
19 changed files with 463 additions and 171 deletions
--- a/docs/en/12-taos-sql/07-tag-index.md
+++ b/docs/en/12-taos-sql/07-tag-index.md
@ -34,6 +34,13 @@ SELECT * FROM information_schema.INS_INDEXES

 You can also add filter conditions to limit the results.

+
+````sql
+SHOW INDEXES FROM tbl_name [FROM db_name];
+SHOW INDEXES FROM [db_name.]tbl_name ;
+````
+Use `show indexes` commands to show indices that have been created for the specified database or table.
+
 ## Detailed Specification

 1. Indexes can improve query performance significantly if they are used properly. The operators supported by tag index include  `=`, `>`, `>=`, `<`, `<=`. If you use these operators with tags, indexes can improve query performance significantly. However, for operators not in this scope, indexes don't help. More and more operators will be added in future.
--- a/docs/en/12-taos-sql/27-indexing.md
+++ b/docs/en/12-taos-sql/27-indexing.md
@ -1,65 +1,134 @@
 ---
-title: Indexing
-sidebar_label: Indexing
-description: This document describes the SQL statements related to indexing in TDengine.
+sidebar_label: Window Pre-Aggregation
+title: Window Pre-Aggregation
+description: Instructions for using Window Pre-Aggregation
 ---

-TDengine supports SMA and tag indexing.
+To improve the performance of aggregate function queries on large datasets, you can create Time-Range Small Materialized Aggregates (TSMA) objects. These objects perform pre-computation on specified aggregate functions using fixed time windows and store the computed results. When querying, you can retrieve the pre-computed results to enhance query performance.

-## Create an Index
+## Creating TSMA

 ```sql
-CREATE INDEX index_name ON tb_name (col_name [, col_name] ...)
+-- Create TSMA based on a super table or regular table
+CREATE TSMA tsma_name ON [dbname.]table_name FUNCTION (func_name(func_param) [, ...] ) INTERVAL(time_duration);
+-- Create a large window TSMA based on a small window TSMA
+CREATE RECURSIVE TSMA tsma_name ON [db_name.]tsma_name1 INTERVAL(time_duration);

-CREATE SMA INDEX index_name ON tb_name index_option
-
-index_option:
-    FUNCTION(functions) INTERVAL(interval_val [, interval_offset]) [SLIDING(sliding_val)] [WATERMARK(watermark_val)] [MAX_DELAY(max_delay_val)]
-
-functions:
-    function [, function] ...
+time_duration:
+    number unit
 ```
-### tag Indexing

-  [tag index](../tag-index)
+To create a TSMA, you need to specify the TSMA name, table name, function list, and window size. When creating a TSMA based on an existing TSMA, using the `RECURSIVE` keyword, you don't need to specify the `FUNCTION()`. It will create a TSMA with the same function list as the existing TSMA, and the INTERVAL must be a multiple of the window of the base TSMA.

-### SMA Indexing
+The naming rule for TSMA is similar to the table name, with a maximum length of the table name length minus the length of the output table suffix. The table name length limit is 193, and the output table suffix is `_tsma_res_stb_`. The maximum length of the TSMA name is 178.

-Performs pre-aggregation on the specified column over the time window defined by the INTERVAL clause. The type is specified in functions_string. SMA indexing improves aggregate query performance for the specified time period. One supertable can only contain one SMA index.
+TSMA can only be created based on super tables and regular tables, not on subtables.

- The max, min, and sum functions are supported.
- WATERMARK: Enter a value between 0ms and 900000ms. The most precise unit supported is milliseconds. The default value is 5 seconds. This option can be used only on supertables.
- MAX_DELAY: Enter a value between 1ms and 900000ms. The most precise unit supported is milliseconds. The default value is the value of interval provided that it does not exceed 900000ms. This option can be used only on supertables. Note: Retain the default value if possible. Configuring a small MAX_DELAY may cause results to be frequently pushed, affecting storage and query performance.
+In the function list, you can only specify supported aggregate functions (see below), and the number of function parameters must be 1, even if the current function supports multiple parameters. The function parameters must be ordinary column names, not tag columns. Duplicate functions and columns in the function list will be deduplicated. When calculating TSMA, all `intermediate results of the functions` will be output to another super table, and the output super table also includes all tag columns of the original table. The maximum number of functions in the function list is the maximum number of columns in the output table (including tag columns) minus the four additional columns added for TSMA calculation, namely `_wstart`, `_wend`, `_wduration`, and a new tag column `tbname`, minus the number of tag columns in the original table. If the number of columns exceeds the limit, an error `Too many columns` will be reported.
+
+Since the output of TSMA is a super table, the row length of the output table is subject to the maximum row length limit. The size of the `intermediate results of different functions` varies, but they are generally larger than the original data size. If the row length of the output table exceeds the maximum row length limit, an error `Row length exceeds max length` will be reported. In this case, you need to reduce the number of functions or split commonly used functions groups into multiple TSMA objects.
+
+The window size is limited to [1ms ~ 1h]. The unit of INTERVAL is the same as the INTERVAL clause in the query, such as a (milliseconds), b (nanoseconds), h (hours), m (minutes), s (seconds), u (microseconds).
+
+TSMA is a database-level object, but it is globally unique. The number of TSMA that can be created in the cluster is limited by the parameter `maxTsmaNum`, with a default value of 8 and a range of [0-12]. Note that since TSMA background calculation uses stream computing, creating a TSMA will create a stream. Therefore, the number of TSMA that can be created is also limited by the number of existing streams and the maximum number of streams that can be created.
+
+## Supported Functions
+| function |  comments |
+|---|---|
+|min||
+|max||
+|sum||
+|first||
+|last||
+|avg||
+|count| If you want to use count(*), you should create the count(ts) function|
+|spread||
+|stddev||
+|hyperloglog||
+|||
+
+## Drop TSMA
+```sql
+DROP TSMA [db_name.]tsma_name;
+```
+
+If there are other TSMA created based on the TSMA being deleted, the delete operation will report an `Invalid drop base tsma, drop recursive tsma first` error. Therefore, all Recursive TSMA must be deleted first.
+
+## TSMA Calculation
+The calculation result of TSMA is a super table in the same database as the original table, but it is not visible to users. It cannot be deleted and will be automatically deleted when `DROP TSMA` is executed. The calculation of TSMA is done through stream computing, which is a background asynchronous process. The calculation result of TSMA is not guaranteed to be real-time, but it can guarantee eventual correctness.
+
+When there is a large amount of historical data, after creating TSMA, the stream computing will first calculate the historical data. During this period, newly created TSMA will not be used. The calculation will be automatically recalculated when data updates, deletions, or expired data arrive. During the recalculation period, the TSMA query results are not guaranteed to be real-time. If you want to query real-time data, you can use the hint `/*+ skip_tsma() */` in the SQL statement or disable the `querySmaOptimize` parameter to query from the original data.
+
+## Using and Limitations of TSMA
+
+Client configuration parameter: `querySmaOptimize`, used to control whether to use TSMA during queries. Set it to `True` to use TSMA, and `False` to query from the original data.
+
+Client configuration parameter: `maxTsmaCalcDelay`, in seconds, is used to control the acceptable TSMA calculation delay for users. If the calculation progress of a TSMA is within this range from the latest time, the TSMA will be used. If it exceeds this range, it will not be used. The default value is 600 (10 minutes), with a minimum value of 600 (10 minutes) and a maximum value of 86400 (1 day).
+
+### Using TSMA Duraing Query
+
+The aggregate functions defined in TSMA can be directly used in most query scenarios. If multiple TSMA are available, the one with the larger window size is preferred. For unclosed windows, the calculation can be done using smaller window TSMA or the original data. However, there are certain scenarios where TSMA cannot be used (see below). In such cases, the entire query will be calculated using the original data.
+
+The default behavior for queries without specified window sizes is to prioritize the use of the largest window TSMA that includes all the aggregate functions used in the query. For example, `SELECT COUNT(*) FROM stable GROUP BY tbname` will use the TSMA with the largest window that includes the `count(ts)` function. Therefore, when using aggregate queries frequently, it is recommended to create TSMA objects with larger window size.
+
+When specifying the window size, which is the `INTERVAL` statement, use the largest TSMA window that is divisible by the window size of the query. In window queries, the window size of the `INTERVAL`, `OFFSET`, and `SLIDING` all affect the TSMA window size that can be used. Divisible window TSMA refers to a TSMA window size that is divisible by the `INTERVAL`, `OFFSET`, and `SLIDING` of the query statement. Therefore, when using window queries frequently, consider the window size, as well as the offset and sliding size when creating TSMA objects.
+
+Example 1. If TSMA with window size of `5m` and `10m` is created, and the query is `INTERVAL(30m)`, the TSMA with window size of `10m` will be used. If the query is `INTERVAL(30m, 10m) SLIDING(5m)`, only the TSMA with window size of `5m` can be used for the query.
+
+### Limitations of Query
+
+When the parameter `querySmaOptimize` is enabled and there is no `skip_tsma()` hint, the following query scenarios cannot use TSMA:
+
+- When the aggregate functions defined in a TSMA do not cover the function list of the current query.
+- Non-`INTERVAL` windows or the query window size (including `INTERVAL, SLIDING, OFFSET`) is not multiples of the defined window size. For example, if the defined window is 2m and the query uses a 5-minute window, but if there is a 1m window available, it can be used.
+- Query with filtering on any regular column (non-primary key time column) in the `WHERE` condition.
+- When `PARTITION` or `GROUP BY` includes any regular column or its expression
+- When other faster optimization logic can be used, such as last cache optimization, if it meets the conditions for last optimization, it will be prioritized. If last optimization is not possible, then it will be determined whether TSMA optimization can be used.
+- When the current TSMA calculation progress delay is greater than the configuration parameter `maxTsmaCalcDelay`
+
+Some examples:

 ```sql
-DROP DATABASE IF EXISTS d0;
-CREATE DATABASE d0;
-USE d0;
-CREATE TABLE IF NOT EXISTS st1 (ts timestamp, c1 int, c2 float, c3 double) TAGS (t1 int unsigned);
-CREATE TABLE ct1 USING st1 TAGS(1000);
-CREATE TABLE ct2 USING st1 TAGS(2000);
-INSERT INTO ct1 VALUES(now+0s, 10, 2.0, 3.0);
-INSERT INTO ct1 VALUES(now+1s, 11, 2.1, 3.1)(now+2s, 12, 2.2, 3.2)(now+3s, 13, 2.3, 3.3);
-CREATE SMA INDEX sma_index_name1 ON st1 FUNCTION(max(c1),max(c2),min(c1)) INTERVAL(5m,10s) SLIDING(5m) WATERMARK 5s MAX_DELAY 1m;
-- query from SMA Index
-ALTER LOCAL 'querySmaOptimize' '1';
-SELECT max(c2),min(c1) FROM st1 INTERVAL(5m,10s) SLIDING(5m);
-SELECT _wstart,_wend,_wduration,max(c2),min(c1) FROM st1 INTERVAL(5m,10s) SLIDING(5m);
-- query from raw data
-ALTER LOCAL 'querySmaOptimize' '0';
+SELECT agg_func_list [, pesudo_col_list] FROM stable WHERE exprs [GROUP/PARTITION BY [tbname] [, tag_list]] [HAVING ...] [INTERVAL(time_duration, offset) SLIDING(duration)]...;
+
+-- create
+CREATE TSMA tsma1 ON stable FUNCTION(COUNT(ts), SUM(c1), SUM(c3), MIN(c1), MIN(c3), AVG(c1)) INTERVAL(1m);
+-- query
+SELECT COUNT(*), SUM(c1) + SUM(c3) FROM stable; ---- use tsma1
+SELECT COUNT(*), AVG(c1) FROM stable GROUP/PARTITION BY tbname, tag1, tag2;  --- use tsma1
+SELECT COUNT(*), MIN(c1) FROM stable INTERVAL(1h);  ---use tsma1
+SELECT COUNT(*), MIN(c1), SPREAD(c1) FROM stable INTERVAL(1h); ----- can't use, spread func not defined, although SPREAD can be calculated by MIN and MAX which are defined.
+SELECT COUNT(*), MIN(c1) FROM stable INTERVAL(30s); ----- can't use tsma1, time_duration not fit. Normally, query_time_duration should be multple of create_duration.
+SELECT COUNT(*), MIN(c1) FROM stable where c2 > 0; ---- can't use tsma1, can't do c2 filtering
+SELECT COUNT(*) FROM stable GROUP BY c2; ---- can't use any tsma
+SELECT MIN(c3), MIN(c2) FROM stable INTERVAL(1m); ---- can't use tsma1, c2 is not defined in tsma1.
+
+-- Another tsma2 created with INTERVAL(1h) based on tsma1
+CREATE RECURSIVE TSMA tsma2 on tsma1 INTERVAL(1h);
+SELECT COUNT(*), SUM(c1) FROM stable; ---- use tsma2
+SELECT COUNT(*), AVG(c1) FROM stable GROUP/PARTITION BY tbname, tag1, tag2;  --- use tsma2
+SELECT COUNT(*), MIN(c1) FROM stable INTERVAL(2h);  ---use tsma2
+SELECT COUNT(*), MIN(c1) FROM stable WHERE ts < '2023-01-01 10:10:10' INTERVAL(30m); --use tsma1
+SELECT COUNT(*), MIN(c1) + MIN(c3) FROM stable INTERVAL(30m);  ---use tsma1
+SELECT COUNT(*), MIN(c1) FROM stable INTERVAL(1h) SLIDING(30m);  ---use tsma1
+SELECT COUNT(*), MIN(c1), SPREAD(c1) FROM stable INTERVAL(1h); ----- can't use tsma1 or tsma2, spread func not defined
+SELECT COUNT(*), MIN(c1) FROM stable INTERVAL(30s); ----- can't use tsma1 or tsma2, time_duration not fit. Normally, query_time_duration should be multple of create_duration.
+SELECT COUNT(*), MIN(c1) FROM stable where c2 > 0; ---- can't use tsma1 or tsam2, can't do c2 filtering
 ```

-## Delete an Index
+### Limitations of Usage
+
+After creating a TSMA, there are certain restrictions on operations that can be performed on the original table:
+
+- You must delete all TSMAs on the table before you can delete the table itself.
+- All tag columns of the original table cannot be deleted, nor can the tag column names or sub-table tag values be modified. You must first delete the TSMA before you can delete the tag column.
+- If some columns are being used by the TSMA, these columns cannot be deleted. You must first delete the TSMA. However, adding new columns to the table is not affected. However, new columns added are not included in any TSMA, so if you want to calculate the new columns, you need to create new TSMA for them.
+
+## Show TSMA

 ```sql
-DROP INDEX index_name;
+SHOW [db_name.]TSMAS;
+SELECT * FROM information_schema.ins_tsma;
 ```

-## View Indices
-
-````sql
-SHOW INDEXES FROM tbl_name [FROM db_name];
-SHOW INDEXES FROM [db_name.]tbl_name ;
-````
-
-Shows indices that have been created for the specified database or table.
+If more functions are specified during creation, and the column names are longer, the function list may be truncated when displayed (currently supports a maximum output of 256KB)
--- a/docs/en/14-reference/12-config/index.md
+++ b/docs/en/14-reference/12-config/index.md
@ -241,6 +241,16 @@ Please note the `taoskeeper` needs to be installed and running to create the `lo
 | Default Value | 0          |
 | Notes         |  When this parameter is set to 0, last(\*)/last_row(\*)/first(\*) only returns the columns of the super table; When it is 1, return the columns and tags of the super table.               |

+### maxTsmaCalcDelay
+
+| Attribute     | Description                                                                                                                                  |
+| --------      | -------------------------------------------------------------------------------------------------------------------------------------------- |
+| Applicable    | Client only                                                                                                                                  |
+| Meaning       | Query allowed tsma calculation delay, if the tsma calculation delay is greater than the configured value, the TSMA will not be used.         |
+| Value Range   | 600s - 86400s, 10 minutes to 1 hour                                                                                                          |
+| Default value | 600s                                                                                                                                         |
+
+
 ## Locale Parameters

 ### timezone
@ -760,6 +770,15 @@ The charset that takes effect is UTF-8.
 | Value Range | 1-10000|
 | Default Value   | 20                  |

+### maxTsmaNum
+
+| Attribute | Description                   |
+| --------- | ----------------------------- |
+| Applicable | Server Only                  |
+| Meaning   | Max num of TSMAs              |
+| Value Range | 0-12                        |
+| Default Value | 8                         |
+
 ## 3.0 Parameters

 | #   |     **Parameter**      | **Applicable to 2.x ** | **Applicable to  3.0 **      | Current behavior in 3.0 |
--- a/docs/zh/12-taos-sql/07-tag-index.md
+++ b/docs/zh/12-taos-sql/07-tag-index.md
@ -34,6 +34,13 @@ SELECT * FROM information_schema.INS_INDEXES

 也可以为上面的查询语句加上过滤条件以缩小查询范围。

+或者通过 SHOW 命令查看指定表上的索引
+
+```sql
+SHOW INDEXES FROM tbl_name [FROM db_name];
+SHOW INDEXES FROM [db_name.]tbl_name;
+```
+
 ## 使用说明

 1. 索引使用得当能够提升数据过滤的效率，目前支持的过滤算子有 `=`, `>`, `>=`, `<`, `<=`。如果查询过滤条件中使用了这些算子，则索引能够明显提升查询效率。但如果查询过滤条件中使用的是其它算子，则索引起不到作用，查询效率没有变化。未来会逐步添加更多的算子。
--- a/docs/zh/12-taos-sql/27-indexing.md
+++ b/docs/zh/12-taos-sql/27-indexing.md
@ -1,66 +1,132 @@
 ---
-sidebar_label: 索引
-title: 索引
-description: 索引功能的使用细节
+sidebar_label: 窗口预聚集
+title: 窗口预聚集
+description: 窗口预聚集使用说明
 ---

-TDengine 从 3.0.0.0 版本开始引入了索引功能，支持 SMA 索引和 tag 索引。
+为了提高大数据量的聚合函数查询性能，通过创建窗口预聚集 (TSMA Time-Range Small Materialized Aggregates) 对象, 使用固定时间窗口对指定的聚集函数进行预计算，并将计算结果存储下来，查询时通过查询预计算结果以提高查询性能。

-## 创建索引
+## 创建TSMA

 ```sql
+-- 创建基于超级表或普通表的tsma
+CREATE TSMA tsma_name ON [dbname.]table_name FUNCTION (func_name(func_param) [, ...] ) INTERVAL(time_duration);
+-- 创建基于小窗口tsma的大窗口tsma
+CREATE RECURSIVE TSMA tsma_name ON [db_name.]tsma_name1 INTERVAL(time_duration);

-CREATE INDEX index_name ON tb_name index_option
-
-CREATE SMA INDEX index_name ON tb_name index_option
-
-index_option:
-    FUNCTION(functions) INTERVAL(interval_val [, interval_offset]) [SLIDING(sliding_val)] [WATERMARK(watermark_val)] [MAX_DELAY(max_delay_val)]
-
-functions:
-    function [, function] ...
+time_duration:
+    number unit
 ```
-### tag 索引

- [tag 索引](../tag-index)
+创建 TSMA 时需要指定 TSMA 名字, 表名字, 函数列表以及窗口大小. 当基于 TSMA 创建时 TSMA 时, 即使用 `RECURSIVE` 关键字， 不需要指定 `FUNCTION()`， 将创建与已有 TSMA 相同的函数列表的TSMA， 且 INTERVAL 必须为所基于的TSMA窗口的整数倍。

-### SMA 索引
+其中 TSMA 命名规则与表名字类似, 长度最大限制为表名长度限制减去输出表后缀长度, 表名长度限制为193, 输出表后缀为`_tsma_res_stb_`, TSMA 名字最大长度为178.

-对指定列按 INTERVAL 子句定义的时间窗口创建进行预聚合计算，预聚合计算类型由 functions_string 指定。SMA 索引能提升指定时间段的聚合查询的性能。目前，限制一个超级表只能创建一个 SMA INDEX。
+TSMA只能基于超级表和普通表创建, 不能基于子表创建.

- 支持的函数包括 MAX、MIN 和 SUM。
- WATERMARK: 最小单位毫秒，取值范围 [0ms, 900000ms]，默认值为 5 秒，只可用于超级表。
- MAX_DELAY: 最小单位毫秒，取值范围 [1ms, 900000ms]，默认值为 interval 的值(但不能超过最大值)，只可用于超级表。注：不建议 MAX_DELAY 设置太小，否则会过于频繁的推送结果，影响存储和查询性能，如无特殊需求，取默认值即可。
+函数列表中只能指定支持的聚集函数(见下文), 并且函数参数必须为1个, 即使当前函数支持多个参数, 函数参数内必须为普通列名, 不能为标签列. 函数列表中完全相同的函数和列会被去重, 如同时创建两个avg(c1), 则只会计算一个输出. TSMA 计算时将会把所有`函数中间结果`都输出到另一张超级表中, 输出超级表还包含了原始表的所有tag列. 函数列表中函数个数最多支持创建表最大列个数(包括tag列)减去 TSMA 计算附加的四列, 分别为`_wstart`, `_wend`, `_wduration`, 以及一个新增tag列 `tbname`, 再减去原始表的tag列数. 若列个数超出限制, 会报`Too many columns`错误. 
+
+由于TSMA输出为一张超级表, 因此输出表的行长度受最大行长度限制, 不同函数的`中间结果`大小各异, 一般都大于原始数据大小, 若输出表的行长度大于最大行长度限制, 将会报`Row length exceeds max length`错误. 此时需要减少函数个数或者将常用的函数进行分组拆分到多个TSMA中.
+
+窗口大小的限制为[1ms ~ 1h]. INTERVAL 的单位与查询中INTERVAL字句相同, 如 a (毫秒), b (纳秒), h (小时), m (分钟), s (秒), u (微妙).
+
+TSMA为库内对象, 但名字全局唯一. 集群内一共可创建TSMA个数受参数`maxTsmaNum`限制, 参数默认值为8, 范围: [0-12]. 注意, 由于TSMA后台计算使用流计算, 因此每创建一条TSMA, 将会创建一条流, 因此能够创建的TSMA条数也受当前已经存在的流条数和最大可创建流条数限制.
+
+## 支持的函数列表
+| 函数|  备注 |
+|---|---|
+|min||
+|max||
+|sum||
+|first||
+|last||
+|avg||
+|count| 若想使用count(*), 则应创建count(ts)函数|
+|spread||
+|stddev||
+|hyperloglog||
+|||
+
+## 删除TSMA
+```sql
+DROP TSMA [db_name.]tsma_name;
+```
+若存在其他TSMA基于当前被删除TSMA创建, 则删除操作报`Invalid drop base tsma, drop recursive tsma first`错误. 因此需先删除 所有Recursive TSMA.
+
+## TSMA的计算
+TSMA的计算结果为与原始表相同库下的一张超级表, 此表用户不可见. 不可删除, 在`DROP TSMA`时自动删除. TSMA的计算是通过流计算完成的, 此过程为后台异步过程, TSMA的计算结果不保证实时性, 但可以保证最终正确性.
+
+当存在大量历史数据时, 创建TSMA之后, 流计算将会首先计算历史数据, 此期间新创建的TSMA不会被使用. 数据更新删除或者过期数据到来时自动重新计算影响部分数据。 在重新计算期间 TSMA 查询结果不保证实时性。若希望查询实时数据， 可以通过在 SQL 中添加 hint `/*+ skip_tsma() */` 或者关闭参数`querySmaOptimize`从原始数据查询。
+
+## TSMA的使用与限制
+
+客户端配置参数: `querySmaOptimize`, 用于控制查询时是否使用TSMA, `True`为使用, `False`为不使用即从原始数据查询.
+
+客户端配置参数：`maxTsmaCalcDelay`，单位 s，用于控制用户可以接受的 TSMA 计算延迟，若 TSMA 的计算进度与最新时间差距在此范围内， 则该 TSMA 将会被使用， 若超出该范围， 则不使用， 默认值： 600（10 分钟）， 最小值： 600（10 分钟）， 最大值： 86400（1 天）.
+
+### 查询时使用TSMA
+
+已在 TSMA 中定义的 agg 函数在大部分查询场景下都可直接使用， 若存在多个可用的 TSMA， 优先使用大窗口的 TSMA， 未闭合窗口通过查询小窗口TSMA或者原始数据计算。 同时也有某些场景不能使用 TSMA(见下文)。 不可用时整个查询将使用原始数据进行计算。 
+
+未指定窗口大小的查询语句默认优先使用包含所有查询聚合函数的最大窗口 TSMA 进行数据的计算。 如`SELECT COUNT(*) FROM stable GROUP BY tbname`将会使用包含count(ts)且窗口最大的TSMA。因此若使用聚合查询频率高时, 应当尽可能创建大窗口的TSMA.
+
+指定窗口大小时即 `INTERVAL` 语句，使用最大的可整除窗口 TSMA。 窗口查询中， `INTERVAL` 的窗口大小， `OFFSET` 以及 `SLIDING` 都影响能使用的 TSMA 窗口大小， 可整 除窗口 TSMA 即 TSMA 窗口大小可被查询语句的 `INTERVAL， OFFSET， SLIDING` 整除的窗口。因此若使用窗口查询较多时, 需要考虑经常查询的窗口大小, 以及 offset, sliding大小来创建TSMA.
+
+例 1. 如 创建 TSMA 窗口大小 `5m` 一条， `10m` 一条， 查询时 `INTERVAL(30m)`， 那么优先使用 `10m` 的 TSMA， 若查询为 `INTERVAL(30m, 10m) SLIDING(5m)`， 那么仅可使用 `5m` 的 TSMA 查询。
+
+
+### 查询限制
+
+在开启了参数`querySmaOptimize`并且无`skip_tsma()` hint时, 以下查询场景无法使用TSMA:
+
+- 某个TSMA 中定义的 agg 函数不能覆盖当前查询的函数列表时
+- 非 `INTERVAL` 的其他窗口，或者 `INTERVAL` 查询窗口大小（包括 `INTERVAL，SLIDING，OFFSET`）不是定义窗口的整数倍，如定义窗口为 2m，查询使用 5 分钟窗口，但若存在 1m 的窗口，则可以使用。
+- 查询 `WHERE` 条件中包含任意普通列(非主键时间列)的过滤。
+- `PARTITION` 或者 `GROUY BY` 包含任意普通列或其表达式时
+- 可以使用其他更快的优化逻辑时， 如last cache优化, 若符合last优化的条件, 则先走last 优化, 无法走last时, 再判断是否可以走tsma优化
+- 当前 TSMA 计算进度延迟大于配置参数 `maxTsmaCalcDelay`时
+
+下面是一些例子:

 ```sql
-DROP DATABASE IF EXISTS d0;
-CREATE DATABASE d0;
-USE d0;
-CREATE TABLE IF NOT EXISTS st1 (ts timestamp, c1 int, c2 float, c3 double) TAGS (t1 int unsigned);
-CREATE TABLE ct1 USING st1 TAGS(1000);
-CREATE TABLE ct2 USING st1 TAGS(2000);
-INSERT INTO ct1 VALUES(now+0s, 10, 2.0, 3.0);
-INSERT INTO ct1 VALUES(now+1s, 11, 2.1, 3.1)(now+2s, 12, 2.2, 3.2)(now+3s, 13, 2.3, 3.3);
-CREATE SMA INDEX sma_index_name1 ON st1 FUNCTION(max(c1),max(c2),min(c1)) INTERVAL(5m,10s) SLIDING(5m) WATERMARK 5s MAX_DELAY 1m;
-- 从 SMA 索引查询
-ALTER LOCAL 'querySmaOptimize' '1';
-SELECT max(c2),min(c1) FROM st1 INTERVAL(5m,10s) SLIDING(5m);
-SELECT _wstart,_wend,_wduration,max(c2),min(c1) FROM st1 INTERVAL(5m,10s) SLIDING(5m);
-- 从原始数据查询
-ALTER LOCAL 'querySmaOptimize' '0'; 
+SELECT agg_func_list [, pesudo_col_list] FROM stable WHERE exprs [GROUP/PARTITION BY [tbname] [, tag_list]] [HAVING ...] [INTERVAL(time_duration, offset) SLIDING(duration)]...;
+
+-- 创建
+CREATE TSMA tsma1 ON stable FUNCTION(COUNT(ts), SUM(c1), SUM(c3), MIN(c1), MIN(c3), AVG(c1)) INTERVAL(1m);
+-- 查询
+SELECT COUNT(*), SUM(c1) + SUM(c3) FROM stable; ---- use tsma1
+SELECT COUNT(*), AVG(c1) FROM stable GROUP/PARTITION BY tbname, tag1, tag2;  --- use tsma1
+SELECT COUNT(*), MIN(c1) FROM stable INTERVAL(1h);  ---use tsma1
+SELECT COUNT(*), MIN(c1), SPREAD(c1) FROM stable INTERVAL(1h); ----- can't use, spread func not defined, although SPREAD can be calculated by MIN and MAX which are defined.
+SELECT COUNT(*), MIN(c1) FROM stable INTERVAL(30s); ----- can't use tsma1, time_duration not fit. Normally, query_time_duration should be multple of create_duration.
+SELECT COUNT(*), MIN(c1) FROM stable where c2 > 0; ---- can't use tsma1, can't do c2 filtering
+SELECT COUNT(*) FROM stable GROUP BY c2; ---- can't use any tsma
+SELECT MIN(c3), MIN(c2) FROM stable INTERVAL(1m); ---- can't use tsma1, c2 is not defined in tsma1.
+
+-- Another tsma2 created with INTERVAL(1h) based on tsma1
+CREATE RECURSIVE TSMA tsma2 on tsma1 INTERVAL(1h);
+SELECT COUNT(*), SUM(c1) FROM stable; ---- use tsma2
+SELECT COUNT(*), AVG(c1) FROM stable GROUP/PARTITION BY tbname, tag1, tag2;  --- use tsma2
+SELECT COUNT(*), MIN(c1) FROM stable INTERVAL(2h);  ---use tsma2
+SELECT COUNT(*), MIN(c1) FROM stable WHERE ts < '2023-01-01 10:10:10' INTERVAL(30m); --use tsma1
+SELECT COUNT(*), MIN(c1) + MIN(c3) FROM stable INTERVAL(30m);  ---use tsma1
+SELECT COUNT(*), MIN(c1) FROM stable INTERVAL(1h) SLIDING(30m);  ---use tsma1
+SELECT COUNT(*), MIN(c1), SPREAD(c1) FROM stable INTERVAL(1h); ----- can't use tsma1 or tsma2, spread func not defined
+SELECT COUNT(*), MIN(c1) FROM stable INTERVAL(30s); ----- can't use tsma1 or tsma2, time_duration not fit. Normally, query_time_duration should be multple of create_duration.
+SELECT COUNT(*), MIN(c1) FROM stable where c2 > 0; ---- can't use tsma1 or tsam2, can't do c2 filtering
 ```

-## 删除索引
+### 使用限制

+创建TSMA之后, 对原始超级表的操作有以下限制:
+
+- 必须删除该表上的所有TSMA才能删除该表.
+- 原始表所有tag列不能删除, 也不能修改tag列名或子表的tag值, 必须先删除TSMA, 才能删除tag列.
+- 若某些列被TSMA使用了, 则这些列不能被删除, 必须先删除TSMA. 添加列不受影响, 但是新添加的列不在任何TSMA中, 因此若要计算新增列, 需要新创建其他的TSMA.
+
+## 查看TSMA
 ```sql
-DROP INDEX index_name;
+SHOW [db_name.]TSMAS;
+SELECT * FROM information_schema.ins_tsma;
 ```
-
-## 查看索引
-
-````sql
-SHOW INDEXES FROM tbl_name [FROM db_name];
-SHOW INDEXES FROM [db_name.]tbl_name;
-````
-
-显示在所指定的数据库或表上已创建的索引。
+若创建时指定的较多的函数, 且列名较长, 在显示函数列表时可能会被截断(目前最大支持输出256KB).
--- a/docs/zh/14-reference/12-config/index.md
+++ b/docs/zh/14-reference/12-config/index.md
@ -240,6 +240,16 @@ taos -C
 | 缺省值   | 0                                                                                                                                              |
 | 补充说明 | 该参数设置为 0 时，last(\*)/last_row(\*)/first(\*) 只返回超级表的普通列；为 1 时，返回超级表的普通列和标签列                                            |

+### maxTsmaCalcDelay
+
+| 属性     | 说明                                                                                                                                           |
+| -------- | ---------------------------------------------------------------------------------------------------------------------------------------------- |
+| 适用范围 | 仅客户端适用                                                                                                                                   |
+| 含义     | 查询时客户端可允许的tsma计算延迟, 若tsma的计算延迟大于配置值, 则该TSMA将不会被使用.                                                  |
+| 取值范围 | 600s - 86400s, 即10分钟-1小时                                                                                                                         |
+| 缺省值   | 600s                                                                                                                                              |
+
+
 ## 区域相关

 ### timezone
@ -745,6 +755,15 @@ charset 的有效值是 UTF-8。
 | 取值范围 | 1-10000                     |
 | 缺省值   | 20                          |

+### maxTsmaNum
+
+| 属性     | 说明                        |
+| -------- | --------------------------- |
+| 适用范围 | 仅服务端适用                |
+| 含义     | 集群内可创建的TSMA个数      |
+| 取值范围 | 0-12                        |
+| 缺省值   | 8                           |
+
 ## 压缩参数

 ### compressMsgSize
--- a/source/client/src/clientRawBlockWrite.c
+++ b/source/client/src/clientRawBlockWrite.c
@ -736,8 +736,8 @@ static int32_t taosCreateStb(TAOS* taos, void* meta, int32_t metaLen) {
    SSchema*          pSchema = req.schemaRow.pSchema + i;
    SFieldWithOptions field = {.type = pSchema->type, .flags = pSchema->flags, .bytes = pSchema->bytes};
    strcpy(field.name, pSchema->name);
-    // todo get active compress param
-    setDefaultOptionsForField(&field);
+    SColCmpr *p = &req.colCmpr.pColCmpr[i];
+    field.compress = p->alg;
    taosArrayPush(pReq.pColumns, &field);
  }
  pReq.pTags = taosArrayInit(req.schemaTag.nCols, sizeof(SField));
--- a/source/client/src/clientSml.c
+++ b/source/client/src/clientSml.c
@ -1290,17 +1290,24 @@ end:
  return code;
 }

-static void smlInsertMeta(SHashObj *metaHash, SArray *metaArray, SArray *cols) {
+static int32_t smlInsertMeta(SHashObj *metaHash, SArray *metaArray, SArray *cols, SHashObj *checkDuplicate) {
+  terrno = 0;
  for (int16_t i = 0; i < taosArrayGetSize(cols); ++i) {
    SSmlKv *kv = (SSmlKv *)taosArrayGet(cols, i);
    int     ret = taosHashPut(metaHash, kv->key, kv->keyLen, &i, SHORT_BYTES);
    if (ret == 0) {
      taosArrayPush(metaArray, kv);
+      if(taosHashGet(checkDuplicate, kv->key, kv->keyLen) != NULL) {
+        return TSDB_CODE_PAR_DUPLICATED_COLUMN;
+      }
+    }else if(terrno == TSDB_CODE_DUP_KEY){
+      return TSDB_CODE_PAR_DUPLICATED_COLUMN;
    }
  }
+  return TSDB_CODE_SUCCESS;
 }

-static int32_t smlUpdateMeta(SHashObj *metaHash, SArray *metaArray, SArray *cols, bool isTag, SSmlMsgBuf *msg) {
+static int32_t smlUpdateMeta(SHashObj *metaHash, SArray *metaArray, SArray *cols, bool isTag, SSmlMsgBuf *msg, SHashObj* checkDuplicate) {
  for (int i = 0; i < taosArrayGetSize(cols); ++i) {
    SSmlKv *kv = (SSmlKv *)taosArrayGet(cols, i);

@ -1332,6 +1339,11 @@ static int32_t smlUpdateMeta(SHashObj *metaHash, SArray *metaArray, SArray *cols
      int     ret = taosHashPut(metaHash, kv->key, kv->keyLen, &size, SHORT_BYTES);
      if (ret == 0) {
        taosArrayPush(metaArray, kv);
+        if(taosHashGet(checkDuplicate, kv->key, kv->keyLen) != NULL) {
+          return TSDB_CODE_PAR_DUPLICATED_COLUMN;
+        }
+      }else{
+        return ret;
      }
    }
  }
@ -1456,7 +1468,7 @@ static int32_t smlPushCols(SArray *colsArray, SArray *cols) {
    taosHashPut(kvHash, kv->key, kv->keyLen, &kv, POINTER_BYTES);
    if (terrno == TSDB_CODE_DUP_KEY) {
      taosHashCleanup(kvHash);
-      return terrno;
+      return TSDB_CODE_PAR_DUPLICATED_COLUMN;
    }
  }

@ -1512,12 +1524,12 @@ static int32_t smlParseLineBottom(SSmlHandle *info) {
    if (tableMeta) {  // update meta
      uDebug("SML:0x%" PRIx64 " smlParseLineBottom update meta, format:%d, linenum:%d", info->id, info->dataFormat,
             info->lineNum);
-      ret = smlUpdateMeta((*tableMeta)->colHash, (*tableMeta)->cols, elements->colArray, false, &info->msgBuf);
+      ret = smlUpdateMeta((*tableMeta)->colHash, (*tableMeta)->cols, elements->colArray, false, &info->msgBuf, (*tableMeta)->tagHash);
      if (ret == TSDB_CODE_SUCCESS) {
-        ret = smlUpdateMeta((*tableMeta)->tagHash, (*tableMeta)->tags, tinfo->tags, true, &info->msgBuf);
+        ret = smlUpdateMeta((*tableMeta)->tagHash, (*tableMeta)->tags, tinfo->tags, true, &info->msgBuf, (*tableMeta)->colHash);
      }
      if (ret != TSDB_CODE_SUCCESS) {
-        uError("SML:0x%" PRIx64 " smlUpdateMeta failed", info->id);
+        uError("SML:0x%" PRIx64 " smlUpdateMeta failed, ret:%d", info->id, ret);
        return ret;
      }
    } else {
@ -1527,13 +1539,19 @@ static int32_t smlParseLineBottom(SSmlHandle *info) {
      if (meta == NULL) {
        return TSDB_CODE_OUT_OF_MEMORY;
      }
-      taosHashPut(info->superTables, elements->measure, elements->measureLen, &meta, POINTER_BYTES);
-      terrno = 0;
-      smlInsertMeta(meta->tagHash, meta->tags, tinfo->tags);
-      if (terrno == TSDB_CODE_DUP_KEY) {
-        return terrno;
+      ret = taosHashPut(info->superTables, elements->measure, elements->measureLen, &meta, POINTER_BYTES);
+      if (ret != TSDB_CODE_SUCCESS) {
+        uError("SML:0x%" PRIx64 " put measuer to hash failed", info->id);
+        return ret;
+      }
+      ret = smlInsertMeta(meta->tagHash, meta->tags, tinfo->tags, NULL);
+      if (ret == TSDB_CODE_SUCCESS) {
+        ret = smlInsertMeta(meta->colHash, meta->cols, elements->colArray, meta->tagHash);
+      }
+      if (ret != TSDB_CODE_SUCCESS) {
+        uError("SML:0x%" PRIx64 " insert meta failed:%s", info->id, tstrerror(ret));
+        return ret;
      }
-      smlInsertMeta(meta->colHash, meta->cols, elements->colArray);
    }
  }
  uDebug("SML:0x%" PRIx64 " smlParseLineBottom end, format:%d, linenum:%d", info->id, info->dataFormat, info->lineNum);
--- a/source/dnode/vnode/src/tsdb/tsdbReaderWriter.c
+++ b/source/dnode/vnode/src/tsdb/tsdbReaderWriter.c
@ -14,8 +14,8 @@
 */

 #include "cos.h"
-#include "tsdb.h"
 #include "crypt.h"
+#include "tsdb.h"
 #include "vnd.h"

 static int32_t tsdbOpenFileImpl(STsdbFD *pFD) {
@ -61,6 +61,7 @@ static int32_t tsdbOpenFileImpl(STsdbFD *pFD) {
      // taosMemoryFree(pFD);
      goto _exit;
    }
+    pFD->s3File = 1;
    /*
    const char *object_name = taosDirEntryBaseName((char *)path);
    long        s3_size = 0;
@ -86,7 +87,6 @@ static int32_t tsdbOpenFileImpl(STsdbFD *pFD) {
        goto _exit;
      }
 #else
-      pFD->s3File = 1;
      pFD->pFD = (TdFilePtr)&pFD->s3File;
      int32_t vid = 0;
      sscanf(object_name, "v%df%dver%" PRId64 ".data", &vid, &pFD->fid, &pFD->cid);
@ -170,7 +170,7 @@ void tsdbCloseFile(STsdbFD **ppFD) {
  }
 }

-static int32_t tsdbWriteFilePage(STsdbFD *pFD, int32_t encryptAlgorithm, char* encryptKey) {
+static int32_t tsdbWriteFilePage(STsdbFD *pFD, int32_t encryptAlgorithm, char *encryptKey) {
  int32_t code = 0;

  if (!pFD->pFD) {
@ -182,7 +182,7 @@ static int32_t tsdbWriteFilePage(STsdbFD *pFD, int32_t encryptAlgorithm, char* e

  if (pFD->pgno > 0) {
    int64_t offset = PAGE_OFFSET(pFD->pgno, pFD->szPage);
-    if (pFD->lcn > 1) {
+    if (pFD->s3File && pFD->lcn > 1) {
      SVnodeCfg *pCfg = &pFD->pTsdb->pVnode->config;
      int64_t    chunksize = (int64_t)pCfg->tsdbPageSize * pCfg->s3ChunkSize;
      int64_t    chunkoffset = chunksize * (pFD->lcn - 1);
@ -198,27 +198,27 @@ static int32_t tsdbWriteFilePage(STsdbFD *pFD, int32_t encryptAlgorithm, char* e
    }

    taosCalcChecksumAppend(0, pFD->pBuf, pFD->szPage);
-    
-    if(encryptAlgorithm == DND_CA_SM4){
-    //if(tsiEncryptAlgorithm == DND_CA_SM4 && (tsiEncryptScope & DND_CS_TSDB) == DND_CS_TSDB){
-      unsigned char		PacketData[128];
-      int		NewLen;
-      int32_t count = 0;
+
+    if (encryptAlgorithm == DND_CA_SM4) {
+      // if(tsiEncryptAlgorithm == DND_CA_SM4 && (tsiEncryptScope & DND_CS_TSDB) == DND_CS_TSDB){
+      unsigned char PacketData[128];
+      int           NewLen;
+      int32_t       count = 0;
      while (count < pFD->szPage) {
        SCryptOpts opts = {0};
        opts.len = 128;
        opts.source = pFD->pBuf + count;
        opts.result = PacketData;
        opts.unitLen = 128;
-        //strncpy(opts.key, tsEncryptKey, 16);
+        // strncpy(opts.key, tsEncryptKey, 16);
        strncpy(opts.key, encryptKey, ENCRYPT_KEY_LEN);

        NewLen = CBC_Encrypt(&opts);

        memcpy(pFD->pBuf + count, PacketData, NewLen);
-        count += NewLen; 
+        count += NewLen;
      }
-      //tsdbDebug("CBC_Encrypt count:%d %s", count, __FUNCTION__);
+      // tsdbDebug("CBC_Encrypt count:%d %s", count, __FUNCTION__);
    }

    n = taosWriteFile(pFD->pFD, pFD->pBuf, pFD->szPage);
@ -237,7 +237,7 @@ _exit:
  return code;
 }

-static int32_t tsdbReadFilePage(STsdbFD *pFD, int64_t pgno, int32_t encryptAlgorithm, char* encryptKey) {
+static int32_t tsdbReadFilePage(STsdbFD *pFD, int64_t pgno, int32_t encryptAlgorithm, char *encryptKey) {
  int32_t code = 0;

  // ASSERT(pgno <= pFD->szFile);
@ -297,20 +297,19 @@ static int32_t tsdbReadFilePage(STsdbFD *pFD, int64_t pgno, int32_t encryptAlgor
  }
  //}

-  if(encryptAlgorithm == DND_CA_SM4){
-  //if(tsiEncryptAlgorithm == DND_CA_SM4 && (tsiEncryptScope & DND_CS_TSDB) == DND_CS_TSDB){
-    unsigned char		PacketData[128];
-    int		NewLen;
+  if (encryptAlgorithm == DND_CA_SM4) {
+    // if(tsiEncryptAlgorithm == DND_CA_SM4 && (tsiEncryptScope & DND_CS_TSDB) == DND_CS_TSDB){
+    unsigned char PacketData[128];
+    int           NewLen;

    int32_t count = 0;
-    while(count < pFD->szPage)
-    {
+    while (count < pFD->szPage) {
      SCryptOpts opts = {0};
      opts.len = 128;
      opts.source = pFD->pBuf + count;
      opts.result = PacketData;
      opts.unitLen = 128;
-      //strncpy(opts.key, tsEncryptKey, 16);
+      // strncpy(opts.key, tsEncryptKey, 16);
      strncpy(opts.key, encryptKey, ENCRYPT_KEY_LEN);

      NewLen = CBC_Decrypt(&opts);
@ -318,7 +317,7 @@ static int32_t tsdbReadFilePage(STsdbFD *pFD, int64_t pgno, int32_t encryptAlgor
      memcpy(pFD->pBuf + count, PacketData, NewLen);
      count += NewLen;
    }
-    //tsdbDebug("CBC_Decrypt count:%d %s", count, __FUNCTION__);
+    // tsdbDebug("CBC_Decrypt count:%d %s", count, __FUNCTION__);
  }

  // check
@ -333,8 +332,8 @@ _exit:
  return code;
 }

-int32_t tsdbWriteFile(STsdbFD *pFD, int64_t offset, const uint8_t *pBuf, int64_t size, int32_t encryptAlgorithm, 
-                      char* encryptKey) {
+int32_t tsdbWriteFile(STsdbFD *pFD, int64_t offset, const uint8_t *pBuf, int64_t size, int32_t encryptAlgorithm,
+                      char *encryptKey) {
  int32_t code = 0;
  int64_t fOffset = LOGIC_TO_FILE_OFFSET(offset, pFD->szPage);
  int64_t pgno = OFFSET_PGNO(fOffset, pFD->szPage);
@ -366,8 +365,8 @@ _exit:
  return code;
 }

-static int32_t tsdbReadFileImp(STsdbFD *pFD, int64_t offset, uint8_t *pBuf, int64_t size, int32_t encryptAlgorithm, 
-                                char* encryptKey) {
+static int32_t tsdbReadFileImp(STsdbFD *pFD, int64_t offset, uint8_t *pBuf, int64_t size, int32_t encryptAlgorithm,
+                               char *encryptKey) {
  int32_t code = 0;
  int64_t n = 0;
  int64_t fOffset = LOGIC_TO_FILE_OFFSET(offset, pFD->szPage);
@ -572,8 +571,8 @@ _exit:
  return code;
 }

-int32_t tsdbReadFile(STsdbFD *pFD, int64_t offset, uint8_t *pBuf, int64_t size, int64_t szHint, 
-                    int32_t encryptAlgorithm, char* encryptKey) {
+int32_t tsdbReadFile(STsdbFD *pFD, int64_t offset, uint8_t *pBuf, int64_t size, int64_t szHint,
+                     int32_t encryptAlgorithm, char *encryptKey) {
  int32_t code = 0;
  if (!pFD->pFD) {
    code = tsdbOpenFileImpl(pFD);
@ -582,7 +581,7 @@ int32_t tsdbReadFile(STsdbFD *pFD, int64_t offset, uint8_t *pBuf, int64_t size,
    }
  }

-  if (pFD->lcn > 1 /*pFD->s3File && tsS3BlockSize < 0*/) {
+  if (pFD->s3File && pFD->lcn > 1 /* && tsS3BlockSize < 0*/) {
    return tsdbReadFileS3(pFD, offset, pBuf, size, szHint);
  } else {
    return tsdbReadFileImp(pFD, offset, pBuf, size, encryptAlgorithm, encryptKey);
@ -593,20 +592,19 @@ _exit:
 }

 int32_t tsdbReadFileToBuffer(STsdbFD *pFD, int64_t offset, int64_t size, SBuffer *buffer, int64_t szHint,
-                            int32_t encryptAlgorithm, char* encryptKey) {
+                             int32_t encryptAlgorithm, char *encryptKey) {
  int32_t code;

  code = tBufferEnsureCapacity(buffer, buffer->size + size);
  if (code) return code;
-  code = tsdbReadFile(pFD, offset, (uint8_t *)tBufferGetDataEnd(buffer), size, szHint,
-                      encryptAlgorithm, encryptKey);
+  code = tsdbReadFile(pFD, offset, (uint8_t *)tBufferGetDataEnd(buffer), size, szHint, encryptAlgorithm, encryptKey);
  if (code) return code;
  buffer->size += size;

  return code;
 }

-int32_t tsdbFsyncFile(STsdbFD *pFD, int32_t encryptAlgorithm, char* encryptKey) {
+int32_t tsdbFsyncFile(STsdbFD *pFD, int32_t encryptAlgorithm, char *encryptKey) {
  int32_t code = 0;
  /*
  if (pFD->s3File) {
@ -726,7 +724,7 @@ int32_t tsdbReadBlockIdx(SDataFReader *pReader, SArray *aBlockIdx) {

  // read
  int32_t encryptAlgorithm = pReader->pTsdb->pVnode->config.tsdbCfg.encryptAlgorithm;
-  char* encryptKey = pReader->pTsdb->pVnode->config.tsdbCfg.encryptKey;
+  char   *encryptKey = pReader->pTsdb->pVnode->config.tsdbCfg.encryptKey;
  code = tsdbReadFile(pReader->pHeadFD, offset, pReader->aBuf[0], size, 0, encryptAlgorithm, encryptKey);
  if (code) goto _err;

@ -765,7 +763,7 @@ int32_t tsdbReadSttBlk(SDataFReader *pReader, int32_t iStt, SArray *aSttBlk) {

  // read
  int32_t encryptAlgorithm = pReader->pTsdb->pVnode->config.tsdbCfg.encryptAlgorithm;
-  char* encryptKey = pReader->pTsdb->pVnode->config.tsdbCfg.encryptKey;
+  char   *encryptKey = pReader->pTsdb->pVnode->config.tsdbCfg.encryptKey;
  code = tsdbReadFile(pReader->aSttFD[iStt], offset, pReader->aBuf[0], size, 0, encryptAlgorithm, encryptKey);
  if (code) goto _err;

@ -800,7 +798,7 @@ int32_t tsdbReadDataBlk(SDataFReader *pReader, SBlockIdx *pBlockIdx, SMapData *m

  // read
  int32_t encryptAlgorithm = pReader->pTsdb->pVnode->config.tsdbCfg.encryptAlgorithm;
-  char* encryptKey = pReader->pTsdb->pVnode->config.tsdbCfg.encryptKey;
+  char   *encryptKey = pReader->pTsdb->pVnode->config.tsdbCfg.encryptKey;
  code = tsdbReadFile(pReader->pHeadFD, offset, pReader->aBuf[0], size, 0, encryptAlgorithm, encryptKey);
  if (code) goto _err;

@ -895,7 +893,7 @@ int32_t tsdbReadDelDatav1(SDelFReader *pReader, SDelIdx *pDelIdx, SArray *aDelDa

  // read
  int32_t encryptAlgorithm = pReader->pTsdb->pVnode->config.tsdbCfg.encryptAlgorithm;
-  char* encryptKey = pReader->pTsdb->pVnode->config.tsdbCfg.encryptKey;
+  char   *encryptKey = pReader->pTsdb->pVnode->config.tsdbCfg.encryptKey;
  code = tsdbReadFile(pReader->pReadH, offset, pReader->aBuf[0], size, 0, encryptAlgorithm, encryptKey);
  if (code) goto _err;

@ -937,7 +935,7 @@ int32_t tsdbReadDelIdx(SDelFReader *pReader, SArray *aDelIdx) {

  // read
  int32_t encryptAlgorithm = pReader->pTsdb->pVnode->config.tsdbCfg.encryptAlgorithm;
-  char* encryptKey = pReader->pTsdb->pVnode->config.tsdbCfg.encryptKey;
+  char   *encryptKey = pReader->pTsdb->pVnode->config.tsdbCfg.encryptKey;
  code = tsdbReadFile(pReader->pReadH, offset, pReader->aBuf[0], size, 0, encryptAlgorithm, encryptKey);
  if (code) goto _err;

--- a/source/dnode/vnode/src/tsdb/tsdbRetention.c
+++ b/source/dnode/vnode/src/tsdb/tsdbRetention.c
@ -528,7 +528,7 @@ static int32_t tsdbMigrateDataFileLCS3(SRTNer *rtner, const STFileObj *fobj, int
  if (fdFrom == NULL) code = terrno;
  TSDB_CHECK_CODE(code, lino, _exit);

-  tsdbInfo("vgId: %d, open lcfile: %s size: %" PRId64, TD_VID(rtner->tsdb->pVnode), fname, lc_size);
+  tsdbInfo("vgId:%d, open lcfile: %s size: %" PRId64, TD_VID(rtner->tsdb->pVnode), fname, lc_size);

  snprintf(dot2 + 1, TSDB_FQDN_LEN - (dot2 + 1 - object_name), "%d.data", lcn);
  fdTo = taosOpenFile(fname, TD_FILE_WRITE | TD_FILE_CREATE | TD_FILE_TRUNC);
@ -671,7 +671,7 @@ static int32_t tsdbDoS3MigrateOnFileSet(SRTNer *rtner, STFileSet *fset) {
  int64_t chunksize = (int64_t)pCfg->tsdbPageSize * pCfg->s3ChunkSize;
  int32_t lcn = fobj->f->lcn;

-  if (lcn < 1 && taosCheckExistFile(fobj->fname)) {
+  if (/*lcn < 1 && */ taosCheckExistFile(fobj->fname)) {
    int32_t mtime = 0;
    int64_t size = 0;
    taosStatFile(fobj->fname, &size, &mtime, NULL);
--- a/source/libs/catalog/src/ctgCache.c
+++ b/source/libs/catalog/src/ctgCache.c
@ -596,6 +596,7 @@ int32_t ctgCopyTbMeta(SCatalog *pCtg, SCtgTbMetaCtx *ctx, SCtgDBCache **pDb, SCt
  }

  memcpy(&(*pTableMeta)->sversion, &stbMeta->sversion, metaSize - sizeof(SCTableMeta));
+  (*pTableMeta)->schemaExt =  NULL;

  return TSDB_CODE_SUCCESS;
 }
@ -2883,14 +2884,24 @@ int32_t ctgGetTbMetasFromCache(SCatalog *pCtg, SRequestConnInfo *pConn, SCtgTbMe
    SMetaRes    res = {0};
    STableMeta *pTableMeta = NULL;
    if (tbMeta->tableType != TSDB_CHILD_TABLE) {
+      int32_t schemaExtSize = 0;
      int32_t metaSize = CTG_META_SIZE(tbMeta);
-      pTableMeta = taosMemoryCalloc(1, metaSize);
+      if (tbMeta->schemaExt != NULL) {
+        schemaExtSize = tbMeta->tableInfo.numOfColumns * sizeof(SSchemaExt);
+      }
+      pTableMeta = taosMemoryCalloc(1, metaSize + schemaExtSize);
      if (NULL == pTableMeta) {
        ctgReleaseTbMetaToCache(pCtg, dbCache, pCache);
        CTG_ERR_RET(TSDB_CODE_OUT_OF_MEMORY);
      }

      memcpy(pTableMeta, tbMeta, metaSize);
+      if (tbMeta->schemaExt != NULL) {
+        pTableMeta->schemaExt = (SSchemaExt *)((char *)pTableMeta + metaSize);
+        memcpy(pTableMeta->schemaExt, tbMeta->schemaExt, schemaExtSize);
+      } else {
+        pTableMeta->schemaExt = NULL;
+      }

      CTG_UNLOCK(CTG_READ, &pCache->metaLock);
      taosHashRelease(dbCache->tbCache, pCache);
@ -2999,6 +3010,7 @@ int32_t ctgGetTbMetasFromCache(SCatalog *pCtg, SRequestConnInfo *pConn, SCtgTbMe
    }

    memcpy(&pTableMeta->sversion, &stbMeta->sversion, metaSize - sizeof(SCTableMeta));
+    pTableMeta->schemaExt = NULL;

    CTG_UNLOCK(CTG_READ, &pCache->metaLock);
    taosHashRelease(dbCache->tbCache, pCache);
--- a/source/libs/parser/src/parUtil.c
+++ b/source/libs/parser/src/parUtil.c
@ -315,19 +315,15 @@ STableMeta* tableMetaDup(const STableMeta* pTableMeta) {
  size_t schemaExtSize = hasSchemaExt ? pTableMeta->tableInfo.numOfColumns * sizeof(SSchemaExt) : 0;

  size_t      size = sizeof(STableMeta) + numOfFields * sizeof(SSchema);
-  int32_t     cpSize = sizeof(STableMeta) - sizeof(void*);
  STableMeta* p = taosMemoryMalloc(size + schemaExtSize);
-
  if (NULL == p) return NULL;

-  memcpy(p, pTableMeta, cpSize);
+  memcpy(p, pTableMeta, schemaExtSize+size);
  if (hasSchemaExt) {
    p->schemaExt = (SSchemaExt*)(((char*)p) + size);
-    memcpy(p->schemaExt, pTableMeta->schemaExt, schemaExtSize);
  } else {
    p->schemaExt = NULL;
  }
-  memcpy(p->schema, pTableMeta->schema, numOfFields * sizeof(SSchema));
  return p;
 }

--- a/tests/army/community/cluster/snapshot.py
+++ b/tests/army/community/cluster/snapshot.py
@ -30,7 +30,7 @@ from frame.srvCtl import *

 class TDTestCase(TBase):
    updatecfgDict = {
-        "countAlwaysReturnValue" : "0",
+        "countAlwaysReturnValue" : "1",
        "lossyColumns"           : "float,double",
        "fPrecision"             : "0.000000001",
        "dPrecision"             : "0.00000000000000001",
@ -106,7 +106,7 @@ class TDTestCase(TBase):
        # check count always return value
        sql = f"select count(*) from {self.db}.ta"
        tdSql.query(sql)
-        tdSql.checkRows(0) # countAlwaysReturnValue is false
+        tdSql.checkRows(1) # countAlwaysReturnValue is false

    # run
    def run(self):
--- a/tests/army/community/storage/compressBasic.py
+++ b/tests/army/community/storage/compressBasic.py
@ -118,18 +118,10 @@ class TDTestCase(TBase):
        sql = f"describe {self.db}.{self.stb}"
        tdSql.query(sql)

-        '''
        # see AutoGen.types 
        defEncodes = [ "delta-i","delta-i","simple8b","simple8b","simple8b","simple8b","simple8b","simple8b",
                       "simple8b","simple8b","delta-d","delta-d","bit-packing",
-                       "disabled","disabled","disabled","disabled","disabled"]
-        '''
-        
-        # pass-ci have error
-        defEncodes = [ "delta-i","delta-i","simple8b","simple8b","simple8b","simple8b","simple8b","simple8b",
-                       "simple8b","simple8b","delta-d","delta-d","bit-packing",
-                       "disabled","disabled","disabled","disabled","simple8b"]
-        
+                       "disabled","disabled","disabled","disabled"]        

        count = tdSql.getRows()
        for i in range(count):
--- a/tests/army/enterprise/s3/s3Basic.json
+++ b/tests/army/enterprise/s3/s3Basic.json
@ -32,7 +32,7 @@
                {
                    "name": "stb",
                    "child_table_exists": "no",
-                    "childtable_count": 10,
+                    "childtable_count": 6,
                    "insert_rows": 2000000,
                    "childtable_prefix": "d",
                    "insert_mode": "taosc",
--- a/tests/army/enterprise/s3/s3Basic.py
+++ b/tests/army/enterprise/s3/s3Basic.py
@ -38,6 +38,10 @@ s3EndPoint     http://192.168.1.52:9000
 s3AccessKey    'zOgllR6bSnw2Ah3mCNel:cdO7oXAu3Cqdb1rUdevFgJMi0LtRwCXdWKQx4bhX'
 s3BucketName   ci-bucket
 s3UploadDelaySec 60
+
+for test:
+"s3AccessKey" : "fGPPyYjzytw05nw44ViA:vK1VcwxgSOykicx6hk8fL1x15uEtyDSFU3w4hTaZ"
+"s3BucketName": "test-bucket"
 '''


@ -63,7 +67,7 @@ class TDTestCase(TBase):

        tdSql.execute(f"use {self.db}")
        # come from s3_basic.json
-        self.childtable_count = 10
+        self.childtable_count = 6
        self.insert_rows = 2000000
        self.timestamp_step = 1000

@ -85,7 +89,7 @@ class TDTestCase(TBase):
            fileName = cols[8]
            #print(f" filesize={fileSize} fileName={fileName}  line={line}")
            if fileSize > maxFileSize:
-                tdLog.info(f"error, {fileSize} over max size({maxFileSize})\n")
+                tdLog.info(f"error, {fileSize} over max size({maxFileSize}) {fileName}\n")
                overCnt += 1
            else:
                tdLog.info(f"{fileName}({fileSize}) check size passed.")
@ -99,7 +103,7 @@ class TDTestCase(TBase):
        loop = 0
        rets = []
        overCnt = 0
-        while loop < 180:
+        while loop < 100:
            time.sleep(3)

            # check upload to s3
@ -335,7 +339,7 @@ class TDTestCase(TBase):
            self.snapshotAgg()
            self.doAction()
            self.checkAggCorrect()
-            self.checkInsertCorrect(difCnt=self.childtable_count*999999)
+            self.checkInsertCorrect(difCnt=self.childtable_count*1499999)
            self.checkDelete()
            self.doAction()

--- a/tests/army/enterprise/s3/s3Basic1.json
+++ b/tests/army/enterprise/s3/s3Basic1.json
@ -32,7 +32,7 @@
                {
                    "name": "stb",
                    "child_table_exists": "yes",
-                    "childtable_count": 10,
+                    "childtable_count": 6,
                    "insert_rows": 1000000,
                    "childtable_prefix": "d",
                    "insert_mode": "taosc",
--- a/tests/army/frame/caseBase.py
+++ b/tests/army/frame/caseBase.py
@ -140,7 +140,7 @@ class TBase:

        # check step
        sql = f"select count(*) from (select diff(ts) as dif from {self.stb} partition by tbname order by ts desc) where dif != {self.timestamp_step}"
-        #tdSql.checkAgg(sql, difCnt)
+        tdSql.checkAgg(sql, difCnt)

    # save agg result
    def snapshotAgg(self):        
--- a/utils/test/c/sml_test.c
+++ b/utils/test/c/sml_test.c
@ -1789,6 +1789,89 @@ int sml_td24559_Test() {
  return code;
 }

+int sml_td29691_Test() {
+  TAOS *taos = taos_connect("localhost", "root", "taosdata", NULL, 0);
+
+  TAOS_RES *pRes = taos_query(taos, "drop database if exists td29691");
+  taos_free_result(pRes);
+
+  pRes = taos_query(taos, "create database if not exists td29691");
+  taos_free_result(pRes);
+
+  // check column name duplication
+  const char *sql[] = {
+      "vbin,t1=1 f1=283i32,f2=3,f2=b\"hello\" 1632299372000",
+  };
+  pRes = taos_query(taos, "use td29691");
+  taos_free_result(pRes);
+  pRes = taos_schemaless_insert(taos, (char **)sql, sizeof(sql) / sizeof(sql[0]), TSDB_SML_LINE_PROTOCOL,
+                                TSDB_SML_TIMESTAMP_MILLI_SECONDS);
+  int code = taos_errno(pRes);
+  printf("%s result0:%s\n", __FUNCTION__, taos_errstr(pRes));
+  ASSERT(code == TSDB_CODE_PAR_DUPLICATED_COLUMN);
+  taos_free_result(pRes);
+
+  // check tag name duplication
+  const char *sql1[] = {
+      "vbin,t1=1,t1=2 f2=b\"hello\" 1632299372000",
+  };
+  pRes = taos_schemaless_insert(taos, (char **)sql1, sizeof(sql1) / sizeof(sql1[0]), TSDB_SML_LINE_PROTOCOL,
+                                TSDB_SML_TIMESTAMP_MILLI_SECONDS);
+  code = taos_errno(pRes);
+  printf("%s result0:%s\n", __FUNCTION__, taos_errstr(pRes));
+  ASSERT(code == TSDB_CODE_PAR_DUPLICATED_COLUMN);
+  taos_free_result(pRes);
+
+
+  // check column tag name duplication
+  const char *sql2[] = {
+      "vbin,t1=1,t2=2 t2=L\"ewe\",f2=b\"hello\" 1632299372000",
+  };
+  pRes = taos_schemaless_insert(taos, (char **)sql2, sizeof(sql2) / sizeof(sql2[0]), TSDB_SML_LINE_PROTOCOL,
+                                TSDB_SML_TIMESTAMP_MILLI_SECONDS);
+  code = taos_errno(pRes);
+  printf("%s result0:%s\n", __FUNCTION__, taos_errstr(pRes));
+  ASSERT(code == TSDB_CODE_PAR_DUPLICATED_COLUMN);
+  taos_free_result(pRes);
+
+  // insert data
+  const char *sql3[] = {
+      "vbin,t1=1,t2=2 f1=1,f2=b\"hello\" 1632299372000",
+  };
+  pRes = taos_schemaless_insert(taos, (char **)sql3, sizeof(sql3) / sizeof(sql3[0]), TSDB_SML_LINE_PROTOCOL,
+                                TSDB_SML_TIMESTAMP_MILLI_SECONDS);
+  code = taos_errno(pRes);
+  printf("%s result0:%s\n", __FUNCTION__, taos_errstr(pRes));
+  ASSERT(code == 0);
+  taos_free_result(pRes);
+
+  //check column tag name duplication when update
+  const char *sql4[] = {
+      "vbin,t1=1,t2=2,f1=ewe f1=1,f2=b\"hello\" 1632299372001",
+  };
+  pRes = taos_schemaless_insert(taos, (char **)sql4, sizeof(sql4) / sizeof(sql4[0]), TSDB_SML_LINE_PROTOCOL,
+                                TSDB_SML_TIMESTAMP_MILLI_SECONDS);
+  code = taos_errno(pRes);
+  printf("%s result0:%s\n", __FUNCTION__, taos_errstr(pRes));
+  ASSERT(code == TSDB_CODE_PAR_DUPLICATED_COLUMN);
+  taos_free_result(pRes);
+
+  //check column tag name duplication when update
+  const char *sql5[] = {
+      "vbin,t1=1,t2=2 f1=1,f2=b\"hello\",t1=3 1632299372002",
+  };
+  pRes = taos_schemaless_insert(taos, (char **)sql5, sizeof(sql5) / sizeof(sql5[0]), TSDB_SML_LINE_PROTOCOL,
+                                TSDB_SML_TIMESTAMP_MILLI_SECONDS);
+  code = taos_errno(pRes);
+  printf("%s result0:%s\n", __FUNCTION__, taos_errstr(pRes));
+  ASSERT(code == TSDB_CODE_PAR_DUPLICATED_COLUMN);
+  taos_free_result(pRes);
+  taos_close(taos);
+
+  return code;
+}
+
+
 int sml_td18789_Test() {
  TAOS *taos = taos_connect("localhost", "root", "taosdata", NULL, 0);

@ -1808,11 +1891,11 @@ int sml_td18789_Test() {

  pRes = taos_schemaless_insert(taos, (char **)sql, sizeof(sql) / sizeof(sql[0]), TSDB_SML_LINE_PROTOCOL,
                                TSDB_SML_TIMESTAMP_MILLI_SECONDS);
-
  int code = taos_errno(pRes);
  printf("%s result0:%s\n", __FUNCTION__, taos_errstr(pRes));
  taos_free_result(pRes);

+
  TAOS_ROW row = NULL;
  pRes = taos_query(taos, "select *,tbname from vbin order by _ts");
  int rowIndex = 0;
@ -1952,6 +2035,8 @@ int main(int argc, char *argv[]) {
  }

  int ret = 0;
+  ret = sml_td29691_Test();
+  ASSERT(ret);
  ret = sml_td29373_Test();
  ASSERT(ret);
  ret = sml_td24559_Test();