Merge pull request #24008 from taosdata/docs/partition_by

fix: add more detail description about partition by clause
2023-12-11 17:51:54 +08:00 · 2023-12-11 17:51:54 +08:00 · 5282cf1585
parent b9c7b1924a e77ae68cbd
commit 5282cf1585
4 changed files with 29 additions and 6 deletions
--- a/docs/en/12-taos-sql/06-select.md
+++ b/docs/en/12-taos-sql/06-select.md
@ -259,7 +259,11 @@ The GROUP BY clause does not guarantee that the results are ordered. If you want

 ## PARTITION BY

-The PARTITION BY clause is a TDengine-specific extension to standard SQL. This clause partitions data based on the part_list and performs computations per partition.
+The PARTITION BY clause is a TDengine-specific extension to standard SQL introduced in TDengine 3.0. This clause partitions data based on the part_list and performs computations per partition.
+
+PARTITION BY and GROUP BY have similar meanings. They both group data according to a specified list and then perform calculations. The difference is that PARTITION BY does not have various restrictions on the SELECT list of the GROUP BY clause. Any operation can be performed within the group (constants, aggregations, scalars, expressions, etc.). Therefore, PARTITION BY is fully compatible with GROUP BY in terms of usage. All places that use the GROUP BY clause can be replaced with PARTITION BY.
+
+Because PARTITION BY does not require returning a row of aggregated data, it can also support various window operations after grouping slices. All window operations that need to be grouped can only use the PARTITION BY clause.

 For more information, see TDengine Extensions.

--- a/docs/en/12-taos-sql/12-distinguished.md
+++ b/docs/en/12-taos-sql/12-distinguished.md
@ -16,7 +16,10 @@ When you query a supertable, you may need to partition the supertable by some di
 PARTITION BY part_list
 ```

-part_list can be any scalar expression, such as a column, constant, scalar function, or a combination of the preceding items.
+part_list can be any scalar expression, such as a column, constant, scalar function, or a combination of the preceding items. For example, grouping data by label location, taking the average voltage within each group.
+```sql
+select avg(voltage) from meters partition by location
+```

 A PARTITION BY clause is processed as follows:

@ -28,7 +31,10 @@ A PARTITION BY clause is processed as follows:
 select max(current) from meters partition by location interval(10m)
 ```

-The most common usage of PARTITION BY is partitioning the data in subtables by tags then perform computation when querying data in a supertable. More specifically, `PARTITION BY TBNAME` partitions the data of each subtable into a single timeline, and this method facilitates the statistical analysis in many use cases of processing timeseries data.
+The most common usage of PARTITION BY is partitioning the data in subtables by tags then perform computation when querying data in a supertable. More specifically, `PARTITION BY TBNAME` partitions the data of each subtable into a single timeline, and this method facilitates the statistical analysis in many use cases of processing timeseries data. For example, calculate the average voltage of each meter every 10 minutes£º
+```sql
+select avg(voltage) from meters partition by tbname interval(10m)
+```

 ## Windowed Queries

--- a/docs/zh/12-taos-sql/06-select.md
+++ b/docs/zh/12-taos-sql/06-select.md
@ -259,7 +259,12 @@ GROUP BY 子句中的表达式可以包含表或视图中的任何列，这些

 ## PARTITION BY

-PARTITION BY 子句是 TDengine 特色语法，按 part_list 对数据进行切分，在每个切分的分片中进行计算。
+PARTITION BY 子句是 TDengine 3.0版本引入的特色语法，用于根据 part_list 对数据进行切分，在每个切分的分片中可以进行各种计算。
+
+PARTITION BY 与 GROUP BY 基本含义相似，都是按照指定列表进行数据分组然后进行计算，不同点在于 PARTITION BY 没有 GROUP BY 子句的 SELECT 列表的各种限制，组内可以进行任意运算（常量、聚合、标量、表达式等），因此在使用上 PARTITION BY 完全兼容 GROUP BY，所有使用 GROUP BY 子句的地方都可以替换为 PARTITION BY。
+
+因为 PARTITION BY 没有返回一行聚合数据的要求，因此还可以支持在分组切片后的各种窗口运算，所有需要分组进行的窗口运算都只能使用 PARTITION BY 子句。
+

 详见 [TDengine 特色查询](../distinguished)

--- a/docs/zh/12-taos-sql/12-distinguished.md
+++ b/docs/zh/12-taos-sql/12-distinguished.md
@ -16,7 +16,11 @@ TDengine 提供的特色查询包括数据切分查询和时间窗口切分查
 PARTITION BY part_list
 ```

-part_list 可以是任意的标量表达式，包括列、常量、标量函数和它们的组合。
+part_list 可以是任意的标量表达式，包括列、常量、标量函数和它们的组合。例如，将数据按标签 location 进行分组，取每个分组内的电压平均值：
+```sql
+select avg(voltage) from meters partition by location
+```
+

 TDengine 按如下方式处理数据切分子句：

@ -27,7 +31,11 @@ TDengine 按如下方式处理数据切分子句：
 ```sql
 select max(current) from meters partition by location interval(10m)
 ```
-数据切分子句最常见的用法就是在超级表查询中，按标签将子表数据进行切分，然后分别进行计算。特别是 PARTITION BY TBNAME 用法，它将每个子表的数据独立出来，形成一条条独立的时间序列，极大的方便了各种时序场景的统计分析。
+数据切分子句最常见的用法就是在超级表查询中，按标签将子表数据进行切分，然后分别进行计算。特别是 PARTITION BY TBNAME 用法，它将每个子表的数据独立出来，形成一条条独立的时间序列，极大的方便了各种时序场景的统计分析。例如，统计每个电表每 10 分钟内的电压平均值：
+```sql
+select avg(voltage) from meters partition by tbname interval(10m)
+```
+

 ## 窗口切分查询