docs:opti format of schemaless & modify subtable rules in stream

This commit is contained in:
wangmm0220 2024-01-22 10:23:53 +08:00
parent 27b63b721f
commit 31f0fe8c4f
4 changed files with 32 additions and 27 deletions

View File

@ -67,7 +67,7 @@ If a stream is created with PARTITION BY clause and SUBTABLE clause, the name of
CREATE STREAM avg_vol_s INTO avg_vol SUBTABLE(CONCAT('new-', tname)) AS SELECT _wstart, count(*), avg(voltage) FROM meters PARTITION BY tbname tname INTERVAL(1m);
```
IN PARTITION clause, 'tbname', representing each subtable name of source supertable, is given alias 'tname'. And 'tname' is used in SUBTABLE clause. In SUBTABLE clause, each auto created subtable will concat 'new-' and source subtable name as their name. Other expressions are also allowed in SUBTABLE clause, but the output type must be varchar.
IN PARTITION clause, 'tbname', representing each subtable name of source supertable, is given alias 'tname'. And 'tname' is used in SUBTABLE clause. In SUBTABLE clause, each auto created subtable will concat 'new-' and source subtable name as their name(Starting from 3.2.3.0, in order to avoid the expression in subtable being unable to distinguish between different subtables, add '_groupId' to the end of subtable name).
If the output length exceeds the limitation of TDengine(192), the name will be truncated. If the generated name is occupied by some other table, the creation and writing of the new subtable will be failed.

View File

@ -84,18 +84,22 @@ Schemaless writes process row data according to the following principles.
1. You can use the following rules to generate the subtable names: first, combine the measurement name and the key and value of the label into the next string:
```json
"measurement,tag_key1=tag_value1,tag_key2=tag_value2"
```
```json
"measurement,tag_key1=tag_value1,tag_key2=tag_value2"
```
:::tip
Note that tag_key1, tag_key2 are not the original order of the tags entered by the user but the result of using the tag names in ascending order of the strings. Therefore, tag_key1 is not the first tag entered in the line protocol.
The string's MD5 hash value "md5_val" is calculated after the ranking is completed. The calculation result is then combined with the string to generate the table name: "t_md5_val". "t\_" is a fixed prefix that every table generated by this mapping relationship has.
:::
:::tip
Note that tag_key1, tag_key2 are not the original order of the tags entered by the user but the result of using the tag names in ascending order of the strings. Therefore, tag_key1 is not the first tag entered in the line protocol.
The string's MD5 hash value "md5_val" is calculated after the ranking is completed. The calculation result is then combined with the string to generate the table name: "t_md5_val". "t\_" is a fixed prefix that every table generated by this mapping relationship has.
:::
If you do not want to use an automatically generated table name, there are two ways to specify sub table names, the first one has a higher priority.
You can configure smlAutoChildTableNameDelimiter in taos.cfg(except for `@ # space \r \t \n`), for example, `smlAutoChildTableNameDelimiter=tname`. You can insert `st,t0=cpul,t1=4 c1=3 1626006833639000000` and the table name will be cpu1-4.
You can configure smlChildTableName in taos.cfg to specify table names, for example, `smlChildTableName=tname`. You can insert `st,tname=cpul,t1=4 c1=3 1626006833639000000` and the cpu1 table will be automatically created. Note that if multiple rows have the same tname but different tag_set values, the tag_set of the first row is used to create the table and the others are ignored.
If you do not want to use an automatically generated table name, there are two ways to specify sub table names(the first one has a higher priority).
1. You can configure smlAutoChildTableNameDelimiter in taos.cfg(except for `@ # space \r \t \n`).
1. For example, `smlAutoChildTableNameDelimiter=tname`. You can insert `st,t0=cpul,t1=4 c1=3 1626006833639000000` and the table name will be cpu1-4.
2. You can configure smlChildTableName in taos.cfg to specify table names.
2. For example, `smlChildTableName=tname`. You can insert `st,tname=cpul,t1=4 c1=3 1626006833639000000` and the cpu1 table will be automatically created. Note that if multiple rows have the same tname but different tag_set values, the tag_set of the first row is used to create the table and the others are ignored.
2. If the super table obtained by parsing the line protocol does not exist, this super table is created.
**Important:** Manually creating supertables for schemaless writing is not supported. Schemaless writing creates appropriate supertables automatically.

View File

@ -34,7 +34,7 @@ subquery: SELECT select_list
stb_name 是保存计算结果的超级表的表名如果该超级表不存在会自动创建如果已存在则检查列的schema信息。详见 写入已存在的超级表
TAGS 句定义了流计算中创建TAG的规则可以为每个partition对应的子表生成自定义的TAG值详见 自定义TAG
TAGS 句定义了流计算中创建TAG的规则可以为每个partition对应的子表生成自定义的TAG值详见 自定义TAG
```sql
create_definition:
col_name column_definition
@ -77,7 +77,7 @@ SELECT _wstart, count(*), avg(voltage) FROM meters PARTITION BY tbname INTERVAL(
CREATE STREAM avg_vol_s INTO avg_vol SUBTABLE(CONCAT('new-', tname)) AS SELECT _wstart, count(*), avg(voltage) FROM meters PARTITION BY tbname tname INTERVAL(1m);
```
PARTITION 子句中,为 tbname 定义了一个别名 tname, 在PARTITION 子句中的别名可以用于 SUBTABLE 子句中的表达式计算,在上述示例中,流新创建的子表将以前缀 'new-' 连接原表名作为表名。
PARTITION 子句中,为 tbname 定义了一个别名 tname, 在PARTITION 子句中的别名可以用于 SUBTABLE 子句中的表达式计算,在上述示例中,流新创建的子表将以前缀 'new-' 连接原表名作为表名(从3.2.3.0开始,为了避免 sutable 中的表达式无法区分各个子表,即误将多个相同时间线写入一个子表,在指定的子表名后面加上 _groupId)
注意,子表名的长度若超过 TDengine 的限制,将被截断。若要生成的子表名已经存在于另一超级表,由于 TDengine 的子表名是唯一的,因此对应新子表的创建以及数据的写入将会失败。
@ -212,8 +212,8 @@ CREATE STREAM streams2 trigger at_once INTO st1 TAGS(cc varchar(100)) as select
PARTITION 子句中,为 concat("tag-", tbname)定义了一个别名cc, 对应超级表st1的自定义TAG的名字。在上述示例中流新创建的子表的TAG将以前缀 'new-' 连接原表名作为TAG的值。
会对TAG信息进行如下检查
1.检查tag的schema信息是否匹配对于不匹配的则自动进行数据类型转换当前只有数据长度大于4096byte时才报错其余场景都能进行类型转换。
2.检查tag的个数是否相同如果不同需要显示的指定超级表与subquery的tag的对应关系否则报错如果相同可以指定对应关系也可以不指定不指定则按位置顺序对应。
1. 检查tag的schema信息是否匹配对于不匹配的则自动进行数据类型转换当前只有数据长度大于4096byte时才报错其余场景都能进行类型转换。
2. 检查tag的个数是否相同如果不同需要显示的指定超级表与subquery的tag的对应关系否则报错如果相同可以指定对应关系也可以不指定不指定则按位置顺序对应。
## 清理中间状态

View File

@ -87,19 +87,20 @@ st,t1=3,t2=4,t3=t3 c1=3i64,c3="passit",c2=false,c4=4f64 1626006833639000000
1. 将使用如下规则来生成子表名:首先将 measurement 的名称和标签的 key 和 value 组合成为如下的字符串
```json
"measurement,tag_key1=tag_value1,tag_key2=tag_value2"
```
```json
"measurement,tag_key1=tag_value1,tag_key2=tag_value2"
```
:::tip
需要注意的是,这里的 tag_key1, tag_key2 并不是用户输入的标签的原始顺序而是使用了标签名称按照字符串升序排列后的结果。所以tag_key1 并不是在行协议中输入的第一个标签。
排列完成以后计算该字符串的 MD5 散列值 "md5_val"。然后将计算的结果与字符串组合生成表名“t_md5_val”。其中的 “t_” 是固定的前缀,每个通过该映射关系自动生成的表都具有该前缀。
:::tip
如果不想用自动生成的表名,有两种指定子表名的方式,第一种优先级更高:
通过在taos.cfg里配置 smlAutoChildTableNameDelimiter 参数来指定(`@ # 空格 回车 换行 制表符`除外)。
举例如下:配置 smlAutoChildTableNameDelimiter=- 插入数据为 st,t0=cpu1,t1=4 c1=3 1626006833639000000 则创建的表名为 cpu1-4。
通过在taos.cfg里配置 smlChildTableName 参数来指定。
举例如下:配置 smlChildTableName=tname 插入数据为 st,tname=cpu1,t1=4 c1=3 1626006833639000000 则创建的表名为 cpu1注意如果多行数据 tname 相同,但是后面的 tag_set 不同,则使用第一行自动建表时指定的 tag_set其他的行会忽略
:::tip
需要注意的是,这里的 tag_key1, tag_key2 并不是用户输入的标签的原始顺序而是使用了标签名称按照字符串升序排列后的结果。所以tag_key1 并不是在行协议中输入的第一个标签。
排列完成以后计算该字符串的 MD5 散列值 "md5_val"。然后将计算的结果与字符串组合生成表名“t_md5_val”。其中的 “t_” 是固定的前缀,每个通过该映射关系自动生成的表都具有该前缀。
:::tip
如果不想用自动生成的表名,有两种指定子表名的方式(第一种优先级更高)。
1. 通过在taos.cfg里配置 smlAutoChildTableNameDelimiter 参数来指定(`@ # 空格 回车 换行 制表符`除外)。
1. 举例如下:配置 smlAutoChildTableNameDelimiter=- 插入数据为 st,t0=cpu1,t1=4 c1=3 1626006833639000000 则创建的表名为 cpu1-4。
2. 通过在taos.cfg里配置 smlChildTableName 参数来指定。
1. 举例如下:配置 smlChildTableName=tname 插入数据为 st,tname=cpu1,t1=4 c1=3 1626006833639000000 则创建的表名为 cpu1注意如果多行数据 tname 相同,但是后面的 tag_set 不同,则使用第一行自动建表时指定的 tag_set其他的行会忽略。
2. 如果解析行协议获得的超级表不存在,则会创建这个超级表(不建议手动创建超级表,不然插入数据可能异常)。
3. 如果解析行协议获得子表不存在,则 Schemaless 会按照步骤 1 或 2 确定的子表名来创建子表。