Merge pull request #30207 from taosdata/merge/mainto3.0

merge: from main to 3.0 branch
This commit is contained in:
Simon Guan 2025-03-18 09:04:49 +08:00 committed by GitHub
commit a59170ccc3
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
60 changed files with 1689 additions and 994 deletions

View File

@ -81,6 +81,8 @@ jobs:
-DBUILD_KEEPER=true \ -DBUILD_KEEPER=true \
-DBUILD_HTTP=false \ -DBUILD_HTTP=false \
-DBUILD_TEST=true \ -DBUILD_TEST=true \
-DWEBSOCKET=true \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_DEPENDENCY_TESTS=false -DBUILD_DEPENDENCY_TESTS=false
make -j 4 make -j 4
sudo make install sudo make install

View File

@ -72,8 +72,16 @@ TDengine Enterprise implements incremental backup and recovery of data by using
7. **Directory:** Enter the full path of the directory in which you want to store backup files. 7. **Directory:** Enter the full path of the directory in which you want to store backup files.
8. **Backup file max size:** Enter the maximum size of a single backup file. If the total size of your backup exceeds this number, the backup is split into multiple files. 8. **Backup file max size:** Enter the maximum size of a single backup file. If the total size of your backup exceeds this number, the backup is split into multiple files.
9. **Compression level:** Select **fastest** for the fastest performance but lowest compression ratio, **best** for the highest compression ratio but slowest performance, or **balanced** for a combination of performance and compression. 9. **Compression level:** Select **fastest** for the fastest performance but lowest compression ratio, **best** for the highest compression ratio but slowest performance, or **balanced** for a combination of performance and compression.
4. Users can enable S3 dumping to upload backup files to the S3 storage service. To enable S3 dumping, the following information needs to be provided:
4. Click **Confirm** to create the backup plan. 1. **Endpoint**: The address of the S3 endpoint.
2. **Access Key ID**: The access key ID for authentication.
3. **Secret Access Key**: The secret access key for authentication.
4. **Bucket**: The name of the target bucket.
5. **Region**: The region where the bucket is located.
6. **Object Prefix**: A prefix for backup file objects, similar to a directory path on S3.
7. **Backup Retention Period**: The retention duration for local backups. All files older than `current time - backup_retention_period` must be uploaded to S3.
8. **Backup Retention Count**: The number of local backups to retain. Only the latest `backup_retention_size` backup files are kept locally.
5. Click **Confirm** to create the backup plan.
You can view your backup plans and modify, clone, or delete them using the buttons in the **Operation** columns. Click **Refresh** to update the status of your plans. Note that you must stop a backup plan before you can delete it. You can also click **View** in the **Backup File** column to view the backup record points and files created by each plan. You can view your backup plans and modify, clone, or delete them using the buttons in the **Operation** columns. Click **Refresh** to update the status of your plans. Note that you must stop a backup plan before you can delete it. You can also click **View** in the **Backup File** column to view the backup record points and files created by each plan.

View File

@ -97,7 +97,6 @@ CSV 文件中的每个 Row 配置一个 OPC 数据点位。Row 的规则如下
(1) 与 Header 中的列有如下对应关系 (1) 与 Header 中的列有如下对应关系
| 序号 | Header 中的列 | 值的类型 | 值的范围 | 是否必填 | 默认值 | | 序号 | Header 中的列 | 值的类型 | 值的范围 | 是否必填 | 默认值 |
|----|-------------------------| -------- |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|--------------| |----|-------------------------| -------- |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|--------------|
| 1 | tag_name | String | 类似`root.parent.temperature`这样的字符串,要满足 OPC DA 的 ID 规范 | 是 | | | 1 | tag_name | String | 类似`root.parent.temperature`这样的字符串,要满足 OPC DA 的 ID 规范 | 是 | |

File diff suppressed because it is too large Load Diff

View File

@ -7,7 +7,7 @@ description: "创建、删除数据库,查看、修改数据库参数"
## 创建数据库 ## 创建数据库
```sql ```sql
CREATE DATABASE [IF NOT EXISTS] db_name [database_options] CREATE DATABASE [IF NOT EXISTS] db_name [database_options];
database_options: database_options:
database_option ... database_option ...
@ -46,53 +46,80 @@ database_option: {
### 参数说明 ### 参数说明
- VGROUPS数据库中初始 vgroup 的数目。 - VGROUPS数据库中初始 vgroup 的数目。
- PRECISION数据库的时间戳精度。ms 表示毫秒、us 表示微秒、ns 表示纳秒、默认 ms 毫秒。 - PRECISION数据库的时间戳精度。
- REPLICA表示数据库副本数取值为 1、2 或 3默认为 1; 2 仅在企业版 3.3.0.0 及以后版本中可用。在集群中使用,副本数必须小于或等于 DNODE 的数目。且使用时存在以下限制: - ms 表示毫秒(默认值)。
- us 表示微秒。
- ns 表示纳秒。
- REPLICA表示数据库副本数取值为 1、2 或 3默认为 1; 2 仅在企业版 3.3.0.0 及以后版本中可用。在集群中使用时,副本数必须小于或等于 DNODE 的数目。且使用时存在以下限制:
- 暂不支持对双副本数据库相关 Vgroup 进行 SPLITE VGROUP 或 REDISTRIBUTE VGROUP 操作 - 暂不支持对双副本数据库相关 Vgroup 进行 SPLITE VGROUP 或 REDISTRIBUTE VGROUP 操作
- 单副本数据库可变更为双副本数据库,但不支持从双副本变更为其它副本数,也不支持从三副本变更为双副本 - 单副本数据库可变更为双副本数据库,但不支持从双副本变更为其它副本数,也不支持从三副本变更为双副本
- BUFFER一个 vnode 写入内存池大小,单位为 MB默认为 256最小为 3最大为 16384。 - BUFFER一个 vnode 写入内存池大小,单位为 MB默认为 256最小为 3最大为 16384。
- PAGES一个 vnode 中元数据存储引擎的缓存页个数,默认为 256最小 64。一个 vnode 元数据存储占用 PAGESIZE \* PAGES默认情况下为 1MB 内存。 - PAGES一个 vnode 中元数据存储引擎的缓存页个数,默认为 256最小 64。一个 vnode 元数据存储占用 PAGESIZE \* PAGES默认情况下为 1MB 内存。
- PAGESIZE一个 vnode 中元数据存储引擎的页大小,单位为 KB默认为 4 KB。范围为 1 到 16384即 1 KB 到 16 MB。 - PAGESIZE一个 vnode 中元数据存储引擎的页大小,单位为 KB默认为 4 KB。范围为 1 到 16384即 1 KB 到 16 MB。
- CACHEMODEL表示是否在内存中缓存子表的最近数据。默认为 none。 - CACHEMODEL表示是否在内存中缓存子表的最近数据。
- none表示不缓存。 - none表示不缓存(默认值)
- last_row表示缓存子表最近一行数据。这将显著改善 LAST_ROW 函数的性能表现。 - last_row表示缓存子表最近一行数据。这将显著改善 LAST_ROW 函数的性能表现。
- last_value表示缓存子表每一列的最近的非 NULL 值。这将显著改善无特殊影响WHERE、ORDER BY、GROUP BY、INTERVAL下的 LAST 函数的性能表现。 - last_value表示缓存子表每一列的最近的非 NULL 值。这将显著改善无特殊影响WHERE、ORDER BY、GROUP BY、INTERVAL下的 LAST 函数的性能表现。
- both表示同时打开缓存最近行和列功能。 - both表示同时打开缓存最近行和列功能。
NoteCacheModel 值来回切换有可能导致 last/last_row 的查询结果不准确,请谨慎操作。推荐保持打开 NoteCacheModel 值来回切换有可能导致 last/last_row 的查询结果不准确,请谨慎操作(推荐保持打开)
- CACHESIZE表示每个 vnode 中用于缓存子表最近数据的内存大小。默认为 1 ,范围是[1, 65536],单位是 MB。 - CACHESIZE表示每个 vnode 中用于缓存子表最近数据的内存大小。默认为 1 ,范围是[1, 65536],单位是 MB。
- COMP表示数据库文件压缩标志位缺省值为 2取值范围为 [0, 2]。 - COMP表示数据库文件压缩标志位缺省值为 2取值范围为 [0, 2]。
- 0表示不压缩。 - 0表示不压缩。
- 1表示一阶段压缩。 - 1表示一阶段压缩。
- 2表示两阶段压缩。 - 2表示两阶段压缩。
- DURATION数据文件存储数据的时间跨度。可以使用加单位的表示形式如 DURATION 100h、DURATION 10d 等,支持 m分钟、h小时和 d三个单位。不加时间单位时默认单位为天如 DURATION 50 表示 50 天。 - DURATION数据文件存储数据的时间跨度。
- 可以使用加单位的表示形式,如 DURATION 100h、DURATION 10d 等,支持 m分钟、h小时和 d三个单位。
- 不加时间单位时默认单位为天,如 DURATION 50 表示 50 天。
- MAXROWS文件块中记录的最大条数默认为 4096 条。 - MAXROWS文件块中记录的最大条数默认为 4096 条。
- MINROWS文件块中记录的最小条数默认为 100 条。 - MINROWS文件块中记录的最小条数默认为 100 条。
- KEEP表示数据文件保存的天数缺省值为 3650取值范围 [1, 365000],且必须大于或等于 3 倍的 DURATION 参数值。数据库会自动删除保存时间超过 KEEP 值的数据从而释放存储空间。KEEP 可以使用加单位的表示形式,如 KEEP 100h、KEEP 10d 等,支持 m分钟、h小时和 d三个单位。也可以不写单位如 KEEP 50此时默认单位为天。企业版支持[多级存储](https://docs.taosdata.com/operation/planning/#%E5%A4%9A%E7%BA%A7%E5%AD%98%E5%82%A8)功能, 因此, 可以设置多个保存时间(多个以英文逗号分隔,最多 3 个,满足 keep 0 \<= keep 1 \<= keep 2如 KEEP 100h,100d,3650d; 社区版不支持多级存储功能(即使配置了多个保存时间, 也不会生效, KEEP 会取最大的保存时间)。了解更多,请点击 [关于主键时间戳](https://docs.taosdata.com/reference/taos-sql/insert/) - KEEP表示数据文件保存的天数缺省值为 3650取值范围 [1, 365000],且必须大于或等于 3 倍的 DURATION 参数值。
- 数据库会自动删除保存时间超过 KEEP 值的数据从而释放存储空间;
- KEEP_TIME_OFFSET自 3.2.0.0 版本生效。删除或迁移保存时间超过 KEEP 值的数据的延迟执行时间,默认值为 0 (小时)。在数据文件保存时间超过 KEEP 后,删除或迁移操作不会立即执行,而会额外等待本参数指定的时间间隔,以实现与业务高峰期错开的目的。 - KEEP 可以使用加单位的表示形式,如 KEEP 100h、KEEP 10d 等,支持 m分钟、h小时和 d三个单位
- STT_TRIGGER表示落盘文件触发文件合并的个数。对于少表高频写入场景此参数建议使用默认配置而对于多表低频写入场景此参数建议配置较大的值。 - 也可以不写单位,如 KEEP 50此时默认单位为天
- 仅企业版支持[多级存储](https://docs.taosdata.com/operation/planning/#%E5%A4%9A%E7%BA%A7%E5%AD%98%E5%82%A8)功能, 因此, 可以设置多个保存时间(多个以英文逗号分隔,最多 3 个,满足 keep 0 \<= keep 1 \<= keep 2如 KEEP 100h,100d,3650d
- 社区版不支持多级存储功能(即使配置了多个保存时间, 也不会生效, KEEP 会取最大的保存时间);
- 了解更多,请点击 [关于主键时间戳](https://docs.taosdata.com/reference/taos-sql/insert/)。
- KEEP_TIME_OFFSET删除或迁移保存时间超过 KEEP 值的数据的延迟执行时间(自 3.2.0.0 版本生效),默认值为 0 (小时)。
- 在数据文件保存时间超过 KEEP 后,删除或迁移操作不会立即执行,而会额外等待本参数指定的时间间隔,以实现与业务高峰期错开的目的。
- STT_TRIGGER表示落盘文件触发文件合并的个数。
- 对于少表高频写入场景,此参数建议使用默认配置;
- 而对于多表低频写入场景,此参数建议配置较大的值。
- SINGLE_STABLE表示此数据库中是否只可以创建一个超级表用于超级表列非常多的情况。 - SINGLE_STABLE表示此数据库中是否只可以创建一个超级表用于超级表列非常多的情况。
- 0表示可以创建多张超级表。 - 0表示可以创建多张超级表。
- 1表示只可以创建一张超级表。 - 1表示只可以创建一张超级表。
- TABLE_PREFIX当其为正值时在决定把一个表分配到哪个 vgroup 时要忽略表名中指定长度的前缀;当其为负值时,在决定把一个表分配到哪个 vgroup 时只使用表名中指定长度的前缀;例如,假定表名为 "v30001",当 TSDB_PREFIX = 2 时 使用 "0001" 来决定分配到哪个 vgroup ,当 TSDB_PREFIX = -2 时使用 "v3" 来决定分配到哪个 vgroup - TABLE_PREFIX分配数据表到某个 vgroup 时,用于忽略或仅使用表名前缀的长度值。
- TABLE_SUFFIX当其为正值时在决定把一个表分配到哪个 vgroup 时要忽略表名中指定长度的后缀;当其为负值时,在决定把一个表分配到哪个 vgroup 时只使用表名中指定长度的后缀;例如,假定表名为 "v30001",当 TSDB_SUFFIX = 2 时 使用 "v300" 来决定分配到哪个 vgroup ,当 TSDB_SUFFIX = -2 时使用 "01" 来决定分配到哪个 vgroup。 - 当其为正值时,在决定把一个表分配到哪个 vgroup 时要忽略表名中指定长度的前缀;
- 当其为负值时,在决定把一个表分配到哪个 vgroup 时只使用表名中指定长度的前缀;
- 例如:假定表名为 "v30001",当 TSDB_PREFIX = 2 时,使用 "0001" 来决定分配到哪个 vgroup ,当 TSDB_PREFIX = -2 时使用 "v3" 来决定分配到哪个 vgroup。
- TABLE_SUFFIX分配数据表到某个 vgroup 时,用于忽略或仅使用表名后缀的长度值。
- 当其为正值时,在决定把一个表分配到哪个 vgroup 时要忽略表名中指定长度的后缀;
- 当其为负值时,在决定把一个表分配到哪个 vgroup 时只使用表名中指定长度的后缀;
- 例如:假定表名为 "v30001",当 TSDB_SUFFIX = 2 时,使用 "v300" 来决定分配到哪个 vgroup ,当 TSDB_SUFFIX = -2 时使用 "01" 来决定分配到哪个 vgroup。
- TSDB_PAGESIZE一个 vnode 中时序数据存储引擎的页大小,单位为 KB默认为 4 KB。范围为 1 到 16384即 1 KB到 16 MB。 - TSDB_PAGESIZE一个 vnode 中时序数据存储引擎的页大小,单位为 KB默认为 4 KB。范围为 1 到 16384即 1 KB到 16 MB。
- DNODES指定 vnode 所在的 DNODE 列表,如 '1,2,3',以逗号区分且字符间不能有空格,仅企业版支持。 - DNODES指定 vnode 所在的 DNODE 列表,如 '1,2,3',以逗号区分且字符间不能有空格 **仅企业版支持**
- WAL_LEVELWAL 级别,默认为 1。 - WAL_LEVELWAL 级别,默认为 1。
- 1写 WAL但不执行 fsync。 - 1写 WAL但不执行 fsync。
- 2写 WAL而且执行 fsync。 - 2写 WAL而且执行 fsync。
- WAL_FSYNC_PERIOD当 WAL_LEVEL 参数设置为 2 时,用于设置落盘的周期。默认为 3000单位毫秒。最小为 0表示每次写入立即落盘最大为 180000即三分钟。 - WAL_FSYNC_PERIOD当 WAL_LEVEL 参数设置为 2 时,用于设置落盘的周期。默认为 3000单位毫秒。最小为 0表示每次写入立即落盘最大为 180000即三分钟。
- WAL_RETENTION_PERIOD为了数据订阅消费需要 WAL 日志文件额外保留的最大时长策略。WAL 日志清理,不受订阅客户端消费状态影响。单位为 s。默认为 3600表示在 WAL 保留最近 3600 秒的数据,请根据数据订阅的需要修改这个参数为适当值。 - WAL_RETENTION_PERIOD为了数据订阅消费需要 WAL 日志文件额外保留的最大时长策略。WAL 日志清理,不受订阅客户端消费状态影响。单位为 s。默认为 3600表示在 WAL 保留最近 3600 秒的数据,请根据数据订阅的需要修改这个参数为适当值。
- WAL_RETENTION_SIZE为了数据订阅消费需要 WAL 日志文件额外保留的最大累计大小策略。单位为 KB。默认为 0表示累计大小无上限。 - WAL_RETENTION_SIZE为了数据订阅消费需要 WAL 日志文件额外保留的最大累计大小策略。单位为 KB。默认为 0表示累计大小无上限。
- COMPACT_INTERVAL自动 compact 触发周期(从 1970-01-01T00:00:00Z 开始切分的时间周期)。取值范围0 或 [10m, keep2]单位m分钟h小时d。不加时间单位默认单位为天默认值为 0即不触发自动 compact 功能。如果 db 中有未完成的 compact 任务,不重复下发 compact 任务。仅企业版 3.3.5.0 版本开始支持。 - COMPACT_INTERVAL自动 compact 触发周期(从 1970-01-01T00:00:00Z 开始切分的时间周期)**仅企业版 3.3.5.0 版本开始支持**)。
- COMPACT_TIME_RANGE自动 compact 任务触发的 compact 时间范围,取值范围:[-keep2, -duration]单位m分钟h小时d。不加时间单位时默认单位为天默认值为 [0, 0]。取默认值 [0, 0] 时,如果 COMPACT_INTERVAL 大于 0会按照 [-keep2, -duration] 下发自动 compact。因此要关闭自动 compact 功能,需要将 COMPACT_INTERVAL 设置为 0。仅企业版 3.3.5.0 版本开始支持。 - 取值范围0 或 [10m, keep2]单位m分钟h小时d
- COMPACT_TIME_OFFSET自动 compact 任务触发的 compact 时间相对本地时间的偏移量。取值范围:[0, 23]单位h小时默认值为 0。以 UTC 0 时区为例,如果 COMPACT_INTERVAL 为 1d当 COMPACT_TIME_OFFSET 为 0 时,在每天 0 点下发自动 compact如果 COMPACT_TIME_OFFSET 为 2在每天 2 点下发自动 compact。仅企业版 3.3.5.0 版本开始支持。 - 不加时间单位默认单位为天,默认值为 0即不触发自动 compact 功能;
- - 如果 db 中有未完成的 compact 任务,不重复下发 compact 任务。
- COMPACT_TIME_RANGE自动 compact 任务触发的 compact 时间范围(**仅企业版 3.3.5.0 版本开始支持**)。
- 取值范围:[-keep2, -duration]单位m分钟h小时d
- 不加时间单位时默认单位为天,默认值为 [0, 0]
- 取默认值 [0, 0] 时,如果 COMPACT_INTERVAL 大于 0会按照 [-keep2, -duration] 下发自动 compact
- 因此,要关闭自动 compact 功能,需要将 COMPACT_INTERVAL 设置为 0。
- COMPACT_TIME_OFFSET自动 compact 任务触发的 compact 时间相对本地时间的偏移量(**仅企业版 3.3.5.0 版本开始支持**)。取值范围:[0, 23]单位h小时默认值为 0。以 UTC 0 时区为例:
- 如果 COMPACT_INTERVAL 为 1d当 COMPACT_TIME_OFFSET 为 0 时,在每天 0 点下发自动 compact
- 如果 COMPACT_TIME_OFFSET 为 2在每天 2 点下发自动 compact。
### 创建数据库示例 ### 创建数据库示例
```sql ```sql
create database if not exists db vgroups 10 buffer 10 create database if not exists db vgroups 10 buffer 10;
``` ```
以上示例创建了一个有 10 个 vgroup 名为 db 的数据库, 其中每个 vnode 分配 10MB 的写入缓存 以上示例创建了一个有 10 个 vgroup 名为 db 的数据库, 其中每个 vnode 分配 10MB 的写入缓存
@ -108,7 +135,7 @@ USE db_name;
## 删除数据库 ## 删除数据库
```sql ```sql
DROP DATABASE [IF EXISTS] db_name DROP DATABASE [IF EXISTS] db_name;
``` ```
删除数据库。指定 Database 所包含的全部数据表将被删除,该数据库的所有 vgroups 也会被全部销毁,请谨慎使用! 删除数据库。指定 Database 所包含的全部数据表将被删除,该数据库的所有 vgroups 也会被全部销毁,请谨慎使用!
@ -146,15 +173,18 @@ alter_database_option: {
1. 如何查看 cachesize? 1. 如何查看 cachesize?
通过 select * from information_schema.ins_databases; 可以查看这些 cachesize 的具体值(单位为 MB 通过 select * from information_schema.ins_databases; 可以查看这些 cachesize 的具体值(单位MB
2. 如何查看 cacheload? 2. 如何查看 cacheload?
通过 show \<db_name>.vgroups; 可以查看 cacheload单位字节)。 通过 show \<db_name>.vgroups; 可以查看 cacheload单位Byte字节)。
3. 判断 cachesize 是否够用 3. 判断 cachesize 是否够用
如果 cacheload 非常接近 cachesize则 cachesize 可能过小。 如果 cacheload 明显小于 cachesize 则 cachesize 是够用的。可以根据这个原则判断是否需要修改 cachesize 。具体修改值可以根据系统可用内存情况来决定是加倍或者是提高几倍。 - 如果 cacheload 非常接近 cachesize则 cachesize 可能过小。
- 如果 cacheload 明显小于 cachesize 则 cachesize 是够用的。
- 可以根据这个原则判断是否需要修改 cachesize 。
- 具体修改值可以根据系统可用内存情况来决定是加倍或者是提高几倍。
:::note :::note
其它参数在 3.0.0.0 中暂不支持修改 其它参数在 3.0.0.0 中暂不支持修改
@ -204,7 +234,7 @@ FLUSH DATABASE db_name;
## 调整 VGROUP 中 VNODE 的分布 ## 调整 VGROUP 中 VNODE 的分布
```sql ```sql
REDISTRIBUTE VGROUP vgroup_no DNODE dnode_id1 [DNODE dnode_id2] [DNODE dnode_id3] REDISTRIBUTE VGROUP vgroup_no DNODE dnode_id1 [DNODE dnode_id2] [DNODE dnode_id3];
``` ```
按照给定的 dnode 列表,调整 vgroup 中的 vnode 分布。因为副本数目最大为 3所以最多输入 3 个 dnode。 按照给定的 dnode 列表,调整 vgroup 中的 vnode 分布。因为副本数目最大为 3所以最多输入 3 个 dnode。
@ -212,10 +242,10 @@ REDISTRIBUTE VGROUP vgroup_no DNODE dnode_id1 [DNODE dnode_id2] [DNODE dnode_id3
## 自动调整 VGROUP 中 LEADER 的分布 ## 自动调整 VGROUP 中 LEADER 的分布
```sql ```sql
BALANCE VGROUP LEADER BALANCE VGROUP LEADER;
``` ```
触发集群所有 vgroup 中的 leader 重新选主,对集群各节点进行负载均衡操作。(企业版功能) 触发集群所有 vgroup 中的 leader 重新选主,对集群各节点进行负载均衡操作。(**企业版功能**
## 查看数据库工作状态 ## 查看数据库工作状态
@ -223,19 +253,22 @@ BALANCE VGROUP LEADER
SHOW db_name.ALIVE; SHOW db_name.ALIVE;
``` ```
查询数据库 db_name 的可用状态,返回值 0不可用 1完全可用 2部分可用即数据库包含的 VNODE 部分节点可用,部分节点不可用) 查询数据库 db_name 的可用状态(返回值):
- 0不可用
- 1完全可用
- 2部分可用即数据库包含的 VNODE 部分节点可用,部分节点不可用)。
## 查看 DB 的磁盘空间占用 ## 查看 DB 的磁盘空间占用
```sql ```sql
select * from INFORMATION_SCHEMA.INS_DISK_USAGE where db_name = 'db_name' select * from INFORMATION_SCHEMA.INS_DISK_USAGE where db_name = 'db_name';
``` ```
查看DB各个模块所占用磁盘的大小 查看DB各个模块所占用磁盘的大小
```sql ```sql
SHOW db_name.disk_info; SHOW db_name.disk_info;
``` ```
查看数据库 db_name 的数据压缩压缩率和数据在磁盘上所占用的大小 查看数据库 db_name 的数据压缩压缩率和数据在磁盘上所占用的大小
该命令本质上等同于 `select sum(data1 + data2 + data3)/sum(raw_data), sum(data1 + data2 + data3) from information_schema.ins_disk_usage where db_name="dbname"` 该命令本质上等同于 `select sum(data1 + data2 + data3)/sum(raw_data), sum(data1 + data2 + data3) from information_schema.ins_disk_usage where db_name="dbname";`

View File

@ -34,7 +34,7 @@ subquery: SELECT select_list
stb_name 是保存计算结果的超级表的表名,如果该超级表不存在,会自动创建;如果已存在,则检查列的 schema 信息。详见 [写入已存在的超级表](#写入已存在的超级表)。 stb_name 是保存计算结果的超级表的表名,如果该超级表不存在,会自动创建;如果已存在,则检查列的 schema 信息。详见 [写入已存在的超级表](#写入已存在的超级表)。
TAGS 子句定义了流计算中创建TAG的规则可以为每个 partition 对应的子表生成自定义的TAG值详见 [自定义 TAG](#自定义 TAG) TAGS 子句定义了流计算中创建TAG的规则可以为每个 partition 对应的子表生成自定义的TAG值详见 [自定义 TAG](#自定义-TAG)
```sql ```sql
create_definition: create_definition:
col_name column_definition col_name column_definition
@ -42,7 +42,7 @@ column_definition:
type_name [COMMENT 'string_value'] type_name [COMMENT 'string_value']
``` ```
subtable 子句定义了流式计算中创建的子表的命名规则,详见 [流式计算的 partition](#流式计算的 partition)。 subtable 子句定义了流式计算中创建的子表的命名规则,详见 [流式计算的 partition](#流式计算的-partition)。
```sql ```sql
window_clause: { window_clause: {

View File

@ -31,7 +31,7 @@ description: "TDengine 服务端、客户端和连接器支持的平台列表"
## TDengine 客户端和连接器支持的平台列表 ## TDengine 客户端和连接器支持的平台列表
目前 TDengine 的连接器可支持的平台广泛目前包括X64/X86/ARM64/ARM32/MIPS/LoongArch64 等硬件平台,以及 Linux/Win64/Win32/macOS 等开发环境。 目前 TDengine 的连接器可支持的平台广泛目前包括X64/X86/ARM64/ARM32/MIPS/LoongArch64(或Loong64) 等硬件平台,以及 Linux/Win64/Win32/macOS 等开发环境。
对照矩阵如下: 对照矩阵如下:

View File

@ -241,6 +241,7 @@ typedef struct SRestoreCheckpointInfo {
int32_t transId; // transaction id of the update the consensus-checkpointId transaction int32_t transId; // transaction id of the update the consensus-checkpointId transaction
int32_t taskId; int32_t taskId;
int32_t nodeId; int32_t nodeId;
int32_t term;
} SRestoreCheckpointInfo; } SRestoreCheckpointInfo;
int32_t tEncodeRestoreCheckpointInfo(SEncoder* pEncoder, const SRestoreCheckpointInfo* pReq); int32_t tEncodeRestoreCheckpointInfo(SEncoder* pEncoder, const SRestoreCheckpointInfo* pReq);

View File

@ -495,6 +495,17 @@ struct SStreamTask {
typedef int32_t (*startComplete_fn_t)(struct SStreamMeta*); typedef int32_t (*startComplete_fn_t)(struct SStreamMeta*);
typedef enum {
START_MARK_REQ_CHKPID = 0x1,
START_WAIT_FOR_CHKPTID = 0x2,
START_CHECK_DOWNSTREAM = 0x3,
} EStartStage;
typedef struct {
EStartStage stage;
int64_t ts;
} SStartTaskStageInfo;
typedef struct STaskStartInfo { typedef struct STaskStartInfo {
int64_t startTs; int64_t startTs;
int64_t readyTs; int64_t readyTs;
@ -504,6 +515,8 @@ typedef struct STaskStartInfo {
SHashObj* pFailedTaskSet; // tasks that are done the check downstream process, may be successful or failed SHashObj* pFailedTaskSet; // tasks that are done the check downstream process, may be successful or failed
int64_t elapsedTime; int64_t elapsedTime;
int32_t restartCount; // restart task counter int32_t restartCount; // restart task counter
EStartStage curStage; // task start stage
SArray* pStagesList; // history stage list with timestamp, SArrya<SStartTaskStageInfo>
startComplete_fn_t completeFn; // complete callback function startComplete_fn_t completeFn; // complete callback function
} STaskStartInfo; } STaskStartInfo;
@ -738,7 +751,7 @@ void streamTaskStopMonitorCheckRsp(STaskCheckInfo* pInfo, const char* id);
void streamTaskCleanupCheckInfo(STaskCheckInfo* pInfo); void streamTaskCleanupCheckInfo(STaskCheckInfo* pInfo);
// fill-history task // fill-history task
int32_t streamLaunchFillHistoryTask(SStreamTask* pTask); int32_t streamLaunchFillHistoryTask(SStreamTask* pTask, bool lock);
int32_t streamStartScanHistoryAsync(SStreamTask* pTask, int8_t igUntreated); int32_t streamStartScanHistoryAsync(SStreamTask* pTask, int8_t igUntreated);
void streamExecScanHistoryInFuture(SStreamTask* pTask, int32_t idleDuration); void streamExecScanHistoryInFuture(SStreamTask* pTask, int32_t idleDuration);
bool streamHistoryTaskSetVerRangeStep2(SStreamTask* pTask, int64_t latestVer); bool streamHistoryTaskSetVerRangeStep2(SStreamTask* pTask, int64_t latestVer);
@ -810,12 +823,14 @@ void streamMetaNotifyClose(SStreamMeta* pMeta);
void streamMetaStartHb(SStreamMeta* pMeta); void streamMetaStartHb(SStreamMeta* pMeta);
int32_t streamMetaAddTaskLaunchResult(SStreamMeta* pMeta, int64_t streamId, int32_t taskId, int64_t startTs, int32_t streamMetaAddTaskLaunchResult(SStreamMeta* pMeta, int64_t streamId, int32_t taskId, int64_t startTs,
int64_t endTs, bool ready); int64_t endTs, bool ready);
int32_t streamMetaAddTaskLaunchResultNoLock(SStreamMeta* pMeta, int64_t streamId, int32_t taskId,
int64_t startTs, int64_t endTs, bool ready);
int32_t streamMetaInitStartInfo(STaskStartInfo* pStartInfo); int32_t streamMetaInitStartInfo(STaskStartInfo* pStartInfo);
void streamMetaClearStartInfo(STaskStartInfo* pStartInfo); void streamMetaClearStartInfo(STaskStartInfo* pStartInfo);
int32_t streamMetaResetTaskStatus(SStreamMeta* pMeta); int32_t streamMetaResetTaskStatus(SStreamMeta* pMeta);
int32_t streamMetaAddFailedTask(SStreamMeta* pMeta, int64_t streamId, int32_t taskId); int32_t streamMetaAddFailedTask(SStreamMeta* pMeta, int64_t streamId, int32_t taskId, bool lock);
void streamMetaAddFailedTaskSelf(SStreamTask* pTask, int64_t failedTs); void streamMetaAddFailedTaskSelf(SStreamTask* pTask, int64_t failedTs, bool lock);
void streamMetaAddIntoUpdateTaskList(SStreamMeta* pMeta, SStreamTask* pTask, SStreamTask* pHTask, int32_t transId, void streamMetaAddIntoUpdateTaskList(SStreamMeta* pMeta, SStreamTask* pTask, SStreamTask* pHTask, int32_t transId,
int64_t startTs); int64_t startTs);
void streamMetaClearSetUpdateTaskListComplete(SStreamMeta* pMeta); void streamMetaClearSetUpdateTaskListComplete(SStreamMeta* pMeta);

View File

@ -822,6 +822,7 @@ _exit:
return code; return code;
} }
// todo: serialized term attributes.
int32_t tDecodeRestoreCheckpointInfo(SDecoder* pDecoder, SRestoreCheckpointInfo* pReq) { int32_t tDecodeRestoreCheckpointInfo(SDecoder* pDecoder, SRestoreCheckpointInfo* pReq) {
int32_t code = 0; int32_t code = 0;
int32_t lino; int32_t lino;

View File

@ -3045,31 +3045,18 @@ _end:
} }
// Construct the child table name in the form of <ctbName>_<stbName>_<groupId> and store it in `ctbName`. // Construct the child table name in the form of <ctbName>_<stbName>_<groupId> and store it in `ctbName`.
// If the name length exceeds TSDB_TABLE_NAME_LEN, first convert <stbName>_<groupId> to an MD5 value and then
// concatenate. If the length is still too long, convert <ctbName> to an MD5 value as well.
int32_t buildCtbNameAddGroupId(const char* stbName, char* ctbName, uint64_t groupId, size_t cap) { int32_t buildCtbNameAddGroupId(const char* stbName, char* ctbName, uint64_t groupId, size_t cap) {
int32_t code = TSDB_CODE_SUCCESS; int32_t code = TSDB_CODE_SUCCESS;
int32_t lino = 0; int32_t lino = 0;
char tmp[TSDB_TABLE_NAME_LEN] = {0}; char tmp[TSDB_TABLE_NAME_LEN] = {0};
char* suffix = tmp;
size_t suffixCap = sizeof(tmp);
size_t suffixLen = 0;
size_t prefixLen = 0;
T_MD5_CTX context;
if (ctbName == NULL || cap < TSDB_TABLE_NAME_LEN) { if (ctbName == NULL || cap < TSDB_TABLE_NAME_LEN) {
code = TSDB_CODE_INTERNAL_ERROR; code = TSDB_CODE_INTERNAL_ERROR;
TSDB_CHECK_CODE(code, lino, _end); TSDB_CHECK_CODE(code, lino, _end);
} }
prefixLen = strlen(ctbName);
if (stbName == NULL) { if (stbName == NULL) {
suffixLen = snprintf(suffix, suffixCap, "%" PRIu64, groupId); snprintf(tmp, TSDB_TABLE_NAME_LEN, "_%"PRIu64, groupId);
if (suffixLen >= suffixCap) {
code = TSDB_CODE_INTERNAL_ERROR;
TSDB_CHECK_CODE(code, lino, _end);
}
} else { } else {
int32_t i = strlen(stbName) - 1; int32_t i = strlen(stbName) - 1;
for (; i >= 0; i--) { for (; i >= 0; i--) {
@ -3077,52 +3064,12 @@ int32_t buildCtbNameAddGroupId(const char* stbName, char* ctbName, uint64_t grou
break; break;
} }
} }
suffixLen = snprintf(suffix, suffixCap, "%s_%" PRIu64, stbName + i + 1, groupId); snprintf(tmp, TSDB_TABLE_NAME_LEN, "_%s_%" PRIu64, stbName + i + 1, groupId);
if (suffixLen >= suffixCap) {
suffixCap = suffixLen + 1;
suffix = taosMemoryMalloc(suffixCap);
TSDB_CHECK_NULL(suffix, code, lino, _end, TSDB_CODE_OUT_OF_MEMORY);
suffixLen = snprintf(suffix, suffixCap, "%s_%" PRIu64, stbName + i + 1, groupId);
if (suffixLen >= suffixCap) {
code = TSDB_CODE_INTERNAL_ERROR;
TSDB_CHECK_CODE(code, lino, _end);
}
}
} }
if (prefixLen + suffixLen + 1 >= TSDB_TABLE_NAME_LEN) { ctbName[cap - strlen(tmp) - 1] = 0; // put stbname + groupId to the end
// If the name length exceeeds the limit, convert the suffix to MD5 value. size_t prefixLen = strlen(ctbName);
tMD5Init(&context); ctbName = strncat(ctbName, tmp, cap - prefixLen - 1);
tMD5Update(&context, (uint8_t*)suffix, suffixLen);
tMD5Final(&context);
suffixLen = snprintf(suffix, suffixCap, "%016" PRIx64 "%016" PRIx64, *(uint64_t*)context.digest,
*(uint64_t*)(context.digest + 8));
if (suffixLen >= suffixCap) {
code = TSDB_CODE_INTERNAL_ERROR;
TSDB_CHECK_CODE(code, lino, _end);
}
}
if (prefixLen + suffixLen + 1 >= TSDB_TABLE_NAME_LEN) {
// If the name is still too long, convert the ctbName to MD5 value.
tMD5Init(&context);
tMD5Update(&context, (uint8_t*)ctbName, prefixLen);
tMD5Final(&context);
prefixLen = snprintf(ctbName, cap, "t_%016" PRIx64 "%016" PRIx64, *(uint64_t*)context.digest,
*(uint64_t*)(context.digest + 8));
if (prefixLen >= cap) {
code = TSDB_CODE_INTERNAL_ERROR;
TSDB_CHECK_CODE(code, lino, _end);
}
}
if (prefixLen + suffixLen + 1 >= TSDB_TABLE_NAME_LEN) {
code = TSDB_CODE_INTERNAL_ERROR;
TSDB_CHECK_CODE(code, lino, _end);
}
ctbName[prefixLen] = '_';
tstrncpy(&ctbName[prefixLen + 1], suffix, cap - prefixLen - 1);
for (char* p = ctbName; *p; ++p) { for (char* p = ctbName; *p; ++p) {
if (*p == '.') *p = '_'; if (*p == '.') *p = '_';
@ -3132,9 +3079,6 @@ _end:
if (code != TSDB_CODE_SUCCESS) { if (code != TSDB_CODE_SUCCESS) {
uError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); uError("%s failed at line %d since %s", __func__, lino, tstrerror(code));
} }
if (suffix != tmp) {
taosMemoryFree(suffix);
}
return code; return code;
} }

View File

@ -16,7 +16,6 @@
#define _DEFAULT_SOURCE #define _DEFAULT_SOURCE
#include "tname.h" #include "tname.h"
#include "tcommon.h" #include "tcommon.h"
#include "tstrbuild.h"
#define VALID_NAME_TYPE(x) ((x) == TSDB_DB_NAME_T || (x) == TSDB_TABLE_NAME_T) #define VALID_NAME_TYPE(x) ((x) == TSDB_DB_NAME_T || (x) == TSDB_TABLE_NAME_T)
@ -234,43 +233,38 @@ static int compareKv(const void* p1, const void* p2) {
* use stable name and tags to grearate child table name * use stable name and tags to grearate child table name
*/ */
int32_t buildChildTableName(RandTableName* rName) { int32_t buildChildTableName(RandTableName* rName) {
SStringBuilder sb = {0};
taosStringBuilderAppendStringLen(&sb, rName->stbFullName, rName->stbFullNameLen);
if (sb.buf == NULL) {
return TSDB_CODE_OUT_OF_MEMORY;
}
taosArraySort(rName->tags, compareKv); taosArraySort(rName->tags, compareKv);
T_MD5_CTX context;
tMD5Init(&context);
// add stable name
tMD5Update(&context, (uint8_t*)rName->stbFullName, rName->stbFullNameLen);
// add tags
for (int j = 0; j < taosArrayGetSize(rName->tags); ++j) { for (int j = 0; j < taosArrayGetSize(rName->tags); ++j) {
taosStringBuilderAppendChar(&sb, ',');
SSmlKv* tagKv = taosArrayGet(rName->tags, j); SSmlKv* tagKv = taosArrayGet(rName->tags, j);
if (tagKv == NULL) { if (tagKv == NULL) {
return TSDB_CODE_SML_INVALID_DATA; return TSDB_CODE_SML_INVALID_DATA;
} }
taosStringBuilderAppendStringLen(&sb, tagKv->key, tagKv->keyLen); tMD5Update(&context, (uint8_t*)",", 1);
taosStringBuilderAppendChar(&sb, '='); tMD5Update(&context, (uint8_t*)tagKv->key, tagKv->keyLen);
tMD5Update(&context, (uint8_t*)"=", 1);
if (IS_VAR_DATA_TYPE(tagKv->type)) { if (IS_VAR_DATA_TYPE(tagKv->type)) {
taosStringBuilderAppendStringLen(&sb, tagKv->value, tagKv->length); tMD5Update(&context, (uint8_t*)tagKv->value, tagKv->length);
} else { } else {
taosStringBuilderAppendStringLen(&sb, (char*)(&(tagKv->value)), tagKv->length); tMD5Update(&context, (uint8_t*)(&(tagKv->value)), tagKv->length);
} }
} }
size_t len = 0;
char* keyJoined = taosStringBuilderGetResult(&sb, &len);
T_MD5_CTX context;
tMD5Init(&context);
tMD5Update(&context, (uint8_t*)keyJoined, (uint32_t)len);
tMD5Final(&context); tMD5Final(&context);
char temp[8] = {0};
rName->ctbShortName[0] = 't'; rName->ctbShortName[0] = 't';
rName->ctbShortName[1] = '_'; rName->ctbShortName[1] = '_';
for (int i = 0; i < 16; i++) { taosByteArrayToHexStr(context.digest, 16, rName->ctbShortName + 2);
(void)sprintf(temp, "%02x", context.digest[i]); rName->ctbShortName[34] = 0;
(void)strcat(rName->ctbShortName, temp);
}
taosStringBuilderDestroy(&sb);
return TSDB_CODE_SUCCESS; return TSDB_CODE_SUCCESS;
} }

View File

@ -531,7 +531,6 @@ TEST(testCase, StreamWithoutDotInStbName2) {
TEST(testCase, StreamWithLongStbName) { TEST(testCase, StreamWithLongStbName) {
char ctbName[TSDB_TABLE_NAME_LEN]; char ctbName[TSDB_TABLE_NAME_LEN];
char expectName[TSDB_TABLE_NAME_LEN];
char *stbName = "a_simle_stb_name"; char *stbName = "a_simle_stb_name";
uint64_t groupId = UINT64_MAX; uint64_t groupId = UINT64_MAX;
@ -550,29 +549,13 @@ TEST(testCase, StreamWithLongStbName) {
EXPECT_EQ(buildCtbNameAddGroupId(stbName, NULL, groupId, sizeof(ctbName)), TSDB_CODE_INTERNAL_ERROR); EXPECT_EQ(buildCtbNameAddGroupId(stbName, NULL, groupId, sizeof(ctbName)), TSDB_CODE_INTERNAL_ERROR);
EXPECT_EQ(buildCtbNameAddGroupId(stbName, ctbName, groupId, sizeof(ctbName) - 1), TSDB_CODE_INTERNAL_ERROR); EXPECT_EQ(buildCtbNameAddGroupId(stbName, ctbName, groupId, sizeof(ctbName) - 1), TSDB_CODE_INTERNAL_ERROR);
// test md5 conversion of stbName with groupid // test truncation of long ctbName
for (int32_t i = 0; i < 159; ++i) ctbName[i] = 'A'; for (int32_t i = 0; i < 159; ++i) ctbName[i] = 'A';
ctbName[159] = '\0'; ctbName[159] = '\0';
stbName = taosStrdup(ctbName); stbName = taosStrdup(ctbName);
snprintf(expectName, TSDB_TABLE_NAME_LEN, "%s_d85f0d87946d76eeedd7b7b78b7492a2", ctbName); std::string expectName = std::string(ctbName) + "_" + std::string(stbName) + "_" + std::to_string(groupId);
EXPECT_EQ(buildCtbNameAddGroupId(stbName, ctbName, groupId, sizeof(ctbName)), TSDB_CODE_SUCCESS); EXPECT_EQ(buildCtbNameAddGroupId(stbName, ctbName, groupId, sizeof(ctbName)), TSDB_CODE_SUCCESS);
EXPECT_STREQ(ctbName, expectName); EXPECT_STREQ(ctbName, expectName.c_str() + expectName.size() - TSDB_TABLE_NAME_LEN + 1);
// test md5 conversion of all parts
for (int32_t i = 0; i < 190; ++i) ctbName[i] = 'A';
ctbName[190] = '\0';
tstrncpy(expectName, "t_d38a8b2df999bef0082ffc80a59a9cd7_d85f0d87946d76eeedd7b7b78b7492a2", TSDB_TABLE_NAME_LEN);
EXPECT_EQ(buildCtbNameAddGroupId(stbName, ctbName, groupId, sizeof(ctbName)), TSDB_CODE_SUCCESS);
EXPECT_STREQ(ctbName, expectName);
// test larger stbName
taosMemoryFree(stbName);
for (int32_t i = 0; i < 190; ++i) ctbName[i] = 'A';
ctbName[190] = '\0';
stbName = taosStrdup(ctbName);
tstrncpy(expectName, "t_d38a8b2df999bef0082ffc80a59a9cd7_9c99cc7c52073b63fb750af402d9b84b", TSDB_TABLE_NAME_LEN);
EXPECT_EQ(buildCtbNameAddGroupId(stbName, ctbName, groupId, sizeof(ctbName)), TSDB_CODE_SUCCESS);
EXPECT_STREQ(ctbName, expectName);
taosMemoryFree(stbName); taosMemoryFree(stbName);
} }

View File

@ -122,7 +122,7 @@ int32_t mndStreamGetRelTrans(SMnode *pMnode, int64_t streamId);
int32_t mndGetNumOfStreams(SMnode *pMnode, char *dbName, int32_t *pNumOfStreams); int32_t mndGetNumOfStreams(SMnode *pMnode, char *dbName, int32_t *pNumOfStreams);
int32_t mndGetNumOfStreamTasks(const SStreamObj *pStream); int32_t mndGetNumOfStreamTasks(const SStreamObj *pStream);
int32_t mndTakeVgroupSnapshot(SMnode *pMnode, bool *allReady, SArray **pList); int32_t mndTakeVgroupSnapshot(SMnode *pMnode, bool *allReady, SArray **pList, SHashObj* pTermMap);
void mndDestroyVgroupChangeInfo(SVgroupChangeInfo *pInfo); void mndDestroyVgroupChangeInfo(SVgroupChangeInfo *pInfo);
void mndKillTransImpl(SMnode *pMnode, int32_t transId, const char *pDbName); void mndKillTransImpl(SMnode *pMnode, int32_t transId, const char *pDbName);
int32_t setTransAction(STrans *pTrans, void *pCont, int32_t contLen, int32_t msgType, const SEpSet *pEpset, int32_t setTransAction(STrans *pTrans, void *pCont, int32_t contLen, int32_t msgType, const SEpSet *pEpset,
@ -147,7 +147,7 @@ int32_t mndStreamSetDropActionFromList(SMnode *pMnode, STrans *pTrans, SArray *p
int32_t mndStreamSetResetTaskAction(SMnode *pMnode, STrans *pTrans, SStreamObj *pStream, int64_t chkptId); int32_t mndStreamSetResetTaskAction(SMnode *pMnode, STrans *pTrans, SStreamObj *pStream, int64_t chkptId);
int32_t mndStreamSetUpdateChkptAction(SMnode *pMnode, STrans *pTrans, SStreamObj *pStream); int32_t mndStreamSetUpdateChkptAction(SMnode *pMnode, STrans *pTrans, SStreamObj *pStream);
int32_t mndCreateStreamResetStatusTrans(SMnode *pMnode, SStreamObj *pStream, int64_t chkptId); int32_t mndCreateStreamResetStatusTrans(SMnode *pMnode, SStreamObj *pStream, int64_t chkptId);
int32_t mndStreamSetChkptIdAction(SMnode *pMnode, STrans *pTrans, SStreamTask *pTask, int64_t checkpointId, int64_t ts); int32_t mndStreamSetChkptIdAction(SMnode* pMnode, STrans* pTrans, SStreamObj* pStream, int64_t checkpointId, SArray *pList);
int32_t mndStreamSetRestartAction(SMnode *pMnode, STrans *pTrans, SStreamObj *pStream); int32_t mndStreamSetRestartAction(SMnode *pMnode, STrans *pTrans, SStreamObj *pStream);
int32_t mndStreamSetCheckpointAction(SMnode *pMnode, STrans *pTrans, SStreamTask *pTask, int64_t checkpointId, int32_t mndStreamSetCheckpointAction(SMnode *pMnode, STrans *pTrans, SStreamTask *pTask, int64_t checkpointId,
int8_t mndTrigger); int8_t mndTrigger);
@ -155,8 +155,7 @@ int32_t mndStreamSetStopStreamTasksActions(SMnode* pMnode, STrans *pTrans, uint6
int32_t mndCreateStreamChkptInfoUpdateTrans(SMnode *pMnode, SStreamObj *pStream, SArray *pChkptInfoList); int32_t mndCreateStreamChkptInfoUpdateTrans(SMnode *pMnode, SStreamObj *pStream, SArray *pChkptInfoList);
int32_t mndScanCheckpointReportInfo(SRpcMsg *pReq); int32_t mndScanCheckpointReportInfo(SRpcMsg *pReq);
int32_t mndCreateSetConsensusChkptIdTrans(SMnode *pMnode, SStreamObj *pStream, int32_t taskId, int64_t checkpointId, int32_t mndCreateSetConsensusChkptIdTrans(SMnode *pMnode, SStreamObj *pStream, int64_t checkpointId, SArray* pList);
int64_t ts);
void removeTasksInBuf(SArray *pTaskIds, SStreamExecInfo *pExecInfo); void removeTasksInBuf(SArray *pTaskIds, SStreamExecInfo *pExecInfo);
int32_t mndFindChangedNodeInfo(SMnode *pMnode, const SArray *pPrevNodeList, const SArray *pNodeList, int32_t mndFindChangedNodeInfo(SMnode *pMnode, const SArray *pPrevNodeList, const SArray *pNodeList,
SVgroupChangeInfo *pInfo); SVgroupChangeInfo *pInfo);

View File

@ -429,9 +429,8 @@ void mndDoTimerPullupTask(SMnode *pMnode, int64_t sec) {
if (sec % 5 == 0) { if (sec % 5 == 0) {
mndStreamConsensusChkpt(pMnode); mndStreamConsensusChkpt(pMnode);
} }
#endif
#ifdef USE_REPORT if (tsTelemInterval > 0 && sec % tsTelemInterval == 0) {
if (sec % tsTelemInterval == (TMIN(86400, (tsTelemInterval - 1)))) {
mndPullupTelem(pMnode); mndPullupTelem(pMnode);
} }
#endif #endif
@ -474,19 +473,18 @@ static void *mndThreadFp(void *param) {
while (1) { while (1) {
lastTime++; lastTime++;
taosMsleep(100); taosMsleep(100);
if (mndGetStop(pMnode)) break; if (mndGetStop(pMnode)) break;
if (lastTime % 10 != 0) continue; if (lastTime % 10 != 0) continue;
if (mnodeIsNotLeader(pMnode)) {
mTrace("timer not process since mnode is not leader");
continue;
}
int64_t sec = lastTime / 10; int64_t sec = lastTime / 10;
mndDoTimerCheckTask(pMnode, sec); mndDoTimerCheckTask(pMnode, sec);
int64_t minCron = minCronTime();
if (sec % minCron == 0 && mnodeIsNotLeader(pMnode)) {
// not leader, do nothing
mTrace("timer not process since mnode is not leader, reason: %s", tstrerror(terrno));
terrno = 0;
continue;
}
mndDoTimerPullupTask(pMnode, sec); mndDoTimerPullupTask(pMnode, sec);
} }

View File

@ -13,13 +13,13 @@
* along with this program. If not, see <http://www.gnu.org/licenses/>. * along with this program. If not, see <http://www.gnu.org/licenses/>.
*/ */
#include "mndStream.h"
#include "audit.h" #include "audit.h"
#include "mndDb.h" #include "mndDb.h"
#include "mndPrivilege.h" #include "mndPrivilege.h"
#include "mndScheduler.h" #include "mndScheduler.h"
#include "mndShow.h" #include "mndShow.h"
#include "mndStb.h" #include "mndStb.h"
#include "mndStream.h"
#include "mndTrans.h" #include "mndTrans.h"
#include "osMemory.h" #include "osMemory.h"
#include "parser.h" #include "parser.h"
@ -1390,18 +1390,16 @@ static int32_t streamWaitComparFn(const void *p1, const void *p2) {
} }
// all tasks of this stream should be ready, otherwise do nothing // all tasks of this stream should be ready, otherwise do nothing
static bool isStreamReadyHelp(int64_t now, SStreamObj* pStream) { static bool isStreamReadyHelp(int64_t now, SStreamObj *pStream) {
bool ready = false; bool ready = false;
streamMutexLock(&execInfo.lock); streamMutexLock(&execInfo.lock);
int64_t lastReadyTs = getStreamTaskLastReadyState(execInfo.pTaskList, pStream->uid); int64_t lastReadyTs = getStreamTaskLastReadyState(execInfo.pTaskList, pStream->uid);
if ((lastReadyTs == -1) || ((lastReadyTs != -1) && ((now - lastReadyTs) < tsStreamCheckpointInterval * 1000))) { if ((lastReadyTs == -1) || ((lastReadyTs != -1) && ((now - lastReadyTs) < tsStreamCheckpointInterval * 1000))) {
if (lastReadyTs != -1) { if (lastReadyTs != -1) {
mInfo("not start checkpoint, stream:0x%" PRIx64 " last ready ts:%" PRId64 " ready duration:%" PRId64 mInfo("not start checkpoint, stream:0x%" PRIx64 " readyTs:%" PRId64 " ready duration:%.2fs less than threshold",
"ms less than threshold", pStream->uid, lastReadyTs, (now - lastReadyTs) / 1000.0);
pStream->uid, lastReadyTs, (now - lastReadyTs));
} }
ready = false; ready = false;
@ -2094,11 +2092,12 @@ static int32_t mndProcessResetStreamReq(SRpcMsg *pReq) {
return TSDB_CODE_ACTION_IN_PROGRESS; return TSDB_CODE_ACTION_IN_PROGRESS;
} }
static int32_t mndProcessVgroupChange(SMnode *pMnode, SVgroupChangeInfo *pChangeInfo, bool includeAllNodes, STrans** pUpdateTrans) { static int32_t mndProcessVgroupChange(SMnode *pMnode, SVgroupChangeInfo *pChangeInfo, bool includeAllNodes,
SSdb *pSdb = pMnode->pSdb; STrans **pUpdateTrans, SArray* pStreamList) {
void *pIter = NULL; SSdb *pSdb = pMnode->pSdb;
STrans *pTrans = NULL; void *pIter = NULL;
int32_t code = 0; STrans *pTrans = NULL;
int32_t code = 0;
*pUpdateTrans = NULL; *pUpdateTrans = NULL;
// conflict check for nodeUpdate trans, here we randomly chose one stream to add into the trans pool // conflict check for nodeUpdate trans, here we randomly chose one stream to add into the trans pool
@ -2167,6 +2166,10 @@ static int32_t mndProcessVgroupChange(SMnode *pMnode, SVgroupChangeInfo *pChange
} }
code = mndPersistTransLog(pStream, pTrans, SDB_STATUS_READY); code = mndPersistTransLog(pStream, pTrans, SDB_STATUS_READY);
if (code == 0) {
taosArrayPush(pStreamList, &pStream->uid);
}
sdbRelease(pSdb, pStream); sdbRelease(pSdb, pStream);
if (code != TSDB_CODE_SUCCESS) { if (code != TSDB_CODE_SUCCESS) {
@ -2345,7 +2348,7 @@ static int32_t mndProcessNodeCheckReq(SRpcMsg *pMsg) {
return 0; return 0;
} }
code = mndTakeVgroupSnapshot(pMnode, &allReady, &pNodeSnapshot); code = mndTakeVgroupSnapshot(pMnode, &allReady, &pNodeSnapshot, NULL);
if (code) { if (code) {
mError("failed to take the vgroup snapshot, ignore it and continue"); mError("failed to take the vgroup snapshot, ignore it and continue");
} }
@ -2369,10 +2372,27 @@ static int32_t mndProcessNodeCheckReq(SRpcMsg *pMsg) {
mDebug("vnode(s) change detected, build trans to update stream task epsets"); mDebug("vnode(s) change detected, build trans to update stream task epsets");
STrans *pTrans = NULL; STrans *pTrans = NULL;
SArray* pStreamIdList = taosArrayInit(4, sizeof(int64_t));
streamMutexLock(&execInfo.lock); streamMutexLock(&execInfo.lock);
code = mndProcessVgroupChange(pMnode, &changeInfo, updateAllVgroups, &pTrans); code = mndProcessVgroupChange(pMnode, &changeInfo, updateAllVgroups, &pTrans, pStreamIdList);
// remove the consensus-checkpoint-id req of all related stream(s)
int32_t num = taosArrayGetSize(pStreamIdList);
if (num > 0) {
mDebug("start to clear %d related stream in consensus-checkpoint-id list due to nodeUpdate", num);
for (int32_t x = 0; x < num; ++x) {
int64_t uid = *(int64_t *)taosArrayGet(pStreamIdList, x);
int32_t ret = mndClearConsensusCheckpointId(execInfo.pStreamConsensus, uid);
if (ret != 0) {
mError("failed to remove stream:0x%" PRIx64 " from consensus-checkpoint-id list, code:%s", uid,
tstrerror(ret));
}
}
}
streamMutexUnlock(&execInfo.lock); streamMutexUnlock(&execInfo.lock);
taosArrayDestroy(pStreamIdList);
// NOTE: sync trans out of lock // NOTE: sync trans out of lock
if (code == 0 && pTrans != NULL) { if (code == 0 && pTrans != NULL) {
@ -2595,8 +2615,9 @@ int32_t mndProcessStreamReqCheckpoint(SRpcMsg *pReq) {
if (pStream != NULL) { // TODO:handle error if (pStream != NULL) { // TODO:handle error
code = mndProcessStreamCheckpointTrans(pMnode, pStream, checkpointId, 0, false); code = mndProcessStreamCheckpointTrans(pMnode, pStream, checkpointId, 0, false);
if (code) { if (code != 0 && code != TSDB_CODE_ACTION_IN_PROGRESS) {
mError("failed to create checkpoint trans, code:%s", tstrerror(code)); mError("stream:0x%" PRIx64 " failed to create checkpoint trans, checkpointId:%" PRId64 ", code:%s",
req.streamId, checkpointId, tstrerror(code));
} }
} else { } else {
// todo: wait for the create stream trans completed, and launch the checkpoint trans // todo: wait for the create stream trans completed, and launch the checkpoint trans
@ -2604,11 +2625,15 @@ int32_t mndProcessStreamReqCheckpoint(SRpcMsg *pReq) {
// sleep(500ms) // sleep(500ms)
} }
// remove this entry // remove this entry, not overwriting the global error code
(void) taosHashRemove(execInfo.pTransferStateStreams, &req.streamId, sizeof(int64_t)); int32_t ret = taosHashRemove(execInfo.pTransferStateStreams, &req.streamId, sizeof(int64_t));
if (ret) {
mError("failed to remove transfer state stream, code:%s", tstrerror(ret));
}
int32_t numOfStreams = taosHashGetSize(execInfo.pTransferStateStreams); int32_t numOfStreams = taosHashGetSize(execInfo.pTransferStateStreams);
mDebug("stream:0x%" PRIx64 " removed, remain streams:%d fill-history not completed", req.streamId, numOfStreams); mDebug("stream:0x%" PRIx64 " removed in transfer-state list, %d stream(s) not finish fill-history process",
req.streamId, numOfStreams);
} }
if (pStream != NULL) { if (pStream != NULL) {
@ -2685,7 +2710,7 @@ static void doAddReportStreamTask(SArray *pList, int64_t reportedChkptId, const
pReport->taskId, p->checkpointId, pReport->checkpointId); pReport->taskId, p->checkpointId, pReport->checkpointId);
} else if (p->checkpointId < pReport->checkpointId) { // expired checkpoint-report msg, update it } else if (p->checkpointId < pReport->checkpointId) { // expired checkpoint-report msg, update it
mInfo("s-task:0x%x expired checkpoint-report info in checkpoint-report list update from %" PRId64 "->%" PRId64, mInfo("s-task:0x%x expired checkpoint-report info in checkpoint-report list update from %" PRId64 "->%" PRId64,
pReport->taskId, p->checkpointId, pReport->checkpointId); pReport->taskId, p->checkpointId, pReport->checkpointId);
// update the checkpoint report info // update the checkpoint report info
p->checkpointId = pReport->checkpointId; p->checkpointId = pReport->checkpointId;
@ -2822,6 +2847,8 @@ static int64_t getConsensusId(int64_t streamId, int32_t numOfTasks, int32_t *pEx
if (chkId > pe->checkpointInfo.latestId) { if (chkId > pe->checkpointInfo.latestId) {
if (chkId != INT64_MAX) { if (chkId != INT64_MAX) {
*pAllSame = false; *pAllSame = false;
mDebug("checkpointIds not identical, prev:%" PRId64 " smaller:%" PRId64 " from task:0x%" PRIx64, chkId,
pe->checkpointInfo.latestId, pe->id.taskId);
} }
chkId = pe->checkpointInfo.latestId; chkId = pe->checkpointInfo.latestId;
} }
@ -2847,7 +2874,7 @@ static void doSendQuickRsp(SRpcHandleInfo *pInfo, int32_t msgSize, int32_t vgId,
} }
} }
static int32_t doCleanReqList(SArray* pList, SCheckpointConsensusInfo* pInfo) { static int32_t doCleanReqList(SArray *pList, SCheckpointConsensusInfo *pInfo) {
int32_t alreadySend = taosArrayGetSize(pList); int32_t alreadySend = taosArrayGetSize(pList);
for (int32_t i = 0; i < alreadySend; ++i) { for (int32_t i = 0; i < alreadySend; ++i) {
@ -2873,7 +2900,6 @@ int32_t mndProcessConsensusInTmr(SRpcMsg *pMsg) {
int64_t now = taosGetTimestampMs(); int64_t now = taosGetTimestampMs();
bool allReady = true; bool allReady = true;
SArray *pNodeSnapshot = NULL; SArray *pNodeSnapshot = NULL;
int32_t maxAllowedTrans = 20;
int32_t numOfTrans = 0; int32_t numOfTrans = 0;
int32_t code = 0; int32_t code = 0;
void *pIter = NULL; void *pIter = NULL;
@ -2889,9 +2915,16 @@ int32_t mndProcessConsensusInTmr(SRpcMsg *pMsg) {
return terrno; return terrno;
} }
SHashObj* pTermMap = taosHashInit(64, taosGetDefaultHashFunction(TSDB_DATA_TYPE_INT), true, HASH_NO_LOCK);
if (pTermMap == NULL) {
taosArrayDestroy(pList);
taosArrayDestroy(pStreamList);
return terrno;
}
mDebug("start to process consensus-checkpointId in tmr"); mDebug("start to process consensus-checkpointId in tmr");
code = mndTakeVgroupSnapshot(pMnode, &allReady, &pNodeSnapshot); code = mndTakeVgroupSnapshot(pMnode, &allReady, &pNodeSnapshot, pTermMap);
taosArrayDestroy(pNodeSnapshot); taosArrayDestroy(pNodeSnapshot);
if (code) { if (code) {
mError("failed to get the vgroup snapshot, ignore it and continue"); mError("failed to get the vgroup snapshot, ignore it and continue");
@ -2901,6 +2934,7 @@ int32_t mndProcessConsensusInTmr(SRpcMsg *pMsg) {
mWarn("not all vnodes are ready, end to process the consensus-checkpointId in tmr process"); mWarn("not all vnodes are ready, end to process the consensus-checkpointId in tmr process");
taosArrayDestroy(pStreamList); taosArrayDestroy(pStreamList);
taosArrayDestroy(pList); taosArrayDestroy(pList);
taosHashCleanup(pTermMap);
return 0; return 0;
} }
@ -2927,31 +2961,62 @@ int32_t mndProcessConsensusInTmr(SRpcMsg *pMsg) {
continue; continue;
} }
if (pStream->uid != pInfo->streamId) {
// todo remove it
}
if ((num < pInfo->numOfTasks) || (pInfo->numOfTasks == 0)) {
mDebug("stream:0x%" PRIx64 " %s %d/%d tasks send checkpoint-consensus req(not all), ignore", pStream->uid,
pStream->name, num, pInfo->numOfTasks);
mndReleaseStream(pMnode, pStream);
continue;
}
streamId = pStream->uid;
int32_t existed = 0;
bool allSame = true;
int64_t chkId = getConsensusId(pInfo->streamId, pInfo->numOfTasks, &existed, &allSame);
if (chkId == -1) {
mDebug("not all(%d/%d) task(s) send hbMsg yet, wait for a while and check again", existed, pInfo->numOfTasks);
mndReleaseStream(pMnode, pStream);
continue;
}
bool allQualified = true;
for (int32_t j = 0; j < num; ++j) { for (int32_t j = 0; j < num; ++j) {
SCheckpointConsensusEntry *pe = taosArrayGet(pInfo->pTaskList, j); SCheckpointConsensusEntry *pe = taosArrayGet(pInfo->pTaskList, j);
if (pe == NULL) { if (pe == NULL) {
continue; continue;
} }
if (streamId == -1) { if (pe->req.nodeId != -2) {
streamId = pe->req.streamId; int32_t *pTerm = taosHashGet(pTermMap, &(pe->req.nodeId), sizeof(pe->req.nodeId));
} if (pTerm == NULL) {
mError("stream:0x%" PRIx64 " s-task:0x%x req from vgId:%d not found in termMap", pe->req.streamId,
int32_t existed = 0; pe->req.taskId, pe->req.nodeId);
bool allSame = true; allQualified = false;
int64_t chkId = getConsensusId(pe->req.streamId, pInfo->numOfTasks, &existed, &allSame); continue;
if (chkId == -1) { } else {
mDebug("not all(%d/%d) task(s) send hbMsg yet, wait for a while and check again, s-task:0x%x", existed, if (*pTerm != pe->req.term) {
pInfo->numOfTasks, pe->req.taskId); mWarn("stream:0x%" PRIx64 " s-task:0x%x req from vgId:%d is expired, term:%d, current term:%d",
break; pe->req.streamId, pe->req.taskId, pe->req.nodeId, pe->req.term, *pTerm);
allQualified = false;
continue;
}
}
} }
if (((now - pe->ts) >= 10 * 1000) || allSame) { if (((now - pe->ts) >= 10 * 1000) || allSame) {
mDebug("s-task:0x%x sendTs:%" PRId64 " wait %.2fs and all tasks have same checkpointId", pe->req.taskId, mDebug("s-task:0x%x vgId:%d term:%d sendTs:%" PRId64 " wait %.2fs or all tasks have same checkpointId:%" PRId64, pe->req.taskId,
pe->req.startTs, (now - pe->ts) / 1000.0); pe->req.nodeId, pe->req.term, pe->req.startTs, (now - pe->ts) / 1000.0, chkId);
if (chkId > pe->req.checkpointId) { if (chkId > pe->req.checkpointId) {
streamMutexUnlock(&execInfo.lock); streamMutexUnlock(&execInfo.lock);
taosArrayDestroy(pStreamList); taosArrayDestroy(pStreamList);
taosArrayDestroy(pList);
taosHashCleanup(pTermMap);
mError("s-task:0x%x checkpointId:%" PRId64 " is updated to %" PRId64 ", update it", pe->req.taskId, mError("s-task:0x%x checkpointId:%" PRId64 " is updated to %" PRId64 ", update it", pe->req.taskId,
pe->req.checkpointId, chkId); pe->req.checkpointId, chkId);
@ -2960,42 +3025,38 @@ int32_t mndProcessConsensusInTmr(SRpcMsg *pMsg) {
return TSDB_CODE_FAILED; return TSDB_CODE_FAILED;
} }
// todo: check for redundant consensus-checkpoint trans, if this kinds of trans repeatly failed.
code = mndCreateSetConsensusChkptIdTrans(pMnode, pStream, pe->req.taskId, chkId, pe->req.startTs);
if (code != TSDB_CODE_SUCCESS && code != TSDB_CODE_ACTION_IN_PROGRESS) {
mError("failed to create consensus-checkpoint trans, stream:0x%" PRIx64, pStream->uid);
}
void *p = taosArrayPush(pList, &pe->req.taskId);
if (p == NULL) {
mError("failed to put into task list, taskId:0x%x", pe->req.taskId);
}
} else { } else {
mDebug("s-task:0x%x sendTs:%" PRId64 " wait %.2fs already, wait for next round to check", pe->req.taskId, mDebug("s-task:0x%x sendTs:%" PRId64 " wait %.2fs already, wait for next round to check", pe->req.taskId,
pe->req.startTs, (now - pe->ts) / 1000.0); pe->req.startTs, (now - pe->ts) / 1000.0);
allQualified = false;
}
}
if (allQualified) {
code = mndStreamTransConflictCheck(pMnode, pStream->uid, MND_STREAM_CHKPT_CONSEN_NAME, false);
if (code == 0) {
code = mndCreateSetConsensusChkptIdTrans(pMnode, pStream, chkId, pInfo->pTaskList);
if (code != TSDB_CODE_SUCCESS && code != TSDB_CODE_ACTION_IN_PROGRESS) {
mError("failed to create consensus-checkpoint trans, stream:0x%" PRIx64, pStream->uid);
} else {
numOfTrans += 1;
mndClearConsensusRspEntry(pInfo);
void *p = taosArrayPush(pStreamList, &streamId);
if (p == NULL) {
mError("failed to put into stream list, stream:0x%" PRIx64 " not remove it in consensus-chkpt list",
streamId);
}
}
} else {
mDebug("stream:0x%" PRIx64 "not create chktp-consensus, due to trans conflict", pStream->uid);
} }
} }
mndReleaseStream(pMnode, pStream); mndReleaseStream(pMnode, pStream);
int32_t alreadySend = doCleanReqList(pList, pInfo); // create one transaction each time
if (numOfTrans > 0) {
// clear request stream item with empty task list
if (taosArrayGetSize(pInfo->pTaskList) == 0) {
mndClearConsensusRspEntry(pInfo);
if (streamId == -1) {
mError("streamId is -1, streamId:%" PRIx64 " in consensus-checkpointId hashMap, cont", pInfo->streamId);
}
void *p = taosArrayPush(pStreamList, &streamId);
if (p == NULL) {
mError("failed to put into stream list, stream:0x%" PRIx64 " not remove it in consensus-chkpt list", streamId);
}
}
numOfTrans += alreadySend;
if (numOfTrans > maxAllowedTrans) {
mInfo("already send consensus-checkpointId trans:%d, try next time", alreadySend);
taosHashCancelIterate(execInfo.pStreamConsensus, pIter); taosHashCancelIterate(execInfo.pStreamConsensus, pIter);
break; break;
} }
@ -3014,6 +3075,7 @@ int32_t mndProcessConsensusInTmr(SRpcMsg *pMsg) {
taosArrayDestroy(pStreamList); taosArrayDestroy(pStreamList);
taosArrayDestroy(pList); taosArrayDestroy(pList);
taosHashCleanup(pTermMap);
mDebug("end to process consensus-checkpointId in tmr, send consensus-checkpoint trans:%d", numOfTrans); mDebug("end to process consensus-checkpointId in tmr, send consensus-checkpoint trans:%d", numOfTrans);
return code; return code;

View File

@ -427,6 +427,8 @@ int32_t mndProcessStreamHb(SRpcMsg *pReq) {
.taskId = p->id.taskId, .taskId = p->id.taskId,
.checkpointId = p->checkpointInfo.latestId, .checkpointId = p->checkpointInfo.latestId,
.startTs = pChkInfo->consensusTs, .startTs = pChkInfo->consensusTs,
.nodeId = p->nodeId,
.term = p->stage,
}; };
SStreamObj *pStream = NULL; SStreamObj *pStream = NULL;
@ -486,7 +488,7 @@ int32_t mndProcessStreamHb(SRpcMsg *pReq) {
if (pMnode != NULL) { if (pMnode != NULL) {
SArray *p = NULL; SArray *p = NULL;
code = mndTakeVgroupSnapshot(pMnode, &allReady, &p); code = mndTakeVgroupSnapshot(pMnode, &allReady, &p, NULL);
taosArrayDestroy(p); taosArrayDestroy(p);
if (code) { if (code) {
mError("failed to get the vgroup snapshot, ignore it and continue"); mError("failed to get the vgroup snapshot, ignore it and continue");

View File

@ -16,7 +16,7 @@
#include "mndStream.h" #include "mndStream.h"
#include "mndTrans.h" #include "mndTrans.h"
#define MAX_CHKPT_EXEC_ELAPSED (600*1000) // 600s #define MAX_CHKPT_EXEC_ELAPSED (600*1000*3) // 600s
typedef struct SKeyInfo { typedef struct SKeyInfo {
void *pKey; void *pKey;
@ -137,6 +137,7 @@ static int32_t doStreamTransConflictCheck(SMnode *pMnode, int64_t streamId, cons
} else if ((strcmp(tInfo.name, MND_STREAM_CREATE_NAME) == 0) || (strcmp(tInfo.name, MND_STREAM_DROP_NAME) == 0) || } else if ((strcmp(tInfo.name, MND_STREAM_CREATE_NAME) == 0) || (strcmp(tInfo.name, MND_STREAM_DROP_NAME) == 0) ||
(strcmp(tInfo.name, MND_STREAM_TASK_RESET_NAME) == 0) || (strcmp(tInfo.name, MND_STREAM_TASK_RESET_NAME) == 0) ||
(strcmp(tInfo.name, MND_STREAM_TASK_UPDATE_NAME) == 0) || (strcmp(tInfo.name, MND_STREAM_TASK_UPDATE_NAME) == 0) ||
(strcmp(tInfo.name, MND_STREAM_CHKPT_CONSEN_NAME) == 0) ||
strcmp(tInfo.name, MND_STREAM_RESTART_NAME) == 0) { strcmp(tInfo.name, MND_STREAM_RESTART_NAME) == 0) {
mWarn("conflict with other transId:%d streamUid:0x%" PRIx64 ", trans:%s", tInfo.transId, tInfo.streamId, mWarn("conflict with other transId:%d streamUid:0x%" PRIx64 ", trans:%s", tInfo.transId, tInfo.streamId,
tInfo.name); tInfo.name);
@ -152,7 +153,7 @@ static int32_t doStreamTransConflictCheck(SMnode *pMnode, int64_t streamId, cons
// * Transactions of different streams are not related. Here only check the conflict of transaction for a given stream. // * Transactions of different streams are not related. Here only check the conflict of transaction for a given stream.
// For a given stream: // For a given stream:
// 1. checkpoint trans is conflict with any other trans except for the drop and reset trans. // 1. checkpoint trans is conflict with any other trans except for the drop and reset trans.
// 2. create/drop/reset/update trans are conflict with any other trans. // 2. create/drop/reset/update/chkpt-consensus trans are conflict with any other trans.
int32_t mndStreamTransConflictCheck(SMnode *pMnode, int64_t streamId, const char *pTransName, bool lock) { int32_t mndStreamTransConflictCheck(SMnode *pMnode, int64_t streamId, const char *pTransName, bool lock) {
if (lock) { if (lock) {
streamMutexLock(&execInfo.lock); streamMutexLock(&execInfo.lock);

View File

@ -113,8 +113,6 @@ static int32_t doSetResumeAction(STrans *pTrans, SMnode *pMnode, SStreamTask *pT
taosMemoryFree(pReq); taosMemoryFree(pReq);
return code; return code;
} }
mDebug("set the resume action for trans:%d", pTrans->id);
return code; return code;
} }
@ -438,6 +436,8 @@ int32_t mndStreamSetResumeAction(STrans *pTrans, SMnode *pMnode, SStreamObj *pSt
return code; return code;
} }
mDebug("transId:%d start to create resume actions", pTrans->id);
while (streamTaskIterNextTask(pIter)) { while (streamTaskIterNextTask(pIter)) {
SStreamTask *pTask = NULL; SStreamTask *pTask = NULL;
code = streamTaskIterGetCurrent(pIter, &pTask); code = streamTaskIterGetCurrent(pIter, &pTask);
@ -578,7 +578,7 @@ int32_t mndStreamSetResetTaskAction(SMnode *pMnode, STrans *pTrans, SStreamObj *
return 0; return 0;
} }
int32_t mndStreamSetChkptIdAction(SMnode *pMnode, STrans *pTrans, SStreamTask* pTask, int64_t checkpointId, int64_t ts) { int32_t doSetCheckpointIdAction(SMnode *pMnode, STrans *pTrans, SStreamTask* pTask, int64_t checkpointId, int64_t ts) {
SRestoreCheckpointInfo req = { SRestoreCheckpointInfo req = {
.taskId = pTask->id.taskId, .taskId = pTask->id.taskId,
.streamId = pTask->id.streamId, .streamId = pTask->id.streamId,
@ -624,7 +624,7 @@ int32_t mndStreamSetChkptIdAction(SMnode *pMnode, STrans *pTrans, SStreamTask* p
return code; return code;
} }
code = setTransAction(pTrans, pBuf, tlen, TDMT_STREAM_CONSEN_CHKPT, &epset, 0, TSDB_CODE_VND_INVALID_VGROUP_ID); code = setTransAction(pTrans, pBuf, tlen, TDMT_STREAM_CONSEN_CHKPT, &epset, TSDB_CODE_STREAM_TASK_IVLD_STATUS, TSDB_CODE_VND_INVALID_VGROUP_ID);
if (code != TSDB_CODE_SUCCESS) { if (code != TSDB_CODE_SUCCESS) {
taosMemoryFree(pBuf); taosMemoryFree(pBuf);
} }
@ -632,6 +632,50 @@ int32_t mndStreamSetChkptIdAction(SMnode *pMnode, STrans *pTrans, SStreamTask* p
return code; return code;
} }
int32_t mndStreamSetChkptIdAction(SMnode *pMnode, STrans *pTrans, SStreamObj *pStream, int64_t checkpointId,
SArray *pList) {
SStreamTaskIter *pIter = NULL;
int32_t num = taosArrayGetSize(pList);
taosWLockLatch(&pStream->lock);
int32_t code = createStreamTaskIter(pStream, &pIter);
if (code) {
taosWUnLockLatch(&pStream->lock);
mError("failed to create stream task iter:%s", pStream->name);
return code;
}
while (streamTaskIterNextTask(pIter)) {
SStreamTask *pTask = NULL;
code = streamTaskIterGetCurrent(pIter, &pTask);
if (code) {
destroyStreamTaskIter(pIter);
taosWUnLockLatch(&pStream->lock);
return code;
}
// find the required entry
int64_t startTs = 0;
for(int32_t i = 0; i < num; ++i) {
SCheckpointConsensusEntry* pEntry = taosArrayGet(pList, i);
if (pEntry->req.taskId == pTask->id.taskId) {
startTs = pEntry->req.startTs;
break;
}
}
code = doSetCheckpointIdAction(pMnode, pTrans, pTask, checkpointId, startTs);
if (code != TSDB_CODE_SUCCESS) {
destroyStreamTaskIter(pIter);
taosWUnLockLatch(&pStream->lock);
return code;
}
}
destroyStreamTaskIter(pIter);
taosWUnLockLatch(&pStream->lock);
return 0;
}
int32_t mndStreamSetCheckpointAction(SMnode *pMnode, STrans *pTrans, SStreamTask *pTask, int64_t checkpointId, int32_t mndStreamSetCheckpointAction(SMnode *pMnode, STrans *pTrans, SStreamTask *pTask, int64_t checkpointId,
int8_t mndTrigger) { int8_t mndTrigger) {

View File

@ -181,7 +181,7 @@ static int32_t mndCheckMnodeStatus(SMnode* pMnode) {
return TSDB_CODE_SUCCESS; return TSDB_CODE_SUCCESS;
} }
static int32_t mndCheckAndAddVgroupsInfo(SMnode *pMnode, SArray *pVgroupList, bool* allReady) { static int32_t mndCheckAndAddVgroupsInfo(SMnode *pMnode, SArray *pVgroupList, bool* allReady, SHashObj* pTermMap) {
SSdb *pSdb = pMnode->pSdb; SSdb *pSdb = pMnode->pSdb;
void *pIter = NULL; void *pIter = NULL;
SVgObj *pVgroup = NULL; SVgObj *pVgroup = NULL;
@ -243,6 +243,14 @@ static int32_t mndCheckAndAddVgroupsInfo(SMnode *pMnode, SArray *pVgroupList, bo
mDebug("take node snapshot, nodeId:%d %s", entry.nodeId, buf); mDebug("take node snapshot, nodeId:%d %s", entry.nodeId, buf);
} }
if (pTermMap != NULL) {
int64_t term = pVgroup->vnodeGid[0].syncTerm;
code = taosHashPut(pTermMap, &pVgroup->vgId, sizeof(pVgroup->vgId), &term, sizeof(term));
if (code) {
mError("failed to put vnode:%d term into hashMap, code:%s", pVgroup->vgId, tstrerror(code));
}
}
sdbRelease(pSdb, pVgroup); sdbRelease(pSdb, pVgroup);
} }
@ -251,7 +259,7 @@ _end:
return code; return code;
} }
int32_t mndTakeVgroupSnapshot(SMnode *pMnode, bool *allReady, SArray **pList) { int32_t mndTakeVgroupSnapshot(SMnode *pMnode, bool *allReady, SArray **pList, SHashObj* pTermMap) {
int32_t code = 0; int32_t code = 0;
SArray *pVgroupList = NULL; SArray *pVgroupList = NULL;
@ -266,7 +274,7 @@ int32_t mndTakeVgroupSnapshot(SMnode *pMnode, bool *allReady, SArray **pList) {
} }
// 1. check for all vnodes status // 1. check for all vnodes status
code = mndCheckAndAddVgroupsInfo(pMnode, pVgroupList, allReady); code = mndCheckAndAddVgroupsInfo(pMnode, pVgroupList, allReady, pTermMap);
if (code) { if (code) {
goto _err; goto _err;
} }
@ -728,15 +736,21 @@ int32_t mndScanCheckpointReportInfo(SRpcMsg *pReq) {
SMnode *pMnode = pReq->info.node; SMnode *pMnode = pReq->info.node;
void *pIter = NULL; void *pIter = NULL;
int32_t code = 0; int32_t code = 0;
SArray *pDropped = taosArrayInit(4, sizeof(int64_t)); int32_t lino = 0;
if (pDropped == NULL) { SArray *pDropped = NULL;
return terrno;
}
mDebug("start to scan checkpoint report info"); mDebug("start to scan checkpoint report info");
streamMutexLock(&execInfo.lock); streamMutexLock(&execInfo.lock);
int32_t num = taosHashGetSize(execInfo.pChkptStreams);
if (num == 0) {
goto _end;
}
pDropped = taosArrayInit(4, sizeof(int64_t));
TSDB_CHECK_NULL(pDropped, code, lino, _end, terrno);
while ((pIter = taosHashIterate(execInfo.pChkptStreams, pIter)) != NULL) { while ((pIter = taosHashIterate(execInfo.pChkptStreams, pIter)) != NULL) {
SChkptReportInfo *px = (SChkptReportInfo *)pIter; SChkptReportInfo *px = (SChkptReportInfo *)pIter;
if (taosArrayGetSize(px->pTaskList) == 0) { if (taosArrayGetSize(px->pTaskList) == 0) {
@ -804,42 +818,35 @@ int32_t mndScanCheckpointReportInfo(SRpcMsg *pReq) {
mDebug("drop %d stream(s) in checkpoint-report list, remain:%d", size, numOfStreams); mDebug("drop %d stream(s) in checkpoint-report list, remain:%d", size, numOfStreams);
} }
_end:
streamMutexUnlock(&execInfo.lock); streamMutexUnlock(&execInfo.lock);
taosArrayDestroy(pDropped); if (pDropped != NULL) {
taosArrayDestroy(pDropped);
}
mDebug("end to scan checkpoint report info") mDebug("end to scan checkpoint report info")
return TSDB_CODE_SUCCESS; return code;
} }
int32_t mndCreateSetConsensusChkptIdTrans(SMnode *pMnode, SStreamObj *pStream, int32_t taskId, int64_t checkpointId, int32_t mndCreateSetConsensusChkptIdTrans(SMnode *pMnode, SStreamObj *pStream, int64_t checkpointId, SArray* pList) {
int64_t ts) { char msg[128] = {0};
char msg[128] = {0}; STrans *pTrans = NULL;
STrans *pTrans = NULL;
SStreamTask *pTask = NULL;
snprintf(msg, tListLen(msg), "set consen-chkpt-id for task:0x%x", taskId); snprintf(msg, tListLen(msg), "set consen-chkpt-id for stream:0x%" PRIx64, pStream->uid);
int32_t code = doCreateTrans(pMnode, pStream, NULL, TRN_CONFLICT_NOTHING, MND_STREAM_CHKPT_CONSEN_NAME, msg, &pTrans); int32_t code = doCreateTrans(pMnode, pStream, NULL, TRN_CONFLICT_NOTHING, MND_STREAM_CHKPT_CONSEN_NAME, msg, &pTrans);
if (pTrans == NULL || code != 0) { if (pTrans == NULL || code != 0) {
return terrno; return terrno;
} }
STaskId id = {.streamId = pStream->uid, .taskId = taskId};
code = mndGetStreamTask(&id, pStream, &pTask);
if (code) {
mError("failed to get task:0x%x in stream:%s, failed to create consensus-checkpointId", taskId, pStream->name);
sdbRelease(pMnode->pSdb, pStream);
return code;
}
code = mndStreamRegisterTrans(pTrans, MND_STREAM_CHKPT_CONSEN_NAME, pStream->uid); code = mndStreamRegisterTrans(pTrans, MND_STREAM_CHKPT_CONSEN_NAME, pStream->uid);
if (code) { if (code) {
sdbRelease(pMnode->pSdb, pStream); sdbRelease(pMnode->pSdb, pStream);
return code; return code;
} }
code = mndStreamSetChkptIdAction(pMnode, pTrans, pTask, checkpointId, ts); code = mndStreamSetChkptIdAction(pMnode, pTrans, pStream, checkpointId, pList);
if (code != 0) { if (code != 0) {
sdbRelease(pMnode->pSdb, pStream); sdbRelease(pMnode->pSdb, pStream);
mndTransDrop(pTrans); mndTransDrop(pTrans);
@ -854,8 +861,10 @@ int32_t mndCreateSetConsensusChkptIdTrans(SMnode *pMnode, SStreamObj *pStream, i
} }
code = mndTransPrepare(pMnode, pTrans); code = mndTransPrepare(pMnode, pTrans);
if (code != TSDB_CODE_SUCCESS && code != TSDB_CODE_ACTION_IN_PROGRESS) { if (code != TSDB_CODE_SUCCESS && code != TSDB_CODE_ACTION_IN_PROGRESS) {
mError("trans:%d, failed to prepare set consensus-chkptId trans since %s", pTrans->id, terrstr()); mError("trans:%d, failed to prepare set consensus-chkptId trans for stream:0x%" PRId64 " since %s", pTrans->id,
pStream->uid, tstrerror(code));
sdbRelease(pMnode->pSdb, pStream); sdbRelease(pMnode->pSdb, pStream);
mndTransDrop(pTrans); mndTransDrop(pTrans);
return code; return code;
@ -911,13 +920,15 @@ void mndAddConsensusTasks(SCheckpointConsensusInfo *pInfo, const SRestoreCheckpo
} }
if (p->req.taskId == info.req.taskId) { if (p->req.taskId == info.req.taskId) {
mDebug("s-task:0x%x already in consensus-checkpointId list for stream:0x%" PRIx64 ", update ts %" PRId64 mDebug("s-task:0x%x already in consensus-checkpointId list for stream:0x%" PRIx64 ", update send reqTs %" PRId64
"->%" PRId64 " checkpointId:%" PRId64 " -> %" PRId64 " total existed:%d", "->%" PRId64 " checkpointId:%" PRId64 " -> %" PRId64 " term:%d->%d total existed:%d",
pRestoreInfo->taskId, pRestoreInfo->streamId, p->req.startTs, info.req.startTs, p->req.checkpointId, pRestoreInfo->taskId, pRestoreInfo->streamId, p->req.startTs, info.req.startTs, p->req.checkpointId,
info.req.checkpointId, num); info.req.checkpointId, p->req.term, info.req.term, num);
p->req.startTs = info.req.startTs; p->req.startTs = info.req.startTs;
p->req.checkpointId = info.req.checkpointId; p->req.checkpointId = info.req.checkpointId;
p->req.transId = info.req.transId; p->req.transId = info.req.transId;
p->req.nodeId = info.req.nodeId;
p->req.term = info.req.term;
return; return;
} }
} }
@ -927,9 +938,10 @@ void mndAddConsensusTasks(SCheckpointConsensusInfo *pInfo, const SRestoreCheckpo
mError("s-task:0x%x failed to put task into consensus-checkpointId list, code: out of memory", info.req.taskId); mError("s-task:0x%x failed to put task into consensus-checkpointId list, code: out of memory", info.req.taskId);
} else { } else {
num = taosArrayGetSize(pInfo->pTaskList); num = taosArrayGetSize(pInfo->pTaskList);
mDebug("s-task:0x%x checkpointId:%" PRId64 " added into consensus-checkpointId list, stream:0x%" PRIx64 mDebug("s-task:0x%x (vgId:%d) checkpointId:%" PRId64 " term:%d, reqTs:%" PRId64
" waiting tasks:%d", " added into consensus-checkpointId list, stream:0x%" PRIx64 " waiting tasks:%d",
pRestoreInfo->taskId, pRestoreInfo->checkpointId, pRestoreInfo->streamId, num); pRestoreInfo->taskId, pRestoreInfo->nodeId, pRestoreInfo->checkpointId, info.req.term,
info.req.startTs, pRestoreInfo->streamId, num);
} }
} }
@ -947,6 +959,7 @@ int32_t mndClearConsensusCheckpointId(SHashObj *pHash, int64_t streamId) {
code = taosHashRemove(pHash, &streamId, sizeof(streamId)); code = taosHashRemove(pHash, &streamId, sizeof(streamId));
if (code == 0) { if (code == 0) {
numOfStreams = taosHashGetSize(pHash);
mDebug("drop stream:0x%" PRIx64 " in consensus-checkpointId list, remain:%d", streamId, numOfStreams); mDebug("drop stream:0x%" PRIx64 " in consensus-checkpointId list, remain:%d", streamId, numOfStreams);
} else { } else {
mError("failed to remove stream:0x%" PRIx64 " in consensus-checkpointId list, remain:%d", streamId, numOfStreams); mError("failed to remove stream:0x%" PRIx64 " in consensus-checkpointId list, remain:%d", streamId, numOfStreams);
@ -1686,7 +1699,7 @@ static int32_t doCheckForUpdated(SMnode *pMnode, SArray **ppNodeSnapshot) {
} }
} }
int32_t code = mndTakeVgroupSnapshot(pMnode, &allReady, ppNodeSnapshot); int32_t code = mndTakeVgroupSnapshot(pMnode, &allReady, ppNodeSnapshot, NULL);
if (code) { if (code) {
mError("failed to get the vgroup snapshot, ignore it and continue"); mError("failed to get the vgroup snapshot, ignore it and continue");
} }

View File

@ -157,7 +157,7 @@ int32_t tqSetDstTableDataPayload(uint64_t suid, const STSchema* pTSchema, int32_
int32_t doMergeExistedRows(SSubmitTbData* pExisted, const SSubmitTbData* pNew, const char* id); int32_t doMergeExistedRows(SSubmitTbData* pExisted, const SSubmitTbData* pNew, const char* id);
int32_t buildAutoCreateTableReq(const char* stbFullName, int64_t suid, int32_t numOfCols, SSDataBlock* pDataBlock, int32_t buildAutoCreateTableReq(const char* stbFullName, int64_t suid, int32_t numOfCols, SSDataBlock* pDataBlock,
SArray* pTagArray, bool newSubTableRule, SVCreateTbReq** pReq); SArray* pTagArray, bool newSubTableRule, SVCreateTbReq** pReq, const char* id);
int32_t tqExtractDropCtbDataBlock(const void* data, int32_t len, int64_t ver, void** pRefBlock, int32_t type); int32_t tqExtractDropCtbDataBlock(const void* data, int32_t len, int64_t ver, void** pRefBlock, int32_t type);
// tq send notifications // tq send notifications

View File

@ -962,7 +962,7 @@ int32_t tsdbCacheDeleteLastrow(SLRUCache *pCache, tb_uid_t uid, TSKEY eKey);
int32_t tsdbCacheDeleteLast(SLRUCache *pCache, tb_uid_t uid, TSKEY eKey); int32_t tsdbCacheDeleteLast(SLRUCache *pCache, tb_uid_t uid, TSKEY eKey);
int32_t tsdbCacheDelete(SLRUCache *pCache, tb_uid_t uid, TSKEY eKey); int32_t tsdbCacheDelete(SLRUCache *pCache, tb_uid_t uid, TSKEY eKey);
int32_t tsdbGetS3Size(STsdb *tsdb, int64_t *size); int32_t tsdbGetFsSize(STsdb *tsdb, SDbSizeStatisInfo *pInfo);
// ========== inline functions ========== // ========== inline functions ==========
static FORCE_INLINE int32_t tsdbKeyCmprFn(const void *p1, const void *p2) { static FORCE_INLINE int32_t tsdbKeyCmprFn(const void *p1, const void *p2) {

View File

@ -196,7 +196,8 @@ int32_t smaBlockToSubmit(SVnode *pVnode, const SArray *pBlocks, const STSchema *
int32_t cid = taosArrayGetSize(pDataBlock->pDataBlock) + 1; int32_t cid = taosArrayGetSize(pDataBlock->pDataBlock) + 1;
TAOS_CHECK_EXIT(buildAutoCreateTableReq(stbFullName, suid, cid, pDataBlock, tagArray, true, &tbData.pCreateTbReq)); TAOS_CHECK_EXIT(buildAutoCreateTableReq(stbFullName, suid, cid, pDataBlock, tagArray, true, &tbData.pCreateTbReq,
""));
{ {
uint64_t groupId = pDataBlock->info.id.groupId; uint64_t groupId = pDataBlock->info.id.groupId;

View File

@ -1078,11 +1078,19 @@ int32_t tqProcessTaskScanHistory(STQ* pTq, SRpcMsg* pMsg) {
// 1. get the related stream task // 1. get the related stream task
code = streamMetaAcquireTask(pMeta, pTask->streamTaskId.streamId, pTask->streamTaskId.taskId, &pStreamTask); code = streamMetaAcquireTask(pMeta, pTask->streamTaskId.streamId, pTask->streamTaskId.taskId, &pStreamTask);
if (pStreamTask == NULL) { if (pStreamTask == NULL) {
tqError("failed to find s-task:0x%" PRIx64 ", it may have been destroyed, drop related fill-history task:%s",
pTask->streamTaskId.taskId, pTask->id.idStr);
tqDebug("s-task:%s fill-history task set status to be dropping", id); int32_t ret = streamMetaAcquireTaskUnsafe(pMeta, &pTask->streamTaskId, &pStreamTask);
code = streamBuildAndSendDropTaskMsg(pTask->pMsgCb, pMeta->vgId, &pTask->id, 0); if (ret == 0 && pStreamTask != NULL) {
tqWarn("s-task:0x%" PRIx64 " stopped, not ready for related task:%s scan-history work, do nothing",
pTask->streamTaskId.taskId, pTask->id.idStr);
streamMetaReleaseTask(pMeta, pStreamTask);
} else {
tqError("failed to find s-task:0x%" PRIx64 ", it may have been destroyed, drop related fill-history task:%s",
pTask->streamTaskId.taskId, pTask->id.idStr);
tqDebug("s-task:%s fill-history task set status to be dropping", id);
code = streamBuildAndSendDropTaskMsg(pTask->pMsgCb, pMeta->vgId, &pTask->id, 0);
}
atomic_store_32(&pTask->status.inScanHistorySentinel, 0); atomic_store_32(&pTask->status.inScanHistorySentinel, 0);
streamMetaReleaseTask(pMeta, pTask); streamMetaReleaseTask(pMeta, pTask);
@ -1371,10 +1379,24 @@ int32_t tqProcessTaskCheckPointSourceReq(STQ* pTq, SRpcMsg* pMsg, SRpcMsg* pRsp)
int32_t tqProcessTaskCheckpointReadyMsg(STQ* pTq, SRpcMsg* pMsg) { int32_t tqProcessTaskCheckpointReadyMsg(STQ* pTq, SRpcMsg* pMsg) {
int32_t vgId = TD_VID(pTq->pVnode); int32_t vgId = TD_VID(pTq->pVnode);
SStreamCheckpointReadyMsg* pReq = (SStreamCheckpointReadyMsg*)pMsg->pCont;
if (!vnodeIsRoleLeader(pTq->pVnode)) { if (!vnodeIsRoleLeader(pTq->pVnode)) {
tqError("vgId:%d not leader, ignore the retrieve checkpoint-trigger msg from 0x%x", vgId, char* msg = POINTER_SHIFT(pMsg->pCont, sizeof(SMsgHead));
(int32_t)pReq->downstreamTaskId); int32_t len = pMsg->contLen - sizeof(SMsgHead);
int32_t code = 0;
SDecoder decoder;
SStreamCheckpointReadyMsg req = {0};
tDecoderInit(&decoder, (uint8_t*)msg, len);
if (tDecodeStreamCheckpointReadyMsg(&decoder, &req) < 0) {
code = TSDB_CODE_MSG_DECODE_ERROR;
tDecoderClear(&decoder);
return code;
}
tDecoderClear(&decoder);
tqError("vgId:%d not leader, s-task:0x%x ignore the retrieve checkpoint-trigger msg from s-task:0x%x vgId:%d", vgId,
req.upstreamTaskId, req.downstreamTaskId, req.downstreamNodeId);
return TSDB_CODE_STREAM_NOT_LEADER; return TSDB_CODE_STREAM_NOT_LEADER;
} }

View File

@ -39,7 +39,7 @@ static int32_t initCreateTableMsg(SVCreateTbReq* pCreateTableReq, uint64_t suid,
int32_t numOfTags); int32_t numOfTags);
static int32_t createDefaultTagColName(SArray** pColNameList); static int32_t createDefaultTagColName(SArray** pColNameList);
static int32_t setCreateTableMsgTableName(SVCreateTbReq* pCreateTableReq, SSDataBlock* pDataBlock, static int32_t setCreateTableMsgTableName(SVCreateTbReq* pCreateTableReq, SSDataBlock* pDataBlock,
const char* stbFullName, int64_t gid, bool newSubTableRule); const char* stbFullName, int64_t gid, bool newSubTableRule, const char* id);
static int32_t doCreateSinkTableInfo(const char* pDstTableName, STableSinkInfo** pInfo); static int32_t doCreateSinkTableInfo(const char* pDstTableName, STableSinkInfo** pInfo);
static int32_t doPutSinkTableInfoIntoCache(SSHashObj* pSinkTableMap, STableSinkInfo* pTableSinkInfo, uint64_t groupId, static int32_t doPutSinkTableInfoIntoCache(SSHashObj* pSinkTableMap, STableSinkInfo* pTableSinkInfo, uint64_t groupId,
const char* id); const char* id);
@ -260,7 +260,7 @@ int32_t createDefaultTagColName(SArray** pColNameList) {
} }
int32_t setCreateTableMsgTableName(SVCreateTbReq* pCreateTableReq, SSDataBlock* pDataBlock, const char* stbFullName, int32_t setCreateTableMsgTableName(SVCreateTbReq* pCreateTableReq, SSDataBlock* pDataBlock, const char* stbFullName,
int64_t gid, bool newSubTableRule) { int64_t gid, bool newSubTableRule, const char* id) {
if (pDataBlock->info.parTbName[0]) { if (pDataBlock->info.parTbName[0]) {
if (newSubTableRule && !isAutoTableName(pDataBlock->info.parTbName) && if (newSubTableRule && !isAutoTableName(pDataBlock->info.parTbName) &&
!alreadyAddGroupId(pDataBlock->info.parTbName, gid) && gid != 0 && stbFullName) { !alreadyAddGroupId(pDataBlock->info.parTbName, gid) && gid != 0 && stbFullName) {
@ -274,16 +274,17 @@ int32_t setCreateTableMsgTableName(SVCreateTbReq* pCreateTableReq, SSDataBlock*
if (code != TSDB_CODE_SUCCESS) { if (code != TSDB_CODE_SUCCESS) {
return code; return code;
} }
// tqDebug("gen name from:%s", pDataBlock->info.parTbName); tqDebug("s-task:%s gen name from:%s blockdata", id, pDataBlock->info.parTbName);
} else { } else {
pCreateTableReq->name = taosStrdup(pDataBlock->info.parTbName); pCreateTableReq->name = taosStrdup(pDataBlock->info.parTbName);
if (pCreateTableReq->name == NULL) { if (pCreateTableReq->name == NULL) {
return terrno; return terrno;
} }
// tqDebug("copy name:%s", pDataBlock->info.parTbName); tqDebug("s-task:%s copy name:%s from blockdata", id, pDataBlock->info.parTbName);
} }
} else { } else {
int32_t code = buildCtbNameByGroupId(stbFullName, gid, &pCreateTableReq->name); int32_t code = buildCtbNameByGroupId(stbFullName, gid, &pCreateTableReq->name);
tqDebug("s-task:%s no name in blockdata, auto-created table name:%s", id, pCreateTableReq->name);
return code; return code;
} }
@ -389,7 +390,8 @@ static int32_t doBuildAndSendCreateTableMsg(SVnode* pVnode, char* stbFullName, S
} }
} }
code = setCreateTableMsgTableName(pCreateTbReq, pDataBlock, stbFullName, gid, IS_NEW_SUBTB_RULE(pTask)); code = setCreateTableMsgTableName(pCreateTbReq, pDataBlock, stbFullName, gid, IS_NEW_SUBTB_RULE(pTask),
pTask->id.idStr);
if (code) { if (code) {
goto _end; goto _end;
} }
@ -641,7 +643,7 @@ bool isValidDstChildTable(SMetaReader* pReader, int32_t vgId, const char* ctbNam
} }
int32_t buildAutoCreateTableReq(const char* stbFullName, int64_t suid, int32_t numOfCols, SSDataBlock* pDataBlock, int32_t buildAutoCreateTableReq(const char* stbFullName, int64_t suid, int32_t numOfCols, SSDataBlock* pDataBlock,
SArray* pTagArray, bool newSubTableRule, SVCreateTbReq** pReq) { SArray* pTagArray, bool newSubTableRule, SVCreateTbReq** pReq, const char* id) {
*pReq = NULL; *pReq = NULL;
SVCreateTbReq* pCreateTbReq = taosMemoryCalloc(1, sizeof(SVCreateTbReq)); SVCreateTbReq* pCreateTbReq = taosMemoryCalloc(1, sizeof(SVCreateTbReq));
@ -674,7 +676,8 @@ int32_t buildAutoCreateTableReq(const char* stbFullName, int64_t suid, int32_t n
} }
// set table name // set table name
code = setCreateTableMsgTableName(pCreateTbReq, pDataBlock, stbFullName, pDataBlock->info.id.groupId, newSubTableRule); code = setCreateTableMsgTableName(pCreateTbReq, pDataBlock, stbFullName, pDataBlock->info.id.groupId, newSubTableRule,
id);
if (code) { if (code) {
return code; return code;
} }
@ -1041,7 +1044,7 @@ int32_t setDstTableDataUid(SVnode* pVnode, SStreamTask* pTask, SSDataBlock* pDat
pTableData->flags = SUBMIT_REQ_AUTO_CREATE_TABLE; pTableData->flags = SUBMIT_REQ_AUTO_CREATE_TABLE;
code = buildAutoCreateTableReq(stbFullName, suid, pTSchema->numOfCols + 1, pDataBlock, pTagArray, code = buildAutoCreateTableReq(stbFullName, suid, pTSchema->numOfCols + 1, pDataBlock, pTagArray,
IS_NEW_SUBTB_RULE(pTask), &pTableData->pCreateTbReq); IS_NEW_SUBTB_RULE(pTask), &pTableData->pCreateTbReq, id);
taosArrayDestroy(pTagArray); taosArrayDestroy(pTagArray);
if (code) { if (code) {
@ -1164,8 +1167,8 @@ void tqSinkDataIntoDstTable(SStreamTask* pTask, void* vnode, void* data) {
bool onlySubmitData = hasOnlySubmitData(pBlocks, numOfBlocks); bool onlySubmitData = hasOnlySubmitData(pBlocks, numOfBlocks);
if (!onlySubmitData || pTask->subtableWithoutMd5 == 1) { if (!onlySubmitData || pTask->subtableWithoutMd5 == 1) {
tqDebug("vgId:%d, s-task:%s write %d stream resBlock(s) into table, has delete block, submit one-by-one", vgId, id, tqDebug("vgId:%d, s-task:%s write %d stream resBlock(s) into table, has other type block, submit one-by-one", vgId,
numOfBlocks); id, numOfBlocks);
for (int32_t i = 0; i < numOfBlocks; ++i) { for (int32_t i = 0; i < numOfBlocks; ++i) {
if (streamTaskShouldStop(pTask)) { if (streamTaskShouldStop(pTask)) {

View File

@ -87,7 +87,7 @@ static void doStartScanWal(void* param, void* tmrId) {
tmr_h pTimer = NULL; tmr_h pTimer = NULL;
SBuildScanWalMsgParam* pParam = (SBuildScanWalMsgParam*)param; SBuildScanWalMsgParam* pParam = (SBuildScanWalMsgParam*)param;
tqDebug("start to do scan wal in tmr, metaRid:%" PRId64, pParam->metaId); tqTrace("start to do scan wal in tmr, metaRid:%" PRId64, pParam->metaId);
SStreamMeta* pMeta = taosAcquireRef(streamMetaRefPool, pParam->metaId); SStreamMeta* pMeta = taosAcquireRef(streamMetaRefPool, pParam->metaId);
if (pMeta == NULL) { if (pMeta == NULL) {
@ -173,7 +173,7 @@ static void doStartScanWal(void* param, void* tmrId) {
_end: _end:
streamTmrStart(doStartScanWal, SCAN_WAL_IDLE_DURATION, pParam, pTimer, &pMeta->scanInfo.scanTimer, vgId, "scan-wal"); streamTmrStart(doStartScanWal, SCAN_WAL_IDLE_DURATION, pParam, pTimer, &pMeta->scanInfo.scanTimer, vgId, "scan-wal");
tqDebug("vgId:%d try scan-wal will start in %dms", vgId, SCAN_WAL_IDLE_DURATION*SCAN_WAL_WAIT_COUNT); tqTrace("vgId:%d try scan-wal will start in %dms", vgId, SCAN_WAL_IDLE_DURATION*SCAN_WAL_WAIT_COUNT);
code = taosReleaseRef(streamMetaRefPool, pParam->metaId); code = taosReleaseRef(streamMetaRefPool, pParam->metaId);
if (code) { if (code) {

View File

@ -567,13 +567,13 @@ int32_t tqStreamTaskProcessCheckRsp(SStreamMeta* pMeta, SRpcMsg* pMsg, bool isLe
if (!isLeader) { if (!isLeader) {
tqError("vgId:%d not leader, task:0x%x not handle the check rsp, downstream:0x%x (vgId:%d)", vgId, tqError("vgId:%d not leader, task:0x%x not handle the check rsp, downstream:0x%x (vgId:%d)", vgId,
rsp.upstreamTaskId, rsp.downstreamTaskId, rsp.downstreamNodeId); rsp.upstreamTaskId, rsp.downstreamTaskId, rsp.downstreamNodeId);
return streamMetaAddFailedTask(pMeta, rsp.streamId, rsp.upstreamTaskId); return streamMetaAddFailedTask(pMeta, rsp.streamId, rsp.upstreamTaskId, true);
} }
SStreamTask* pTask = NULL; SStreamTask* pTask = NULL;
code = streamMetaAcquireTask(pMeta, rsp.streamId, rsp.upstreamTaskId, &pTask); code = streamMetaAcquireTask(pMeta, rsp.streamId, rsp.upstreamTaskId, &pTask);
if ((pTask == NULL) || (code != 0)) { if ((pTask == NULL) || (code != 0)) {
return streamMetaAddFailedTask(pMeta, rsp.streamId, rsp.upstreamTaskId); return streamMetaAddFailedTask(pMeta, rsp.streamId, rsp.upstreamTaskId, true);
} }
code = streamTaskProcessCheckRsp(pTask, &rsp); code = streamTaskProcessCheckRsp(pTask, &rsp);
@ -773,6 +773,9 @@ int32_t tqStreamTaskProcessDropReq(SStreamMeta* pMeta, char* msg, int32_t msgLen
// commit the update // commit the update
int32_t numOfTasks = streamMetaGetNumOfTasks(pMeta); int32_t numOfTasks = streamMetaGetNumOfTasks(pMeta);
tqDebug("vgId:%d task:0x%x dropped, remain tasks:%d", vgId, pReq->taskId, numOfTasks); tqDebug("vgId:%d task:0x%x dropped, remain tasks:%d", vgId, pReq->taskId, numOfTasks);
if (numOfTasks == 0) {
streamMetaResetStartInfo(&pMeta->startInfo, vgId);
}
if (streamMetaCommit(pMeta) < 0) { if (streamMetaCommit(pMeta) < 0) {
// persist to disk // persist to disk
@ -813,21 +816,47 @@ int32_t tqStreamTaskProcessUpdateCheckpointReq(SStreamMeta* pMeta, bool restored
} }
static int32_t restartStreamTasks(SStreamMeta* pMeta, bool isLeader) { static int32_t restartStreamTasks(SStreamMeta* pMeta, bool isLeader) {
int32_t vgId = pMeta->vgId; int32_t vgId = pMeta->vgId;
int32_t code = 0; int32_t code = 0;
int64_t st = taosGetTimestampMs(); int64_t st = taosGetTimestampMs();
STaskStartInfo* pStartInfo = &pMeta->startInfo;
if (pStartInfo->startAllTasks == 1) {
// wait for the checkpoint id rsp, this rsp will be expired
if (pStartInfo->curStage == START_MARK_REQ_CHKPID) {
SStartTaskStageInfo* pCurStageInfo = taosArrayGetLast(pStartInfo->pStagesList);
tqInfo("vgId:%d only mark the req consensus checkpointId flag, reqTs:%"PRId64 " ignore and continue", vgId, pCurStageInfo->ts);
taosArrayClear(pStartInfo->pStagesList);
pStartInfo->curStage = 0;
goto _start;
} else if (pStartInfo->curStage == START_WAIT_FOR_CHKPTID) {
SStartTaskStageInfo* pCurStageInfo = taosArrayGetLast(pStartInfo->pStagesList);
tqInfo("vgId:%d already sent consensus-checkpoint msg(waiting for chkptid) expired, reqTs:%" PRId64
" rsp will be discarded",
vgId, pCurStageInfo->ts);
taosArrayClear(pStartInfo->pStagesList);
pStartInfo->curStage = 0;
goto _start;
} else if (pStartInfo->curStage == START_CHECK_DOWNSTREAM) {
pStartInfo->restartCount += 1;
tqDebug(
"vgId:%d in start tasks procedure (check downstream), inc restartCounter by 1 and wait for it completes, "
"remaining restart:%d",
vgId, pStartInfo->restartCount);
} else {
tqInfo("vgId:%d in start procedure, but not start to do anything yet, do nothing", vgId);
}
streamMetaWLock(pMeta);
if (pMeta->startInfo.startAllTasks == 1) {
pMeta->startInfo.restartCount += 1;
tqDebug("vgId:%d in start tasks procedure, inc restartCounter by 1, remaining restart:%d", vgId,
pMeta->startInfo.restartCount);
streamMetaWUnLock(pMeta);
return TSDB_CODE_SUCCESS; return TSDB_CODE_SUCCESS;
} }
pMeta->startInfo.startAllTasks = 1; _start:
pStartInfo->startAllTasks = 1;
terrno = 0; terrno = 0;
tqInfo("vgId:%d tasks are all updated and stopped, restart all tasks, triggered by transId:%d, ts:%" PRId64, vgId, tqInfo("vgId:%d tasks are all updated and stopped, restart all tasks, triggered by transId:%d, ts:%" PRId64, vgId,
pMeta->updateInfo.completeTransId, pMeta->updateInfo.completeTs); pMeta->updateInfo.completeTransId, pMeta->updateInfo.completeTs);
@ -835,24 +864,15 @@ static int32_t restartStreamTasks(SStreamMeta* pMeta, bool isLeader) {
streamMetaClear(pMeta); streamMetaClear(pMeta);
int64_t el = taosGetTimestampMs() - st; int64_t el = taosGetTimestampMs() - st;
tqInfo("vgId:%d close&reload state elapsed time:%.3fs", vgId, el / 1000.); tqInfo("vgId:%d clear&close stream meta completed, elapsed time:%.3fs", vgId, el / 1000.);
streamMetaLoadAllTasks(pMeta); streamMetaLoadAllTasks(pMeta);
{
STaskStartInfo* pStartInfo = &pMeta->startInfo;
taosHashClear(pStartInfo->pReadyTaskSet);
taosHashClear(pStartInfo->pFailedTaskSet);
pStartInfo->readyTs = 0;
}
if (isLeader && !tsDisableStream) { if (isLeader && !tsDisableStream) {
streamMetaWUnLock(pMeta);
code = streamMetaStartAllTasks(pMeta); code = streamMetaStartAllTasks(pMeta);
} else { } else {
streamMetaResetStartInfo(&pMeta->startInfo, pMeta->vgId); streamMetaResetStartInfo(&pMeta->startInfo, pMeta->vgId);
pMeta->startInfo.restartCount = 0; pStartInfo->restartCount = 0;
streamMetaWUnLock(pMeta);
tqInfo("vgId:%d, follower node not start stream tasks or stream is disabled", vgId); tqInfo("vgId:%d, follower node not start stream tasks or stream is disabled", vgId);
} }
@ -882,16 +902,20 @@ int32_t tqStreamTaskProcessRunReq(SStreamMeta* pMeta, SRpcMsg* pMsg, bool isLead
code = streamMetaStartOneTask(pMeta, req.streamId, req.taskId); code = streamMetaStartOneTask(pMeta, req.streamId, req.taskId);
return 0; return 0;
} else if (type == STREAM_EXEC_T_START_ALL_TASKS) { } else if (type == STREAM_EXEC_T_START_ALL_TASKS) {
streamMetaWLock(pMeta);
code = streamMetaStartAllTasks(pMeta); code = streamMetaStartAllTasks(pMeta);
streamMetaWUnLock(pMeta);
return 0; return 0;
} else if (type == STREAM_EXEC_T_RESTART_ALL_TASKS) { } else if (type == STREAM_EXEC_T_RESTART_ALL_TASKS) {
streamMetaWLock(pMeta);
code = restartStreamTasks(pMeta, isLeader); code = restartStreamTasks(pMeta, isLeader);
streamMetaWUnLock(pMeta);
return 0; return 0;
} else if (type == STREAM_EXEC_T_STOP_ALL_TASKS) { } else if (type == STREAM_EXEC_T_STOP_ALL_TASKS) {
code = streamMetaStopAllTasks(pMeta); code = streamMetaStopAllTasks(pMeta);
return 0; return 0;
} else if (type == STREAM_EXEC_T_ADD_FAILED_TASK) { } else if (type == STREAM_EXEC_T_ADD_FAILED_TASK) {
code = streamMetaAddFailedTask(pMeta, req.streamId, req.taskId); code = streamMetaAddFailedTask(pMeta, req.streamId, req.taskId, true);
return code; return code;
} else if (type == STREAM_EXEC_T_STOP_ONE_TASK) { } else if (type == STREAM_EXEC_T_STOP_ONE_TASK) {
code = streamMetaStopOneTask(pMeta, req.streamId, req.taskId); code = streamMetaStopOneTask(pMeta, req.streamId, req.taskId);
@ -948,7 +972,7 @@ int32_t tqStartTaskCompleteCallback(SStreamMeta* pMeta) {
bool scanWal = false; bool scanWal = false;
int32_t code = 0; int32_t code = 0;
streamMetaWLock(pMeta); // streamMetaWLock(pMeta);
if (pStartInfo->startAllTasks == 1) { if (pStartInfo->startAllTasks == 1) {
tqDebug("vgId:%d already in start tasks procedure in other thread, restartCounter:%d, do nothing", vgId, tqDebug("vgId:%d already in start tasks procedure in other thread, restartCounter:%d, do nothing", vgId,
pMeta->startInfo.restartCount); pMeta->startInfo.restartCount);
@ -960,7 +984,7 @@ int32_t tqStartTaskCompleteCallback(SStreamMeta* pMeta) {
pStartInfo->restartCount -= 1; pStartInfo->restartCount -= 1;
tqDebug("vgId:%d role:%d need to restart all tasks again, restartCounter:%d", vgId, pMeta->role, tqDebug("vgId:%d role:%d need to restart all tasks again, restartCounter:%d", vgId, pMeta->role,
pStartInfo->restartCount); pStartInfo->restartCount);
streamMetaWUnLock(pMeta); // streamMetaWUnLock(pMeta);
return restartStreamTasks(pMeta, (pMeta->role == NODE_ROLE_LEADER)); return restartStreamTasks(pMeta, (pMeta->role == NODE_ROLE_LEADER));
} else { } else {
@ -975,7 +999,7 @@ int32_t tqStartTaskCompleteCallback(SStreamMeta* pMeta) {
} }
} }
streamMetaWUnLock(pMeta); // streamMetaWUnLock(pMeta);
return code; return code;
} }
@ -1237,7 +1261,7 @@ static int32_t tqProcessTaskResumeImpl(void* handle, SStreamTask* pTask, int64_t
pTask->hTaskInfo.operatorOpen = false; pTask->hTaskInfo.operatorOpen = false;
code = streamStartScanHistoryAsync(pTask, igUntreated); code = streamStartScanHistoryAsync(pTask, igUntreated);
} else if (level == TASK_LEVEL__SOURCE && (streamQueueGetNumOfItems(pTask->inputq.queue) == 0)) { } else if (level == TASK_LEVEL__SOURCE && (streamQueueGetNumOfItems(pTask->inputq.queue) == 0)) {
// code = tqScanWalAsync((STQ*)handle, false); // code = tqScanWalAsync((STQ*)handle, false);
} else { } else {
code = streamTrySchedExec(pTask, false); code = streamTrySchedExec(pTask, false);
} }
@ -1357,12 +1381,19 @@ int32_t tqStreamTaskProcessConsenChkptIdReq(SStreamMeta* pMeta, SRpcMsg* pMsg) {
code = streamMetaAcquireTask(pMeta, req.streamId, req.taskId, &pTask); code = streamMetaAcquireTask(pMeta, req.streamId, req.taskId, &pTask);
if (pTask == NULL || (code != 0)) { if (pTask == NULL || (code != 0)) {
tqError("vgId:%d process consensus checkpointId req, failed to acquire task:0x%x, it may have been dropped already", // ignore this code to avoid error code over writing
pMeta->vgId, req.taskId); if (pMeta->role == NODE_ROLE_LEADER) {
// ignore this code to avoid error code over write tqError("vgId:%d process consensus checkpointId req:%" PRId64
int32_t ret = streamMetaAddFailedTask(pMeta, req.streamId, req.taskId); " transId:%d, failed to acquire task:0x%x, it may have been dropped/stopped already",
if (ret) { pMeta->vgId, req.checkpointId, req.transId, req.taskId);
tqError("s-task:0x%x failed add check downstream failed, core:%s", req.taskId, tstrerror(ret));
int32_t ret = streamMetaAddFailedTask(pMeta, req.streamId, req.taskId, true);
if (ret) {
tqError("s-task:0x%x failed add check downstream failed, core:%s", req.taskId, tstrerror(ret));
}
} else {
tqDebug("vgId:%d task:0x%x stopped in follower node, not set the consensus checkpointId:%" PRId64 " transId:%d",
pMeta->vgId, req.taskId, req.checkpointId, req.transId);
} }
return 0; return 0;
@ -1370,19 +1401,26 @@ int32_t tqStreamTaskProcessConsenChkptIdReq(SStreamMeta* pMeta, SRpcMsg* pMsg) {
// discard the rsp, since it is expired. // discard the rsp, since it is expired.
if (req.startTs < pTask->execInfo.created) { if (req.startTs < pTask->execInfo.created) {
tqWarn("s-task:%s vgId:%d create time:%" PRId64 " recv expired consensus checkpointId:%" PRId64 tqWarn("s-task:%s vgId:%d createTs:%" PRId64 " recv expired consensus checkpointId:%" PRId64
" from task createTs:%" PRId64 " < task createTs:%" PRId64 ", discard", " from task createTs:%" PRId64 " < task createTs:%" PRId64 ", discard",
pTask->id.idStr, pMeta->vgId, pTask->execInfo.created, req.checkpointId, req.startTs, pTask->id.idStr, pMeta->vgId, pTask->execInfo.created, req.checkpointId, req.startTs,
pTask->execInfo.created); pTask->execInfo.created);
streamMetaAddFailedTaskSelf(pTask, now); if (pMeta->role == NODE_ROLE_LEADER) {
streamMetaAddFailedTaskSelf(pTask, now, true);
}
streamMetaReleaseTask(pMeta, pTask); streamMetaReleaseTask(pMeta, pTask);
return TSDB_CODE_SUCCESS; return TSDB_CODE_SUCCESS;
} }
tqDebug("s-task:%s vgId:%d checkpointId:%" PRId64 " restore to consensus-checkpointId:%" PRId64 " from mnode", tqDebug("s-task:%s vgId:%d checkpointId:%" PRId64 " restore to consensus-checkpointId:%" PRId64
pTask->id.idStr, vgId, pTask->chkInfo.checkpointId, req.checkpointId); " transId:%d from mnode, reqTs:%" PRId64 " task createTs:%" PRId64,
pTask->id.idStr, vgId, pTask->chkInfo.checkpointId, req.checkpointId, req.transId, req.startTs,
pTask->execInfo.created);
streamMutexLock(&pTask->lock); streamMutexLock(&pTask->lock);
SConsenChkptInfo* pConsenInfo = &pTask->status.consenChkptInfo;
if (pTask->chkInfo.checkpointId < req.checkpointId) { if (pTask->chkInfo.checkpointId < req.checkpointId) {
tqFatal("s-task:%s vgId:%d invalid consensus-checkpointId:%" PRId64 ", greater than existed checkpointId:%" PRId64, tqFatal("s-task:%s vgId:%d invalid consensus-checkpointId:%" PRId64 ", greater than existed checkpointId:%" PRId64,
pTask->id.idStr, vgId, req.checkpointId, pTask->chkInfo.checkpointId); pTask->id.idStr, vgId, req.checkpointId, pTask->chkInfo.checkpointId);
@ -1392,9 +1430,8 @@ int32_t tqStreamTaskProcessConsenChkptIdReq(SStreamMeta* pMeta, SRpcMsg* pMsg) {
return 0; return 0;
} }
SConsenChkptInfo* pConsenInfo = &pTask->status.consenChkptInfo;
if (pConsenInfo->consenChkptTransId >= req.transId) { if (pConsenInfo->consenChkptTransId >= req.transId) {
tqDebug("s-task:%s vgId:%d latest consensus transId:%d, expired consensus trans:%d, discard", pTask->id.idStr, vgId, tqWarn("s-task:%s vgId:%d latest consensus transId:%d, expired consensus trans:%d, discard", pTask->id.idStr, vgId,
pConsenInfo->consenChkptTransId, req.transId); pConsenInfo->consenChkptTransId, req.transId);
streamMutexUnlock(&pTask->lock); streamMutexUnlock(&pTask->lock);
streamMetaReleaseTask(pMeta, pTask); streamMetaReleaseTask(pMeta, pTask);
@ -1414,6 +1451,19 @@ int32_t tqStreamTaskProcessConsenChkptIdReq(SStreamMeta* pMeta, SRpcMsg* pMsg) {
streamTaskSetConsenChkptIdRecv(pTask, req.transId, now); streamTaskSetConsenChkptIdRecv(pTask, req.transId, now);
streamMutexUnlock(&pTask->lock); streamMutexUnlock(&pTask->lock);
streamMetaWLock(pTask->pMeta);
if (pMeta->startInfo.curStage == START_WAIT_FOR_CHKPTID) {
pMeta->startInfo.curStage = START_CHECK_DOWNSTREAM;
SStartTaskStageInfo info = {.stage = pMeta->startInfo.curStage, .ts = now};
taosArrayPush(pMeta->startInfo.pStagesList, &info);
tqDebug("vgId:%d wait_for_chkptId stage -> check_down_stream stage, reqTs:%" PRId64 " , numOfStageHist:%d",
pMeta->vgId, info.ts, (int32_t)taosArrayGetSize(pMeta->startInfo.pStagesList));
}
streamMetaWUnLock(pTask->pMeta);
if (pMeta->role == NODE_ROLE_LEADER) { if (pMeta->role == NODE_ROLE_LEADER) {
code = tqStreamStartOneTaskAsync(pMeta, pTask->pMsgCb, req.streamId, req.taskId); code = tqStreamStartOneTaskAsync(pMeta, pTask->pMsgCb, req.streamId, req.taskId);
if (code) { if (code) {

View File

@ -1413,4 +1413,62 @@ void tsdbFileSetReaderClose(struct SFileSetReader **ppReader) {
*ppReader = NULL; *ppReader = NULL;
return; return;
}
static FORCE_INLINE void getLevelSize(const STFileObj *fObj, int64_t szArr[TFS_MAX_TIERS]) {
if (fObj == NULL) return;
int64_t sz = fObj->f->size;
// level == 0, primary storage
// level == 1, second storage,
// level == 2, third storage
int32_t level = fObj->f->did.level;
if (level >= 0 && level < TFS_MAX_TIERS) {
szArr[level] += sz;
}
}
static FORCE_INLINE int32_t tsdbGetFsSizeImpl(STsdb *tsdb, SDbSizeStatisInfo *pInfo) {
int32_t code = 0;
int64_t levelSize[TFS_MAX_TIERS] = {0};
int64_t s3Size = 0;
const STFileSet *fset;
const SSttLvl *stt = NULL;
const STFileObj *fObj = NULL;
SVnodeCfg *pCfg = &tsdb->pVnode->config;
int64_t chunksize = (int64_t)pCfg->tsdbPageSize * pCfg->s3ChunkSize;
TARRAY2_FOREACH(tsdb->pFS->fSetArr, fset) {
for (int32_t t = TSDB_FTYPE_MIN; t < TSDB_FTYPE_MAX; ++t) {
getLevelSize(fset->farr[t], levelSize);
}
TARRAY2_FOREACH(fset->lvlArr, stt) {
TARRAY2_FOREACH(stt->fobjArr, fObj) { getLevelSize(fObj, levelSize); }
}
fObj = fset->farr[TSDB_FTYPE_DATA];
if (fObj) {
int32_t lcn = fObj->f->lcn;
if (lcn > 1) {
s3Size += ((lcn - 1) * chunksize);
}
}
}
pInfo->l1Size = levelSize[0];
pInfo->l2Size = levelSize[1];
pInfo->l3Size = levelSize[2];
pInfo->s3Size = s3Size;
return code;
}
int32_t tsdbGetFsSize(STsdb *tsdb, SDbSizeStatisInfo *pInfo) {
int32_t code = 0;
(void)taosThreadMutexLock(&tsdb->mutex);
code = tsdbGetFsSizeImpl(tsdb, pInfo);
(void)taosThreadMutexUnlock(&tsdb->mutex);
return code;
} }

View File

@ -761,33 +761,5 @@ int32_t tsdbAsyncS3Migrate(STsdb *tsdb, int64_t now) {
return code; return code;
} }
static int32_t tsdbGetS3SizeImpl(STsdb *tsdb, int64_t *size) {
int32_t code = 0;
SVnodeCfg *pCfg = &tsdb->pVnode->config;
int64_t chunksize = (int64_t)pCfg->tsdbPageSize * pCfg->s3ChunkSize;
STFileSet *fset;
TARRAY2_FOREACH(tsdb->pFS->fSetArr, fset) {
STFileObj *fobj = fset->farr[TSDB_FTYPE_DATA];
if (fobj) {
int32_t lcn = fobj->f->lcn;
if (lcn > 1) {
*size += ((lcn - 1) * chunksize);
}
}
}
return code;
}
#endif #endif
int32_t tsdbGetS3Size(STsdb *tsdb, int64_t *size) {
int32_t code = 0;
#ifdef USE_S3
(void)taosThreadMutexLock(&tsdb->mutex);
code = tsdbGetS3SizeImpl(tsdb, size);
(void)taosThreadMutexUnlock(&tsdb->mutex);
#endif
return code;
}

View File

@ -401,6 +401,12 @@ int vnodeValidateTableHash(SVnode *pVnode, char *tableFName) {
} }
if (hashValue < pVnode->config.hashBegin || hashValue > pVnode->config.hashEnd) { if (hashValue < pVnode->config.hashBegin || hashValue > pVnode->config.hashEnd) {
vInfo("vgId:%d, %u, %u, hashVal: %u, restored:%d", pVnode->config.vgId, pVnode->config.hashBegin,
pVnode->config.hashEnd, hashValue, pVnode->restored);
vError("vgId:%d invalid table name:%s, hashVal:0x%x, range [0x%x, 0x%x]", pVnode->config.vgId,
tableFName, hashValue, pVnode->config.hashBegin, pVnode->config.hashEnd);
return terrno = TSDB_CODE_VND_HASH_MISMATCH; return terrno = TSDB_CODE_VND_HASH_MISMATCH;
} }

View File

@ -14,6 +14,7 @@
*/ */
#include "tsdb.h" #include "tsdb.h"
#include "tutil.h"
#include "vnd.h" #include "vnd.h"
#define VNODE_GET_LOAD_RESET_VALS(pVar, oVal, vType, tags) \ #define VNODE_GET_LOAD_RESET_VALS(pVar, oVal, vType, tags) \
@ -617,10 +618,10 @@ int32_t vnodeReadVSubtables(SReadHandle* pHandle, int64_t suid, SArray** ppRes)
SVCTableRefCols* pTb = NULL; SVCTableRefCols* pTb = NULL;
int32_t refColsNum = 0; int32_t refColsNum = 0;
char tbFName[TSDB_TABLE_FNAME_LEN]; char tbFName[TSDB_TABLE_FNAME_LEN];
SArray *pList = taosArrayInit(10, sizeof(uint64_t)); SArray *pList = taosArrayInit(10, sizeof(uint64_t));
QUERY_CHECK_NULL(pList, code, line, _return, terrno); QUERY_CHECK_NULL(pList, code, line, _return, terrno);
QUERY_CHECK_CODE(pHandle->api.metaFn.getChildTableList(pHandle->vnode, suid, pList), line, _return); QUERY_CHECK_CODE(pHandle->api.metaFn.getChildTableList(pHandle->vnode, suid, pList), line, _return);
size_t num = taosArrayGetSize(pList); size_t num = taosArrayGetSize(pList);
@ -655,7 +656,7 @@ int32_t vnodeReadVSubtables(SReadHandle* pHandle, int64_t suid, SArray** ppRes)
pTb->uid = mr.me.uid; pTb->uid = mr.me.uid;
pTb->numOfColRefs = refColsNum; pTb->numOfColRefs = refColsNum;
pTb->refCols = (SRefColInfo*)(pTb + 1); pTb->refCols = (SRefColInfo*)(pTb + 1);
refColsNum = 0; refColsNum = 0;
tSimpleHashClear(pSrcTbls); tSimpleHashClear(pSrcTbls);
for (int32_t j = 0; j < mr.me.colRef.nCols; j++) { for (int32_t j = 0; j < mr.me.colRef.nCols; j++) {
@ -673,14 +674,14 @@ int32_t vnodeReadVSubtables(SReadHandle* pHandle, int64_t suid, SArray** ppRes)
if (NULL == tSimpleHashGet(pSrcTbls, tbFName, strlen(tbFName))) { if (NULL == tSimpleHashGet(pSrcTbls, tbFName, strlen(tbFName))) {
QUERY_CHECK_CODE(tSimpleHashPut(pSrcTbls, tbFName, strlen(tbFName), &code, sizeof(code)), line, _return); QUERY_CHECK_CODE(tSimpleHashPut(pSrcTbls, tbFName, strlen(tbFName), &code, sizeof(code)), line, _return);
} }
refColsNum++; refColsNum++;
} }
pTb->numOfSrcTbls = tSimpleHashGetSize(pSrcTbls); pTb->numOfSrcTbls = tSimpleHashGetSize(pSrcTbls);
QUERY_CHECK_NULL(taosArrayPush(*ppRes, &pTb), code, line, _return, terrno); QUERY_CHECK_NULL(taosArrayPush(*ppRes, &pTb), code, line, _return, terrno);
pTb = NULL; pTb = NULL;
pHandle->api.metaReaderFn.clearReader(&mr); pHandle->api.metaReaderFn.clearReader(&mr);
readerInit = false; readerInit = false;
} }
@ -694,7 +695,7 @@ _return:
taosArrayDestroy(pList); taosArrayDestroy(pList);
taosMemoryFree(pTb); taosMemoryFree(pTb);
tSimpleHashCleanup(pSrcTbls); tSimpleHashCleanup(pSrcTbls);
if (code) { if (code) {
qError("%s failed since %s", __func__, tstrerror(code)); qError("%s failed since %s", __func__, tstrerror(code));
} }
@ -1158,18 +1159,14 @@ int32_t vnodeGetTableSchema(void *pVnode, int64_t uid, STSchema **pSchema, int64
return tsdbGetTableSchema(((SVnode *)pVnode)->pMeta, uid, pSchema, suid); return tsdbGetTableSchema(((SVnode *)pVnode)->pMeta, uid, pSchema, suid);
} }
int32_t vnodeGetDBSize(void *pVnode, SDbSizeStatisInfo *pInfo) { static FORCE_INLINE int32_t vnodeGetDBPrimaryInfo(SVnode *pVnode, SDbSizeStatisInfo *pInfo) {
SVnode *pVnodeObj = pVnode;
if (pVnodeObj == NULL) {
return TSDB_CODE_VND_NOT_EXIST;
}
int32_t code = 0; int32_t code = 0;
char path[TSDB_FILENAME_LEN] = {0}; char path[TSDB_FILENAME_LEN] = {0};
char *dirName[] = {VNODE_TSDB_DIR, VNODE_WAL_DIR, VNODE_META_DIR, VNODE_TSDB_CACHE_DIR}; char *dirName[] = {VNODE_TSDB_DIR, VNODE_WAL_DIR, VNODE_META_DIR, VNODE_TSDB_CACHE_DIR};
int64_t dirSize[4]; int64_t dirSize[4];
vnodeGetPrimaryDir(pVnodeObj->path, pVnodeObj->diskPrimary, pVnodeObj->pTfs, path, TSDB_FILENAME_LEN); vnodeGetPrimaryDir(pVnode->path, pVnode->diskPrimary, pVnode->pTfs, path, TSDB_FILENAME_LEN);
int32_t offset = strlen(path); int32_t offset = strlen(path);
for (int i = 0; i < sizeof(dirName) / sizeof(dirName[0]); i++) { for (int i = 0; i < sizeof(dirName) / sizeof(dirName[0]); i++) {
@ -1183,13 +1180,24 @@ int32_t vnodeGetDBSize(void *pVnode, SDbSizeStatisInfo *pInfo) {
dirSize[i] = size; dirSize[i] = size;
} }
pInfo->l1Size = dirSize[0] - dirSize[3]; pInfo->l1Size = 0;
pInfo->walSize = dirSize[1]; pInfo->walSize = dirSize[1];
pInfo->metaSize = dirSize[2]; pInfo->metaSize = dirSize[2];
pInfo->cacheSize = dirSize[3]; pInfo->cacheSize = dirSize[3];
return code;
}
int32_t vnodeGetDBSize(void *pVnode, SDbSizeStatisInfo *pInfo) {
int32_t code = 0;
int32_t lino = 0;
SVnode *pVnodeObj = pVnode;
if (pVnodeObj == NULL) {
return TSDB_CODE_VND_NOT_EXIST;
}
code = vnodeGetDBPrimaryInfo(pVnode, pInfo);
if (code != 0) goto _exit;
code = tsdbGetS3Size(pVnodeObj->pTsdb, &pInfo->s3Size); code = tsdbGetFsSize(pVnodeObj->pTsdb, pInfo);
_exit:
return code; return code;
} }

View File

@ -168,6 +168,9 @@ int32_t getQualifiedRowNumDesc(SExprSupp* pExprSup, SSDataBlock* pBlock, TSKEY*
int32_t buildAllResultKey(SStateStore* pStateStore, SStreamState* pState, TSKEY ts, SArray* pUpdated); int32_t buildAllResultKey(SStateStore* pStateStore, SStreamState* pState, TSKEY ts, SArray* pUpdated);
int32_t initOffsetInfo(int32_t** ppOffset, SSDataBlock* pRes); int32_t initOffsetInfo(int32_t** ppOffset, SSDataBlock* pRes);
TSKEY compareTs(void* pKey); TSKEY compareTs(void* pKey);
void clearGroupResArray(SGroupResInfo* pGroupResInfo);
void clearSessionGroupResInfo(SGroupResInfo* pGroupResInfo);
void destroyResultWinInfo(void* pRes);
int32_t addEventAggNotifyEvent(EStreamNotifyEventType eventType, const SSessionKey* pSessionKey, int32_t addEventAggNotifyEvent(EStreamNotifyEventType eventType, const SSessionKey* pSessionKey,
const SSDataBlock* pInputBlock, const SNodeList* pCondCols, int32_t ri, const SSDataBlock* pInputBlock, const SNodeList* pCondCols, int32_t ri,

View File

@ -1005,7 +1005,7 @@ int32_t qKillTask(qTaskInfo_t tinfo, int32_t rspCode, int64_t waitDuration) {
} }
if (waitDuration > 0) { if (waitDuration > 0) {
qDebug("%s sync killed execTask, and waiting for %.2fs", GET_TASKID(pTaskInfo), waitDuration/1000.0); qDebug("%s sync killed execTask, and waiting for at most %.2fs", GET_TASKID(pTaskInfo), waitDuration/1000.0);
} else { } else {
qDebug("%s async killed execTask", GET_TASKID(pTaskInfo)); qDebug("%s async killed execTask", GET_TASKID(pTaskInfo));
} }
@ -1033,6 +1033,11 @@ int32_t qKillTask(qTaskInfo_t tinfo, int32_t rspCode, int64_t waitDuration) {
} }
} }
int64_t et = taosGetTimestampMs() - st;
if (et < waitDuration) {
qInfo("%s waiting %.2fs for executor stopping", GET_TASKID(pTaskInfo), et / 1000.0);
return TSDB_CODE_SUCCESS;
}
return TSDB_CODE_SUCCESS; return TSDB_CODE_SUCCESS;
} }

View File

@ -58,8 +58,8 @@ void destroyStreamCountAggOperatorInfo(void* param) {
destroyStreamBasicInfo(&pInfo->basic); destroyStreamBasicInfo(&pInfo->basic);
cleanupExprSupp(&pInfo->scalarSupp); cleanupExprSupp(&pInfo->scalarSupp);
clearGroupResInfo(&pInfo->groupResInfo); clearSessionGroupResInfo(&pInfo->groupResInfo);
taosArrayDestroyP(pInfo->pUpdated, destroyFlusedPos); taosArrayDestroyEx(pInfo->pUpdated, destroyResultWinInfo);
pInfo->pUpdated = NULL; pInfo->pUpdated = NULL;
destroyStreamAggSupporter(&pInfo->streamAggSup); destroyStreamAggSupporter(&pInfo->streamAggSup);

View File

@ -48,8 +48,9 @@ void destroyStreamEventOperatorInfo(void* param) {
} }
destroyStreamBasicInfo(&pInfo->basic); destroyStreamBasicInfo(&pInfo->basic);
clearGroupResInfo(&pInfo->groupResInfo);
taosArrayDestroyP(pInfo->pUpdated, destroyFlusedPos); clearSessionGroupResInfo(&pInfo->groupResInfo);
taosArrayDestroyEx(pInfo->pUpdated, destroyResultWinInfo);
pInfo->pUpdated = NULL; pInfo->pUpdated = NULL;
destroyStreamAggSupporter(&pInfo->streamAggSup); destroyStreamAggSupporter(&pInfo->streamAggSup);

View File

@ -132,6 +132,13 @@ void destroyStreamFillInfo(SStreamFillInfo* pFillInfo) {
taosMemoryFree(pFillInfo); taosMemoryFree(pFillInfo);
} }
void clearGroupResArray(SGroupResInfo* pGroupResInfo) {
pGroupResInfo->freeItem = false;
taosArrayDestroy(pGroupResInfo->pRows);
pGroupResInfo->pRows = NULL;
pGroupResInfo->index = 0;
}
void destroyStreamFillOperatorInfo(void* param) { void destroyStreamFillOperatorInfo(void* param) {
SStreamFillOperatorInfo* pInfo = (SStreamFillOperatorInfo*)param; SStreamFillOperatorInfo* pInfo = (SStreamFillOperatorInfo*)param;
destroyStreamFillInfo(pInfo->pFillInfo); destroyStreamFillInfo(pInfo->pFillInfo);
@ -145,7 +152,7 @@ void destroyStreamFillOperatorInfo(void* param) {
taosArrayDestroy(pInfo->matchInfo.pList); taosArrayDestroy(pInfo->matchInfo.pList);
pInfo->matchInfo.pList = NULL; pInfo->matchInfo.pList = NULL;
taosArrayDestroy(pInfo->pUpdated); taosArrayDestroy(pInfo->pUpdated);
clearGroupResInfo(&pInfo->groupResInfo); clearGroupResArray(&pInfo->groupResInfo);
taosArrayDestroy(pInfo->pCloseTs); taosArrayDestroy(pInfo->pCloseTs);
if (pInfo->stateStore.streamFileStateDestroy != NULL) { if (pInfo->stateStore.streamFileStateDestroy != NULL) {

View File

@ -166,7 +166,7 @@ void destroyStreamTimeSliceOperatorInfo(void* param) {
cleanupExprSupp(&pInfo->scalarSup); cleanupExprSupp(&pInfo->scalarSup);
taosArrayDestroy(pInfo->historyPoints); taosArrayDestroy(pInfo->historyPoints);
taosArrayDestroyP(pInfo->pUpdated, destroyFlusedPos); taosArrayDestroy(pInfo->pUpdated);
pInfo->pUpdated = NULL; pInfo->pUpdated = NULL;
tSimpleHashCleanup(pInfo->pUpdatedMap); tSimpleHashCleanup(pInfo->pUpdatedMap);
@ -174,7 +174,7 @@ void destroyStreamTimeSliceOperatorInfo(void* param) {
taosArrayDestroy(pInfo->pDelWins); taosArrayDestroy(pInfo->pDelWins);
tSimpleHashCleanup(pInfo->pDeletedMap); tSimpleHashCleanup(pInfo->pDeletedMap);
clearGroupResInfo(&pInfo->groupResInfo); clearGroupResArray(&pInfo->groupResInfo);
taosArrayDestroy(pInfo->historyWins); taosArrayDestroy(pInfo->historyWins);

View File

@ -470,6 +470,7 @@ void clearGroupResInfo(SGroupResInfo* pGroupResInfo) {
destroyFlusedPos(pPos); destroyFlusedPos(pPos);
} }
} }
pGroupResInfo->freeItem = false; pGroupResInfo->freeItem = false;
taosArrayDestroy(pGroupResInfo->pRows); taosArrayDestroy(pGroupResInfo->pRows);
pGroupResInfo->pRows = NULL; pGroupResInfo->pRows = NULL;
@ -2182,6 +2183,27 @@ void destroyStreamAggSupporter(SStreamAggSupporter* pSup) {
taosMemoryFreeClear(pSup->pDummyCtx); taosMemoryFreeClear(pSup->pDummyCtx);
} }
void destroyResultWinInfo(void* pRes) {
SResultWindowInfo* pWinRes = (SResultWindowInfo*)pRes;
destroyFlusedPos(pWinRes->pStatePos);
}
void clearSessionGroupResInfo(SGroupResInfo* pGroupResInfo) {
int32_t size = taosArrayGetSize(pGroupResInfo->pRows);
if (pGroupResInfo->index >= 0 && pGroupResInfo->index < size) {
for (int32_t i = pGroupResInfo->index; i < size; i++) {
SResultWindowInfo* pRes = (SResultWindowInfo*) taosArrayGet(pGroupResInfo->pRows, i);
destroyFlusedPos(pRes->pStatePos);
pRes->pStatePos = NULL;
}
}
pGroupResInfo->freeItem = false;
taosArrayDestroy(pGroupResInfo->pRows);
pGroupResInfo->pRows = NULL;
pGroupResInfo->index = 0;
}
void destroyStreamSessionAggOperatorInfo(void* param) { void destroyStreamSessionAggOperatorInfo(void* param) {
if (param == NULL) { if (param == NULL) {
return; return;
@ -2196,8 +2218,8 @@ void destroyStreamSessionAggOperatorInfo(void* param) {
destroyStreamBasicInfo(&pInfo->basic); destroyStreamBasicInfo(&pInfo->basic);
cleanupExprSupp(&pInfo->scalarSupp); cleanupExprSupp(&pInfo->scalarSupp);
clearGroupResInfo(&pInfo->groupResInfo); clearSessionGroupResInfo(&pInfo->groupResInfo);
taosArrayDestroyP(pInfo->pUpdated, destroyFlusedPos); taosArrayDestroyEx(pInfo->pUpdated, destroyResultWinInfo);
pInfo->pUpdated = NULL; pInfo->pUpdated = NULL;
destroyStreamAggSupporter(&pInfo->streamAggSup); destroyStreamAggSupporter(&pInfo->streamAggSup);
@ -4460,8 +4482,8 @@ void destroyStreamStateOperatorInfo(void* param) {
} }
destroyStreamBasicInfo(&pInfo->basic); destroyStreamBasicInfo(&pInfo->basic);
clearGroupResInfo(&pInfo->groupResInfo); clearSessionGroupResInfo(&pInfo->groupResInfo);
taosArrayDestroyP(pInfo->pUpdated, destroyFlusedPos); taosArrayDestroyEx(pInfo->pUpdated, destroyResultWinInfo);
pInfo->pUpdated = NULL; pInfo->pUpdated = NULL;
destroyStreamAggSupporter(&pInfo->streamAggSup); destroyStreamAggSupporter(&pInfo->streamAggSup);

View File

@ -618,8 +618,8 @@ int32_t getVnodeSysTableTargetName(int32_t acctId, SNode* pWhere, SName* pName)
static int32_t userAuthToString(int32_t acctId, const char* pUser, const char* pDb, const char* pTable, AUTH_TYPE type, static int32_t userAuthToString(int32_t acctId, const char* pUser, const char* pDb, const char* pTable, AUTH_TYPE type,
char* pStr, bool isView) { char* pStr, bool isView) {
return snprintf(pStr, USER_AUTH_KEY_MAX_LEN, "%s*%d*%s*%s*%d*%d", pUser, acctId, pDb, return snprintf(pStr, USER_AUTH_KEY_MAX_LEN, "`%s`*%d*`%s`*`%s`*%d*%d", pUser, acctId, pDb,
(NULL == pTable || '\0' == pTable[0]) ? "``" : pTable, type, isView); (NULL == pTable || '\0' == pTable[0]) ? "" : pTable, type, isView);
} }
static int32_t getIntegerFromAuthStr(const char* pStart, char** pNext) { static int32_t getIntegerFromAuthStr(const char* pStart, char** pNext) {
@ -635,6 +635,30 @@ static int32_t getIntegerFromAuthStr(const char* pStart, char** pNext) {
return taosStr2Int32(buf, NULL, 10); return taosStr2Int32(buf, NULL, 10);
} }
static int32_t getBackQuotedStringFromAuthStr(const char* pStart, char* pStr, uint32_t dstLen, char** pNext) {
const char* pBeginQuote = strchr(pStart, '`');
if (!pBeginQuote) {
qWarn("failed to get string from auth string, %s, should be quoted with `", pStart);
return TSDB_CODE_INVALID_PARA;
}
const char* pEndQuote = strchr(pBeginQuote + 1, '`');
if (!pEndQuote) {
qWarn("failed to get string from auth string, %s, should be quoted with `", pStart);
return TSDB_CODE_INVALID_PARA;
}
pStr[0] = '\0';
strncpy(pStr, pBeginQuote + 1, TMIN(dstLen, pEndQuote - pBeginQuote - 1));
char* pSeperator = strchr(pEndQuote + 1, '*');
if (!pSeperator) {
*pNext = NULL;
} else {
*pNext = ++pSeperator;
}
return 0;
}
static void getStringFromAuthStr(const char* pStart, char* pStr, uint32_t dstLen, char** pNext) { static void getStringFromAuthStr(const char* pStart, char* pStr, uint32_t dstLen, char** pNext) {
char* p = strchr(pStart, '*'); char* p = strchr(pStart, '*');
if (NULL == p) { if (NULL == p) {
@ -649,19 +673,26 @@ static void getStringFromAuthStr(const char* pStart, char* pStr, uint32_t dstLen
} }
} }
static void stringToUserAuth(const char* pStr, int32_t len, SUserAuthInfo* pUserAuth) { static int32_t stringToUserAuth(const char* pStr, int32_t len, SUserAuthInfo* pUserAuth) {
char* p = NULL; char* p = NULL;
getStringFromAuthStr(pStr, pUserAuth->user, TSDB_USER_LEN, &p); int32_t code = getBackQuotedStringFromAuthStr(pStr, pUserAuth->user, TSDB_USER_LEN, &p);
pUserAuth->tbName.acctId = getIntegerFromAuthStr(p, &p); if (code == TSDB_CODE_SUCCESS) {
getStringFromAuthStr(p, pUserAuth->tbName.dbname, TSDB_DB_NAME_LEN, &p); pUserAuth->tbName.acctId = getIntegerFromAuthStr(p, &p);
getStringFromAuthStr(p, pUserAuth->tbName.tname, TSDB_TABLE_NAME_LEN, &p); code = getBackQuotedStringFromAuthStr(p, pUserAuth->tbName.dbname, TSDB_DB_NAME_LEN, &p);
if (pUserAuth->tbName.tname[0]) {
pUserAuth->tbName.type = TSDB_TABLE_NAME_T;
} else {
pUserAuth->tbName.type = TSDB_DB_NAME_T;
} }
pUserAuth->type = getIntegerFromAuthStr(p, &p); if (code == TSDB_CODE_SUCCESS) {
pUserAuth->isView = getIntegerFromAuthStr(p, &p); code = getBackQuotedStringFromAuthStr(p, pUserAuth->tbName.tname, TSDB_TABLE_NAME_LEN, &p);
}
if (code == TSDB_CODE_SUCCESS) {
if (pUserAuth->tbName.tname[0]) {
pUserAuth->tbName.type = TSDB_TABLE_NAME_T;
} else {
pUserAuth->tbName.type = TSDB_DB_NAME_T;
}
pUserAuth->type = getIntegerFromAuthStr(p, &p);
pUserAuth->isView = getIntegerFromAuthStr(p, &p);
}
return code;
} }
static int32_t buildTableReq(SHashObj* pTablesHash, SArray** pTables) { static int32_t buildTableReq(SHashObj* pTablesHash, SArray** pTables) {
@ -762,8 +793,9 @@ static int32_t buildUserAuthReq(SHashObj* pUserAuthHash, SArray** pUserAuth) {
char key[USER_AUTH_KEY_MAX_LEN] = {0}; char key[USER_AUTH_KEY_MAX_LEN] = {0};
strncpy(key, pKey, len); strncpy(key, pKey, len);
SUserAuthInfo userAuth = {0}; SUserAuthInfo userAuth = {0};
stringToUserAuth(key, len, &userAuth); int32_t code = stringToUserAuth(key, len, &userAuth);
if (NULL == taosArrayPush(*pUserAuth, &userAuth)) { if (TSDB_CODE_SUCCESS != code) terrno = code;
if (code != 0 || NULL == taosArrayPush(*pUserAuth, &userAuth)) {
taosHashCancelIterate(pUserAuthHash, p); taosHashCancelIterate(pUserAuthHash, p);
taosArrayDestroy(*pUserAuth); taosArrayDestroy(*pUserAuth);
*pUserAuth = NULL; *pUserAuth = NULL;

View File

@ -1,20 +1,18 @@
MESSAGE(STATUS "build filter unit test") MESSAGE(STATUS "build filter unit test")
IF(TD_DARWIN) # GoogleTest requires at least C++11
# GoogleTest requires at least C++11 SET(CMAKE_CXX_STANDARD 11)
SET(CMAKE_CXX_STANDARD 11) AUX_SOURCE_DIRECTORY(${CMAKE_CURRENT_SOURCE_DIR} SOURCE_LIST)
AUX_SOURCE_DIRECTORY(${CMAKE_CURRENT_SOURCE_DIR} SOURCE_LIST)
ADD_EXECUTABLE(filterTest ${SOURCE_LIST}) ADD_EXECUTABLE(filterTest ${SOURCE_LIST})
TARGET_LINK_LIBRARIES( TARGET_LINK_LIBRARIES(
filterTest filterTest
PUBLIC os util common gtest qcom function nodes scalar parser catalog transport PUBLIC os util common gtest qcom function nodes scalar parser catalog transport
) )
TARGET_INCLUDE_DIRECTORIES( TARGET_INCLUDE_DIRECTORIES(
filterTest filterTest
PUBLIC "${TD_SOURCE_DIR}/include/libs/scalar/" PUBLIC "${TD_SOURCE_DIR}/include/libs/scalar/"
PRIVATE "${TD_SOURCE_DIR}/source/libs/scalar/inc" PRIVATE "${TD_SOURCE_DIR}/source/libs/scalar/inc"
) )
ENDIF()

View File

@ -1,25 +1,23 @@
MESSAGE(STATUS "build scalar unit test") MESSAGE(STATUS "build scalar unit test")
IF(NOT TD_DARWIN) # GoogleTest requires at least C++11
# GoogleTest requires at least C++11 SET(CMAKE_CXX_STANDARD 11)
SET(CMAKE_CXX_STANDARD 11) AUX_SOURCE_DIRECTORY(${CMAKE_CURRENT_SOURCE_DIR} SOURCE_LIST)
AUX_SOURCE_DIRECTORY(${CMAKE_CURRENT_SOURCE_DIR} SOURCE_LIST)
ADD_EXECUTABLE(scalarTest ${SOURCE_LIST}) ADD_EXECUTABLE(scalarTest ${SOURCE_LIST})
TARGET_LINK_LIBRARIES( TARGET_LINK_LIBRARIES(
scalarTest scalarTest
PUBLIC os util common gtest qcom function nodes scalar parser catalog transport PUBLIC os util common gtest qcom function nodes scalar parser catalog transport
) )
TARGET_INCLUDE_DIRECTORIES( TARGET_INCLUDE_DIRECTORIES(
scalarTest scalarTest
PUBLIC "${TD_SOURCE_DIR}/include/libs/scalar/" PUBLIC "${TD_SOURCE_DIR}/include/libs/scalar/"
PUBLIC "${TD_SOURCE_DIR}/source/libs/parser/inc" PUBLIC "${TD_SOURCE_DIR}/source/libs/parser/inc"
PRIVATE "${TD_SOURCE_DIR}/source/libs/scalar/inc" PRIVATE "${TD_SOURCE_DIR}/source/libs/scalar/inc"
) )
add_test( add_test(
NAME scalarTest NAME scalarTest
COMMAND scalarTest COMMAND scalarTest
) )
ENDIF()

View File

@ -45,7 +45,7 @@ typedef struct {
TdThreadMutex cfMutex; TdThreadMutex cfMutex;
SHashObj* cfInst; SHashObj* cfInst;
int64_t defaultCfInit; int64_t defaultCfInit;
int64_t vgId;
} SBackendWrapper; } SBackendWrapper;
typedef struct { typedef struct {

View File

@ -843,6 +843,8 @@ int32_t streamBackendInit(const char* streamPath, int64_t chkpId, int32_t vgId,
pHandle->cfInst = taosHashInit(64, taosGetDefaultHashFunction(TSDB_DATA_TYPE_BINARY), false, HASH_NO_LOCK); pHandle->cfInst = taosHashInit(64, taosGetDefaultHashFunction(TSDB_DATA_TYPE_BINARY), false, HASH_NO_LOCK);
TSDB_CHECK_NULL(pHandle->cfInst, code, lino, _EXIT, terrno); TSDB_CHECK_NULL(pHandle->cfInst, code, lino, _EXIT, terrno);
pHandle->vgId = vgId;
rocksdb_env_t* env = rocksdb_create_default_env(); // rocksdb_envoptions_create(); rocksdb_env_t* env = rocksdb_create_default_env(); // rocksdb_envoptions_create();
int32_t nBGThread = tsNumOfSnodeStreamThreads <= 2 ? 1 : tsNumOfSnodeStreamThreads / 2; int32_t nBGThread = tsNumOfSnodeStreamThreads <= 2 ? 1 : tsNumOfSnodeStreamThreads / 2;
@ -914,6 +916,7 @@ _EXIT:
taosMemoryFree(backendPath); taosMemoryFree(backendPath);
return code; return code;
} }
void streamBackendCleanup(void* arg) { void streamBackendCleanup(void* arg) {
SBackendWrapper* pHandle = (SBackendWrapper*)arg; SBackendWrapper* pHandle = (SBackendWrapper*)arg;
@ -930,6 +933,7 @@ void streamBackendCleanup(void* arg) {
rocksdb_close(pHandle->db); rocksdb_close(pHandle->db);
pHandle->db = NULL; pHandle->db = NULL;
} }
rocksdb_options_destroy(pHandle->dbOpt); rocksdb_options_destroy(pHandle->dbOpt);
rocksdb_env_destroy(pHandle->env); rocksdb_env_destroy(pHandle->env);
rocksdb_cache_destroy(pHandle->cache); rocksdb_cache_destroy(pHandle->cache);
@ -945,16 +949,16 @@ void streamBackendCleanup(void* arg) {
streamMutexDestroy(&pHandle->mutex); streamMutexDestroy(&pHandle->mutex);
streamMutexDestroy(&pHandle->cfMutex); streamMutexDestroy(&pHandle->cfMutex);
stDebug("destroy stream backend :%p", pHandle); stDebug("vgId:%d destroy stream backend:%p", (int32_t) pHandle->vgId, pHandle);
taosMemoryFree(pHandle); taosMemoryFree(pHandle);
return;
} }
void streamBackendHandleCleanup(void* arg) { void streamBackendHandleCleanup(void* arg) {
SBackendCfWrapper* wrapper = arg; SBackendCfWrapper* wrapper = arg;
bool remove = wrapper->remove; bool remove = wrapper->remove;
TAOS_UNUSED(taosThreadRwlockWrlock(&wrapper->rwLock)); TAOS_UNUSED(taosThreadRwlockWrlock(&wrapper->rwLock));
stDebug("start to do-close backendwrapper %p, %s", wrapper, wrapper->idstr); stDebug("start to do-close backendWrapper %p, %s", wrapper, wrapper->idstr);
if (wrapper->rocksdb == NULL) { if (wrapper->rocksdb == NULL) {
TAOS_UNUSED(taosThreadRwlockUnlock(&wrapper->rwLock)); TAOS_UNUSED(taosThreadRwlockUnlock(&wrapper->rwLock));
return; return;
@ -2613,11 +2617,14 @@ int32_t taskDbOpen(const char* path, const char* key, int64_t chkptId, int64_t*
void taskDbDestroy(void* pDb, bool flush) { void taskDbDestroy(void* pDb, bool flush) {
STaskDbWrapper* wrapper = pDb; STaskDbWrapper* wrapper = pDb;
if (wrapper == NULL) return; if (wrapper == NULL) {
return;
}
int64_t st = taosGetTimestampMs();
streamMetaRemoveDB(wrapper->pMeta, wrapper->idstr); streamMetaRemoveDB(wrapper->pMeta, wrapper->idstr);
stDebug("succ to destroy stream backend:%p", wrapper); stDebug("%s succ to destroy stream backend:%p", wrapper->idstr, wrapper);
int8_t nCf = tListLen(ginitDict); int8_t nCf = tListLen(ginitDict);
if (flush && wrapper->removeAllFiles == 0) { if (flush && wrapper->removeAllFiles == 0) {
@ -2674,25 +2681,26 @@ void taskDbDestroy(void* pDb, bool flush) {
rocksdb_comparator_destroy(compare); rocksdb_comparator_destroy(compare);
rocksdb_block_based_options_destroy(tblOpt); rocksdb_block_based_options_destroy(tblOpt);
} }
taosMemoryFree(wrapper->pCompares); taosMemoryFree(wrapper->pCompares);
taosMemoryFree(wrapper->pCfOpts); taosMemoryFree(wrapper->pCfOpts);
taosMemoryFree(wrapper->pCfParams); taosMemoryFree(wrapper->pCfParams);
streamMutexDestroy(&wrapper->mutex); streamMutexDestroy(&wrapper->mutex);
taskDbDestroyChkpOpt(wrapper); taskDbDestroyChkpOpt(wrapper);
taosMemoryFree(wrapper->idstr);
if (wrapper->removeAllFiles) { if (wrapper->removeAllFiles) {
char* err = NULL; char* err = NULL;
stInfo("drop task remove backend dat:%s", wrapper->path); stInfo("drop task remove backend data:%s", wrapper->path);
taosRemoveDir(wrapper->path); taosRemoveDir(wrapper->path);
} }
int64_t et = taosGetTimestampMs();
stDebug("%s destroy stream backend:%p completed, elapsed time:%.2fs", wrapper->idstr, wrapper, (et - st)/1000.0);
taosMemoryFree(wrapper->idstr);
taosMemoryFree(wrapper->path); taosMemoryFree(wrapper->path);
taosMemoryFree(wrapper); taosMemoryFree(wrapper);
return;
} }
void taskDbDestroy2(void* pDb) { taskDbDestroy(pDb, true); } void taskDbDestroy2(void* pDb) { taskDbDestroy(pDb, true); }

View File

@ -20,7 +20,7 @@
#define CHECK_NOT_RSP_DURATION 60 * 1000 // 60 sec #define CHECK_NOT_RSP_DURATION 60 * 1000 // 60 sec
static void processDownstreamReadyRsp(SStreamTask* pTask); static void processDownstreamReadyRsp(SStreamTask* pTask, bool lock);
static void rspMonitorFn(void* param, void* tmrId); static void rspMonitorFn(void* param, void* tmrId);
static void streamTaskInitTaskCheckInfo(STaskCheckInfo* pInfo, STaskOutputInfo* pOutputInfo, int64_t startTs); static void streamTaskInitTaskCheckInfo(STaskCheckInfo* pInfo, STaskOutputInfo* pOutputInfo, int64_t startTs);
static int32_t streamTaskStartCheckDownstream(STaskCheckInfo* pInfo, const char* id); static int32_t streamTaskStartCheckDownstream(STaskCheckInfo* pInfo, const char* id);
@ -158,9 +158,11 @@ void streamTaskSendCheckMsg(SStreamTask* pTask) {
} }
} }
} else { // for sink task, set it ready directly. } else { // for sink task, set it ready directly.
// streamTaskSetConsenChkptIdRecv(pTask, 0, taosGetTimestampMs());
//
stDebug("s-task:%s (vgId:%d) set downstream ready, since no downstream", idstr, pTask->info.nodeId); stDebug("s-task:%s (vgId:%d) set downstream ready, since no downstream", idstr, pTask->info.nodeId);
streamTaskStopMonitorCheckRsp(&pTask->taskCheckInfo, idstr); streamTaskStopMonitorCheckRsp(&pTask->taskCheckInfo, idstr);
processDownstreamReadyRsp(pTask); processDownstreamReadyRsp(pTask, false);
} }
if (code) { if (code) {
@ -233,7 +235,7 @@ int32_t streamTaskProcessCheckRsp(SStreamTask* pTask, const SStreamTaskCheckRsp*
} }
if (left == 0) { if (left == 0) {
processDownstreamReadyRsp(pTask); // all downstream tasks are ready, set the complete check downstream flag processDownstreamReadyRsp(pTask, true); // all downstream tasks are ready, set the complete check downstream flag
streamTaskStopMonitorCheckRsp(pInfo, id); streamTaskStopMonitorCheckRsp(pInfo, id);
} else { } else {
stDebug("s-task:%s (vgId:%d) recv check rsp from task:0x%x (vgId:%d) status:%d, total:%d not ready:%d", id, stDebug("s-task:%s (vgId:%d) recv check rsp from task:0x%x (vgId:%d) status:%d, total:%d not ready:%d", id,
@ -259,7 +261,7 @@ int32_t streamTaskProcessCheckRsp(SStreamTask* pTask, const SStreamTaskCheckRsp*
code = streamTaskAddIntoNodeUpdateList(pTask, pRsp->downstreamNodeId); code = streamTaskAddIntoNodeUpdateList(pTask, pRsp->downstreamNodeId);
} }
streamMetaAddFailedTaskSelf(pTask, now); streamMetaAddFailedTaskSelf(pTask, now, true);
} else { // TASK_DOWNSTREAM_NOT_READY, rsp-check monitor will retry in 300 ms } else { // TASK_DOWNSTREAM_NOT_READY, rsp-check monitor will retry in 300 ms
stDebug("s-task:%s (vgId:%d) recv check rsp from task:0x%x (vgId:%d) status:%d, total:%d not ready:%d", id, stDebug("s-task:%s (vgId:%d) recv check rsp from task:0x%x (vgId:%d) status:%d, total:%d not ready:%d", id,
pRsp->upstreamNodeId, pRsp->downstreamTaskId, pRsp->downstreamNodeId, pRsp->status, total, left); pRsp->upstreamNodeId, pRsp->downstreamTaskId, pRsp->downstreamNodeId, pRsp->status, total, left);
@ -354,7 +356,7 @@ void streamTaskCleanupCheckInfo(STaskCheckInfo* pInfo) {
} }
/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
void processDownstreamReadyRsp(SStreamTask* pTask) { void processDownstreamReadyRsp(SStreamTask* pTask, bool lock) {
EStreamTaskEvent event = (pTask->info.fillHistory != STREAM_HISTORY_TASK) ? TASK_EVENT_INIT : TASK_EVENT_INIT_SCANHIST; EStreamTaskEvent event = (pTask->info.fillHistory != STREAM_HISTORY_TASK) ? TASK_EVENT_INIT : TASK_EVENT_INIT_SCANHIST;
int32_t code = streamTaskOnHandleEventSuccess(pTask->status.pSM, event, NULL, NULL); int32_t code = streamTaskOnHandleEventSuccess(pTask->status.pSM, event, NULL, NULL);
if (code) { if (code) {
@ -363,7 +365,12 @@ void processDownstreamReadyRsp(SStreamTask* pTask) {
int64_t checkTs = pTask->execInfo.checkTs; int64_t checkTs = pTask->execInfo.checkTs;
int64_t readyTs = pTask->execInfo.readyTs; int64_t readyTs = pTask->execInfo.readyTs;
code = streamMetaAddTaskLaunchResult(pTask->pMeta, pTask->id.streamId, pTask->id.taskId, checkTs, readyTs, true); if (lock) {
code = streamMetaAddTaskLaunchResult(pTask->pMeta, pTask->id.streamId, pTask->id.taskId, checkTs, readyTs, true);
} else {
code = streamMetaAddTaskLaunchResultNoLock(pTask->pMeta, pTask->id.streamId, pTask->id.taskId, checkTs, readyTs, true);
}
if (code) { if (code) {
stError("s-task:%s failed to record the downstream task status, code:%s", pTask->id.idStr, tstrerror(code)); stError("s-task:%s failed to record the downstream task status, code:%s", pTask->id.idStr, tstrerror(code));
} }
@ -388,7 +395,7 @@ void processDownstreamReadyRsp(SStreamTask* pTask) {
// todo: let's retry // todo: let's retry
if (HAS_RELATED_FILLHISTORY_TASK(pTask)) { if (HAS_RELATED_FILLHISTORY_TASK(pTask)) {
stDebug("s-task:%s try to launch related task", pTask->id.idStr); stDebug("s-task:%s try to launch related task", pTask->id.idStr);
code = streamLaunchFillHistoryTask(pTask); code = streamLaunchFillHistoryTask(pTask, lock);
if (code) { if (code) {
stError("s-task:%s failed to launch related task, code:%s", pTask->id.idStr, tstrerror(code)); stError("s-task:%s failed to launch related task, code:%s", pTask->id.idStr, tstrerror(code));
} }

View File

@ -631,8 +631,9 @@ static int32_t doUpdateCheckpointInfoCheck(SStreamTask* pTask, bool restored, SV
code = streamMetaUnregisterTask(pMeta, pReq->hStreamId, pReq->hTaskId); code = streamMetaUnregisterTask(pMeta, pReq->hStreamId, pReq->hTaskId);
int32_t numOfTasks = streamMetaGetNumOfTasks(pMeta); int32_t numOfTasks = streamMetaGetNumOfTasks(pMeta);
stDebug("s-task:%s vgId:%d related fill-history task:0x%x dropped in update checkpointInfo, remain tasks:%d", stDebug("s-task:%s vgId:%d related fill-history task:0x%" PRIx64
id, vgId, pReq->taskId, numOfTasks); " dropped in update checkpointInfo, remain tasks:%d",
id, vgId, pReq->hTaskId, numOfTasks);
// todo: task may not exist, commit anyway, optimize this later // todo: task may not exist, commit anyway, optimize this later
code = streamMetaCommit(pMeta); code = streamMetaCommit(pMeta);
@ -1646,18 +1647,27 @@ int32_t streamTaskSendNegotiateChkptIdMsg(SStreamTask* pTask) {
streamTaskSetReqConsenChkptId(pTask, taosGetTimestampMs()); streamTaskSetReqConsenChkptId(pTask, taosGetTimestampMs());
streamMutexUnlock(&pTask->lock); streamMutexUnlock(&pTask->lock);
// 1. stop the executo at first
if (pTask->exec.pExecutor != NULL) {
// we need to make sure the underlying operator is stopped right, otherwise, SIGSEG may occur,
// waiting at most for 10min
if (pTask->info.taskLevel != TASK_LEVEL__SINK && pTask->exec.pExecutor != NULL) {
int32_t code = qKillTask(pTask->exec.pExecutor, TSDB_CODE_SUCCESS, 600000);
if (code != TSDB_CODE_SUCCESS) {
stError("s-task:%s failed to kill task related query handle, code:%s", pTask->id.idStr, tstrerror(code));
}
}
qDestroyTask(pTask->exec.pExecutor);
pTask->exec.pExecutor = NULL;
}
// 2. destroy backend after stop executor
if (pTask->pBackend != NULL) { if (pTask->pBackend != NULL) {
streamFreeTaskState(pTask, p); streamFreeTaskState(pTask, p);
pTask->pBackend = NULL; pTask->pBackend = NULL;
} }
streamMetaWLock(pTask->pMeta);
if (pTask->exec.pExecutor != NULL) {
qDestroyTask(pTask->exec.pExecutor);
pTask->exec.pExecutor = NULL;
}
streamMetaWUnLock(pTask->pMeta);
return 0; return 0;
} }

View File

@ -147,7 +147,7 @@ int32_t streamTaskBroadcastRetrieveReq(SStreamTask* pTask, SStreamRetrieveReq* r
static int32_t buildStreamRetrieveReq(SStreamTask* pTask, const SSDataBlock* pBlock, SStreamRetrieveReq* req) { static int32_t buildStreamRetrieveReq(SStreamTask* pTask, const SSDataBlock* pBlock, SStreamRetrieveReq* req) {
SRetrieveTableRsp* pRetrieve = NULL; SRetrieveTableRsp* pRetrieve = NULL;
size_t dataEncodeSize = blockGetEncodeSize(pBlock); size_t dataEncodeSize = blockGetEncodeSize(pBlock);
int32_t len = sizeof(SRetrieveTableRsp) + dataEncodeSize + PAYLOAD_PREFIX_LEN; int32_t len = sizeof(SRetrieveTableRsp) + dataEncodeSize + PAYLOAD_PREFIX_LEN;
pRetrieve = taosMemoryCalloc(1, len); pRetrieve = taosMemoryCalloc(1, len);
@ -795,6 +795,9 @@ static int32_t doAddDispatchBlock(SStreamTask* pTask, SStreamDispatchReq* pReqs,
} }
if (hashValue >= pVgInfo->hashBegin && hashValue <= pVgInfo->hashEnd) { if (hashValue >= pVgInfo->hashBegin && hashValue <= pVgInfo->hashEnd) {
stDebug("s-task:%s dst table hashVal:0x%x assign to vgId:%d range[0x%x, 0x%x]", pTask->id.idStr, hashValue,
pVgInfo->vgId, pVgInfo->hashBegin, pVgInfo->hashEnd);
if ((code = streamAddBlockIntoDispatchMsg(pDataBlock, &pReqs[j])) < 0) { if ((code = streamAddBlockIntoDispatchMsg(pDataBlock, &pReqs[j])) < 0) {
stError("s-task:%s failed to add dispatch block, code:%s", pTask->id.idStr, tstrerror(terrno)); stError("s-task:%s failed to add dispatch block, code:%s", pTask->id.idStr, tstrerror(terrno));
return code; return code;
@ -838,6 +841,8 @@ int32_t streamSearchAndAddBlock(SStreamTask* pTask, SStreamDispatchReq* pReqs, S
if (!pDataBlock->info.parTbName[0]) { if (!pDataBlock->info.parTbName[0]) {
memset(pDataBlock->info.parTbName, 0, TSDB_TABLE_NAME_LEN); memset(pDataBlock->info.parTbName, 0, TSDB_TABLE_NAME_LEN);
memcpy(pDataBlock->info.parTbName, pBln->parTbName, strlen(pBln->parTbName)); memcpy(pDataBlock->info.parTbName, pBln->parTbName, strlen(pBln->parTbName));
stDebug("s-task:%s cached table name:%s, groupId:%" PRId64 " hashVal:0x%x", pTask->id.idStr, pBln->parTbName,
groupId, hashValue);
} }
} else { } else {
char ctbName[TSDB_TABLE_FNAME_LEN] = {0}; char ctbName[TSDB_TABLE_FNAME_LEN] = {0};
@ -863,9 +868,9 @@ int32_t streamSearchAndAddBlock(SStreamTask* pTask, SStreamDispatchReq* pReqs, S
} }
} }
snprintf(ctbName, TSDB_TABLE_NAME_LEN, "%s.%s", pTask->outputInfo.shuffleDispatcher.dbInfo.db, snprintf(ctbName, TSDB_TABLE_FNAME_LEN, "%s.%s", pTask->outputInfo.shuffleDispatcher.dbInfo.db,
pDataBlock->info.parTbName); pDataBlock->info.parTbName);
/*uint32_t hashValue = MurmurHash3_32(ctbName, strlen(ctbName));*/
SUseDbRsp* pDbInfo = &pTask->outputInfo.shuffleDispatcher.dbInfo; SUseDbRsp* pDbInfo = &pTask->outputInfo.shuffleDispatcher.dbInfo;
hashValue = hashValue =
taosGetTbHashVal(ctbName, strlen(ctbName), pDbInfo->hashMethod, pDbInfo->hashPrefix, pDbInfo->hashSuffix); taosGetTbHashVal(ctbName, strlen(ctbName), pDbInfo->hashMethod, pDbInfo->hashPrefix, pDbInfo->hashSuffix);
@ -873,6 +878,8 @@ int32_t streamSearchAndAddBlock(SStreamTask* pTask, SStreamDispatchReq* pReqs, S
bln.hashValue = hashValue; bln.hashValue = hashValue;
memcpy(bln.parTbName, pDataBlock->info.parTbName, strlen(pDataBlock->info.parTbName)); memcpy(bln.parTbName, pDataBlock->info.parTbName, strlen(pDataBlock->info.parTbName));
stDebug("s-task:%s dst table:%s hashVal:0x%x groupId:%"PRId64, pTask->id.idStr, ctbName, hashValue, groupId);
// failed to put into name buffer, no need to do anything // failed to put into name buffer, no need to do anything
if (tSimpleHashGetSize(pTask->pNameMap) < MAX_BLOCK_NAME_NUM) { // allow error, and do nothing if (tSimpleHashGetSize(pTask->pNameMap) < MAX_BLOCK_NAME_NUM) { // allow error, and do nothing
code = tSimpleHashPut(pTask->pNameMap, &groupId, sizeof(int64_t), &bln, sizeof(SBlockName)); code = tSimpleHashPut(pTask->pNameMap, &groupId, sizeof(int64_t), &bln, sizeof(SBlockName));
@ -1104,8 +1111,8 @@ static int32_t doTaskChkptStatusCheck(SStreamTask* pTask, void* param, int32_t n
} }
if (taosArrayGetSize(pTask->upstreamInfo.pList) != num) { if (taosArrayGetSize(pTask->upstreamInfo.pList) != num) {
stWarn("s-task:%s vgId:%d upstream number:%d not equals sent readyMsg:%d, quit from readyMsg send tmr", id, stWarn("s-task:%s vgId:%d upstream number:%d not equals sent readyMsg:%d, quit from readyMsg send tmr", id, vgId,
vgId, (int32_t)taosArrayGetSize(pTask->upstreamInfo.pList), num); (int32_t)taosArrayGetSize(pTask->upstreamInfo.pList), num);
return -1; return -1;
} }
@ -1265,8 +1272,7 @@ static void chkptReadyMsgSendMonitorFn(void* param, void* tmrId) {
// 1. check status in the first place // 1. check status in the first place
if (state.state != TASK_STATUS__CK) { if (state.state != TASK_STATUS__CK) {
streamCleanBeforeQuitTmr(pTmrInfo, param); streamCleanBeforeQuitTmr(pTmrInfo, param);
stDebug("s-task:%s vgId:%d status:%s not in checkpoint, quit from monitor checkpoint-ready", id, vgId, stDebug("s-task:%s vgId:%d status:%s not in checkpoint, quit from monitor checkpoint-ready", id, vgId, state.name);
state.name);
streamMetaReleaseTask(pTask->pMeta, pTask); streamMetaReleaseTask(pTask->pMeta, pTask);
taosArrayDestroy(pNotRspList); taosArrayDestroy(pNotRspList);
return; return;
@ -1395,7 +1401,7 @@ int32_t streamTaskSendCheckpointSourceRsp(SStreamTask* pTask) {
} }
int32_t streamAddBlockIntoDispatchMsg(const SSDataBlock* pBlock, SStreamDispatchReq* pReq) { int32_t streamAddBlockIntoDispatchMsg(const SSDataBlock* pBlock, SStreamDispatchReq* pReq) {
size_t dataEncodeSize = blockGetEncodeSize(pBlock); size_t dataEncodeSize = blockGetEncodeSize(pBlock);
int32_t dataStrLen = sizeof(SRetrieveTableRsp) + dataEncodeSize + PAYLOAD_PREFIX_LEN; int32_t dataStrLen = sizeof(SRetrieveTableRsp) + dataEncodeSize + PAYLOAD_PREFIX_LEN;
void* buf = taosMemoryCalloc(1, dataStrLen); void* buf = taosMemoryCalloc(1, dataStrLen);
if (buf == NULL) { if (buf == NULL) {

View File

@ -195,6 +195,7 @@ int32_t streamMetaSendHbHelper(SStreamMeta* pMeta) {
int32_t numOfTasks = streamMetaGetNumOfTasks(pMeta); int32_t numOfTasks = streamMetaGetNumOfTasks(pMeta);
SMetaHbInfo* pInfo = pMeta->pHbInfo; SMetaHbInfo* pInfo = pMeta->pHbInfo;
int32_t code = 0; int32_t code = 0;
bool setReqCheckpointId = false;
// not recv the hb msg rsp yet, send current hb msg again // not recv the hb msg rsp yet, send current hb msg again
if (pInfo->msgSendTs > 0) { if (pInfo->msgSendTs > 0) {
@ -243,7 +244,7 @@ int32_t streamMetaSendHbHelper(SStreamMeta* pMeta) {
continue; continue;
} }
// todo: this lock may blocked by lock in streamMetaStartOneTask function, which may lock a very long time when // todo: this lock may be blocked by lock in streamMetaStartOneTask function, which may lock a very long time when
// trying to load remote checkpoint data // trying to load remote checkpoint data
streamMutexLock(&pTask->lock); streamMutexLock(&pTask->lock);
STaskStatusEntry entry = streamTaskGetStatusEntry(pTask); STaskStatusEntry entry = streamTaskGetStatusEntry(pTask);
@ -274,7 +275,8 @@ int32_t streamMetaSendHbHelper(SStreamMeta* pMeta) {
streamMutexLock(&pTask->lock); streamMutexLock(&pTask->lock);
entry.checkpointInfo.consensusChkptId = streamTaskCheckIfReqConsenChkptId(pTask, pMsg->ts); entry.checkpointInfo.consensusChkptId = streamTaskCheckIfReqConsenChkptId(pTask, pMsg->ts);
if (entry.checkpointInfo.consensusChkptId) { if (entry.checkpointInfo.consensusChkptId) {
entry.checkpointInfo.consensusTs = pMsg->ts; entry.checkpointInfo.consensusTs = pTask->status.consenChkptInfo.statusTs;
setReqCheckpointId = true;
} }
streamMutexUnlock(&pTask->lock); streamMutexUnlock(&pTask->lock);
@ -294,6 +296,20 @@ int32_t streamMetaSendHbHelper(SStreamMeta* pMeta) {
streamMetaReleaseTask(pMeta, pTask); streamMetaReleaseTask(pMeta, pTask);
} }
if (setReqCheckpointId) {
if (pMeta->startInfo.curStage != START_MARK_REQ_CHKPID) {
stError("vgId:%d internal unknown error, current stage is:%d expected:%d", pMeta->vgId, pMeta->startInfo.curStage,
START_MARK_REQ_CHKPID);
}
pMeta->startInfo.curStage = START_WAIT_FOR_CHKPTID;
SStartTaskStageInfo info = {.stage = pMeta->startInfo.curStage, .ts = pMsg->ts};
taosArrayPush(pMeta->startInfo.pStagesList, &info);
stDebug("vgId:%d mark_req stage -> wait_for_chkptId stage, reqTs:%" PRId64 " , numOfStageHist:%d", pMeta->vgId,
info.ts, (int32_t)taosArrayGetSize(pMeta->startInfo.pStagesList));
}
pMsg->numOfTasks = taosArrayGetSize(pMsg->pTaskStatus); pMsg->numOfTasks = taosArrayGetSize(pMsg->pTaskStatus);
if (hasMnodeEpset) { if (hasMnodeEpset) {
@ -317,7 +333,6 @@ void streamMetaHbToMnode(void* param, void* tmrId) {
SStreamMeta* pMeta = taosAcquireRef(streamMetaRefPool, rid); SStreamMeta* pMeta = taosAcquireRef(streamMetaRefPool, rid);
if (pMeta == NULL) { if (pMeta == NULL) {
stError("invalid meta rid:%" PRId64 " failed to acquired stream-meta", rid); stError("invalid meta rid:%" PRId64 " failed to acquired stream-meta", rid);
// taosMemoryFree(param);
return; return;
} }
@ -345,7 +360,6 @@ void streamMetaHbToMnode(void* param, void* tmrId) {
} else { } else {
stError("vgId:%d role:%d not leader not send hb to mnode, failed to release meta rid:%" PRId64, vgId, role, rid); stError("vgId:%d role:%d not leader not send hb to mnode, failed to release meta rid:%" PRId64, vgId, role, rid);
} }
// taosMemoryFree(param);
return; return;
} }
@ -381,7 +395,10 @@ void streamMetaHbToMnode(void* param, void* tmrId) {
} }
if (!send) { if (!send) {
stError("vgId:%d failed to send hmMsg to mnode, retry again in 5s, code:%s", pMeta->vgId, tstrerror(code)); stError("vgId:%d failed to send hbMsg to mnode due to acquire lock failure, retry again in 5s", pMeta->vgId);
}
if (code) {
stError("vgId:%d failed to send hbMsg to mnode, retry in 5, code:%s", pMeta->vgId, tstrerror(code));
} }
streamTmrStart(streamMetaHbToMnode, META_HB_CHECK_INTERVAL, param, streamTimer, &pMeta->pHbInfo->hbTmr, pMeta->vgId, streamTmrStart(streamMetaHbToMnode, META_HB_CHECK_INTERVAL, param, streamTimer, &pMeta->pHbInfo->hbTmr, pMeta->vgId,

View File

@ -240,7 +240,7 @@ int32_t streamMetaCvtDbFormat(SStreamMeta* pMeta) {
void* key = taosHashGetKey(pIter, NULL); void* key = taosHashGetKey(pIter, NULL);
code = streamStateCvtDataFormat(pMeta->path, key, *(void**)pIter); code = streamStateCvtDataFormat(pMeta->path, key, *(void**)pIter);
if (code != 0) { if (code != 0) {
stError("failed to cvt data"); stError("vgId:%d failed to cvt data", pMeta->vgId);
goto _EXIT; goto _EXIT;
} }
@ -513,6 +513,7 @@ _err:
if (pMeta->startInfo.pFailedTaskSet) taosHashCleanup(pMeta->startInfo.pFailedTaskSet); if (pMeta->startInfo.pFailedTaskSet) taosHashCleanup(pMeta->startInfo.pFailedTaskSet);
if (pMeta->bkdChkptMgt) bkdMgtDestroy(pMeta->bkdChkptMgt); if (pMeta->bkdChkptMgt) bkdMgtDestroy(pMeta->bkdChkptMgt);
if (pMeta->startInfo.pStagesList) taosArrayDestroy(pMeta->startInfo.pStagesList);
taosMemoryFree(pMeta); taosMemoryFree(pMeta);
stError("vgId:%d failed to open stream meta, at line:%d reason:%s", vgId, lino, tstrerror(code)); stError("vgId:%d failed to open stream meta, at line:%d reason:%s", vgId, lino, tstrerror(code));
@ -544,7 +545,9 @@ void streamMetaInitBackend(SStreamMeta* pMeta) {
void streamMetaClear(SStreamMeta* pMeta) { void streamMetaClear(SStreamMeta* pMeta) {
// remove all existed tasks in this vnode // remove all existed tasks in this vnode
void* pIter = NULL; int64_t st = taosGetTimestampMs();
void* pIter = NULL;
while ((pIter = taosHashIterate(pMeta->pTasksMap, pIter)) != NULL) { while ((pIter = taosHashIterate(pMeta->pTasksMap, pIter)) != NULL) {
int64_t refId = *(int64_t*)pIter; int64_t refId = *(int64_t*)pIter;
SStreamTask* p = taosAcquireRef(streamTaskRefPool, refId); SStreamTask* p = taosAcquireRef(streamTaskRefPool, refId);
@ -570,6 +573,9 @@ void streamMetaClear(SStreamMeta* pMeta) {
} }
} }
int64_t et = taosGetTimestampMs();
stDebug("vgId:%d clear task map, elapsed time:%.2fs", pMeta->vgId, (et - st)/1000.0);
if (pMeta->streamBackendRid != 0) { if (pMeta->streamBackendRid != 0) {
int32_t code = taosRemoveRef(streamBackendId, pMeta->streamBackendRid); int32_t code = taosRemoveRef(streamBackendId, pMeta->streamBackendRid);
if (code) { if (code) {
@ -577,6 +583,9 @@ void streamMetaClear(SStreamMeta* pMeta) {
} }
} }
int64_t et1 = taosGetTimestampMs();
stDebug("vgId:%d clear backend completed, elapsed time:%.2fs", pMeta->vgId, (et1 - et)/1000.0);
taosHashClear(pMeta->pTasksMap); taosHashClear(pMeta->pTasksMap);
taosArrayClear(pMeta->pTaskList); taosArrayClear(pMeta->pTaskList);
@ -589,6 +598,8 @@ void streamMetaClear(SStreamMeta* pMeta) {
// the willrestart/starting flag can NOT be cleared // the willrestart/starting flag can NOT be cleared
taosHashClear(pMeta->startInfo.pReadyTaskSet); taosHashClear(pMeta->startInfo.pReadyTaskSet);
taosHashClear(pMeta->startInfo.pFailedTaskSet); taosHashClear(pMeta->startInfo.pFailedTaskSet);
taosArrayClear(pMeta->startInfo.pStagesList);
pMeta->startInfo.readyTs = 0; pMeta->startInfo.readyTs = 0;
} }

View File

@ -25,9 +25,9 @@
#define SCANHISTORY_IDLE_TICK ((SCANHISTORY_MAX_IDLE_TIME * 1000) / SCANHISTORY_IDLE_TIME_SLICE) #define SCANHISTORY_IDLE_TICK ((SCANHISTORY_MAX_IDLE_TIME * 1000) / SCANHISTORY_IDLE_TIME_SLICE)
typedef struct SLaunchHTaskInfo { typedef struct SLaunchHTaskInfo {
int64_t metaRid; int64_t metaRid;
STaskId id; STaskId id;
STaskId hTaskId; STaskId hTaskId;
} SLaunchHTaskInfo; } SLaunchHTaskInfo;
static int32_t streamSetParamForScanHistory(SStreamTask* pTask); static int32_t streamSetParamForScanHistory(SStreamTask* pTask);
@ -40,7 +40,7 @@ static void doExecScanhistoryInFuture(void* param, void* tmrId);
static int32_t doStartScanHistoryTask(SStreamTask* pTask); static int32_t doStartScanHistoryTask(SStreamTask* pTask);
static int32_t streamTaskStartScanHistory(SStreamTask* pTask); static int32_t streamTaskStartScanHistory(SStreamTask* pTask);
static void checkFillhistoryTaskStatus(SStreamTask* pTask, SStreamTask* pHTask); static void checkFillhistoryTaskStatus(SStreamTask* pTask, SStreamTask* pHTask);
static int32_t launchNotBuiltFillHistoryTask(SStreamTask* pTask); static int32_t launchNotBuiltFillHistoryTask(SStreamTask* pTask, bool lock);
static void doRetryLaunchFillHistoryTask(SStreamTask* pTask, SLaunchHTaskInfo* pInfo, int64_t now); static void doRetryLaunchFillHistoryTask(SStreamTask* pTask, SLaunchHTaskInfo* pInfo, int64_t now);
static void notRetryLaunchFillHistoryTask(SStreamTask* pTask, SLaunchHTaskInfo* pInfo, int64_t now); static void notRetryLaunchFillHistoryTask(SStreamTask* pTask, SLaunchHTaskInfo* pInfo, int64_t now);
@ -122,7 +122,7 @@ int32_t streamTaskStartScanHistory(SStreamTask* pTask) {
int32_t streamTaskOnNormalTaskReady(SStreamTask* pTask) { int32_t streamTaskOnNormalTaskReady(SStreamTask* pTask) {
const char* id = pTask->id.idStr; const char* id = pTask->id.idStr;
int32_t code = 0; int32_t code = 0;
code = streamTaskSetReady(pTask); code = streamTaskSetReady(pTask);
if (code) { if (code) {
@ -196,8 +196,8 @@ int32_t streamSetParamForStreamScannerStep2(SStreamTask* pTask, SVersionRange* p
return qStreamSourceScanParamForHistoryScanStep2(pTask->exec.pExecutor, pVerRange, pWindow); return qStreamSourceScanParamForHistoryScanStep2(pTask->exec.pExecutor, pVerRange, pWindow);
} }
// A fill history task needs to be started. // an fill history task needs to be started.
int32_t streamLaunchFillHistoryTask(SStreamTask* pTask) { int32_t streamLaunchFillHistoryTask(SStreamTask* pTask, bool lock) {
SStreamMeta* pMeta = pTask->pMeta; SStreamMeta* pMeta = pTask->pMeta;
STaskExecStatisInfo* pExecInfo = &pTask->execInfo; STaskExecStatisInfo* pExecInfo = &pTask->execInfo;
const char* idStr = pTask->id.idStr; const char* idStr = pTask->id.idStr;
@ -205,29 +205,44 @@ int32_t streamLaunchFillHistoryTask(SStreamTask* pTask) {
int32_t hTaskId = pTask->hTaskInfo.id.taskId; int32_t hTaskId = pTask->hTaskInfo.id.taskId;
int64_t now = taosGetTimestampMs(); int64_t now = taosGetTimestampMs();
int32_t code = 0; int32_t code = 0;
SStreamTask* pHisTask = NULL;
// check stream task status in the first place. // check stream task status in the first place.
SStreamTaskState pStatus = streamTaskGetStatus(pTask); SStreamTaskState status = streamTaskGetStatus(pTask);
if (pStatus.state != TASK_STATUS__READY && pStatus.state != TASK_STATUS__HALT && if (status.state != TASK_STATUS__READY && status.state != TASK_STATUS__HALT && status.state != TASK_STATUS__PAUSE) {
pStatus.state != TASK_STATUS__PAUSE) {
stDebug("s-task:%s not launch related fill-history task:0x%" PRIx64 "-0x%x, status:%s", idStr, hStreamId, hTaskId, stDebug("s-task:%s not launch related fill-history task:0x%" PRIx64 "-0x%x, status:%s", idStr, hStreamId, hTaskId,
pStatus.name); status.name);
if (lock) {
return streamMetaAddTaskLaunchResult(pMeta, hStreamId, hTaskId, pExecInfo->checkTs, pExecInfo->readyTs, false); return streamMetaAddTaskLaunchResult(pMeta, hStreamId, hTaskId, pExecInfo->checkTs, pExecInfo->readyTs, false);
} else {
return streamMetaAddTaskLaunchResultNoLock(pMeta, hStreamId, hTaskId, pExecInfo->checkTs, pExecInfo->readyTs,
false);
}
} }
stDebug("s-task:%s start to launch related fill-history task:0x%" PRIx64 "-0x%x", idStr, hStreamId, hTaskId); stDebug("s-task:%s start to launch related fill-history task:0x%" PRIx64 "-0x%x", idStr, hStreamId, hTaskId);
// Set the execution conditions, including the query time window and the version range // Set the execution conditions, including the query time window and the version range
streamMetaRLock(pMeta); if (lock) {
SStreamTask* pHisTask = NULL; streamMetaRLock(pMeta);
code = streamMetaAcquireTaskUnsafe(pMeta, &pTask->hTaskInfo.id, &pHisTask); }
streamMetaRUnLock(pMeta);
if (code == 0) { // it is already added into stream meta store. code = streamMetaAcquireTaskUnsafe(pMeta, &pTask->hTaskInfo.id, &pHisTask);
if (lock) {
streamMetaRUnLock(pMeta);
}
if (code == 0) { // it is already added into stream meta store.
if (pHisTask->status.downstreamReady == 1) { // it's ready now, do nothing if (pHisTask->status.downstreamReady == 1) { // it's ready now, do nothing
stDebug("s-task:%s fill-history task is ready, no need to check downstream", pHisTask->id.idStr); stDebug("s-task:%s fill-history task is ready, no need to check downstream", pHisTask->id.idStr);
code = streamMetaAddTaskLaunchResult(pMeta, hStreamId, hTaskId, pExecInfo->checkTs, pExecInfo->readyTs, true); if (lock) {
code = streamMetaAddTaskLaunchResult(pMeta, hStreamId, hTaskId, pExecInfo->checkTs, pExecInfo->readyTs, true);
} else {
code = streamMetaAddTaskLaunchResultNoLock(pMeta, hStreamId, hTaskId, pExecInfo->checkTs, pExecInfo->readyTs,
true);
}
if (code) { if (code) {
stError("s-task:%s failed to record start task status, code:%s", idStr, tstrerror(code)); stError("s-task:%s failed to record start task status, code:%s", idStr, tstrerror(code));
} }
@ -235,7 +250,7 @@ int32_t streamLaunchFillHistoryTask(SStreamTask* pTask) {
if (pHisTask->pBackend == NULL) { if (pHisTask->pBackend == NULL) {
code = pMeta->expandTaskFn(pHisTask); code = pMeta->expandTaskFn(pHisTask);
if (code != TSDB_CODE_SUCCESS) { if (code != TSDB_CODE_SUCCESS) {
streamMetaAddFailedTaskSelf(pHisTask, now); streamMetaAddFailedTaskSelf(pHisTask, now, lock);
stError("s-task:%s failed to expand fill-history task, code:%s", pHisTask->id.idStr, tstrerror(code)); stError("s-task:%s failed to expand fill-history task, code:%s", pHisTask->id.idStr, tstrerror(code));
} }
} }
@ -248,7 +263,7 @@ int32_t streamLaunchFillHistoryTask(SStreamTask* pTask) {
streamMetaReleaseTask(pMeta, pHisTask); streamMetaReleaseTask(pMeta, pHisTask);
return code; return code;
} else { } else {
return launchNotBuiltFillHistoryTask(pTask); return launchNotBuiltFillHistoryTask(pTask, lock);
} }
} }
@ -298,8 +313,8 @@ void notRetryLaunchFillHistoryTask(SStreamTask* pTask, SLaunchHTaskInfo* pInfo,
if (code) { if (code) {
stError("s-task:%s failed to record the start task status, code:%s", pTask->id.idStr, tstrerror(code)); stError("s-task:%s failed to record the start task status, code:%s", pTask->id.idStr, tstrerror(code));
} else { } else {
stError("s-task:%s max retry:%d reached, quit from retrying launch related fill-history task:0x%x", stError("s-task:%s max retry:%d reached, quit from retrying launch related fill-history task:0x%x", pTask->id.idStr,
pTask->id.idStr, MAX_RETRY_LAUNCH_HISTORY_TASK, (int32_t)pHTaskInfo->id.taskId); MAX_RETRY_LAUNCH_HISTORY_TASK, (int32_t)pHTaskInfo->id.taskId);
} }
pHTaskInfo->id.taskId = 0; pHTaskInfo->id.taskId = 0;
@ -338,7 +353,7 @@ static void doCleanup(SStreamTask* pTask, int64_t metaRid, SLaunchHTaskInfo* pIn
streamMetaReleaseTask(pMeta, pTask); streamMetaReleaseTask(pMeta, pTask);
int32_t ret = taosReleaseRef(streamMetaRefPool, metaRid); int32_t ret = taosReleaseRef(streamMetaRefPool, metaRid);
if (ret) { if (ret) {
stError("vgId:%d failed to release meta refId:%"PRId64, vgId, metaRid); stError("vgId:%d failed to release meta refId:%" PRId64, vgId, metaRid);
} }
if (pInfo != NULL) { if (pInfo != NULL) {
@ -373,7 +388,7 @@ void tryLaunchHistoryTask(void* param, void* tmrId) {
int32_t ret = taosReleaseRef(streamMetaRefPool, metaRid); int32_t ret = taosReleaseRef(streamMetaRefPool, metaRid);
if (ret) { if (ret) {
stError("vgId:%d failed to release meta refId:%"PRId64, vgId, metaRid); stError("vgId:%d failed to release meta refId:%" PRId64, vgId, metaRid);
} }
// already dropped, no need to set the failure info into the stream task meta. // already dropped, no need to set the failure info into the stream task meta.
@ -426,7 +441,7 @@ void tryLaunchHistoryTask(void* param, void* tmrId) {
if (pHTask->pBackend == NULL) { if (pHTask->pBackend == NULL) {
code = pMeta->expandTaskFn(pHTask); code = pMeta->expandTaskFn(pHTask);
if (code != TSDB_CODE_SUCCESS) { if (code != TSDB_CODE_SUCCESS) {
streamMetaAddFailedTaskSelf(pHTask, now); streamMetaAddFailedTaskSelf(pHTask, now, true);
stError("failed to expand fill-history task:%s, code:%s", pHTask->id.idStr, tstrerror(code)); stError("failed to expand fill-history task:%s, code:%s", pHTask->id.idStr, tstrerror(code));
} }
} }
@ -461,13 +476,14 @@ int32_t createHTaskLaunchInfo(SStreamMeta* pMeta, STaskId* pTaskId, int64_t hStr
return TSDB_CODE_SUCCESS; return TSDB_CODE_SUCCESS;
} }
int32_t launchNotBuiltFillHistoryTask(SStreamTask* pTask) { int32_t launchNotBuiltFillHistoryTask(SStreamTask* pTask, bool lock) {
SStreamMeta* pMeta = pTask->pMeta; SStreamMeta* pMeta = pTask->pMeta;
STaskExecStatisInfo* pExecInfo = &pTask->execInfo; STaskExecStatisInfo* pExecInfo = &pTask->execInfo;
const char* idStr = pTask->id.idStr; const char* idStr = pTask->id.idStr;
int64_t hStreamId = pTask->hTaskInfo.id.streamId; int64_t hStreamId = pTask->hTaskInfo.id.streamId;
int32_t hTaskId = pTask->hTaskInfo.id.taskId; int32_t hTaskId = pTask->hTaskInfo.id.taskId;
SLaunchHTaskInfo* pInfo = NULL; SLaunchHTaskInfo* pInfo = NULL;
int32_t ret = 0;
stWarn("s-task:%s vgId:%d failed to launch history task:0x%x, since not built yet", idStr, pMeta->vgId, hTaskId); stWarn("s-task:%s vgId:%d failed to launch history task:0x%x, since not built yet", idStr, pMeta->vgId, hTaskId);
@ -475,10 +491,16 @@ int32_t launchNotBuiltFillHistoryTask(SStreamTask* pTask) {
int32_t code = createHTaskLaunchInfo(pMeta, &id, hStreamId, hTaskId, &pInfo); int32_t code = createHTaskLaunchInfo(pMeta, &id, hStreamId, hTaskId, &pInfo);
if (code) { if (code) {
stError("s-task:%s failed to launch related fill-history task, since Out Of Memory", idStr); stError("s-task:%s failed to launch related fill-history task, since Out Of Memory", idStr);
int32_t ret = streamMetaAddTaskLaunchResult(pMeta, hStreamId, hTaskId, pExecInfo->checkTs, pExecInfo->readyTs, false); if (lock) {
ret = streamMetaAddTaskLaunchResult(pMeta, hStreamId, hTaskId, pExecInfo->checkTs, pExecInfo->readyTs, false);
} else {
ret = streamMetaAddTaskLaunchResultNoLock(pMeta, hStreamId, hTaskId, pExecInfo->checkTs, pExecInfo->readyTs, false);
}
if (ret) { if (ret) {
stError("s-task:%s add task check downstream result failed, code:%s", idStr, tstrerror(ret)); stError("s-task:%s add task check downstream result failed, code:%s", idStr, tstrerror(ret));
} }
return code; return code;
} }
@ -493,7 +515,13 @@ int32_t launchNotBuiltFillHistoryTask(SStreamTask* pTask) {
stError("s-task:%s failed to start timer, related fill-history task not launched", idStr); stError("s-task:%s failed to start timer, related fill-history task not launched", idStr);
taosMemoryFree(pInfo); taosMemoryFree(pInfo);
code = streamMetaAddTaskLaunchResult(pMeta, hStreamId, hTaskId, pExecInfo->checkTs, pExecInfo->readyTs, false);
if (lock) {
code = streamMetaAddTaskLaunchResult(pMeta, hStreamId, hTaskId, pExecInfo->checkTs, pExecInfo->readyTs, false);
} else {
code = streamMetaAddTaskLaunchResultNoLock(pMeta, hStreamId, hTaskId, pExecInfo->checkTs, pExecInfo->readyTs, false);
}
if (code) { if (code) {
stError("s-task:0x%x failed to record the start task status, code:%s", hTaskId, tstrerror(code)); stError("s-task:0x%x failed to record the start task status, code:%s", hTaskId, tstrerror(code));
} }
@ -518,8 +546,8 @@ int32_t streamTaskResetTimewindowFilter(SStreamTask* pTask) {
bool streamHistoryTaskSetVerRangeStep2(SStreamTask* pTask, int64_t nextProcessVer) { bool streamHistoryTaskSetVerRangeStep2(SStreamTask* pTask, int64_t nextProcessVer) {
SVersionRange* pRange = &pTask->dataRange.range; SVersionRange* pRange = &pTask->dataRange.range;
if (nextProcessVer < pRange->maxVer) { if (nextProcessVer < pRange->maxVer) {
stError("s-task:%s next processdVer:%"PRId64" is less than range max ver:%"PRId64, pTask->id.idStr, nextProcessVer, stError("s-task:%s next processdVer:%" PRId64 " is less than range max ver:%" PRId64, pTask->id.idStr,
pRange->maxVer); nextProcessVer, pRange->maxVer);
return true; return true;
} }
@ -585,7 +613,7 @@ int32_t streamTaskSetRangeStreamCalc(SStreamTask* pTask) {
} }
void doExecScanhistoryInFuture(void* param, void* tmrId) { void doExecScanhistoryInFuture(void* param, void* tmrId) {
int64_t taskRefId = *(int64_t*) param; int64_t taskRefId = *(int64_t*)param;
SStreamTask* pTask = taosAcquireRef(streamTaskRefPool, taskRefId); SStreamTask* pTask = taosAcquireRef(streamTaskRefPool, taskRefId);
if (pTask == NULL) { if (pTask == NULL) {
@ -610,8 +638,7 @@ void doExecScanhistoryInFuture(void* param, void* tmrId) {
stError("s-task:%s async start history task failed", pTask->id.idStr); stError("s-task:%s async start history task failed", pTask->id.idStr);
} }
stDebug("s-task:%s fill-history:%d start scan-history data, out of tmr", pTask->id.idStr, stDebug("s-task:%s fill-history:%d start scan-history data, out of tmr", pTask->id.idStr, pTask->info.fillHistory);
pTask->info.fillHistory);
} else { } else {
int64_t* pTaskRefId = NULL; int64_t* pTaskRefId = NULL;
int32_t code = streamTaskAllocRefId(pTask, &pTaskRefId); int32_t code = streamTaskAllocRefId(pTask, &pTaskRefId);

View File

@ -39,19 +39,18 @@ int32_t streamMetaStartAllTasks(SStreamMeta* pMeta) {
int32_t vgId = pMeta->vgId; int32_t vgId = pMeta->vgId;
int64_t now = taosGetTimestampMs(); int64_t now = taosGetTimestampMs();
SArray* pTaskList = NULL; SArray* pTaskList = NULL;
int32_t numOfConsensusChkptIdTasks = 0;
int32_t numOfTasks = 0;
int32_t numOfTasks = taosArrayGetSize(pMeta->pTaskList); numOfTasks = taosArrayGetSize(pMeta->pTaskList);
stInfo("vgId:%d start to consensus checkpointId for all %d task(s), start ts:%" PRId64, vgId, numOfTasks, now);
if (numOfTasks == 0) { if (numOfTasks == 0) {
stInfo("vgId:%d no tasks exist, quit from consensus checkpointId", pMeta->vgId); stInfo("vgId:%d no tasks exist, quit from consensus checkpointId", pMeta->vgId);
streamMetaWLock(pMeta);
streamMetaResetStartInfo(&pMeta->startInfo, vgId); streamMetaResetStartInfo(&pMeta->startInfo, vgId);
streamMetaWUnLock(pMeta);
return TSDB_CODE_SUCCESS; return TSDB_CODE_SUCCESS;
} }
stInfo("vgId:%d start to consensus checkpointId for all %d task(s), start ts:%" PRId64, vgId, numOfTasks, now);
code = prepareBeforeStartTasks(pMeta, &pTaskList, now); code = prepareBeforeStartTasks(pMeta, &pTaskList, now);
if (code != TSDB_CODE_SUCCESS) { if (code != TSDB_CODE_SUCCESS) {
return TSDB_CODE_SUCCESS; // ignore the error and return directly return TSDB_CODE_SUCCESS; // ignore the error and return directly
@ -65,10 +64,11 @@ int32_t streamMetaStartAllTasks(SStreamMeta* pMeta) {
for (int32_t i = 0; i < numOfTasks; ++i) { for (int32_t i = 0; i < numOfTasks; ++i) {
SStreamTaskId* pTaskId = taosArrayGet(pTaskList, i); SStreamTaskId* pTaskId = taosArrayGet(pTaskList, i);
SStreamTask* pTask = NULL; SStreamTask* pTask = NULL;
code = streamMetaAcquireTask(pMeta, pTaskId->streamId, pTaskId->taskId, &pTask);
code = streamMetaAcquireTaskNoLock(pMeta, pTaskId->streamId, pTaskId->taskId, &pTask);
if ((pTask == NULL) || (code != 0)) { if ((pTask == NULL) || (code != 0)) {
stError("vgId:%d failed to acquire task:0x%x during start task, it may be dropped", pMeta->vgId, pTaskId->taskId); stError("vgId:%d failed to acquire task:0x%x during start task, it may be dropped", pMeta->vgId, pTaskId->taskId);
int32_t ret = streamMetaAddFailedTask(pMeta, pTaskId->streamId, pTaskId->taskId); int32_t ret = streamMetaAddFailedTask(pMeta, pTaskId->streamId, pTaskId->taskId, false);
if (ret) { if (ret) {
stError("s-task:0x%x add check downstream failed, core:%s", pTaskId->taskId, tstrerror(ret)); stError("s-task:0x%x add check downstream failed, core:%s", pTaskId->taskId, tstrerror(ret));
} }
@ -79,7 +79,7 @@ int32_t streamMetaStartAllTasks(SStreamMeta* pMeta) {
code = pMeta->expandTaskFn(pTask); code = pMeta->expandTaskFn(pTask);
if (code != TSDB_CODE_SUCCESS) { if (code != TSDB_CODE_SUCCESS) {
stError("s-task:0x%x vgId:%d failed to expand stream backend", pTaskId->taskId, vgId); stError("s-task:0x%x vgId:%d failed to expand stream backend", pTaskId->taskId, vgId);
streamMetaAddFailedTaskSelf(pTask, pTask->execInfo.readyTs); streamMetaAddFailedTaskSelf(pTask, pTask->execInfo.readyTs, false);
} }
} }
@ -91,10 +91,10 @@ int32_t streamMetaStartAllTasks(SStreamMeta* pMeta) {
SStreamTaskId* pTaskId = taosArrayGet(pTaskList, i); SStreamTaskId* pTaskId = taosArrayGet(pTaskList, i);
SStreamTask* pTask = NULL; SStreamTask* pTask = NULL;
code = streamMetaAcquireTask(pMeta, pTaskId->streamId, pTaskId->taskId, &pTask); code = streamMetaAcquireTaskNoLock(pMeta, pTaskId->streamId, pTaskId->taskId, &pTask);
if ((pTask == NULL )|| (code != 0)) { if ((pTask == NULL) || (code != 0)) {
stError("vgId:%d failed to acquire task:0x%x during start tasks", pMeta->vgId, pTaskId->taskId); stError("vgId:%d failed to acquire task:0x%x during start tasks", pMeta->vgId, pTaskId->taskId);
int32_t ret = streamMetaAddFailedTask(pMeta, pTaskId->streamId, pTaskId->taskId); int32_t ret = streamMetaAddFailedTask(pMeta, pTaskId->streamId, pTaskId->taskId, false);
if (ret) { if (ret) {
stError("s-task:0x%x failed add check downstream failed, core:%s", pTaskId->taskId, tstrerror(ret)); stError("s-task:0x%x failed add check downstream failed, core:%s", pTaskId->taskId, tstrerror(ret));
} }
@ -116,14 +116,14 @@ int32_t streamMetaStartAllTasks(SStreamMeta* pMeta) {
if (HAS_RELATED_FILLHISTORY_TASK(pTask)) { if (HAS_RELATED_FILLHISTORY_TASK(pTask)) {
stDebug("s-task:%s downstream ready, no need to check downstream, check only related fill-history task", stDebug("s-task:%s downstream ready, no need to check downstream, check only related fill-history task",
pTask->id.idStr); pTask->id.idStr);
code = streamLaunchFillHistoryTask(pTask); // todo: how about retry launch fill-history task? code = streamLaunchFillHistoryTask(pTask, false); // todo: how about retry launch fill-history task?
if (code) { if (code) {
stError("s-task:%s failed to launch history task, code:%s", pTask->id.idStr, tstrerror(code)); stError("s-task:%s failed to launch history task, code:%s", pTask->id.idStr, tstrerror(code));
} }
} }
code = streamMetaAddTaskLaunchResult(pMeta, pTaskId->streamId, pTaskId->taskId, pInfo->checkTs, pInfo->readyTs, code = streamMetaAddTaskLaunchResultNoLock(pMeta, pTaskId->streamId, pTaskId->taskId, pInfo->checkTs,
true); pInfo->readyTs, true);
streamMetaReleaseTask(pMeta, pTask); streamMetaReleaseTask(pMeta, pTask);
continue; continue;
} }
@ -136,7 +136,7 @@ int32_t streamMetaStartAllTasks(SStreamMeta* pMeta) {
// do no added into result hashmap if it is failed due to concurrently starting of this stream task. // do no added into result hashmap if it is failed due to concurrently starting of this stream task.
if (code != TSDB_CODE_STREAM_CONFLICT_EVENT) { if (code != TSDB_CODE_STREAM_CONFLICT_EVENT) {
streamMetaAddFailedTaskSelf(pTask, pInfo->readyTs); streamMetaAddFailedTaskSelf(pTask, pInfo->readyTs, false);
} }
} }
@ -146,11 +146,23 @@ int32_t streamMetaStartAllTasks(SStreamMeta* pMeta) {
// negotiate the consensus checkpoint id for current task // negotiate the consensus checkpoint id for current task
code = streamTaskSendNegotiateChkptIdMsg(pTask); code = streamTaskSendNegotiateChkptIdMsg(pTask);
if (code == 0) {
numOfConsensusChkptIdTasks += 1;
}
// this task may have no checkpoint, but others tasks may generate checkpoint already? // this task may have no checkpoint, but others tasks may generate checkpoint already?
streamMetaReleaseTask(pMeta, pTask); streamMetaReleaseTask(pMeta, pTask);
} }
if (numOfConsensusChkptIdTasks > 0) {
pMeta->startInfo.curStage = START_MARK_REQ_CHKPID;
SStartTaskStageInfo info = {.stage = pMeta->startInfo.curStage, .ts = now};
taosArrayPush(pMeta->startInfo.pStagesList, &info);
stDebug("vgId:%d %d task(s) 0 stage -> mark_req stage, reqTs:%" PRId64 " numOfStageHist:%d", pMeta->vgId, numOfConsensusChkptIdTasks,
info.ts, (int32_t)taosArrayGetSize(pMeta->startInfo.pStagesList));
}
// prepare the fill-history task before starting all stream tasks, to avoid fill-history tasks are started without // prepare the fill-history task before starting all stream tasks, to avoid fill-history tasks are started without
// initialization, when the operation of check downstream tasks status is executed far quickly. // initialization, when the operation of check downstream tasks status is executed far quickly.
stInfo("vgId:%d start all task(s) completed", pMeta->vgId); stInfo("vgId:%d start all task(s) completed", pMeta->vgId);
@ -159,54 +171,76 @@ int32_t streamMetaStartAllTasks(SStreamMeta* pMeta) {
} }
int32_t prepareBeforeStartTasks(SStreamMeta* pMeta, SArray** pList, int64_t now) { int32_t prepareBeforeStartTasks(SStreamMeta* pMeta, SArray** pList, int64_t now) {
streamMetaWLock(pMeta); STaskStartInfo* pInfo = &pMeta->startInfo;
if (pMeta->closeFlag) { if (pMeta->closeFlag) {
streamMetaWUnLock(pMeta);
stError("vgId:%d vnode is closed, not start check task(s) downstream status", pMeta->vgId); stError("vgId:%d vnode is closed, not start check task(s) downstream status", pMeta->vgId);
return TSDB_CODE_FAILED; return TSDB_CODE_FAILED;
} }
*pList = taosArrayDup(pMeta->pTaskList, NULL); *pList = taosArrayDup(pMeta->pTaskList, NULL);
if (*pList == NULL) { if (*pList == NULL) {
stError("vgId:%d failed to dup tasklist, before restart tasks, code:%s", pMeta->vgId, tstrerror(terrno));
return terrno; return terrno;
} }
taosHashClear(pMeta->startInfo.pReadyTaskSet); taosHashClear(pInfo->pReadyTaskSet);
taosHashClear(pMeta->startInfo.pFailedTaskSet); taosHashClear(pInfo->pFailedTaskSet);
pMeta->startInfo.startTs = now; taosArrayClear(pInfo->pStagesList);
pInfo->curStage = 0;
pInfo->startTs = now;
int32_t code = streamMetaResetTaskStatus(pMeta); int32_t code = streamMetaResetTaskStatus(pMeta);
streamMetaWUnLock(pMeta);
return code; return code;
} }
void streamMetaResetStartInfo(STaskStartInfo* pStartInfo, int32_t vgId) { void streamMetaResetStartInfo(STaskStartInfo* pStartInfo, int32_t vgId) {
taosHashClear(pStartInfo->pReadyTaskSet); taosHashClear(pStartInfo->pReadyTaskSet);
taosHashClear(pStartInfo->pFailedTaskSet); taosHashClear(pStartInfo->pFailedTaskSet);
taosArrayClear(pStartInfo->pStagesList);
pStartInfo->tasksWillRestart = 0; pStartInfo->tasksWillRestart = 0;
pStartInfo->readyTs = 0; pStartInfo->readyTs = 0;
pStartInfo->elapsedTime = 0; pStartInfo->elapsedTime = 0;
pStartInfo->curStage = 0;
// reset the sentinel flag value to be 0 // reset the sentinel flag value to be 0
pStartInfo->startAllTasks = 0; pStartInfo->startAllTasks = 0;
stDebug("vgId:%d clear start-all-task info", vgId); stDebug("vgId:%d clear start-all-task info", vgId);
} }
int32_t streamMetaAddTaskLaunchResult(SStreamMeta* pMeta, int64_t streamId, int32_t taskId, int64_t startTs, static void streamMetaLogLaunchTasksInfo(SStreamMeta* pMeta, int32_t numOfTotal, int32_t taskId, bool ready) {
int64_t endTs, bool ready) { STaskStartInfo* pStartInfo = &pMeta->startInfo;
pStartInfo->readyTs = taosGetTimestampMs();
pStartInfo->elapsedTime = (pStartInfo->startTs != 0) ? pStartInfo->readyTs - pStartInfo->startTs : 0;
for (int32_t i = 0; i < taosArrayGetSize(pStartInfo->pStagesList); ++i) {
SStartTaskStageInfo* pStageInfo = taosArrayGet(pStartInfo->pStagesList, i);
stDebug("vgId:%d start task procedure, stage:%d, ts:%" PRId64, pMeta->vgId, pStageInfo->stage, pStageInfo->ts);
}
stDebug("vgId:%d all %d task(s) check downstream completed, last completed task:0x%x (succ:%d) startTs:%" PRId64
", readyTs:%" PRId64 " total elapsed time:%.2fs",
pMeta->vgId, numOfTotal, taskId, ready, pStartInfo->startTs, pStartInfo->readyTs,
pStartInfo->elapsedTime / 1000.0);
// print the initialization elapsed time and info
displayStatusInfo(pMeta, pStartInfo->pReadyTaskSet, true);
displayStatusInfo(pMeta, pStartInfo->pFailedTaskSet, false);
}
int32_t streamMetaAddTaskLaunchResultNoLock(SStreamMeta* pMeta, int64_t streamId, int32_t taskId,
int64_t startTs, int64_t endTs, bool ready) {
STaskStartInfo* pStartInfo = &pMeta->startInfo; STaskStartInfo* pStartInfo = &pMeta->startInfo;
STaskId id = {.streamId = streamId, .taskId = taskId}; STaskId id = {.streamId = streamId, .taskId = taskId};
int32_t vgId = pMeta->vgId; int32_t vgId = pMeta->vgId;
bool allRsp = true; bool allRsp = true;
SStreamTask* p = NULL; SStreamTask* p = NULL;
streamMetaWLock(pMeta);
int32_t code = streamMetaAcquireTaskUnsafe(pMeta, &id, &p); int32_t code = streamMetaAcquireTaskUnsafe(pMeta, &id, &p);
if (code != 0) { // task does not exist in current vnode, not record the complete info if (code != 0) { // task does not exist in current vnode, not record the complete info
stError("vgId:%d s-task:0x%x not exists discard the check downstream info", vgId, taskId); stError("vgId:%d s-task:0x%x not exists discard the check downstream info", vgId, taskId);
streamMetaWUnLock(pMeta);
return 0; return 0;
} }
@ -218,7 +252,6 @@ int32_t streamMetaAddTaskLaunchResult(SStreamMeta* pMeta, int64_t streamId, int3
"vgId:%d not in start all task(s) process, not record launch result status, s-task:0x%x launch succ:%d elapsed " "vgId:%d not in start all task(s) process, not record launch result status, s-task:0x%x launch succ:%d elapsed "
"time:%" PRId64 "ms", "time:%" PRId64 "ms",
vgId, taskId, ready, el); vgId, taskId, ready, el);
streamMetaWUnLock(pMeta);
return 0; return 0;
} }
@ -230,35 +263,24 @@ int32_t streamMetaAddTaskLaunchResult(SStreamMeta* pMeta, int64_t streamId, int3
stError("vgId:%d record start task result failed, s-task:0x%" PRIx64 stError("vgId:%d record start task result failed, s-task:0x%" PRIx64
" already exist start results in meta start task result hashmap", " already exist start results in meta start task result hashmap",
vgId, id.taskId); vgId, id.taskId);
code = 0;
} else { } else {
stError("vgId:%d failed to record start task:0x%" PRIx64 " results, start all tasks failed", vgId, id.taskId); stError("vgId:%d failed to record start task:0x%" PRIx64 " results, start all tasks failed, code:%s", vgId,
id.taskId, tstrerror(code));
} }
streamMetaWUnLock(pMeta);
return code;
} }
int32_t numOfTotal = streamMetaGetNumOfTasks(pMeta); int32_t numOfTotal = streamMetaGetNumOfTasks(pMeta);
int32_t numOfRecv = taosHashGetSize(pStartInfo->pReadyTaskSet) + taosHashGetSize(pStartInfo->pFailedTaskSet); int32_t numOfSucc = taosHashGetSize(pStartInfo->pReadyTaskSet);
int32_t numOfRecv = numOfSucc + taosHashGetSize(pStartInfo->pFailedTaskSet);
allRsp = allCheckDownstreamRsp(pMeta, pStartInfo, numOfTotal); allRsp = allCheckDownstreamRsp(pMeta, pStartInfo, numOfTotal);
if (allRsp) { if (allRsp) {
pStartInfo->readyTs = taosGetTimestampMs(); streamMetaLogLaunchTasksInfo(pMeta, numOfTotal, taskId, ready);
pStartInfo->elapsedTime = (pStartInfo->startTs != 0) ? pStartInfo->readyTs - pStartInfo->startTs : 0;
stDebug("vgId:%d all %d task(s) check downstream completed, last completed task:0x%x (succ:%d) startTs:%" PRId64
", readyTs:%" PRId64 " total elapsed time:%.2fs",
vgId, numOfTotal, taskId, ready, pStartInfo->startTs, pStartInfo->readyTs,
pStartInfo->elapsedTime / 1000.0);
// print the initialization elapsed time and info
displayStatusInfo(pMeta, pStartInfo->pReadyTaskSet, true);
displayStatusInfo(pMeta, pStartInfo->pFailedTaskSet, false);
streamMetaResetStartInfo(pStartInfo, vgId); streamMetaResetStartInfo(pStartInfo, vgId);
streamMetaWUnLock(pMeta);
code = pStartInfo->completeFn(pMeta); code = pStartInfo->completeFn(pMeta);
} else { } else {
streamMetaWUnLock(pMeta);
stDebug("vgId:%d recv check downstream results, s-task:0x%x succ:%d, received:%d, total:%d", vgId, taskId, ready, stDebug("vgId:%d recv check downstream results, s-task:0x%x succ:%d, received:%d, total:%d", vgId, taskId, ready,
numOfRecv, numOfTotal); numOfRecv, numOfTotal);
} }
@ -266,6 +288,17 @@ int32_t streamMetaAddTaskLaunchResult(SStreamMeta* pMeta, int64_t streamId, int3
return code; return code;
} }
int32_t streamMetaAddTaskLaunchResult(SStreamMeta* pMeta, int64_t streamId, int32_t taskId, int64_t startTs,
int64_t endTs, bool ready) {
int32_t code = 0;
streamMetaWLock(pMeta);
code = streamMetaAddTaskLaunchResultNoLock(pMeta, streamId, taskId, startTs, endTs, ready);
streamMetaWUnLock(pMeta);
return code;
}
// check all existed tasks are received rsp // check all existed tasks are received rsp
bool allCheckDownstreamRsp(SStreamMeta* pMeta, STaskStartInfo* pStartInfo, int32_t numOfTotal) { bool allCheckDownstreamRsp(SStreamMeta* pMeta, STaskStartInfo* pStartInfo, int32_t numOfTotal) {
for (int32_t i = 0; i < numOfTotal; ++i) { for (int32_t i = 0; i < numOfTotal; ++i) {
@ -279,6 +312,7 @@ bool allCheckDownstreamRsp(SStreamMeta* pMeta, STaskStartInfo* pStartInfo, int32
if (px == NULL) { if (px == NULL) {
px = taosHashGet(pStartInfo->pFailedTaskSet, &idx, sizeof(idx)); px = taosHashGet(pStartInfo->pFailedTaskSet, &idx, sizeof(idx));
if (px == NULL) { if (px == NULL) {
stDebug("vgId:%d s-task:0x%x start result not rsp yet", pMeta->vgId, (int32_t) idx.taskId);
return false; return false;
} }
} }
@ -292,7 +326,7 @@ void displayStatusInfo(SStreamMeta* pMeta, SHashObj* pTaskSet, bool succ) {
void* pIter = NULL; void* pIter = NULL;
size_t keyLen = 0; size_t keyLen = 0;
stInfo("vgId:%d %d tasks check-downstream completed, %s", vgId, taosHashGetSize(pTaskSet), stInfo("vgId:%d %d tasks complete check-downstream, %s", vgId, taosHashGetSize(pTaskSet),
succ ? "success" : "failed"); succ ? "success" : "failed");
while ((pIter = taosHashIterate(pTaskSet, pIter)) != NULL) { while ((pIter = taosHashIterate(pTaskSet, pIter)) != NULL) {
@ -323,12 +357,19 @@ int32_t streamMetaInitStartInfo(STaskStartInfo* pStartInfo) {
return terrno; return terrno;
} }
pStartInfo->pStagesList = taosArrayInit(4, sizeof(SStartTaskStageInfo));
if (pStartInfo->pStagesList == NULL) {
return terrno;
}
return 0; return 0;
} }
void streamMetaClearStartInfo(STaskStartInfo* pStartInfo) { void streamMetaClearStartInfo(STaskStartInfo* pStartInfo) {
taosHashCleanup(pStartInfo->pReadyTaskSet); taosHashCleanup(pStartInfo->pReadyTaskSet);
taosHashCleanup(pStartInfo->pFailedTaskSet); taosHashCleanup(pStartInfo->pFailedTaskSet);
taosArrayDestroy(pStartInfo->pStagesList);
pStartInfo->readyTs = 0; pStartInfo->readyTs = 0;
pStartInfo->elapsedTime = 0; pStartInfo->elapsedTime = 0;
pStartInfo->startTs = 0; pStartInfo->startTs = 0;
@ -348,7 +389,7 @@ int32_t streamMetaStartOneTask(SStreamMeta* pMeta, int64_t streamId, int32_t tas
code = streamMetaAcquireTask(pMeta, streamId, taskId, &pTask); code = streamMetaAcquireTask(pMeta, streamId, taskId, &pTask);
if ((pTask == NULL) || (code != 0)) { if ((pTask == NULL) || (code != 0)) {
stError("vgId:%d failed to acquire task:0x%x when starting task", vgId, taskId); stError("vgId:%d failed to acquire task:0x%x when starting task", vgId, taskId);
int32_t ret = streamMetaAddFailedTask(pMeta, streamId, taskId); int32_t ret = streamMetaAddFailedTask(pMeta, streamId, taskId, true);
if (ret) { if (ret) {
stError("s-task:0x%x add check downstream failed, core:%s", taskId, tstrerror(ret)); stError("s-task:0x%x add check downstream failed, core:%s", taskId, tstrerror(ret));
} }
@ -365,7 +406,7 @@ int32_t streamMetaStartOneTask(SStreamMeta* pMeta, int64_t streamId, int32_t tas
} }
// the start all tasks procedure may happen to start the newly deployed stream task, and results in the // the start all tasks procedure may happen to start the newly deployed stream task, and results in the
// concurrently start this task by two threads. // concurrent start this task by two threads.
streamMutexLock(&pTask->lock); streamMutexLock(&pTask->lock);
SStreamTaskState status = streamTaskGetStatus(pTask); SStreamTaskState status = streamTaskGetStatus(pTask);
@ -382,12 +423,14 @@ int32_t streamMetaStartOneTask(SStreamMeta* pMeta, int64_t streamId, int32_t tas
return TSDB_CODE_STREAM_TASK_IVLD_STATUS; return TSDB_CODE_STREAM_TASK_IVLD_STATUS;
} }
if(pTask->status.downstreamReady != 0) { if (pTask->status.downstreamReady != 0) {
stFatal("s-task:0x%x downstream should be not ready, but it ready here, internal error happens", taskId); stFatal("s-task:0x%x downstream should be not ready, but it ready here, internal error happens", taskId);
streamMetaReleaseTask(pMeta, pTask); streamMetaReleaseTask(pMeta, pTask);
return TSDB_CODE_STREAM_INTERNAL_ERROR; return TSDB_CODE_STREAM_INTERNAL_ERROR;
} }
streamMetaWLock(pMeta);
// avoid initialization and destroy running concurrently. // avoid initialization and destroy running concurrently.
streamMutexLock(&pTask->lock); streamMutexLock(&pTask->lock);
if (pTask->pBackend == NULL) { if (pTask->pBackend == NULL) {
@ -395,7 +438,7 @@ int32_t streamMetaStartOneTask(SStreamMeta* pMeta, int64_t streamId, int32_t tas
streamMutexUnlock(&pTask->lock); streamMutexUnlock(&pTask->lock);
if (code != TSDB_CODE_SUCCESS) { if (code != TSDB_CODE_SUCCESS) {
streamMetaAddFailedTaskSelf(pTask, pInfo->readyTs); streamMetaAddFailedTaskSelf(pTask, pInfo->readyTs, false);
} }
} else { } else {
streamMutexUnlock(&pTask->lock); streamMutexUnlock(&pTask->lock);
@ -410,12 +453,14 @@ int32_t streamMetaStartOneTask(SStreamMeta* pMeta, int64_t streamId, int32_t tas
// do no added into result hashmap if it is failed due to concurrently starting of this stream task. // do no added into result hashmap if it is failed due to concurrently starting of this stream task.
if (code != TSDB_CODE_STREAM_CONFLICT_EVENT) { if (code != TSDB_CODE_STREAM_CONFLICT_EVENT) {
streamMetaAddFailedTaskSelf(pTask, pInfo->readyTs); streamMetaAddFailedTaskSelf(pTask, pInfo->readyTs, false);
} }
} }
} }
streamMetaWUnLock(pMeta);
streamMetaReleaseTask(pMeta, pTask); streamMetaReleaseTask(pMeta, pTask);
return code; return code;
} }
@ -470,26 +515,21 @@ int32_t streamMetaStopAllTasks(SStreamMeta* pMeta) {
int32_t streamTaskCheckIfReqConsenChkptId(SStreamTask* pTask, int64_t ts) { int32_t streamTaskCheckIfReqConsenChkptId(SStreamTask* pTask, int64_t ts) {
SConsenChkptInfo* pConChkptInfo = &pTask->status.consenChkptInfo; SConsenChkptInfo* pConChkptInfo = &pTask->status.consenChkptInfo;
int32_t vgId = pTask->pMeta->vgId;
int32_t vgId = pTask->pMeta->vgId; if (pTask->pMeta->startInfo.curStage == START_MARK_REQ_CHKPID) {
if (pConChkptInfo->status == TASK_CONSEN_CHKPT_REQ) { if (pConChkptInfo->status == TASK_CONSEN_CHKPT_REQ) {
// mark the sending of req consensus checkpoint request. // mark the sending of req consensus checkpoint request.
pConChkptInfo->status = TASK_CONSEN_CHKPT_SEND; pConChkptInfo->status = TASK_CONSEN_CHKPT_SEND;
pConChkptInfo->statusTs = ts; stDebug("s-task:%s vgId:%d set requiring consensus-chkptId in hbMsg, ts:%" PRId64, pTask->id.idStr, vgId,
stDebug("s-task:%s vgId:%d set requiring consensus-chkptId in hbMsg, ts:%" PRId64, pTask->id.idStr, pConChkptInfo->statusTs);
vgId, pConChkptInfo->statusTs);
return 1;
} else {
int32_t el = (ts - pConChkptInfo->statusTs) / 1000;
// not recv consensus-checkpoint rsp for 60sec, send it again in hb to mnode
if ((pConChkptInfo->status == TASK_CONSEN_CHKPT_SEND) && el > 60) {
pConChkptInfo->statusTs = ts;
stWarn(
"s-task:%s vgId:%d not recv consensus-chkptId for %ds(more than 60s), set requiring in Hb again, ts:%" PRId64,
pTask->id.idStr, vgId, el, pConChkptInfo->statusTs);
return 1; return 1;
} else if (pConChkptInfo->status == 0) {
stDebug("vgId:%d s-task:%s not need to set the req checkpointId, current stage:%d", vgId, pTask->id.idStr,
pConChkptInfo->status);
} else {
stWarn("vgId:%d, s-task:%s restart procedure expired, start stage:%d", vgId, pTask->id.idStr,
pConChkptInfo->status);
} }
} }
@ -513,10 +553,11 @@ void streamTaskSetReqConsenChkptId(SStreamTask* pTask, int64_t ts) {
pInfo->statusTs = ts; pInfo->statusTs = ts;
pInfo->consenChkptTransId = 0; pInfo->consenChkptTransId = 0;
stDebug("s-task:%s set req consen-checkpointId flag, prev transId:%d, ts:%" PRId64, pTask->id.idStr, prevTrans, ts); stDebug("s-task:%s set req consen-checkpointId flag, prev transId:%d, ts:%" PRId64 ", task created ts:%" PRId64,
pTask->id.idStr, prevTrans, ts, pTask->execInfo.created);
} }
int32_t streamMetaAddFailedTask(SStreamMeta* pMeta, int64_t streamId, int32_t taskId) { int32_t streamMetaAddFailedTask(SStreamMeta* pMeta, int64_t streamId, int32_t taskId, bool lock) {
int32_t code = TSDB_CODE_SUCCESS; int32_t code = TSDB_CODE_SUCCESS;
int64_t now = taosGetTimestampMs(); int64_t now = taosGetTimestampMs();
int64_t startTs = 0; int64_t startTs = 0;
@ -527,7 +568,9 @@ int32_t streamMetaAddFailedTask(SStreamMeta* pMeta, int64_t streamId, int32_t ta
stDebug("vgId:%d add start failed task:0x%x", pMeta->vgId, taskId); stDebug("vgId:%d add start failed task:0x%x", pMeta->vgId, taskId);
streamMetaRLock(pMeta); if (lock) {
streamMetaRLock(pMeta);
}
code = streamMetaAcquireTaskUnsafe(pMeta, &id, &pTask); code = streamMetaAcquireTaskUnsafe(pMeta, &id, &pTask);
if (code == 0) { if (code == 0) {
@ -536,15 +579,26 @@ int32_t streamMetaAddFailedTask(SStreamMeta* pMeta, int64_t streamId, int32_t ta
hId = pTask->hTaskInfo.id; hId = pTask->hTaskInfo.id;
streamMetaReleaseTask(pMeta, pTask); streamMetaReleaseTask(pMeta, pTask);
streamMetaRUnLock(pMeta); if (lock) {
streamMetaRUnLock(pMeta);
}
// add the failed task info, along with the related fill-history task info into tasks list. // add the failed task info, along with the related fill-history task info into tasks list.
code = streamMetaAddTaskLaunchResult(pMeta, streamId, taskId, startTs, now, false); if (lock) {
if (hasFillhistoryTask) { code = streamMetaAddTaskLaunchResult(pMeta, streamId, taskId, startTs, now, false);
code = streamMetaAddTaskLaunchResult(pMeta, hId.streamId, hId.taskId, startTs, now, false); if (hasFillhistoryTask) {
code = streamMetaAddTaskLaunchResult(pMeta, hId.streamId, hId.taskId, startTs, now, false);
}
} else {
code = streamMetaAddTaskLaunchResultNoLock(pMeta, streamId, taskId, startTs, now, false);
if (hasFillhistoryTask) {
code = streamMetaAddTaskLaunchResultNoLock(pMeta, hId.streamId, hId.taskId, startTs, now, false);
}
} }
} else { } else {
streamMetaRUnLock(pMeta); if (lock) {
streamMetaRUnLock(pMeta);
}
stError("failed to locate the stream task:0x%" PRIx64 "-0x%x (vgId:%d), it may have been destroyed or stopped", stError("failed to locate the stream task:0x%" PRIx64 "-0x%x (vgId:%d), it may have been destroyed or stopped",
streamId, taskId, pMeta->vgId); streamId, taskId, pMeta->vgId);
@ -554,9 +608,17 @@ int32_t streamMetaAddFailedTask(SStreamMeta* pMeta, int64_t streamId, int32_t ta
return code; return code;
} }
void streamMetaAddFailedTaskSelf(SStreamTask* pTask, int64_t failedTs) { void streamMetaAddFailedTaskSelf(SStreamTask* pTask, int64_t failedTs, bool lock) {
int32_t startTs = pTask->execInfo.checkTs; int32_t startTs = pTask->execInfo.checkTs;
int32_t code = streamMetaAddTaskLaunchResult(pTask->pMeta, pTask->id.streamId, pTask->id.taskId, startTs, failedTs, false); int32_t code = 0;
if (lock) {
code = streamMetaAddTaskLaunchResult(pTask->pMeta, pTask->id.streamId, pTask->id.taskId, startTs, failedTs, false);
} else {
code = streamMetaAddTaskLaunchResultNoLock(pTask->pMeta, pTask->id.streamId, pTask->id.taskId, startTs, failedTs,
false);
}
if (code) { if (code) {
stError("s-task:%s failed to add self task failed to start, code:%s", pTask->id.idStr, tstrerror(code)); stError("s-task:%s failed to add self task failed to start, code:%s", pTask->id.idStr, tstrerror(code));
} }
@ -564,7 +626,13 @@ void streamMetaAddFailedTaskSelf(SStreamTask* pTask, int64_t failedTs) {
// automatically set the related fill-history task to be failed. // automatically set the related fill-history task to be failed.
if (HAS_RELATED_FILLHISTORY_TASK(pTask)) { if (HAS_RELATED_FILLHISTORY_TASK(pTask)) {
STaskId* pId = &pTask->hTaskInfo.id; STaskId* pId = &pTask->hTaskInfo.id;
code = streamMetaAddTaskLaunchResult(pTask->pMeta, pId->streamId, pId->taskId, startTs, failedTs, false);
if (lock) {
code = streamMetaAddTaskLaunchResult(pTask->pMeta, pId->streamId, pId->taskId, startTs, failedTs, false);
} else {
code = streamMetaAddTaskLaunchResultNoLock(pTask->pMeta, pId->streamId, pId->taskId, startTs, failedTs, false);
}
if (code) { if (code) {
stError("s-task:0x%" PRIx64 " failed to add self task failed to start, code:%s", pId->taskId, tstrerror(code)); stError("s-task:0x%" PRIx64 " failed to add self task failed to start, code:%s", pId->taskId, tstrerror(code));
} }

View File

@ -152,11 +152,12 @@ _end:
return NULL; return NULL;
} }
SStreamState* streamStateOpen(const char* path, void* pTask, int64_t streamId, int32_t taskId) { SStreamState* streamStateOpen(const char* path, void* pTask, int64_t streamId, int32_t taskId) {
int32_t code = TSDB_CODE_SUCCESS; int32_t code = TSDB_CODE_SUCCESS;
int32_t lino = 0; int32_t lino = 0;
SStreamTask* pStreamTask = pTask;
SStreamState* pState = taosMemoryCalloc(1, sizeof(SStreamState)); SStreamState* pState = taosMemoryCalloc(1, sizeof(SStreamState));
stDebug("open stream state %p, %s", pState, path); stDebug("s-task:%s open stream state %p, %s", pStreamTask->id.idStr, pState, path);
TAOS_UNUSED(tsnprintf(pState->pTaskIdStr, sizeof(pState->pTaskIdStr), "TID:0x%x QID:0x%" PRIx64, TAOS_UNUSED(tsnprintf(pState->pTaskIdStr, sizeof(pState->pTaskIdStr), "TID:0x%x QID:0x%" PRIx64,
taskId, streamId)); taskId, streamId));
@ -173,7 +174,6 @@ SStreamState* streamStateOpen(const char* path, void* pTask, int64_t streamId, i
QUERY_CHECK_CODE(code, lino, _end); QUERY_CHECK_CODE(code, lino, _end);
} }
SStreamTask* pStreamTask = pTask;
pState->streamId = streamId; pState->streamId = streamId;
pState->taskId = taskId; pState->taskId = taskId;
TAOS_UNUSED(tsnprintf(pState->pTdbState->idstr, sizeof(pState->pTdbState->idstr), "0x%" PRIx64 "-0x%x", TAOS_UNUSED(tsnprintf(pState->pTdbState->idstr, sizeof(pState->pTdbState->idstr), "0x%" PRIx64 "-0x%x",
@ -189,8 +189,8 @@ SStreamState* streamStateOpen(const char* path, void* pTask, int64_t streamId, i
pState->parNameMap = tSimpleHashInit(1024, hashFn); pState->parNameMap = tSimpleHashInit(1024, hashFn);
QUERY_CHECK_NULL(pState->parNameMap, code, lino, _end, terrno); QUERY_CHECK_NULL(pState->parNameMap, code, lino, _end, terrno);
stInfo("open state %p on backend %p 0x%" PRIx64 "-%d succ", pState, pMeta->streamBackend, pState->streamId, stInfo("s-task:%s open state %p on backend %p 0x%" PRIx64 "-%d succ", pStreamTask->id.idStr, pState,
pState->taskId); pMeta->streamBackend, pState->streamId, pState->taskId);
return pState; return pState;
_end: _end:

View File

@ -15,21 +15,21 @@
#include "streamInt.h" #include "streamInt.h"
void streamMutexLock(TdThreadMutex *pMutex) { void streamMutexLock(TdThreadMutex* pMutex) {
int32_t code = taosThreadMutexLock(pMutex); int32_t code = taosThreadMutexLock(pMutex);
if (code) { if (code) {
stError("%p mutex lock failed, code:%s", pMutex, tstrerror(code)); stError("%p mutex lock failed, code:%s", pMutex, tstrerror(code));
} }
} }
void streamMutexUnlock(TdThreadMutex *pMutex) { void streamMutexUnlock(TdThreadMutex* pMutex) {
int32_t code = taosThreadMutexUnlock(pMutex); int32_t code = taosThreadMutexUnlock(pMutex);
if (code) { if (code) {
stError("%p mutex unlock failed, code:%s", pMutex, tstrerror(code)); stError("%p mutex unlock failed, code:%s", pMutex, tstrerror(code));
} }
} }
void streamMutexDestroy(TdThreadMutex *pMutex) { void streamMutexDestroy(TdThreadMutex* pMutex) {
int32_t code = taosThreadMutexDestroy(pMutex); int32_t code = taosThreadMutexDestroy(pMutex);
if (code) { if (code) {
stError("%p mutex destroy, code:%s", pMutex, tstrerror(code)); stError("%p mutex destroy, code:%s", pMutex, tstrerror(code));
@ -37,7 +37,7 @@ void streamMutexDestroy(TdThreadMutex *pMutex) {
} }
void streamMetaRLock(SStreamMeta* pMeta) { void streamMetaRLock(SStreamMeta* pMeta) {
// stTrace("vgId:%d meta-rlock", pMeta->vgId); // stTrace("vgId:%d meta-rlock", pMeta->vgId);
int32_t code = taosThreadRwlockRdlock(&pMeta->lock); int32_t code = taosThreadRwlockRdlock(&pMeta->lock);
if (code) { if (code) {
stError("vgId:%d meta-rlock failed, code:%s", pMeta->vgId, tstrerror(code)); stError("vgId:%d meta-rlock failed, code:%s", pMeta->vgId, tstrerror(code));
@ -45,7 +45,7 @@ void streamMetaRLock(SStreamMeta* pMeta) {
} }
void streamMetaRUnLock(SStreamMeta* pMeta) { void streamMetaRUnLock(SStreamMeta* pMeta) {
// stTrace("vgId:%d meta-runlock", pMeta->vgId); // stTrace("vgId:%d meta-runlock", pMeta->vgId);
int32_t code = taosThreadRwlockUnlock(&pMeta->lock); int32_t code = taosThreadRwlockUnlock(&pMeta->lock);
if (code != TSDB_CODE_SUCCESS) { if (code != TSDB_CODE_SUCCESS) {
stError("vgId:%d meta-runlock failed, code:%s", pMeta->vgId, tstrerror(code)); stError("vgId:%d meta-runlock failed, code:%s", pMeta->vgId, tstrerror(code));
@ -57,14 +57,16 @@ void streamMetaRUnLock(SStreamMeta* pMeta) {
int32_t streamMetaTryRlock(SStreamMeta* pMeta) { int32_t streamMetaTryRlock(SStreamMeta* pMeta) {
int32_t code = taosThreadRwlockTryRdlock(&pMeta->lock); int32_t code = taosThreadRwlockTryRdlock(&pMeta->lock);
if (code) { if (code) {
stError("vgId:%d try meta-rlock failed, code:%s", pMeta->vgId, tstrerror(code)); if (code != TAOS_SYSTEM_ERROR(EBUSY)) {
stError("vgId:%d try meta-rlock failed, code:%s", pMeta->vgId, tstrerror(code));
}
} }
return code; return code;
} }
void streamMetaWLock(SStreamMeta* pMeta) { void streamMetaWLock(SStreamMeta* pMeta) {
// stTrace("vgId:%d meta-wlock", pMeta->vgId); // stTrace("vgId:%d meta-wlock", pMeta->vgId);
int32_t code = taosThreadRwlockWrlock(&pMeta->lock); int32_t code = taosThreadRwlockWrlock(&pMeta->lock);
if (code) { if (code) {
stError("vgId:%d failed to apply wlock, code:%s", pMeta->vgId, tstrerror(code)); stError("vgId:%d failed to apply wlock, code:%s", pMeta->vgId, tstrerror(code));
@ -72,7 +74,7 @@ void streamMetaWLock(SStreamMeta* pMeta) {
} }
void streamMetaWUnLock(SStreamMeta* pMeta) { void streamMetaWUnLock(SStreamMeta* pMeta) {
// stTrace("vgId:%d meta-wunlock", pMeta->vgId); // stTrace("vgId:%d meta-wunlock", pMeta->vgId);
int32_t code = taosThreadRwlockUnlock(&pMeta->lock); int32_t code = taosThreadRwlockUnlock(&pMeta->lock);
if (code) { if (code) {
stError("vgId:%d failed to apply wunlock, code:%s", pMeta->vgId, tstrerror(code)); stError("vgId:%d failed to apply wunlock, code:%s", pMeta->vgId, tstrerror(code));
@ -94,5 +96,5 @@ void streamSetFatalError(SStreamMeta* pMeta, int32_t code, const char* funcName,
} }
int32_t streamGetFatalError(const SStreamMeta* pMeta) { int32_t streamGetFatalError(const SStreamMeta* pMeta) {
return atomic_load_32((volatile int32_t*) &pMeta->fatalInfo.code); return atomic_load_32((volatile int32_t*)&pMeta->fatalInfo.code);
} }

View File

@ -2421,6 +2421,7 @@ void syncNodeVoteForTerm(SSyncNode* pSyncNode, SyncTerm term, SRaftId* pRaftId)
sError("vgId:%d, failed to vote for term, term:%" PRId64 ", storeTerm:%" PRId64, pSyncNode->vgId, term, storeTerm); sError("vgId:%d, failed to vote for term, term:%" PRId64 ", storeTerm:%" PRId64, pSyncNode->vgId, term, storeTerm);
return; return;
} }
sTrace("vgId:%d, begin hasVoted", pSyncNode->vgId);
bool voted = raftStoreHasVoted(pSyncNode); bool voted = raftStoreHasVoted(pSyncNode);
if (voted) { if (voted) {
sError("vgId:%d, failed to vote for term since not voted", pSyncNode->vgId); sError("vgId:%d, failed to vote for term since not voted", pSyncNode->vgId);
@ -3578,7 +3579,7 @@ int32_t syncNodeOnHeartbeat(SSyncNode* ths, const SRpcMsg* pRpcMsg) {
SRpcMsg rpcMsg = {0}; SRpcMsg rpcMsg = {0};
TAOS_CHECK_RETURN(syncBuildHeartbeatReply(&rpcMsg, ths->vgId)); TAOS_CHECK_RETURN(syncBuildHeartbeatReply(&rpcMsg, ths->vgId));
SyncTerm currentTerm = raftStoreGetTerm(ths); SyncTerm currentTerm = raftStoreTryGetTerm(ths);
SyncHeartbeatReply* pMsgReply = rpcMsg.pCont; SyncHeartbeatReply* pMsgReply = rpcMsg.pCont;
pMsgReply->destId = pMsg->srcId; pMsgReply->destId = pMsg->srcId;
@ -3588,6 +3589,15 @@ int32_t syncNodeOnHeartbeat(SSyncNode* ths, const SRpcMsg* pRpcMsg) {
pMsgReply->startTime = ths->startTime; pMsgReply->startTime = ths->startTime;
pMsgReply->timeStamp = tsMs; pMsgReply->timeStamp = tsMs;
// reply
TRACE_SET_MSGID(&(rpcMsg.info.traceId), tGenIdPI64());
trace = &(rpcMsg.info.traceId);
sGTrace("vgId:%d, send sync-heartbeat-reply to dnode:%d term:%" PRId64 " timestamp:%" PRId64, ths->vgId,
DID(&(pMsgReply->destId)), pMsgReply->term, pMsgReply->timeStamp);
TAOS_CHECK_RETURN(syncNodeSendMsgById(&pMsgReply->destId, ths, &rpcMsg));
if (currentTerm == 0) currentTerm = raftStoreGetTerm(ths);
sGTrace("vgId:%d, process sync-heartbeat msg from dnode:%d, cluster:%d, Msgterm:%" PRId64 " currentTerm:%" PRId64, sGTrace("vgId:%d, process sync-heartbeat msg from dnode:%d, cluster:%d, Msgterm:%" PRId64 " currentTerm:%" PRId64,
ths->vgId, DID(&(pMsg->srcId)), CID(&(pMsg->srcId)), pMsg->term, currentTerm); ths->vgId, DID(&(pMsg->srcId)), CID(&(pMsg->srcId)), pMsg->term, currentTerm);
@ -3647,9 +3657,6 @@ int32_t syncNodeOnHeartbeat(SSyncNode* ths, const SRpcMsg* pRpcMsg) {
} }
} }
// reply
TAOS_CHECK_RETURN(syncNodeSendMsgById(&pMsgReply->destId, ths, &rpcMsg));
if (resetElect) syncNodeResetElectTimer(ths); if (resetElect) syncNodeResetElectTimer(ths);
return 0; return 0;
} }

View File

@ -108,6 +108,7 @@ int32_t syncNodeOnRequestVote(SSyncNode* ths, const SRpcMsg* pRpcMsg) {
SyncTerm currentTerm = raftStoreGetTerm(ths); SyncTerm currentTerm = raftStoreGetTerm(ths);
if (!(pMsg->term <= currentTerm)) return TSDB_CODE_SYN_INTERNAL_ERROR; if (!(pMsg->term <= currentTerm)) return TSDB_CODE_SYN_INTERNAL_ERROR;
sTrace("vgId:%d, begin hasVoted", ths->vgId);
bool grant = (pMsg->term == currentTerm) && logOK && bool grant = (pMsg->term == currentTerm) && logOK &&
((!raftStoreHasVoted(ths)) || (syncUtilSameId(&ths->raftStore.voteFor, &pMsg->srcId))); ((!raftStoreHasVoted(ths)) || (syncUtilSameId(&ths->raftStore.voteFor, &pMsg->srcId)));
if (grant) { if (grant) {

View File

@ -329,9 +329,9 @@ char *strbetween(char *string, char *begin, char *end) {
return result; return result;
} }
int32_t tintToHex(uint64_t val, char hex[]) { static const char hexstr[16] = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'};
const char hexstr[16] = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'};
int32_t tintToHex(uint64_t val, char hex[]) {
int32_t j = 0, k = 0; int32_t j = 0, k = 0;
if (val == 0) { if (val == 0) {
hex[j++] = hexstr[0]; hex[j++] = hexstr[0];
@ -355,13 +355,12 @@ int32_t titoa(uint64_t val, size_t radix, char str[]) {
return 0; return 0;
} }
const char *s = "0123456789abcdef";
char buf[65] = {0}; char buf[65] = {0};
int32_t i = 0; int32_t i = 0;
uint64_t v = val; uint64_t v = val;
do { do {
buf[i++] = s[v % radix]; buf[i++] = hexstr[v % radix];
v /= radix; v /= radix;
} while (v > 0); } while (v > 0);
@ -373,13 +372,12 @@ int32_t titoa(uint64_t val, size_t radix, char str[]) {
return i; return i;
} }
int32_t taosByteArrayToHexStr(char bytes[], int32_t len, char hexstr[]) { int32_t taosByteArrayToHexStr(char bytes[], int32_t len, char str[]) {
int32_t i; int32_t i;
char hexval[16] = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'};
for (i = 0; i < len; i++) { for (i = 0; i < len; i++) {
hexstr[i * 2] = hexval[((bytes[i] >> 4u) & 0xF)]; str[i * 2] = hexstr[((bytes[i] >> 4u) & 0xF)];
hexstr[(i * 2) + 1] = hexval[(bytes[i]) & 0x0F]; str[(i * 2) + 1] = hexstr[(bytes[i]) & 0x0F];
} }
return 0; return 0;

View File

@ -241,20 +241,23 @@ Please refer to the [Unit Test](#31-unit-test)、[System Test](#32-system-test)
### 3.7.1 How to run tests? ### 3.7.1 How to run tests?
TSBS test can be started locally by running command below. Ensure that your virtual machine supports the AVX instruction set:
TSBS test can be started locally by running command below. Ensure that your virtual machine supports the AVX instruction set.
You need to use sudo -s to start a new shell session as the superuser (root) in order to begin the testing:
```bash ```bash
cd /usr/local/src && \ cd /usr/local/src && \
git clone https://github.com/taosdata/tsbs.git && \ git clone https://github.com/taosdata/tsbs.git && \
cd tsbs && \ cd tsbs && \
git checkout enh/chr-td-33357 && \ git checkout enh/add-influxdb3.0 && \
cd scripts/tsdbComp && \ cd scripts/tsdbComp && \
./testTsbs.sh ./tsbs_test.sh -s scenario4
``` ```
> [!NOTE] > [!NOTE]
> 1. TSBS test is written in Golang, in order to run the test smoothly, a Go proxy in China is set in above script by default. If this is not what you want, please unset it with command `sed -i '/GOPROXY/d' /usr/local/src/tsbs/scripts/tsdbComp/installTsbsCommand.sh` before starting the test. > 1. TSBS test is written in Golang. If you are unable to connect to the [international Go proxy](https://proxy.golang.org), the script will automatically set it to the [china Go proxy](https://goproxy.cn).
> 2. To check your current Go proxy setting, please run `go env | grep GOPROXY`. > 2. If you need to cancel this china Go proxy, you can execute the following command in your environment `go env -u GOPROXY`.
> 3. To check your current Go proxy setting, please run `go env | grep GOPROXY`.
### 3.7.2 How to start client and server on different hosts? ### 3.7.2 How to start client and server on different hosts?
@ -277,4 +280,9 @@ serverPass="taosdata123" # server root password
### 3.7.3 Check test results ### 3.7.3 Check test results
When the test is done, the result can be found in `/data2/` directory, which can also be configured in `test.ini`. When the test is done, the result can be found in `${installPath}/tsbs/scripts/tsdbComp/log/` directory, which ${installPath} can be configured in `test.ini`.
### 3.7.4 Test more scenario
Use `./tsbs_test.sh -h` to get more test scenarios.

View File

@ -579,6 +579,8 @@
,,y,system-test,./pytest.sh python3 ./test.py -f 0-others/view/non_marterial_view/test_view.py ,,y,system-test,./pytest.sh python3 ./test.py -f 0-others/view/non_marterial_view/test_view.py
,,y,system-test,./pytest.sh python3 ./test.py -f 0-others/test_show_table_distributed.py ,,y,system-test,./pytest.sh python3 ./test.py -f 0-others/test_show_table_distributed.py
,,y,system-test,./pytest.sh python3 ./test.py -f 0-others/test_show_disk_usage.py ,,y,system-test,./pytest.sh python3 ./test.py -f 0-others/test_show_disk_usage.py
,,y,system-test,./pytest.sh python3 ./test.py -f 0-others/show_disk_usage_multilevel.py
,,n,system-test,python3 ./test.py -f 0-others/compatibility.py ,,n,system-test,python3 ./test.py -f 0-others/compatibility.py
,,n,system-test,python3 ./test.py -f 0-others/tag_index_basic.py ,,n,system-test,python3 ./test.py -f 0-others/tag_index_basic.py
,,n,system-test,python3 ./test.py -f 0-others/udfpy_main.py ,,n,system-test,python3 ./test.py -f 0-others/udfpy_main.py

View File

@ -0,0 +1,220 @@
###################################################################
# Copyright (c) 2016 by TAOS Technologies, Inc.
# All rights reserved.
#
# This file is proprietary and confidential to TAOS Technologies.
# No part of this file may be reproduced, stored, transmitted,
# disclosed or used in any form or by any means other than as
# expressly provided by the written permission from Jianhui Tao
#
###################################################################
# -*- coding: utf-8 -*-
import os
import time
from util.log import *
from util.cases import *
from util.sql import *
from util.common import *
from util.sqlset import *
import subprocess
from datetime import datetime, timedelta
def get_disk_usage(path):
try:
result = subprocess.run(['du', '-sb', path], stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
if result.returncode == 0:
# The output is in the format "size\tpath"
size = int(result.stdout.split()[0])
return size
else:
print(f"Error: {result.stderr}")
return None
except Exception as e:
print(f"Exception occurred: {e}")
return None
class TDTestCase:
def _prepare_env1(self):
tdLog.info("============== prepare environment 1 ===============")
level_0_path = f'{self.dnode_path}/data00'
cfg = {
level_0_path: 'dataDir',
}
tdSql.createDir(level_0_path)
tdDnodes.stop(1)
tdDnodes.deploy(1, cfg)
tdDnodes.start(1)
def _prepare_env2(self):
tdLog.info("============== prepare environment 2 ===============")
level_0_path = f'{self.dnode_path}/data00'
level_1_path = f'{self.dnode_path}/data01'
cfg = {
f'{level_0_path}': 'dataDir',
f'{level_1_path} 1 0': 'dataDir',
}
tdSql.createDir(level_1_path)
tdDnodes.stop(1)
tdDnodes.deploy(1, cfg)
tdDnodes.start(1)
def _write_bulk_data(self):
tdLog.info("============== write bulk data ===============")
json_content = f"""
{{
"filetype": "insert",
"cfgdir": "{self.cfg_path}",
"host": "localhost",
"port": 6030,
"user": "root",
"password": "taosdata",
"connection_pool_size": 8,
"thread_count": 16,
"create_table_thread_count": 10,
"result_file": "./insert_res.txt",
"confirm_parameter_prompt": "no",
"insert_interval": 0,
"interlace_rows": 5,
"num_of_records_per_req": 1540,
"prepared_rand": 10000,
"chinese": "no",
"databases": [
{{
"dbinfo": {{
"name": "{self.db_name}",
"drop": "yes",
"vgroups": {self.vgroups},
"duration": "1d",
"keep": "3d,6d",
"wal_retention_period": 0,
"stt_trigger": 1
}},
"super_tables": [
{{
"name": "stb",
"child_table_exists": "no",
"childtable_count": 1000,
"childtable_prefix": "ctb",
"escape_character": "yes",
"auto_create_table": "no",
"batch_create_tbl_num": 500,
"data_source": "rand",
"insert_mode": "taosc",
"non_stop_mode": "no",
"line_protocol": "line",
"insert_rows": 10000,
"childtable_limit": 10,
"childtable_offset": 100,
"interlace_rows": 0,
"insert_interval": 0,
"partial_col_num": 0,
"disorder_ratio": 0,
"disorder_range": 1000,
"timestamp_step": 40000,
"start_timestamp": "{(datetime.now() - timedelta(days=5)).strftime('%Y-%m-%d %H:%M:%S')}",
"use_sample_ts": "no",
"tags_file": "",
"columns": [
{{
"type": "bigint",
"count": 10
}}
],
"tags": [
{{
"type": "TINYINT",
"name": "groupid",
"max": 10,
"min": 1
}},
{{
"name": "location",
"type": "BINARY",
"len": 16,
"values": [
"beijing",
"shanghai"
]
}}
]
}}
]
}}
]
}}
"""
json_file = '/tmp/test.json'
with open(json_file, 'w') as f:
f.write(json_content)
# Use subprocess.run() to wait for the command to finish
subprocess.run(f'taosBenchmark -f {json_file}', shell=True, check=True)
def _check_retention(self):
for vgid in range(2, 2+self.vgroups):
tsdb_path = self.dnode_path+f'/data01/vnode/vnode{vgid}/tsdb'
# check the path should not be empty
if not os.listdir(tsdb_path):
tdLog.error(f'{tsdb_path} is empty')
assert False
def _calculate_disk_usage(self, path):
size = 0
for vgid in range(2, 2+self.vgroups):
tsdb_path = self.dnode_path+f'/{path}/vnode/vnode{vgid}/tsdb'
size += get_disk_usage(tsdb_path)
return int(size/1024)
def _value_check(self, size1, size2, threshold=1000):
if abs(size1 - size2) < threshold:
tdLog.info(f"checkEqual success, base_value={size1},check_value={size2}")
else :
tdLog.exit(f"checkEqual error, base_value=={size1},check_value={size2}")
def run(self):
self._prepare_env1()
self._write_bulk_data()
tdSql.execute(f'flush database {self.db_name}')
tdDnodes.stop(1)
self._prepare_env2()
tdSql.execute(f'trim database {self.db_name}')
time.sleep(10)
self._check_retention()
size1 = self._calculate_disk_usage('data00')
size2 = self._calculate_disk_usage('data01')
tdSql.query(f'select sum(data1), sum(data2) from information_schema.ins_disk_usage where db_name="{self.db_name}"')
data1 = int(tdSql.queryResult[0][0])
data2 = int(tdSql.queryResult[0][1])
self._value_check(size1, data1)
self._value_check(size2, data2)
def init(self, conn, logSql, replicaVar=1):
tdLog.debug("start to execute %s" % __file__)
tdSql.init(conn.cursor())
self.dnode_path = tdCom.getTaosdPath()
self.cfg_path = f'{self.dnode_path}/cfg'
self.log_path = f'{self.dnode_path}/log'
self.db_name = 'test'
self.vgroups = 10
def stop(self):
tdSql.close()
tdLog.success("%s successfully executed" % __file__)
tdCases.addWindows(__file__, TDTestCase())
tdCases.addLinux(__file__, TDTestCase())

View File

@ -299,7 +299,16 @@ class TDTestCase:
tdSql.query(f'select * from information_schema.ins_streams where stream_name = "{stream_name}"') tdSql.query(f'select * from information_schema.ins_streams where stream_name = "{stream_name}"')
tdSql.checkEqual(tdSql.queryResult[0][4],f'create stream {stream_name} trigger at_once ignore expired 0 into stb1 as select * from tb') tdSql.checkEqual(tdSql.queryResult[0][4],f'create stream {stream_name} trigger at_once ignore expired 0 into stb1 as select * from tb')
tdSql.execute(f'drop database {self.dbname}') tdSql.execute(f'drop database {self.dbname}')
def test_table_name_with_star(self):
dbname = "test_tbname_with_star"
tbname = 's_*cszl01_207602da'
tdSql.execute(f'create database {dbname} replica 1 wal_retention_period 3600')
tdSql.execute(f'create table {dbname}.`{tbname}` (ts timestamp, c1 int)', queryTimes=1, show=1)
tdSql.execute(f"drop table {dbname}.`{tbname}`")
tdSql.execute(f"drop database {dbname}")
def run(self): def run(self):
self.test_table_name_with_star()
self.drop_ntb_check() self.drop_ntb_check()
self.drop_stb_ctb_check() self.drop_stb_ctb_check()
self.drop_stable_with_check() self.drop_stable_with_check()

View File

@ -17,7 +17,7 @@ from util.sql import *
from util.common import * from util.common import *
class TDTestCase: class TDTestCase:
updatecfgDict = {'ttlUnit':5,'ttlPushInterval':3} updatecfgDict = {'ttlUnit':5,'ttlPushInterval':3, 'mdebugflag':143}
def init(self, conn, logSql, replicaVar=1): def init(self, conn, logSql, replicaVar=1):
self.replicaVar = int(replicaVar) self.replicaVar = int(replicaVar)
tdLog.debug("start to execute %s" % __file__) tdLog.debug("start to execute %s" % __file__)