diff --git a/deps/win/x64/dm_static/dmodule.lib b/deps/win/x64/dm_static/dmodule.lib index 36ce1d81d3..55afb81e3f 100644 Binary files a/deps/win/x64/dm_static/dmodule.lib and b/deps/win/x64/dm_static/dmodule.lib differ diff --git a/docs/zh/01-index.md b/docs/zh/01-index.md index 130ea3b204..6ba8f021a7 100644 --- a/docs/zh/01-index.md +++ b/docs/zh/01-index.md @@ -4,9 +4,9 @@ sidebar_label: 文档首页 slug: / --- -TDengine 是一款[开源](https://www.taosdata.com/tdengine/open_source_time-series_database)、[高性能](https://www.taosdata.com/fast)、[云原生](https://www.taosdata.com/tdengine/cloud_native_time-series_database)的时序数据库Time Series Database, TSDB), 它专为物联网、车联网、工业互联网、金融、IT 运维等场景优化设计。同时它还带有内建的缓存、流式计算、数据订阅等系统功能,能大幅减少系统设计的复杂度,降低研发和运营成本,是一款极简的时序数据处理平台。本文档是 TDengine 的用户手册,主要是介绍 TDengine 的基本概念、安装、使用、功能、开发接口、运营维护、TDengine 内核设计等等,它主要是面向架构师、开发工程师与系统管理员的。 +TDengine 是一款[开源](https://www.taosdata.com/tdengine/open_source_time-series_database)、[高性能](https://www.taosdata.com/fast)、[云原生](https://www.taosdata.com/tdengine/cloud_native_time-series_database)的时序数据库Time Series Database, TSDB), 它专为物联网、车联网、工业互联网、金融、IT 运维等场景优化设计。同时它还带有内建的缓存、流式计算、数据订阅等系统功能,能大幅减少系统设计的复杂度,降低研发和运营成本,是一款极简的时序数据处理平台。本文档是 TDengine 的用户手册,主要是介绍 TDengine 的基本概念、安装、使用、功能、开发接口、运营维护、TDengine 内核设计等等,它主要是面向架构师、开发工程师与系统管理员的。如果你对时序数据的基本概念、价值以及其所能带来的业务价值尚不了解,请参考[时序数据基础](./concept) -TDengine 充分利用了时序数据的特点,提出了“一个数据采集点一张表”与“超级表”的概念,设计了创新的存储引擎,让数据的写入、查询和存储效率都得到极大的提升。为正确理解并使用 TDengine,无论如何,请您仔细阅读[基本概念](./concept)一章。 +TDengine 充分利用了时序数据的特点,提出了“一个数据采集点一张表”与“超级表”的概念,设计了创新的存储引擎,让数据的写入、查询和存储效率都得到极大的提升。为正确理解并使用 TDengine,无论如何,请您仔细阅读[快速入门](./basic)一章。 如果你是开发工程师,请一定仔细阅读[开发指南](./develop)一章,该部分对数据库连接、建模、插入数据、查询、流式计算、缓存、数据订阅、用户自定义函数等功能都做了详细介绍,并配有各种编程语言的示例代码。大部分情况下,你只要复制粘贴示例代码,针对自己的应用稍作改动,就能跑起来。 diff --git a/docs/zh/02-concept.md b/docs/zh/02-concept.md index a1e45d8a0a..d793d11c36 100644 --- a/docs/zh/02-concept.md +++ b/docs/zh/02-concept.md @@ -63,7 +63,7 @@ toc_max_heading_level: 4 1. 数据库(Database):数据库提供时序数据的高效存储和读取能力。在工业、物联网场景,由设备所产生的时序数据量是十分惊人的。从存储数据的角度来说,数据库需要把这些数据持久化到硬盘上并最大程度地压缩,从而降低存储成本。从读取数据的角度来说,数据库需要保证实时查询,以及历史数据的查询效率。比较传统的存储方案是使用 MySql、Oracle 等关系型数据库,也有 Hadoop 体系的 HBase,专用的时序数据库则有 InfluxDB、OpenTSDB、Prometheus 等。 -2. 数据订阅(Data Subscription):很多时序数据应用都需要在第一时间订阅到业务所需的实时数据,从而及时了解被监测对对象的最新状态,用 AI 或其他工具做实时的数据分析。同时,由于数据的隐私以及安全,你只能容许应用订阅他有权限访问的数据。因此,一个时序数据处理平台一定需要具备数据订阅的能力,帮助应用实时获取最新数据。 +2. 数据订阅(Data Subscription):很多时序数据应用都需要在第一时间订阅到业务所需的实时数据,从而及时了解被监测对对象的最新状态,用 AI 或其他工具做实时的数据分析。同时,由于数据的隐私以及安全,你只能允许应用订阅他有权限访问的数据。因此,一个时序数据处理平台一定需要具备数据订阅的能力,帮助应用实时获取最新数据。 3. ETL(Extract, Transform, Load):在实际的物联网、工业场景中,时序数据的采集需要特定的 ETL 工具进行数据的提取、清洗和转换操作,才能把数据写入数据库中,以保证数据的质量。因为不同数据采集系统往往使用不同的标准,比如采集的温度的物理单位不一致,有的用摄氏度,有的用华氏度;系统之间所在的时区不一致,要进行转换;时间分辨率也可能不统一,因此这些从不同系统汇聚来的数据需要进行转换才能写入数据库。 @@ -109,7 +109,7 @@ toc_max_heading_level: 4 5. 必须拥有高效的缓存功能:绝大部分场景,都需要能快速获取设备当前状态或其他信息,用以报警、大屏展示或其他。系统需要提供一高效机制,让用户可以获取全部、或符合过滤条件的部分设备的最新状态。 -6. 必须拥有实时流式计算:各种实时预警或预测已经不是简单的基于某一个阈值进行,而是需要通过将一个或多个设备产生的数据流进行实时聚合计算,不只是基于一个时间点、而是基于一个时间窗口进行计算。不仅如此,计算的需求也相当复杂,因场景而异,应容许用户自定义函数进行计算。 +6. 必须拥有实时流式计算:各种实时预警或预测已经不是简单的基于某一个阈值进行,而是需要通过将一个或多个设备产生的数据流进行实时聚合计算,不只是基于一个时间点、而是基于一个时间窗口进行计算。不仅如此,计算的需求也相当复杂,因场景而异,应允许用户自定义函数进行计算。 7. 必须支持数据订阅:与通用大数据平台比较一致,同一组数据往往有很多应用都需要,因此系统应该提供订阅功能,只要有新的数据更新,就应该实时提醒应用。由于数据隐私和安全,而且这个订阅也应该是个性化的,只能订阅有权查看的数据,比如仅仅能订阅每小时的平均功率,而不能订阅原始的电流、电压值。 @@ -119,7 +119,7 @@ toc_max_heading_level: 4 10. 必须支持灵活的多维度分析:对于联网设备产生的数据,需要进行各种维度的统计分析,比如从设备所处的地域进行分析,从设备的型号、供应商进行分析,从设备所使用的人员进行分析等等。而且这些维度的分析是无法事先想好的,是在实际运营过程中,根据业务发展的需求定下来的。因此时序大数据系统需要一个灵活的机制增加某个维度的分析。 -11. 需要支持即席分析和查询。为提高大数据分析师的工作效率,系统应该提供一命令行工具或容许用户通过其他工具,执行 SQL 查询,而不是非要通过编程接口。查询分析的结果可以很方便的导出,再制作成各种图表。 +11. 需要支持即席分析和查询。为提高大数据分析师的工作效率,系统应该提供一命令行工具或允许用户通过其他工具,执行 SQL 查询,而不是非要通过编程接口。查询分析的结果可以很方便的导出,再制作成各种图表。 12. 必须支持数据降频、插值、特殊函数计算等操作。原始数据的采集频次可能很高,但具体分析往往不需要对原始数据执行,而是数据降频之后。系统需要提供高效的数据降频操作。设备是很难同步的,不同设备采集数据的时间点是很难对齐的,因此分析一个特定时间点的值,往往需要插值才能解决,系统需要提供线性插值、设置固定值等多种插值策略才行。工业互联网里,除通用的统计操作之外,往往还需要支持一些特殊函数,比如时间加权平均、累计求和、差值等。 diff --git a/docs/zh/03-intro.md b/docs/zh/03-intro.md index 096b18757a..d62b906277 100644 --- a/docs/zh/03-intro.md +++ b/docs/zh/03-intro.md @@ -6,7 +6,7 @@ toc_max_heading_level: 4 TDengine 是一个高性能、分布式的时序数据库。通过集成的缓存、数据订阅、流计算和数据清洗与转换等功能,TDengine 已经发展成为一个专为物联网、工业互联网、金融和 IT 运维等关键行业量身定制的时序大数据平台。该平台能够高效地汇聚、存储、分析、计算和分发来自海量数据采集点的大规模数据流,每日处理能力可达 TB 乃至 PB 级别。借助 TDengine,企业可以实现实时的业务监控和预警,进而发掘出有价值的商业洞察。 -自 2019 年 7 月 以 来, 涛 思 数 据 陆 续 将 TDengine 的 不 同 版 本 开 源, 包 括 单 机版(2019 年 7 月)、集群版(2020 年 8 月)以及云原生版(2022 年 8 月)。开源之后,TDengine 迅速获得了全球开发者的关注,多次在 GitHub 网站全球趋势排行榜上位居榜首。截至编写本书时,TDengine 在 GitHub 网站上已积累近 2.3 万颗星,安装实例超过53 万个,覆盖 60 多个国家和地区,广泛应用于电力、石油、化工、新能源、智能制造、汽车、环境监测等行业或领域,赢得了全球开发者的广泛认可 +自 2019 年 7 月 以来, 涛思数据陆续将 TDengine 的不同版本开源,包括单机版(2019 年 7 月)、集群版(2020 年 8 月)以及云原生版(2022 年 8 月)。开源之后,TDengine 迅速获得了全球开发者的关注,多次在 GitHub 网站全球趋势排行榜上位居榜首,最新的关注热度见[涛思数据首页](https://www.taosdata.com/)。 ## TDengine 产品 @@ -20,9 +20,9 @@ TDengine OSS 是一个开源的高性能时序数据库,与其他时序数据 ## TDengine 主要功能与特性 -TDengine 经过特别优化,以适应时间序列数据的独特需求,引入了“一个数据采集点一张表”和“超级表”的创新数据组织策略。这些策略背后的支撑是一个革命性的存储引擎,它极大地提升了数据处理的速度和效率,无论是在数据的写入、查询还是存储方面。接下来,逐一探索 TDengine 的众多功能,帮助您全面了解这个为高效处理时间序列数据而生的大数据平台。 +TDengine 经过特别优化,以适应时间序列数据的独特需求,引入了 “一个数据采集点一张表” 和 “超级表” 的创新数据组织策略。这些策略背后的支撑是一个革命性的存储引擎,它极大地提升了数据处理的速度和效率,无论是在数据的写入、查询还是存储方面。接下来,逐一探索 TDengine 的众多功能,帮助您全面了解这个为高效处理时间序列数据而生的大数据平台。 -1. 写入数据:TDengine 支持多种数据写入方式。首先,它完全兼容 SQL,允许用户使用标准的 SQL 语法进行数据写入。而且 TDengine 还支持无模式(Schemaless)写入,包括流行的 InfluxDB Line 协议、OpenTSDB 的 Telnet 和 JSON 协议,这些协议的加入使得数据的导入变得更加灵活和高效。更进一步,TDengine 与众多第三方工具实现了无缝集成,例如 Telegraf、Prometheus、EMQX、StatsD、collectd 和 HiveMQ 等。对于 TDengine Enterprise, TDengine 还提供了 MQTT、OPC-UA、OPC-DA、PI、Wonderware, Kafka 等连接器。这些工具通过简单的配置,无需一行代码,就可以将来自各种数据源的数据源源不断的写入数据库,极大地简化了数据收集和存储的过程。 +1. 写入数据:TDengine 支持多种数据写入方式。首先,它完全兼容 SQL,允许用户使用标准的 SQL 语法进行数据写入。而且 TDengine 还支持无模式(Schemaless)写入,包括流行的 InfluxDB Line 协议、OpenTSDB 的 Telnet 和 JSON 协议,这些协议的加入使得数据的导入变得更加灵活和高效。更进一步,TDengine 与众多第三方工具实现了无缝集成,例如 Telegraf、Prometheus、EMQX、StatsD、collectd 和 HiveMQ 等。在 TDengine Enterprise 中, 还提供了 MQTT、OPC-UA、OPC-DA、PI、Wonderware、Kafka、InfluxDB、OpenTSDB、MySQL、Oracle 和 SQL Server 等连接器。这些工具通过简单的配置,无需一行代码,就可以将来自各种数据源的数据源源不断的写入数据库,极大地简化了数据收集和存储的过程。 2. 查询数据:TDengine 提供标准的 SQL 查询语法,并针对时序数据和业务的特点优化和新增了许多语法和功能,例如降采样、插值、累计求和、时间加权平均、状态窗口、时间窗口、会话窗口、滑动窗口等。TDengine 还支持用户自定义函数(UDF) @@ -38,13 +38,11 @@ TDengine 经过特别优化,以适应时间序列数据的独特需求,引 8. 数据迁移:TDengine 提供了多种便捷的数据导入导出功能,包括脚本文件导入导出、数据文件导入导出、taosdump 工具导入导出等。企业版还支持边云协同、数据同步等场景,兼容多种数据源,如 AVEVA PI System 等。 -9. 编程连接器:TDengine 提供不同语言的连接器,包括 C/C++、Java、Go、Node.js、Rust、Python、C#、R、PHP 等。而且 TDengine 支持 REST 接口,应用可以直接通过 HTTP POST 请求 BODY 中包含的 SQL 语句来操作数据库。 +9. 编程连接器:TDengine 提供不同语言的连接器,包括 C/C++、Java、Go、Node.js、Rust、Python、C#、R、PHP 等。这些连接器大多都支持原生连接和 WebSocket 两种连接方式。TDengine 也提供 REST 接口,任何语言的应用程序可以直接通过 HTTP 请求访问数据库。 -10. 数据安全共享:TDengine 通过数据库视图功能和权限管理,确保数据访问的安全性。结合数据订阅功能实现灵活精细的数据分发控制,保护数据安全和隐私 +10. 数据安全:TDengine 提供了丰富的用户管理和权限管理功能以控制不同用户对数据库和表的访问权限,提供了 IP 白名单功能以控制不同帐号只能从特定的服务器接入集群。TDengine 支持系统管理员对不同数据库按需加密,数据加密后对读写完全透明且对性能的影响很小。还提供了审计日志功能以记录系统中的敏感操作。 -11. 编程连接器:TDengine 提供了丰富的编程语言连接器,包括 C/C++、Java、Go、Node.js、Rust、Python、C#、R、PHP 等,并支持 REST ful 接口,方便应用通过HTTP POST 请求操作数据库。 - -12. 常用工具:TDengine 还提供了交互式命令行程序(CLI),便于管理集群、检查系统状态、做即时查询。压力测试工具 taosBenchmark,用于测试 TDengine 的性能。TDengine 还提供了图形化管理界面,简化了操作和管理过程。 +11. 常用工具:TDengine 还提供了交互式命令行程序(CLI),便于管理集群、检查系统状态、做即时查询。压力测试工具 taosBenchmark,用于测试 TDengine 的性能。TDengine 还提供了图形化管理界面,简化了操作和管理过程。 ## TDengine 与典型时序数据库的区别 diff --git a/docs/zh/04-get-started/01-docker.md b/docs/zh/04-get-started/01-docker.md index d0a40e8e28..364e00f8f2 100644 --- a/docs/zh/04-get-started/01-docker.md +++ b/docs/zh/04-get-started/01-docker.md @@ -71,4 +71,54 @@ taos> ## 快速体验 -想要快速体验 TDengine 的写入和查询能力,请参考[快速体验](../use) \ No newline at end of file +### 体验写入 + +taosBenchmark 是一个专为测试 TDengine 性能而设计的工具,它能够全面评估TDengine 在写入、查询和订阅等方面的功能表现。该工具能够模拟大量设备产生的数据,并允许用户灵活控制数据库、超级表、标签列的数量和类型、数据列的数量和类型、子表数量、每张子表的数据量、写入数据的时间间隔、工作线程数量以及是否写入乱序数据等策略。 + +启动 TDengine 的服务,在终端中执行如下命令 + +```shell +taosBenchmark -y +``` + +系统将自动在数据库 test 下创建一张名为 meters的超级表。这张超级表将包含 10,000 张子表,表名从 d0 到 d9999,每张表包含 10,000条记录。每条记录包含 ts(时间戳)、current(电流)、voltage(电压)和 phase(相位)4个字段。时间戳范围从“2017-07-14 10:40:00 000” 到 “2017-07-14 10:40:09 999”。每张表还带有 location 和 groupId 两个标签,其中,groupId 设置为 1 到 10,而 location 则设置为 California.Campbell、California.Cupertino 等城市信息。 + +执行该命令后,系统将迅速完成 1 亿条记录的写入过程。实际所需时间取决于硬件性能,但即便在普通 PC 服务器上,这个过程通常也只需要十几秒。 + +taosBenchmark 提供了丰富的选项,允许用户自定义测试参数,如表的数目、记录条数等。要查看详细的参数列表,请在终端中输入如下命令 +```shell +taosBenchmark --help +``` + +有关taosBenchmark 的详细使用方法,请参考[taosBenchmark 参考手册](../../reference/components/taosbenchmark) + +### 体验查询 + +使用上述 taosBenchmark 插入数据后,可以在 TDengine CLI(taos)输入查询命令,体验查询速度。 + +1. 查询超级表 meters 下的记录总条数 +```shell +SELECT COUNT(*) FROM test.meters; +``` + +2. 查询 1 亿条记录的平均值、最大值、最小值 +```shell +SELECT AVG(current), MAX(voltage), MIN(phase) FROM test.meters; +``` + +3. 查询 location = "California.SanFrancisco" 的记录总条数 +```shell +SELECT COUNT(*) FROM test.meters WHERE location = "California.SanFrancisco"; +``` + +4. 查询 groupId = 10 的所有记录的平均值、最大值、最小值 +```shell +SELECT AVG(current), MAX(voltage), MIN(phase) FROM test.meters WHERE groupId = 10; +``` + +5. 对表 d1001 按每 10 秒进行平均值、最大值和最小值聚合统计 +```shell +SELECT _wstart, AVG(current), MAX(voltage), MIN(phase) FROM test.d1001 INTERVAL(10s); +``` + +在上面的查询中,使用系统提供的伪列_wstart 来给出每个窗口的开始时间。 \ No newline at end of file diff --git a/docs/zh/04-get-started/03-package.md b/docs/zh/04-get-started/03-package.md index 93804f31dc..4906a2fcfa 100644 --- a/docs/zh/04-get-started/03-package.md +++ b/docs/zh/04-get-started/03-package.md @@ -267,4 +267,54 @@ Query OK, 2 row(s) in set (0.003128s) ## 快速体验 -想要快速体验 TDengine 的写入和查询能力,请参考[快速体验](../use) \ No newline at end of file +### 体验写入 + +taosBenchmark 是一个专为测试 TDengine 性能而设计的工具,它能够全面评估TDengine 在写入、查询和订阅等方面的功能表现。该工具能够模拟大量设备产生的数据,并允许用户灵活控制数据库、超级表、标签列的数量和类型、数据列的数量和类型、子表数量、每张子表的数据量、写入数据的时间间隔、工作线程数量以及是否写入乱序数据等策略。 + +启动 TDengine 的服务,在终端中执行如下命令 + +```shell +taosBenchmark -y +``` + +系统将自动在数据库 test 下创建一张名为 meters的超级表。这张超级表将包含 10,000 张子表,表名从 d0 到 d9999,每张表包含 10,000条记录。每条记录包含 ts(时间戳)、current(电流)、voltage(电压)和 phase(相位)4个字段。时间戳范围从 “2017-07-14 10:40:00 000” 到 “2017-07-14 10:40:09 999”。每张表还带有 location 和 groupId 两个标签,其中,groupId 设置为 1 到 10,而 location 则设置为 California.Campbell、California.Cupertino 等城市信息。 + +执行该命令后,系统将迅速完成 1 亿条记录的写入过程。实际所需时间取决于硬件性能,但即便在普通 PC 服务器上,这个过程通常也只需要十几秒。 + +taosBenchmark 提供了丰富的选项,允许用户自定义测试参数,如表的数目、记录条数等。要查看详细的参数列表,请在终端中输入如下命令 +```shell +taosBenchmark --help +``` + +有关taosBenchmark 的详细使用方法,请参考[taosBenchmark 参考手册](../../reference/components/taosbenchmark) + +### 体验查询 + +使用上述 taosBenchmark 插入数据后,可以在 TDengine CLI(taos)输入查询命令,体验查询速度。 + +1. 查询超级表 meters 下的记录总条数 +```shell +SELECT COUNT(*) FROM test.meters; +``` + +2. 查询 1 亿条记录的平均值、最大值、最小值 +```shell +SELECT AVG(current), MAX(voltage), MIN(phase) FROM test.meters; +``` + +3. 查询 location = "California.SanFrancisco" 的记录总条数 +```shell +SELECT COUNT(*) FROM test.meters WHERE location = "California.SanFrancisco"; +``` + +4. 查询 groupId = 10 的所有记录的平均值、最大值、最小值 +```shell +SELECT AVG(current), MAX(voltage), MIN(phase) FROM test.meters WHERE groupId = 10; +``` + +5. 对表 d1001 按每 10 秒进行平均值、最大值和最小值聚合统计 +```shell +SELECT _wstart, AVG(current), MAX(voltage), MIN(phase) FROM test.d1001 INTERVAL(10s); +``` + +在上面的查询中,使用系统提供的伪列_wstart 来给出每个窗口的开始时间。 \ No newline at end of file diff --git a/docs/zh/04-get-started/07-use.md b/docs/zh/04-get-started/_07-use.md similarity index 100% rename from docs/zh/04-get-started/07-use.md rename to docs/zh/04-get-started/_07-use.md diff --git a/docs/zh/05-basic/03-query.md b/docs/zh/05-basic/03-query.md index e7ef86888f..1b4c3731e6 100644 --- a/docs/zh/05-basic/03-query.md +++ b/docs/zh/05-basic/03-query.md @@ -41,7 +41,7 @@ LIMIT 5 TDengine 支持通过 GROUP BY 子句,对数据进行聚合查询。SQL 语句包含 GROUP BY 子句时,SELECT 列表只能包含如下表达式: 1. 常量 -2. 聚集函数 +2. 聚合函数 3. 与 GROUP BY 后表达式相同的表达式 4. 包含前面表达式的表达式 @@ -158,7 +158,7 @@ window_clause: { **注意** 在使用窗口子句时应注意以下规则: 1. 窗口子句位于数据切分子句之后,不可以和 GROUP BY 子句一起使用。 -2. 窗口子句将数据按窗口进行切分,对每个窗口进行 SELECT 列表中的表达式的计算,SELECT 列表中的表达式只能包含:常量;伪列:_wstart 伪列、_wend 伪列和 _wduration 伪列;聚集函数(包括选择函数和可以由参数确定输出行数的时序特有函数) +2. 窗口子句将数据按窗口进行切分,对每个窗口进行 SELECT 列表中的表达式的计算,SELECT 列表中的表达式只能包含:常量;伪列:_wstart 伪列、_wend 伪列和 _wduration 伪列;聚合函数(包括选择函数和可以由参数确定输出行数的时序特有函数) 3. WHERE 语句可以指定查询的起止时间和其他过滤条件。 ### 时间戳伪列 diff --git a/docs/zh/06-advanced/02-cache.md b/docs/zh/06-advanced/02-cache.md index d1ad356c17..ca1da30dbf 100644 --- a/docs/zh/06-advanced/02-cache.md +++ b/docs/zh/06-advanced/02-cache.md @@ -16,12 +16,12 @@ TDengine 采用了一种创新的时间驱动缓存管理策略,亦称为写 为了实现数据的分布式存储和高可用性,TDengine 引入了虚拟节点(vnode)的概念。每个 vnode 可以拥有多达 3 个副本,这些副本共同组成一个 vnode group,简称 vgroup。在创建数据库时,用户需要确定每个 vnode 的写入缓存大小,以确保数据的合理分配和高效存储。 -创建数据库时的两个关键参数—vgroups 和 buffer—分别决定了数据库中的数据由多少个 vgroup 进行处理,以及为每个 vnode 分配多少写入缓存。通过合理配置这两个 +创建数据库时的两个关键参数 `vgroups` 和 `buffer` 分别决定了数据库中的数据由多少个 vgroup 进行处理,以及为每个 vnode 分配多少写入缓存。通过合理配置这两个 参数,用户可以根据实际需求调整数据库的性能和存储容量,从而实现最佳的性能和成本效益。 例 如, 下面的 SQL 创建了包含 10 个 vgroup,每个 vnode 占 用 256MB 内存的数据库。 -```ssql -create database power vgroups 10 buffer 256 cachemodel 'none' pages 128 pagesize 16 +```sql +CREATE DATABASE POWER VGROUPS 10 BUFFER 256 CACHEMODEL 'NONE' PAGES 128 PAGESIZE 16; ``` 缓存越大越好,但超过一定阈值后再增加缓存对写入性能提升并无帮助。 @@ -43,7 +43,7 @@ create database power vgroups 10 buffer 256 cachemodel 'none' pages 128 pagesize 为了提升查询和写入操作的效率,每个 vnode 都配备了缓存机制,用于存储其曾经获取过的元数据。这一元数据缓存的大小由创建数据库时的两个参数 pages 和 pagesize 共同决定。其中,pagesize 参数的单位是 KB,用于指定每个缓存页的大小。如下 SQL 会为数据库 power 的每个 vnode 创建 128 个 page、每个 page 16KB 的元数据缓存 ```sql -create database power pages 128 pagesize 16 +CREATE DATABASE POWER PAGES 128 PAGESIZE 16; ``` ## 文件系统缓存 @@ -57,7 +57,7 @@ TDengine 利用这些日志文件实现故障前的状态恢复。在写入 WAL - wal_fsync_period:当 wal_level 设置为 2 时,这个参数控制执行 fsync 的频率。设置为 0 表示每次写入后立即执行 fsync,这可以确保数据的安全性,但可能会牺牲一些性能。当设置为大于 0 的数值时,表示 fsync 周期,默认为 3000,范围是[1, 180000],单位毫秒。 ```sql -create database power wal_level 1 wal_fsync_period 3000 +CREATE DATABASE POWER WAL_LEVEL 1 WAL_FSYNC_PERIOD 3000; ``` 在创建数据库时可以选择不同的参数类型,来选择性能优先或者可靠性优先。 diff --git a/docs/zh/06-advanced/05-data-in/index.md b/docs/zh/06-advanced/05-data-in/index.md index 99dfbf098b..32d530acb5 100644 --- a/docs/zh/06-advanced/05-data-in/index.md +++ b/docs/zh/06-advanced/05-data-in/index.md @@ -1,276 +1,276 @@ ---- -sidebar_label: 数据写入 -title: 零代码数据源接入 -toc_max_heading_level: 4 ---- - -## 概述 - -TDengine Enterprise 配备了一个强大的可视化数据管理工具—taosExplorer。借助 taosExplorer,用户只须在浏览器中简单配置,就能轻松地向 TDengine 提交任务,实现以零代码方式将来自不同数据源的数据无缝导入 TDengine。在导入过程中,TDengine 会对数据进行自动提取、过滤和转换,以保证导入的数据质量。通过这种零代码数据源接入方式,TDengine 成功转型为一个卓越的时序大数据汇聚平台。用户无须部署额外的 ETL 工具,从而大大简化整体架构的设计,提高了数据处理效率。 - -下图展示了零代码接入平台的系统架构。 - -![零代码数据接入架构图](./data-in.png) - -## 支持的数据源 - -目前 TDengine 支持的数据源如下: - -1. Aveva PI System:一个工业数据管理和分析平台,前身为 OSIsoft PI System,它能够实时采集、整合、分析和可视化工业数据,助力企业实现智能化决策和精细化管理 -2. Aveva Historian:一个工业大数据分析软件,前身为 Wonderware Historian,专为工业环境设计,用于存储、管理和分析来自各种工业设备、传感器的实时和历史数据。 -3. OPC DA/UA:OPC 是 Open Platform Communications 的缩写,是一种开放式、标准化的通信协议,用于不同厂商的自动化设备之间进行数据交换。它最初由微软公司开发,旨在解决工业控制领域中不同设备之间互操作性差的问题。OPC 协议最初于 1996 年发布,当时称为 OPC DA (Data Access),主要用于实时数据采集和控制;2006 年,OPC 基金会发布了 OPC UA (Unified Architecture) 标准,它是一种基于服务的面向对象的协议,具有更高的灵活性和可扩展性,已成为 OPC 协议的主流版本。 -4. MQTT:Message Queuing Telemetry Transport 的缩写,一种基于发布/订阅模式的轻量级通讯协议,专为低开销、低带宽占用的即时通讯设计,广泛适用于物联网、小型设备、移动应用等领域。 -5. Kafka:由 Apache 软件基金会开发的一个开源流处理平台,主要用于处理实时数据,并提供一个统一、高通量、低延迟的消息系统。它具备高速度、可伸缩性、持久性和分布式设计等特点,使得它能够在每秒处理数十万次的读写操作,支持上千个客户端,同时保持数据的可靠性和可用性。 -6. OpenTSDB:基于 HBase 的分布式、可伸缩的时序数据库。它主要用于存储、索引和提供从大规模集群(包括网络设备、操作系统、应用程序等)中收集的指标数据,使这些数据更易于访问和图形化展示。 -7. CSV:Comma Separated Values 的缩写,是一种以逗号分隔的纯文本文件格式,通常用于电子表格或数据库软件。 -8. TDengine 2:泛指运行 TDengine 2.x 版本的 TDengine 实例。 -9. TDengine 3:泛指运行 TDengine 3.x 版本的 TDengine 实例。 -10. MySQL, PostgreSQL, Oracle 等关系型数据库。 - -## 数据提取、过滤和转换 - -因为数据源可以有多个,每个数据源的物理单位可能不一样,命名规则也不一样,时区也可能不同。为解决这个问题,TDengine 内置 ETL 功能,可以从数据源的数据包中解析、提取需要的数据,并进行过滤和转换,以保证写入数据的质量,提供统一的命名空间。具体的功能如下: - -1. 解析:使用 JSON Path 或正则表达式,从原始消息中解析字段 -2. 从列中提取或拆分:使用 split 或正则表达式,从一个原始字段中提取多个字段 -3. 过滤:只有表达式的值为 true 时,消息才会被写入 TDengine -4. 转换:建立解析后的字段和 TDengine 超级表字段之间的转换与映射关系。 - -下面详细讲解数据转换规则 - - -### 解析 - -仅非结构化的数据源需要这个步骤,目前 MQTT 和 Kafka 数据源会使用这个步骤提供的规则来解析非结构化数据,以初步获取结构化数据,即可以以字段描述的行列数据。在 explorer 中您需要提供示例数据和解析规则,来预览解析出以表格呈现的结构化数据。 - -#### 示例数据 - -![示例数据](./pic/transform-01.png) - -如图,textarea 输入框中就是示例数据,可以通过三种方式来获取示例数据: - -1. 直接在 textarea 中输入示例数据; -2. 点击右侧按钮 “从服务器检索” 则从配置的服务器获取示例数据,并追加到示例数据 textarea 中; -3. 上传文件,将文件内容追加到示例数据 textarea 中。 - -#### 解析 - -解析就是通过解析规则,将非结构化字符串解析为结构化数据。消息体的解析规则目前支持 JSON、Regex 和 UDT。 - -##### JSON 解析 - -如下 JSON 示例数据,可自动解析出字段:`groupid`、`voltage`、`current`、`ts`、`inuse`、`location`。 - -``` json -{"groupid": 170001, "voltage": "221V", "current": 12.3, "ts": "2023-12-18T22:12:00", "inuse": true, "location": "beijing.chaoyang.datun"} -{"groupid": 170001, "voltage": "220V", "current": 12.2, "ts": "2023-12-18T22:12:02", "inuse": true, "location": "beijing.chaoyang.datun"} -{"groupid": 170001, "voltage": "216V", "current": 12.5, "ts": "2023-12-18T22:12:04", "inuse": false, "location": "beijing.chaoyang.datun"} -``` - -如下嵌套结构的 JSON 数据,可自动解析出字段`groupid`、`data_voltage`、`data_current`、`ts`、`inuse`、`location_0_province`、`location_0_city`、`location_0_datun`,也可以选择要解析的字段,并设置解析的别名。 - -``` json -{"groupid": 170001, "data": { "voltage": "221V", "current": 12.3 }, "ts": "2023-12-18T22:12:00", "inuse": true, "location": [{"province": "beijing", "city":"chaoyang", "street": "datun"}]} -``` - -![JSON 解析](./pic/transform-02.png) - -##### Regex 正则表达式 - -可以使用正则表达式的**命名捕获组**从任何字符串(文本)字段中提取多个字段。如图所示,从 nginx 日志中提取访问ip、时间戳、访问的url等字段。 - -``` re -(?\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b)\s-\s-\s\[(?\d{2}/\w{3}/\d{4}:\d{2}:\d{2}:\d{2}\s\+\d{4})\]\s"(?[A-Z]+)\s(?[^\s"]+).*(?\d{3})\s(?\d+) -``` - -![Regex 解析](./pic/transform-03.png) - -##### UDT 自定义解析脚本 - -自定义 rhai 语法脚本解析输入数据(参考 `https://rhai.rs/book/` ),脚本目前仅支持 json 格式原始数据。 - -**输入**:脚本中可以使用参数 data, data 是原始数据 json 解析后的 Object Map; - -**输出**:输出的数据必须是数组。 - -例如对于数据,一次上报三相电压值,分别入到三个子表中。则需要对这类数据做解析 - -``` json -{ - "ts": "2024-06-27 18:00:00", - "voltage": "220.1,220.3,221.1", - "dev_id": "8208891" -} -``` - -那么可以使用如下脚本来提取三个电压数据。 - -``` -let v3 = data["voltage"].split(","); - -[ -#{"ts": data["ts"], "val": v3[0], "dev_id": data["dev_id"]}, -#{"ts": data["ts"], "val": v3[1], "dev_id": data["dev_id"]}, -#{"ts": data["ts"], "val": v3[2], "dev_id": data["dev_id"]} -] -``` - -最终解析结果如下所示: - -![UDT](./pic/transform-udf.png) - -### 提取或拆分 - -解析后的数据,可能还无法满足目标表的数据要求。比如智能表原始采集数据如下( json 格式): - -``` json -{"groupid": 170001, "voltage": "221V", "current": 12.3, "ts": "2023-12-18T22:12:00", "inuse": true, "location": "beijing.chaoyang.datun"} -{"groupid": 170001, "voltage": "220V", "current": 12.2, "ts": "2023-12-18T22:12:02", "inuse": true, "location": "beijing.chaoyang.datun"} -{"groupid": 170001, "voltage": "216V", "current": 12.5, "ts": "2023-12-18T22:12:04", "inuse": false, "location": "beijing.chaoyang.datun"} -``` - -使用 json 规则解析出的电压是字符串表达的带单位形式,最终入库希望能使用 int 类型记录电压值和电流值,便于统计分析,此时就需要对电压进一步拆分;另外日期期望拆分为日期和时间入库。 - -如下图所示可以对源字段`ts`使用 split 规则拆分成日期和时间,对字段`voltage`使用 regex 提取出电压值和电压单位。split 规则需要设置**分隔符**和**拆分数量**,拆分后的字段命名规则为`{原字段名}_{顺序号}`,Regex 规则同解析过程中的一样,使用**命名捕获组**命名提取字段。 - -![拆分和提取](./pic/transform-04.png) - -### 过滤 - -过滤功能可以设置过滤条件,满足条件的数据行 才会被写入目标表。过滤条件表达式的结果必须是 boolean 类型。在编写过滤条件前,必须确定 解析字段的类型,根据解析字段的类型,可以使用判断函数、比较操作符(`>`、`>=`、`<=`、`<`、`==`、`!=`)来判断。 - -#### 字段类型及转换 - -只有明确解析出的每个字段的类型,才能使用正确的语法做数据过滤。 - -使用 json 规则解析出的字段,按照属性值来自动设置类型: - -1. bool 类型:"inuse": true -2. int 类型:"voltage": 220 -3. float 类型:"current" : 12.2 -4. String 类型:"location": "MX001" - -使用 regex 规则解析的数据都是 string 类型。 -使用 split 和 regex 提取或拆分的数据是 string 类型。 - -如果提取出的数据类型不是预期中的类型,可以做数据类型转换。常用的数据类型转换就是把字符串转换成为数值类型。支持的转换函数如下: - -|Function|From type|To type|e.g.| -|:----|:----|:----|:----| -| parse_int | string | int | parse_int("56") // 结果为整数 56 | -| parse_float | string | float | parse_float("12.3") // 结果为浮点数 12.3 | - -#### 判断表达式 - -不同的数据类型有各自判断表达式的写法。 - -##### BOOL 类型 - -可以使用变量或者使用操作符`!`,比如对于字段 "inuse": true,可以编写以下表达式: - -> 1. inuse -> 2. !inuse - -##### 数值类型(int/float) - -数值类型支持使用比较操作符`==`、`!=`、`>`、`>=`、`<`、`<=`。 - -##### 字符串类型 - -使用比较操作符,比较字符串。 - -字符串函数 - -|Function|Description|e.g.| -|:----|:----|:----| -| is_empty | returns true if the string is empty | s.is_empty() | -| contains | checks if a certain character or sub-string occurs in the string | s.contains("substring") | -| starts_with | returns true if the string starts with a certain string | s.starts_with("prefix") | -| ends_with | returns true if the string ends with a certain string | s.ends_with("suffix") | -| len | returns the number of characters (not number of bytes) in the string,must be used with comparison operator | s.len == 5 判断字符串长度是否为5;len作为属性返回 int ,和前四个函数有区别,前四个直接返回 bool。 | - -##### 复合表达式 - -多个判断表达式,可以使用逻辑操作符(&&、||、!)来组合。 -比如下面的表达式表示获取北京市安装的并且电压值大于 200 的智能表数据。 - -> location.starts_with("beijing") && voltage > 200 - -### 映射 - -映射是将解析、提取、拆分的**源字段**对应到**目标表字段**,可以直接对应,也可以通过一些规则计算后再映射到目标表。 - -#### 选择目标超级表 - -选择目标超级表后,会加载出超级表所有的 tags 和 columns。 -源字段根据名称自动使用 mapping 规则映射到目标超级表的 tag 和 column。 -例如有如下解析、提取、拆分后的预览数据: - -#### 映射规则 - -支持的映射规则如下表所示: - -|rule|description| -|:----|:----| -| mapping | 直接映射,需要选择映射源字段。| -| value | 常量,可以输入字符串常量,也可以是数值常量,输入的常量值直接入库。| -| generator | 生成器,目前仅支持时间戳生成器 now,入库时会将当前时间入库。| -| join | 字符串连接器,可指定连接字符拼接选择的多个源字段。| -| format | **字符串格式化工具**,填写格式化字符串,比如有三个源字段 year, month, day 分别表示年月日,入库希望以yyyy-MM-dd的日期格式入库,则可以提供格式化字符串为 `${year}-${month}-${day}`。其中`${}`作为占位符,占位符中可以是一个源字段,也可以是 string 类型字段的函数处理| -| sum | 选择多个数值型字段做加法计算。| -| expr | **数值运算表达式**,可以对数值型字段做更加复杂的函数处理和数学运算。| - -##### format 中支持的字符串处理函数 - -|Function|description|e.g.| -|:----|:----|:----| -| pad(len, pad_chars) | pads the string with a character or a string to at least a specified length | "1.2".pad(5, '0') // 结果为"1.200" | -|trim|trims the string of whitespace at the beginning and end|" abc ee ".trim() // 结果为"abc ee"| -|sub_string(start_pos, len)|extracts a sub-string,两个参数:
1. start position, counting from end if < 0
2. (optional) number of characters to extract, none if ≤ 0, to end if omitted|"012345678".sub_string(5) // "5678"
"012345678".sub_string(5, 2) // "56"
"012345678".sub_string(-2) // "78"| -|replace(substring, replacement)|replaces a sub-string with another|"012345678".replace("012", "abc") // "abc345678"| - -##### expr 数学计算表达式 - -基本数学运算支持加`+`、减`-`、乘`*`、除`/`。 - -比如数据源采集数值以设置度为单位,目标库存储华氏度温度值。那么就需要对采集的温度数据做转换。 - -解析的源字段为`temperature`,则需要使用表达式 `temperature * 1.8 + 32`。 - -数值表达式中也支持使用数学函数,可用的数学函数如下表所示: - -|Function|description|e.g.| -|:----|:----|:----| -|sin、cos、tan、sinh、cosh|Trigonometry|a.sin() | -|asin、acos、atan、 asinh、acosh|arc-trigonometry|a.asin()| -|sqrt|Square root|a.sqrt() // 4.sqrt() == 2| -|exp|Exponential|a.exp()| -|ln、log|Logarithmic|a.ln() // e.ln() == 1
a.log() // 10.log() == 1| -|floor、ceiling、round、int、fraction|rounding|a.floor() // (4.2).floor() == 4
a.ceiling() // (4.2).ceiling() == 5
a.round() // (4.2).round() == 4
a.int() // (4.2).int() == 4
a.fraction() // (4.2).fraction() == 0.2| - -#### 子表名映射 - -子表名类型为字符串,可以使用映射规则中的字符串格式化 format 表达式定义子表名。 - -## 任务的创建 - -下面以 MQTT 数据源为例,介绍如何创建一个 MQTT 类型的任务,从 MQTT Broker 消费数据,并写入 TDengine。 - -1. 登录至 taosExplorer 以后,点击左侧导航栏上的“数据写入”,即可进入任务列表页面 -2. 在任务列表页面,点击“+ 新增数据源”,即可进入任务创建页面 -3. 输入任务名称后,选择类型为 MQTT, 然后可以创建一个新的代理,或者选择已创建的代理 -4. 输入 MQTT broker 的 IP 地址和端口号,例如:192.168.1.100:1883 -5. 配置认证和 SSL 加密: - - 如果 MQTT broker 开启了用户认证,则在认证部分,输入 MQTT broker 的用户名和密码; - - 如果 MQTT broker 开启了 SSL 加密,则可以打开页面上的 SSL 证书开关,并上传 CA 的证书,以及客户端的证书和私钥文件; -6. 在“采集配置“部分,可选择 MQTT 协议的版本,目前支持 3.1, 3.1.1, 5.0 三个版本;配置 Client ID 时要注意,如果对同一个 MQTT broker 创建了多个任务,Client ID 应不同,否则会造成 Client ID 冲突,导致任务无法正常运行;在对主题和 QoS 进行配置时,需要使用 `::` 的形式,即订阅的主题与 QoS 之间要使用两个冒号分隔,其中 QoS 的取值范围为 0, 1, 2, 分别代表 at most once, at lease once, exactly once;配置完成以上信息后,可以点击“检查连通性”按钮,对以上配置进行检查,如果连通性检查失败,请按照页面上返回的具体错误提示进行修改; -7. 在从 MQTT broker 同步数据的过程中,taosX 还支持对消息体中的字段进行提取,过滤、映射等操作。在位于 “Payload 转换”下方的文本框中,可以直接输入输入消息体样例,或是以上传文件的方式导入,以后还会支持直接从所配置的服务器中检索样例消息; -8. 对消息体字段的提取,目前支持 2 种方式:JSON 和正则表达式。对于简单的 key/value 格式的 JSON 数据,可以直接点击提取按钮,即可展示解析出的字段名;对于复杂的 JSON 数据,可以使用 JSON Path 提取感兴趣的字段;当使用正则表达式提取字段时,要保证正则表达式的正确性; -9. 消息体中的字段被解析后,可以基于解析出的字段名设置过滤规则,只有满足过滤规则的数据,才会写入 TDengine,否则会忽略该消息;例如:可以配置过滤规则为 voltage > 200,即只有当电压大于 200V 的数据才会被同步至 TDengine; -10. 最后,在配置完消息体中的字段和超级表中的字段的映射规则后,就可以提交任务了;除了基本的映射以外,在这里还可以对消息中字段的值进行转换,例如:可以通过表达式 (expr) 将原消息体中的电压和电流,计算为功率后再写入 TDengine; -11. 任务提交后,会自动返回任务列表页面,如果提交成功,任务的状态会切换至“运行中”,如果提交失败,可通过查看该任务的活动日志,查找错误原因; -12. 对于运行中的任务,点击指标的查看按钮,可以查看该任务的详细运行指标,弹出窗口划分为 2 个标签页,分别展示该任务多次运行的累计指标和本次运行的指标,这些指标每 2 秒钟会自动刷新一次。 - -## 任务管理 - +--- +sidebar_label: 数据接入 +title: 零代码第三方数据接入 +toc_max_heading_level: 4 +--- + +## 概述 + +TDengine Enterprise 配备了一个强大的可视化数据管理工具—taosExplorer。借助 taosExplorer,用户只须在浏览器中简单配置,就能轻松地向 TDengine 提交任务,实现以零代码方式将来自不同数据源的数据无缝导入 TDengine。在导入过程中,TDengine 会对数据进行自动提取、过滤和转换,以保证导入的数据质量。通过这种零代码数据源接入方式,TDengine 成功转型为一个卓越的时序大数据汇聚平台。用户无须部署额外的 ETL 工具,从而大大简化整体架构的设计,提高了数据处理效率。 + +下图展示了零代码接入平台的系统架构。 + +![零代码数据接入架构图](./data-in.png) + +## 支持的数据源 + +目前 TDengine 支持的数据源如下: + +1. Aveva PI System:一个工业数据管理和分析平台,前身为 OSIsoft PI System,它能够实时采集、整合、分析和可视化工业数据,助力企业实现智能化决策和精细化管理 +2. Aveva Historian:一个工业大数据分析软件,前身为 Wonderware Historian,专为工业环境设计,用于存储、管理和分析来自各种工业设备、传感器的实时和历史数据。 +3. OPC DA/UA:OPC 是 Open Platform Communications 的缩写,是一种开放式、标准化的通信协议,用于不同厂商的自动化设备之间进行数据交换。它最初由微软公司开发,旨在解决工业控制领域中不同设备之间互操作性差的问题。OPC 协议最初于 1996 年发布,当时称为 OPC DA (Data Access),主要用于实时数据采集和控制;2006 年,OPC 基金会发布了 OPC UA (Unified Architecture) 标准,它是一种基于服务的面向对象的协议,具有更高的灵活性和可扩展性,已成为 OPC 协议的主流版本。 +4. MQTT:Message Queuing Telemetry Transport 的缩写,一种基于发布/订阅模式的轻量级通讯协议,专为低开销、低带宽占用的即时通讯设计,广泛适用于物联网、小型设备、移动应用等领域。 +5. Kafka:由 Apache 软件基金会开发的一个开源流处理平台,主要用于处理实时数据,并提供一个统一、高通量、低延迟的消息系统。它具备高速度、可伸缩性、持久性和分布式设计等特点,使得它能够在每秒处理数十万次的读写操作,支持上千个客户端,同时保持数据的可靠性和可用性。 +6. OpenTSDB:基于 HBase 的分布式、可伸缩的时序数据库。它主要用于存储、索引和提供从大规模集群(包括网络设备、操作系统、应用程序等)中收集的指标数据,使这些数据更易于访问和图形化展示。 +7. CSV:Comma Separated Values 的缩写,是一种以逗号分隔的纯文本文件格式,通常用于电子表格或数据库软件。 +8. TDengine 2:泛指运行 TDengine 2.x 版本的 TDengine 实例。 +9. TDengine 3:泛指运行 TDengine 3.x 版本的 TDengine 实例。 +10. MySQL, PostgreSQL, Oracle 等关系型数据库。 + +## 数据提取、过滤和转换 + +因为数据源可以有多个,每个数据源的物理单位可能不一样,命名规则也不一样,时区也可能不同。为解决这个问题,TDengine 内置 ETL 功能,可以从数据源的数据包中解析、提取需要的数据,并进行过滤和转换,以保证写入数据的质量,提供统一的命名空间。具体的功能如下: + +1. 解析:使用 JSON Path 或正则表达式,从原始消息中解析字段 +2. 从列中提取或拆分:使用 split 或正则表达式,从一个原始字段中提取多个字段 +3. 过滤:只有表达式的值为 true 时,消息才会被写入 TDengine +4. 转换:建立解析后的字段和 TDengine 超级表字段之间的转换与映射关系。 + +下面详细讲解数据转换规则 + + +### 解析 + +仅非结构化的数据源需要这个步骤,目前 MQTT 和 Kafka 数据源会使用这个步骤提供的规则来解析非结构化数据,以初步获取结构化数据,即可以以字段描述的行列数据。在 explorer 中您需要提供示例数据和解析规则,来预览解析出以表格呈现的结构化数据。 + +#### 示例数据 + +![示例数据](./pic/transform-01.png) + +如图,textarea 输入框中就是示例数据,可以通过三种方式来获取示例数据: + +1. 直接在 textarea 中输入示例数据; +2. 点击右侧按钮 “从服务器检索” 则从配置的服务器获取示例数据,并追加到示例数据 textarea 中; +3. 上传文件,将文件内容追加到示例数据 textarea 中。 + +#### 解析 + +解析就是通过解析规则,将非结构化字符串解析为结构化数据。消息体的解析规则目前支持 JSON、Regex 和 UDT。 + +##### JSON 解析 + +如下 JSON 示例数据,可自动解析出字段:`groupid`、`voltage`、`current`、`ts`、`inuse`、`location`。 + +``` json +{"groupid": 170001, "voltage": "221V", "current": 12.3, "ts": "2023-12-18T22:12:00", "inuse": true, "location": "beijing.chaoyang.datun"} +{"groupid": 170001, "voltage": "220V", "current": 12.2, "ts": "2023-12-18T22:12:02", "inuse": true, "location": "beijing.chaoyang.datun"} +{"groupid": 170001, "voltage": "216V", "current": 12.5, "ts": "2023-12-18T22:12:04", "inuse": false, "location": "beijing.chaoyang.datun"} +``` + +如下嵌套结构的 JSON 数据,可自动解析出字段`groupid`、`data_voltage`、`data_current`、`ts`、`inuse`、`location_0_province`、`location_0_city`、`location_0_datun`,也可以选择要解析的字段,并设置解析的别名。 + +``` json +{"groupid": 170001, "data": { "voltage": "221V", "current": 12.3 }, "ts": "2023-12-18T22:12:00", "inuse": true, "location": [{"province": "beijing", "city":"chaoyang", "street": "datun"}]} +``` + +![JSON 解析](./pic/transform-02.png) + +##### Regex 正则表达式 + +可以使用正则表达式的**命名捕获组**从任何字符串(文本)字段中提取多个字段。如图所示,从 nginx 日志中提取访问ip、时间戳、访问的url等字段。 + +``` re +(?\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b)\s-\s-\s\[(?\d{2}/\w{3}/\d{4}:\d{2}:\d{2}:\d{2}\s\+\d{4})\]\s"(?[A-Z]+)\s(?[^\s"]+).*(?\d{3})\s(?\d+) +``` + +![Regex 解析](./pic/transform-03.png) + +##### UDT 自定义解析脚本 + +自定义 rhai 语法脚本解析输入数据(参考 `https://rhai.rs/book/` ),脚本目前仅支持 json 格式原始数据。 + +**输入**:脚本中可以使用参数 data, data 是原始数据 json 解析后的 Object Map; + +**输出**:输出的数据必须是数组。 + +例如对于数据,一次上报三相电压值,分别入到三个子表中。则需要对这类数据做解析 + +``` json +{ + "ts": "2024-06-27 18:00:00", + "voltage": "220.1,220.3,221.1", + "dev_id": "8208891" +} +``` + +那么可以使用如下脚本来提取三个电压数据。 + +``` +let v3 = data["voltage"].split(","); + +[ +#{"ts": data["ts"], "val": v3[0], "dev_id": data["dev_id"]}, +#{"ts": data["ts"], "val": v3[1], "dev_id": data["dev_id"]}, +#{"ts": data["ts"], "val": v3[2], "dev_id": data["dev_id"]} +] +``` + +最终解析结果如下所示: + +![UDT](./pic/transform-udf.png) + +### 提取或拆分 + +解析后的数据,可能还无法满足目标表的数据要求。比如智能表原始采集数据如下( json 格式): + +``` json +{"groupid": 170001, "voltage": "221V", "current": 12.3, "ts": "2023-12-18T22:12:00", "inuse": true, "location": "beijing.chaoyang.datun"} +{"groupid": 170001, "voltage": "220V", "current": 12.2, "ts": "2023-12-18T22:12:02", "inuse": true, "location": "beijing.chaoyang.datun"} +{"groupid": 170001, "voltage": "216V", "current": 12.5, "ts": "2023-12-18T22:12:04", "inuse": false, "location": "beijing.chaoyang.datun"} +``` + +使用 json 规则解析出的电压是字符串表达的带单位形式,最终入库希望能使用 int 类型记录电压值和电流值,便于统计分析,此时就需要对电压进一步拆分;另外日期期望拆分为日期和时间入库。 + +如下图所示可以对源字段`ts`使用 split 规则拆分成日期和时间,对字段`voltage`使用 regex 提取出电压值和电压单位。split 规则需要设置**分隔符**和**拆分数量**,拆分后的字段命名规则为`{原字段名}_{顺序号}`,Regex 规则同解析过程中的一样,使用**命名捕获组**命名提取字段。 + +![拆分和提取](./pic/transform-04.png) + +### 过滤 + +过滤功能可以设置过滤条件,满足条件的数据行 才会被写入目标表。过滤条件表达式的结果必须是 boolean 类型。在编写过滤条件前,必须确定 解析字段的类型,根据解析字段的类型,可以使用判断函数、比较操作符(`>`、`>=`、`<=`、`<`、`==`、`!=`)来判断。 + +#### 字段类型及转换 + +只有明确解析出的每个字段的类型,才能使用正确的语法做数据过滤。 + +使用 json 规则解析出的字段,按照属性值来自动设置类型: + +1. bool 类型:"inuse": true +2. int 类型:"voltage": 220 +3. float 类型:"current" : 12.2 +4. String 类型:"location": "MX001" + +使用 regex 规则解析的数据都是 string 类型。 +使用 split 和 regex 提取或拆分的数据是 string 类型。 + +如果提取出的数据类型不是预期中的类型,可以做数据类型转换。常用的数据类型转换就是把字符串转换成为数值类型。支持的转换函数如下: + +|Function|From type|To type|e.g.| +|:----|:----|:----|:----| +| parse_int | string | int | parse_int("56") // 结果为整数 56 | +| parse_float | string | float | parse_float("12.3") // 结果为浮点数 12.3 | + +#### 判断表达式 + +不同的数据类型有各自判断表达式的写法。 + +##### BOOL 类型 + +可以使用变量或者使用操作符`!`,比如对于字段 "inuse": true,可以编写以下表达式: + +> 1. inuse +> 2. !inuse + +##### 数值类型(int/float) + +数值类型支持使用比较操作符`==`、`!=`、`>`、`>=`、`<`、`<=`。 + +##### 字符串类型 + +使用比较操作符,比较字符串。 + +字符串函数 + +|Function|Description|e.g.| +|:----|:----|:----| +| is_empty | returns true if the string is empty | s.is_empty() | +| contains | checks if a certain character or sub-string occurs in the string | s.contains("substring") | +| starts_with | returns true if the string starts with a certain string | s.starts_with("prefix") | +| ends_with | returns true if the string ends with a certain string | s.ends_with("suffix") | +| len | returns the number of characters (not number of bytes) in the string,must be used with comparison operator | s.len == 5 判断字符串长度是否为5;len作为属性返回 int ,和前四个函数有区别,前四个直接返回 bool。 | + +##### 复合表达式 + +多个判断表达式,可以使用逻辑操作符(&&、||、!)来组合。 +比如下面的表达式表示获取北京市安装的并且电压值大于 200 的智能表数据。 + +> location.starts_with("beijing") && voltage > 200 + +### 映射 + +映射是将解析、提取、拆分的**源字段**对应到**目标表字段**,可以直接对应,也可以通过一些规则计算后再映射到目标表。 + +#### 选择目标超级表 + +选择目标超级表后,会加载出超级表所有的 tags 和 columns。 +源字段根据名称自动使用 mapping 规则映射到目标超级表的 tag 和 column。 +例如有如下解析、提取、拆分后的预览数据: + +#### 映射规则 + +支持的映射规则如下表所示: + +|rule|description| +|:----|:----| +| mapping | 直接映射,需要选择映射源字段。| +| value | 常量,可以输入字符串常量,也可以是数值常量,输入的常量值直接入库。| +| generator | 生成器,目前仅支持时间戳生成器 now,入库时会将当前时间入库。| +| join | 字符串连接器,可指定连接字符拼接选择的多个源字段。| +| format | **字符串格式化工具**,填写格式化字符串,比如有三个源字段 year, month, day 分别表示年月日,入库希望以yyyy-MM-dd的日期格式入库,则可以提供格式化字符串为 `${year}-${month}-${day}`。其中`${}`作为占位符,占位符中可以是一个源字段,也可以是 string 类型字段的函数处理| +| sum | 选择多个数值型字段做加法计算。| +| expr | **数值运算表达式**,可以对数值型字段做更加复杂的函数处理和数学运算。| + +##### format 中支持的字符串处理函数 + +|Function|description|e.g.| +|:----|:----|:----| +| pad(len, pad_chars) | pads the string with a character or a string to at least a specified length | "1.2".pad(5, '0') // 结果为"1.200" | +|trim|trims the string of whitespace at the beginning and end|" abc ee ".trim() // 结果为"abc ee"| +|sub_string(start_pos, len)|extracts a sub-string,两个参数:
1. start position, counting from end if < 0
2. (optional) number of characters to extract, none if ≤ 0, to end if omitted|"012345678".sub_string(5) // "5678"
"012345678".sub_string(5, 2) // "56"
"012345678".sub_string(-2) // "78"| +|replace(substring, replacement)|replaces a sub-string with another|"012345678".replace("012", "abc") // "abc345678"| + +##### expr 数学计算表达式 + +基本数学运算支持加`+`、减`-`、乘`*`、除`/`。 + +比如数据源采集数值以设置度为单位,目标库存储华氏度温度值。那么就需要对采集的温度数据做转换。 + +解析的源字段为`temperature`,则需要使用表达式 `temperature * 1.8 + 32`。 + +数值表达式中也支持使用数学函数,可用的数学函数如下表所示: + +|Function|description|e.g.| +|:----|:----|:----| +|sin、cos、tan、sinh、cosh|Trigonometry|a.sin() | +|asin、acos、atan、 asinh、acosh|arc-trigonometry|a.asin()| +|sqrt|Square root|a.sqrt() // 4.sqrt() == 2| +|exp|Exponential|a.exp()| +|ln、log|Logarithmic|a.ln() // e.ln() == 1
a.log() // 10.log() == 1| +|floor、ceiling、round、int、fraction|rounding|a.floor() // (4.2).floor() == 4
a.ceiling() // (4.2).ceiling() == 5
a.round() // (4.2).round() == 4
a.int() // (4.2).int() == 4
a.fraction() // (4.2).fraction() == 0.2| + +#### 子表名映射 + +子表名类型为字符串,可以使用映射规则中的字符串格式化 format 表达式定义子表名。 + +## 任务的创建 + +下面以 MQTT 数据源为例,介绍如何创建一个 MQTT 类型的任务,从 MQTT Broker 消费数据,并写入 TDengine。 + +1. 登录至 taosExplorer 以后,点击左侧导航栏上的“数据写入”,即可进入任务列表页面 +2. 在任务列表页面,点击“+ 新增数据源”,即可进入任务创建页面 +3. 输入任务名称后,选择类型为 MQTT, 然后可以创建一个新的代理,或者选择已创建的代理 +4. 输入 MQTT broker 的 IP 地址和端口号,例如:192.168.1.100:1883 +5. 配置认证和 SSL 加密: + - 如果 MQTT broker 开启了用户认证,则在认证部分,输入 MQTT broker 的用户名和密码; + - 如果 MQTT broker 开启了 SSL 加密,则可以打开页面上的 SSL 证书开关,并上传 CA 的证书,以及客户端的证书和私钥文件; +6. 在“采集配置“部分,可选择 MQTT 协议的版本,目前支持 3.1, 3.1.1, 5.0 三个版本;配置 Client ID 时要注意,如果对同一个 MQTT broker 创建了多个任务,Client ID 应不同,否则会造成 Client ID 冲突,导致任务无法正常运行;在对主题和 QoS 进行配置时,需要使用 `::` 的形式,即订阅的主题与 QoS 之间要使用两个冒号分隔,其中 QoS 的取值范围为 0, 1, 2, 分别代表 at most once, at lease once, exactly once;配置完成以上信息后,可以点击“检查连通性”按钮,对以上配置进行检查,如果连通性检查失败,请按照页面上返回的具体错误提示进行修改; +7. 在从 MQTT broker 同步数据的过程中,taosX 还支持对消息体中的字段进行提取,过滤、映射等操作。在位于 “Payload 转换”下方的文本框中,可以直接输入输入消息体样例,或是以上传文件的方式导入,以后还会支持直接从所配置的服务器中检索样例消息; +8. 对消息体字段的提取,目前支持 2 种方式:JSON 和正则表达式。对于简单的 key/value 格式的 JSON 数据,可以直接点击提取按钮,即可展示解析出的字段名;对于复杂的 JSON 数据,可以使用 JSON Path 提取感兴趣的字段;当使用正则表达式提取字段时,要保证正则表达式的正确性; +9. 消息体中的字段被解析后,可以基于解析出的字段名设置过滤规则,只有满足过滤规则的数据,才会写入 TDengine,否则会忽略该消息;例如:可以配置过滤规则为 voltage > 200,即只有当电压大于 200V 的数据才会被同步至 TDengine; +10. 最后,在配置完消息体中的字段和超级表中的字段的映射规则后,就可以提交任务了;除了基本的映射以外,在这里还可以对消息中字段的值进行转换,例如:可以通过表达式 (expr) 将原消息体中的电压和电流,计算为功率后再写入 TDengine; +11. 任务提交后,会自动返回任务列表页面,如果提交成功,任务的状态会切换至“运行中”,如果提交失败,可通过查看该任务的活动日志,查找错误原因; +12. 对于运行中的任务,点击指标的查看按钮,可以查看该任务的详细运行指标,弹出窗口划分为 2 个标签页,分别展示该任务多次运行的累计指标和本次运行的指标,这些指标每 2 秒钟会自动刷新一次。 + +## 任务管理 + 在任务列表页面,还可以对任务进行启动、停止、查看、删除、复制等操作,也可以查看各个任务的运行情况,包括写入的记录条数、流量等。 \ No newline at end of file diff --git a/docs/zh/07-operation/02-planning.md b/docs/zh/07-operation/02-planning.md index dc119b5166..66da1df8bf 100644 --- a/docs/zh/07-operation/02-planning.md +++ b/docs/zh/07-operation/02-planning.md @@ -63,7 +63,7 @@ M = (T × S × 3 + (N / 4096) + 100) TDengine 用户对 CPU 的需求主要受以下 3 个因素影响: - 数据分片:在 TDengine 中,每个 CPU 核心可以服务 1 至 2 个 vnode。假设一个集群配置了 100 个 vgroup,并且采用三副本策略,那么建议该集群的 CPU 核心数量为 150~300 个,以实现最佳性能。 -- 数据写入:TDengine 的单核每秒至少能处理 10 000 个写入请求。值得注意的是,每个写入请求可以包含多条记录,而且一次写入一条记录与同时写入 10 条记录相比,消耗的计算资源相差无几。因此,每次写入的记录数越多,写入效率越高。例如,如果一个写入请求包含 200 条以上记录,单核就能实现每秒写入 100 万条记录的速度。然而,这要求前端数据采集系统具备更高的能力,因为它需要缓存记录,然后批量写入。 +- 数据写入:TDengine 的单核每秒至少能处理 10,000 个写入请求。值得注意的是,每个写入请求可以包含多条记录,而且一次写入一条记录与同时写入 10 条记录相比,消耗的计算资源相差无几。因此,每次写入的记录数越多,写入效率越高。例如,如果一个写入请求包含 200 条以上记录,单核就能实现每秒写入 100 万条记录的速度。然而,这要求前端数据采集系统具备更高的能力,因为它需要缓存记录,然后批量写入。 - 查询需求:虽然 TDengine 提供了高效的查询功能,但由于每个应用场景的查询差异较大,且查询频次也会发生变化,因此很难给出一个具体的数字来衡量查询所需的计算资源。用户需要根据自己的实际场景编写一些查询语句,以便更准确地确定所需的计算资源。 综上所述,对于数据分片和数据写入,CPU 的需求是可以预估的。然而,查询需求所消耗的计算资源则难以预测。在实际运行过程中,建议保持 CPU 使用率不超过 50%,以确保系统的稳定性和性能。一旦 CPU 使用率超过这一阈值,就需要考虑增加新的节点或增加 CPU 核心数量,以提供更多的计算资源。 @@ -152,5 +152,5 @@ TDengine 的多级存储功能在使用上还具备以下优点。 |RESTful 接口 | 6041 | |WebSocket 接口 |6041 | |taosKeeper | 6043 | -|taosX | 6055 | +|taosX | 6050, 6055 | |taosExplorer | 6060 | \ No newline at end of file diff --git a/docs/zh/07-operation/03-deployment.md b/docs/zh/07-operation/03-deployment.md index e1ff48da04..83b2c91843 100644 --- a/docs/zh/07-operation/03-deployment.md +++ b/docs/zh/07-operation/03-deployment.md @@ -1,828 +1,840 @@ ---- -sidebar_label: 集群部署 -title: 集群部署 -toc_max_heading_level: 4 ---- - -由于 TDengine 设计之初就采用了分布式架构,具有强大的水平扩展能力,以满足不断增长的数据处理需求,因此 TDengine 支持集群,并将此核心功能开源。用户可以根据实际环境和需求选择 4 种部署方式—手动部署、Docker 部署、Kubernetes 部署和 Helm 部署。 - -## 手动部署 - -按照以下步骤手动搭建 TDengine 集群。 - -### 清除数据 - -如果搭建集群的物理节点中存在之前的测试数据或者装过其他版本(如 1.x/2.x)的TDengine,请先将其删除,并清空所有数据。 - -### 检查环境 - -在进行 TDengine 集群部署之前,全面检查所有 dnode 以及应用程序所在物理节点的网络设置至关重要。以下是检查步骤: - -第 1 步,在每个物理节点上执行 hostname -f 命令,以查看并确认所有节点的hostname 是唯一的。对于应用程序驱动所在的节点,这一步骤可以省略。 -第 2 步,在每个物理节点上执行 ping host 命令,其中 host 是其他物理节点的 hostname。这一步骤旨在检测当前节点与其他物理节点之间的网络连通性。如果发现无法 ping 通,请立即检查网络和 DNS 设置。对于 Linux 操作系统,请检查 /etc/hosts 文件;对于 Windows 操作系统,请检查C:\Windows\system32\drivers\etc\hosts 文件。网络不通畅将导致无法组建集群,请务必解决此问题。 -第 3 步,在应用程序运行的物理节点上重复上述网络监测步骤。如果发现网络不通畅,应用程序将无法连接到 taosd 服务。此时,请仔细检查应用程序所在物理节点的DNS 设置或 hosts 文件,确保其配置正确无误。 -第 4 步,检查端口,确保集群中所有主机在端口 6030 上的 TCP 能够互通。 - -通过以上步骤,你可以确保所有节点在网络层面顺利通信,从而为成功部署TDengine 集群奠定坚实基础 - -### 安装 TDengine - -为了确保集群内各物理节点的一致性和稳定性,请在所有物理节点上安装相同版本的 TDengine。 - -### 修改配置 - -修改 TDengine 的配置文件(所有节点的配置文件都需要修改)。假设准备启动的第 1 个 dnode 的 endpoint 为 h1.taosdata.com:6030,其与集群配置相关参数如下。 - -```shell -# firstEp 是每个 dnode 首次启动后连接的第 1 个 dnode -firstEp h1.taosdata.com:6030 -# 必须配置为本 dnode 的 FQDN,如果本机只有一个 hostname,可注释或删除如下这行代码 -fqdn h1.taosdata.com -# 配置本 dnode 的端口,默认是 6030 -serverPort 6030 -``` - -一定要修改的参数是 f irstEp 和 fqdn。对于每个 dnode,f irstEp 配置应该保持一致,但 fqdn 一定要配置成其所在 dnode 的值。其他参数可不做任何修改,除非你很清楚为什么要修改。 - -对于希望加入集群的 dnode 节点,必须确保下表所列的与 TDengine 集群相关的参数设置完全一致。任何参数的不匹配都可能导致 dnode 节点无法成功加入集群。 - -| 参数名称 | 含义 | -|:---------------:|:----------------------------------------------------------:| -|statusInterval | dnode 向 mnode 报告状态的间隔 | -|timezone | 时区 | -|locale | 系统区位信息及编码格式 | -|charset | 字符集编码 | -|ttlChangeOnWrite | ttl 到期时间是否伴随表的修改操作而改变 | - -### 启动集群 - -按照前述步骤启动第 1 个 dnode,例如 h1.taosdata.com。接着在终端中执行 taos,启动 TDengine 的 CLI 程序 taos,并在其中执行 show dnodes 命令,以查看当前集群中的所有 dnode 信息。 - -```shell -taos> show dnodes; - id | endpoint | vnodes|support_vnodes|status| create_time | note | -=================================================================================== - 1| h1.taosdata.com:6030 | 0| 1024| ready| 2022-07-16 10:50:42.673 | | -``` - -可以看到,刚刚启动的 dnode 节点的 endpoint 为 h1.taosdata.com:6030。这个地址就是新建集群的 first Ep。 - -### 添加 dnode - -按照前述步骤,在每个物理节点启动 taosd。每个 dnode 都需要在 taos.cfg 文件中将 firstEp 参数配置为新建集群首个节点的 endpoint,在本例中是 h1.taosdata.com:6030。在第 1 个 dnode 所在机器,在终端中运行 taos,打开 TDengine 的 CLI 程序 taos,然后登录TDengine 集群,执行如下 SQL。 - -```shell -create dnode "h2.taosdata.com:6030" -``` - -将新 dnode 的 endpoint 添加进集群的 endpoint 列表。需要为 `fqdn:port` 加上双引号,否则运行时出错。请注意将示例的 h2.taosdata.com:6030 替换为这个新 dnode 的 endpoint。然后执行如下 SQL 查看新节点是否成功加入。若要加入的 dnode 当前处于离线状态,请参考本节后面的 “常见问题”部分进行解决。 - -```shell -show dnodes; -``` - -在日志中,请确认输出的 dnode 的 fqdn 和端口是否与你刚刚尝试添加的 endpoint 一致。如果不一致,请修正为正确的 endpoint。遵循上述步骤,你可以持续地将新的 dnode 逐个加入集群,从而扩展集群规模并提高整体性能。确保在添加新节点时遵循正确的流程,这有助于维持集群的稳定性和可靠性。 - -**Tips** -- 任何已经加入集群的 dnode 都可以作为后续待加入节点的 firstEp。firstEp 参数仅仅在该 dnode 首次加入集群时起作用,加入集群后,该 dnode 会保存最新的 mnode 的 endpoint 列表,后续不再依赖这个参数。之后配置文件中的 firstEp 参数主要用于客户端连接,如果没有为 TDengine 的 CLI 设置参数,则默认连接由 firstEp 指定的节点。 -- 两个没有配置 firstEp 参数的 dnode 在启动后会独立运行。这时无法将其中一个dnode 加入另外一个 dnode,形成集群。 -- TDengine 不允许将两个独立的集群合并成新的集群。 - -### 添加 mnode - -在创建 TDengine 集群时,首个 dnode 将自动成为集群的 mnode,负责集群的管理和协调工作。为了实现 mnode 的高可用性,后续添加的 dnode 需要手动创建 mnode。请注意,一个集群最多允许创建 3 个 mnode,且每个 dnode 上只能创建一个 mnode。当集群中的 dnode 数量达到或超过 3 个时,你可以为现有集群创建 mnode。在第 1个 dnode 中,首先通过 TDengine 的 CLI 程序 taos 登录 TDengine,然后执行如下 SQL。 - -```shell -create mnode on dnode -``` - -请注意将上面示例中的 dnodeId 替换为刚创建 dnode 的序号(可以通过执行 `show dnodes` 命令获得)。最后执行如下 `show mnodes`,查看新创建的 mnode 是否成功加入集群。 - -### 删除 dnode - -对于错误加入集群的 dnode 可以通过 `drop dnode ` 命令删除。 - -**Tips** -- 一旦 dnode 被删除,它将无法直接重新加入集群。如果需要重新加入此类节点,你应首先对该节点进行初始化操作,即清空其数据文件夹。 -- 在执行 drop dnode 命令时,集群会先将待删除 dnode 上的数据迁移至其他节点。请注意,drop dnode 与停止 taosd 进程是两个截然不同的操作,请勿混淆。由于删除 dnode 前须执行数据迁移,因此被删除的 dnode 必须保持在线状态,直至删除操作完成。删除操作结束后,方可停止 taosd 进程。 -- 一旦 dnode 被删除,集群中的其他节点将感知到此操作,并且不再接收该 dnodeId 的请求。dnodeId 是由集群自动分配的,用户无法手动指定。 - -### 常见问题 - -在搭建 TDengine 集群的过程中,如果在执行 create dnode 命令以添加新节点后,新节点始终显示为离线状态,请按照以下步骤进行排查。 -第 1 步,检查新节点上的 taosd 服务是否已经正常启动。你可以通过查看日志文件或使用 ps 命令来确认。 -第 2 步,如果 taosd 服务已启动,接下来请检查新节点的网络连接是否畅通,并确认防火墙是否已关闭。网络不通或防火墙设置可能会阻止节点与集群的其他节点通信。 -第 3 步,使用 taos -h fqdn 命令尝试连接到新节点,然后执行 show dnodes 命令。这将显示新节点作为独立集群的运行状态。如果显示的列表与主节点上显示的不一致,说明新节点可能已自行组成一个单节点集群。要解决这个问题,请按照以下步骤操作。首先,停止新节点上的 taosd 服务。其次,清空新节点上 taos.cfg 配置文件中指定的 dataDir 目录下的所有文件。这将删除与该节点相关的所有数据和配置信息。最后,重新启动新节点上的 taosd 服务。这将使新节点恢复到初始状态,并准备好重新加入主集群。 - -### 部署 taosAdapter - -本节讲述如何部署 taosAdapter,taosAdapter 为 TDengine 集群提供 RESTful 和 WebSocket 接入能力,因而在集群中扮演着很重要的角色。 - -1. 安装 - -TDengine Enterprise 安装完成后,即可使用 taosAdapter。如果想在不同的服务器上分别部署 taosAdapter,需要在这些服务器上都安装 TDengine Enterprise。 - -2. 单一实例部署 - -部署 taosAdapter 的单一实例非常简单,具体命令和配置参数请参考手册中 taosAdapter 部分。 - -3. 多实例部署 - -部署 taosAdapter 的多个实例的主要目的如下: -- 提升集群的吞吐量,避免 taosAdapter 成为系统瓶颈。 -- 提升集群的健壮性和高可用能力,当有一个实例因某种故障而不再提供服务时,可以将进入业务系统的请求自动路由到其他实例。 - -在部署 taosAdapter 的多个实例时,需要解决负载均衡问题,以避免某个节点过载而其他节点闲置。在部署过程中,需要分别部署多个单一实例,每个实例的部署步骤与部署单一实例完全相同。接下来关键的部分是配置 Nginx。以下是一个经过验证的较佳实践配置,你只须将其中的 endpoint 替换为实际环境中的正确地址即可。关于各参数的含义,请参考 Nginx 的官方文档。 - -```json -user root; -worker_processes auto; -error_log /var/log/nginx_error.log; - - -events { - use epoll; - worker_connections 1024; -} - -http { - - access_log off; - - map $http_upgrade $connection_upgrade { - default upgrade; - '' close; - } - - server { - listen 6041; - location ~* { - proxy_pass http://dbserver; - proxy_read_timeout 600s; - proxy_send_timeout 600s; - proxy_connect_timeout 600s; - proxy_next_upstream error http_502 non_idempotent; - proxy_http_version 1.1; - proxy_set_header Upgrade $http_upgrade; - proxy_set_header Connection $http_connection; - } - } - server { - listen 6043; - location ~* { - proxy_pass http://keeper; - proxy_read_timeout 60s; - proxy_next_upstream error http_502 http_500 non_idempotent; - } - } - - server { - listen 6060; - location ~* { - proxy_pass http://explorer; - proxy_read_timeout 60s; - proxy_next_upstream error http_502 http_500 non_idempotent; - } - } - upstream dbserver { - least_conn; - server 172.16.214.201:6041 max_fails=0; - server 172.16.214.202:6041 max_fails=0; - server 172.16.214.203:6041 max_fails=0; - } - upstream keeper { - ip_hash; - server 172.16.214.201:6043 ; - server 172.16.214.202:6043 ; - server 172.16.214.203:6043 ; - } - upstream explorer{ - ip_hash; - server 172.16.214.201:6060 ; - server 172.16.214.202:6060 ; - server 172.16.214.203:6060 ; - } -} -``` - -## Docker 部署 - -本节将介绍如何在 Docker 容器中启动 TDengine 服务并对其进行访问。你可以在 docker run 命令行或者 docker-compose 文件中使用环境变量来控制容器中服务的行为。 - -### 启动 TDengine - -TDengine 镜像启动时默认激活 HTTP 服务,使用下列命令便可创建一个带有 HTTP 服务的容器化 TDengine 环境。 -```shell -docker run -d --name tdengine \ --v ~/data/taos/dnode/data:/var/lib/taos \ --v ~/data/taos/dnode/log:/var/log/taos \ --p 6041:6041 tdengine/tdengine -``` - -详细的参数说明如下。 -- /var/lib/taos:TDengine 默认数据文件目录,可通过配置文件修改位置。 -- /var/log/taos:TDengine 默认日志文件目录,可通过配置文件修改位置。 - -以上命令启动了一个名为 tdengine 的容器,并把其中的 HTTP 服务的端口 6041 映射到主机端口 6041。如下命令可以验证该容器中提供的 HTTP 服务是否可用。 - -```shell -curl -u root:taosdata -d "show databases" localhost:6041/rest/sql -``` - -运行如下命令可在容器中访问 TDengine。 -```shell -$ docker exec -it tdengine taos - -taos> show databases; - name | -================================= - information_schema | - performance_schema | -Query OK, 2 rows in database (0.033802s) -``` - -在容器中,TDengine CLI 或者各种连接器(例如 JDBC-JNI)与服务器通过容器的 hostname 建立连接。从容器外访问容器内的 TDengine 比较复杂,通过 RESTful/WebSocket 连接方式是最简单的方法。 - -### 在 host 网络模式下启动 TDengine - -运行以下命令可以在 host 网络模式下启动 TDengine,这样可以使用主机的 FQDN 建立连接,而不是使用容器的 hostname。 -```shell -docker run -d --name tdengine --network host tdengine/tdengine -``` - -这种方式与在主机上使用 systemctl 命令启动 TDengine 的效果相同。在主机上已安装 TDengine 客户端的情况下,可以直接使用下面的命令访问 TDengine 服务。 -```shell -$ taos - -taos> show dnodes; - id | endpoint | vnodes | support_vnodes | status | create_time | note | -================================================================================================================================================= - 1 | vm98:6030 | 0 | 32 | ready | 2022-08-19 14:50:05.337 | | -Query OK, 1 rows in database (0.010654s) -``` - -### 以指定的 hostname 和 port 启动 TDengine - -使用如下命令可以利用 TAOS_FQDN 环境变量或者 taos.cfg 中的 fqdn 配置项使TDengine 在指定的 hostname 上建立连接。这种方式为部署 TDengine 提供了更大的灵活性。 - -```shell -docker run -d \ - --name tdengine \ - -e TAOS_FQDN=tdengine \ - -p 6030:6030 \ - -p 6041-6049:6041-6049 \ - -p 6041-6049:6041-6049/udp \ - tdengine/tdengine -``` - -首先,上面的命令在容器中启动一个 TDengine 服务,其所监听的 hostname 为tdengine,并将容器的端口 6030 映射到主机的端口 6030,将容器的端口段 6041~6049 映射到主机的端口段 6041~6049。如果主机上该端口段已经被占用,可以修改上述命令以指定一个主机上空闲的端口段。 - -其次,要确保 tdengine 这个 hostname 在 /etc/hosts 中可解析。通过如下命令可将正确的配置信息保存到 hosts 文件中。 -```shell -echo 127.0.0.1 tdengine |sudo tee -a /etc/hosts -``` - -最后,可以通过 TDengine CLI 以 tdengine 为服务器地址访问 TDengine 服务,命令如下。 -```shell -taos -h tdengine -P 6030 -``` - -如果 TAOS_FQDN 被设置为与所在主机名相同,则效果与“在 host 网络模式下启动TDengine”相同。 - -## Kubernetes 部署 - -作为面向云原生架构设计的时序数据库,TDengine 本身就支持 Kubernetes 部署。这里介绍如何使用 YAML 文件从头一步一步创建一个可用于生产使用的高可用 TDengine 集群,并重点介绍 Kubernetes 环境下 TDengine 的常用操作。本小节要求读者对 Kubernetes 有一定的了解,可以熟练运行常见的 kubectl 命令,了解 statefulset、service、pvc 等概念,对这些概念不熟悉的读者,可以先参考 Kubernetes 的官网进行学习。 -为了满足高可用的需求,集群需要满足如下要求: -- 3 个及以上 dnode :TDengine 的同一个 vgroup 中的多个 vnode ,不允许同时分布在一个 dnode ,所以如果创建 3 副本的数据库,则 dnode 数大于等于 3 -- 3 个 mnode :mnode 负责整个集群的管理工作,TDengine 默认是一个 mnode。如果这个 mnode 所在的 dnode 掉线,则整个集群不可用。 -- 数据库的 3 副本:TDengine 的副本配置是数据库级别,所以数据库 3 副本可满足在 3 个 dnode 的集群中,任意一个 dnode 下线,都不影响集群的正常使用。如果下线 dnode 个数为 2 时,此时集群不可用,因为 RAFT 无法完成选举。(企业版:在灾难恢复场景,任一节点数据文件损坏,都可以通过重新拉起 dnode 进行恢复) - -### 前置条件 - -要使用 Kubernetes 部署管理 TDengine 集群,需要做好如下准备工作。 -- 本文适用 Kubernetes v1.19 以上版本 -- 本文使用 kubectl 工具进行安装部署,请提前安装好相应软件 -- Kubernetes 已经安装部署并能正常访问使用或更新必要的容器仓库或其他服务 - -### 配置 Service 服务 - -创建一个 Service 配置文件:taosd-service.yaml,服务名称 metadata.name (此处为 "taosd") 将在下一步中使用到。首先添加 TDengine 所用到的端口,然后在选择器设置确定的标签 app (此处为 “tdengine”)。 - -```yaml ---- -apiVersion: v1 -kind: Service -metadata: - name: "taosd" - labels: - app: "tdengine" -spec: - ports: - - name: tcp6030 - protocol: "TCP" - port: 6030 - - name: tcp6041 - protocol: "TCP" - port: 6041 - selector: - app: "tdengine" -``` - -### 有状态服务 StatefulSet - -根据 Kubernetes 对各类部署的说明,我们将使用 StatefulSet 作为 TDengine 的部署资源类型。 创建文件 tdengine.yaml,其中 replicas 定义集群节点的数量为 3。节点时区为中国(Asia/Shanghai),每个节点分配 5G 标准(standard)存储,你也可以根据实际情况进行相应修改。 - -请特别注意 startupProbe 的配置,在 dnode 的 Pod 掉线一段时间后,再重新启动,这个时候新上线的 dnode 会短暂不可用。如果 startupProbe 配置过小,Kubernetes 会认为该 Pod 处于不正常的状态,并尝试重启该 Pod,该 dnode 的 Pod 会频繁重启,始终无法恢复到正常状态。 - -```yaml ---- -apiVersion: apps/v1 -kind: StatefulSet -metadata: - name: "tdengine" - labels: - app: "tdengine" -spec: - serviceName: "taosd" - replicas: 3 - updateStrategy: - type: RollingUpdate - selector: - matchLabels: - app: "tdengine" - template: - metadata: - name: "tdengine" - labels: - app: "tdengine" - spec: - containers: - - name: "tdengine" - image: "tdengine/tdengine:3.2.3.0" - imagePullPolicy: "IfNotPresent" - ports: - - name: tcp6030 - protocol: "TCP" - containerPort: 6030 - - name: tcp6041 - protocol: "TCP" - containerPort: 6041 - env: - # POD_NAME for FQDN config - - name: POD_NAME - valueFrom: - fieldRef: - fieldPath: metadata.name - # SERVICE_NAME and NAMESPACE for fqdn resolve - - name: SERVICE_NAME - value: "taosd" - - name: STS_NAME - value: "tdengine" - - name: STS_NAMESPACE - valueFrom: - fieldRef: - fieldPath: metadata.namespace - # TZ for timezone settings, we recommend to always set it. - - name: TZ - value: "Asia/Shanghai" - # Environment variables with prefix TAOS_ will be parsed and converted into corresponding parameter in taos.cfg. For example, serverPort in taos.cfg should be configured by TAOS_SERVER_PORT when using K8S to deploy - - name: TAOS_SERVER_PORT - value: "6030" - # Must set if you want a cluster. - - name: TAOS_FIRST_EP - value: "$(STS_NAME)-0.$(SERVICE_NAME).$(STS_NAMESPACE).svc.cluster.local:$(TAOS_SERVER_PORT)" - # TAOS_FQND should always be set in k8s env. - - name: TAOS_FQDN - value: "$(POD_NAME).$(SERVICE_NAME).$(STS_NAMESPACE).svc.cluster.local" - volumeMounts: - - name: taosdata - mountPath: /var/lib/taos - startupProbe: - exec: - command: - - taos-check - failureThreshold: 360 - periodSeconds: 10 - readinessProbe: - exec: - command: - - taos-check - initialDelaySeconds: 5 - timeoutSeconds: 5000 - livenessProbe: - exec: - command: - - taos-check - initialDelaySeconds: 15 - periodSeconds: 20 - volumeClaimTemplates: - - metadata: - name: taosdata - spec: - accessModes: - - "ReadWriteOnce" - storageClassName: "standard" - resources: - requests: - storage: "5Gi" -``` - -### 使用 kubectl 命令部署 TDengine 集群 - -首先创建对应的 namespace dengine-test,以及 pvc,并保证 storageClassName 是 standard 的剩余空间足够。然后顺序执行以下命令: -```shell -kubectl apply -f taosd-service.yaml -n tdengine-test -``` - -上面的配置将生成一个三节点的 TDengine 集群,dnode 为自动配置,可以使用 show dnodes 命令查看当前集群的节点: -```shell -kubectl exec -it tdengine-0 -n tdengine-test -- taos -s "show dnodes" -kubectl exec -it tdengine-1 -n tdengine-test -- taos -s "show dnodes" -kubectl exec -it tdengine-2 -n tdengine-test -- taos -s "show dnodes" -``` - -输出如下: -```shell -taos show dnodes - id | endpoint | vnodes | support_vnodes | status | create_time | reboot_time | note | active_code | c_active_code | -============================================================================================================================================================================================================================================= - 1 | tdengine-0.ta... | 0 | 16 | ready | 2023-07-19 17:54:18.552 | 2023-07-19 17:54:18.469 | | | | - 2 | tdengine-1.ta... | 0 | 16 | ready | 2023-07-19 17:54:37.828 | 2023-07-19 17:54:38.698 | | | | - 3 | tdengine-2.ta... | 0 | 16 | ready | 2023-07-19 17:55:01.141 | 2023-07-19 17:55:02.039 | | | | -Query OK, 3 row(s) in set (0.001853s) -``` - -查看当前 mnode -```shell -kubectl exec -it tdengine-1 -n tdengine-test -- taos -s "show mnodes\G" -taos> show mnodes\G -*************************** 1.row *************************** - id: 1 - endpoint: tdengine-0.taosd.tdengine-test.svc.cluster.local:6030 - role: leader - status: ready -create_time: 2023-07-19 17:54:18.559 -reboot_time: 2023-07-19 17:54:19.520 -Query OK, 1 row(s) in set (0.001282s) -``` - -创建 mnode -```shell -kubectl exec -it tdengine-0 -n tdengine-test -- taos -s "create mnode on dnode 2" -kubectl exec -it tdengine-0 -n tdengine-test -- taos -s "create mnode on dnode 3" -``` - -查看 mnode -```shell -kubectl exec -it tdengine-1 -n tdengine-test -- taos -s "show mnodes\G" - -taos> show mnodes\G -*************************** 1.row *************************** - id: 1 - endpoint: tdengine-0.taosd.tdengine-test.svc.cluster.local:6030 - role: leader - status: ready -create_time: 2023-07-19 17:54:18.559 -reboot_time: 2023-07-20 09:19:36.060 -*************************** 2.row *************************** - id: 2 - endpoint: tdengine-1.taosd.tdengine-test.svc.cluster.local:6030 - role: follower - status: ready -create_time: 2023-07-20 09:22:05.600 -reboot_time: 2023-07-20 09:22:12.838 -*************************** 3.row *************************** - id: 3 - endpoint: tdengine-2.taosd.tdengine-test.svc.cluster.local:6030 - role: follower - status: ready -create_time: 2023-07-20 09:22:20.042 -reboot_time: 2023-07-20 09:22:23.271 -Query OK, 3 row(s) in set (0.003108s) -``` - -### 端口转发 - -利用 kubectl 端口转发功能可以使应用可以访问 Kubernetes 环境运行的 TDengine 集群。 - -```shell -kubectl port-forward -n tdengine-test tdengine-0 6041:6041 & -``` - -使用 curl 命令验证 TDengine REST API 使用的 6041 接口。 -```shell -curl -u root:taosdata -d "show databases" 127.0.0.1:6041/rest/sql -{"code":0,"column_meta":[["name","VARCHAR",64]],"data":[["information_schema"],["performance_schema"],["test"],["test1"]],"rows":4} -``` - -### 集群扩容 - -TDengine 支持集群扩容: -```shell -kubectl scale statefulsets tdengine -n tdengine-test --replicas=4 -``` - -上面命令行中参数 `--replica=4` 表示要将 TDengine 集群扩容到 4 个节点,执行后首先检查 POD 的状态: -```shell -kubectl get pod -l app=tdengine -n tdengine-test -o wide -``` - -输出如下: -```text -NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES -tdengine-0 1/1 Running 4 (6h26m ago) 6h53m 10.244.2.75 node86 -tdengine-1 1/1 Running 1 (6h39m ago) 6h53m 10.244.0.59 node84 -tdengine-2 1/1 Running 0 5h16m 10.244.1.224 node85 -tdengine-3 1/1 Running 0 3m24s 10.244.2.76 node86 -``` - -此时 Pod 的状态仍然是 Running,TDengine 集群中的 dnode 状态要等 Pod 状态为 ready 之后才能看到: -```shell -kubectl exec -it tdengine-3 -n tdengine-test -- taos -s "show dnodes" -``` - -扩容后的四节点 TDengine 集群的 dnode 列表: -```text -taos> show dnodes - id | endpoint | vnodes | support_vnodes | status | create_time | reboot_time | note | active_code | c_active_code | -============================================================================================================================================================================================================================================= - 1 | tdengine-0.ta... | 10 | 16 | ready | 2023-07-19 17:54:18.552 | 2023-07-20 09:39:04.297 | | | | - 2 | tdengine-1.ta... | 10 | 16 | ready | 2023-07-19 17:54:37.828 | 2023-07-20 09:28:24.240 | | | | - 3 | tdengine-2.ta... | 10 | 16 | ready | 2023-07-19 17:55:01.141 | 2023-07-20 10:48:43.445 | | | | - 4 | tdengine-3.ta... | 0 | 16 | ready | 2023-07-20 16:01:44.007 | 2023-07-20 16:01:44.889 | | | | -Query OK, 4 row(s) in set (0.003628s) -``` - -### 清理集群 - -**Warning** -删除 pvc 时需要注意下 pv persistentVolumeReclaimPolicy 策略,建议改为 Delete,这样在删除 pvc 时才会自动清理 pv,同时会清理底层的 csi 存储资源,如果没有配置删除 pvc 自动清理 pv 的策略,再删除 pvc 后,在手动清理 pv 时,pv 对应的 csi 存储资源可能不会被释放。 - -完整移除 TDengine 集群,需要分别清理 statefulset、svc、pvc,最后删除命名空间。 - -```shell -kubectl delete statefulset -l app=tdengine -n tdengine-test -kubectl delete svc -l app=tdengine -n tdengine-test -kubectl delete pvc -l app=tdengine -n tdengine-test -kubectl delete namespace tdengine-test -``` - -### 集群灾备能力 - -对于在 Kubernetes 环境下 TDengine 的高可用和高可靠来说,对于硬件损坏、灾难恢复,分为两个层面来讲: -- 底层的分布式块存储具备的灾难恢复能力,块存储的多副本,当下流行的分布式块存储如 Ceph,就具备多副本能力,将存储副本扩展到不同的机架、机柜、机房、数据中心(或者直接使用公有云厂商提供的块存储服务) -- TDengine 的灾难恢复,在 TDengine Enterprise 中,本身具备了当一个 dnode 永久下线(物理机磁盘损坏,数据分拣丢失)后,重新拉起一个空白的 dnode 来恢复原 dnode 的工作。 - -## 使用 Helm 部署 TDengine 集群 - -Helm 是 Kubernetes 的包管理器。 -上一节使用 Kubernetes 部署 TDengine 集群的操作已经足够简单,但 Helm 可以提供更强大的能力。 - -### 安装 Helm - -```shell -curl -fsSL -o get_helm.sh \ - https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 -chmod +x get_helm.sh -./get_helm.sh -``` - -Helm 会使用 kubectl 和 kubeconfig 的配置来操作 Kubernetes,可以参考 Rancher 安装 Kubernetes 的配置来进行设置。 - -### 安装 TDengine Chart - -TDengine Chart 尚未发布到 Helm 仓库,当前可以从 GitHub 直接下载: -```shell -wget https://github.com/taosdata/TDengine-Operator/raw/3.0/helm/tdengine-3.0.2.tgz -``` - -获取当前 Kubernetes 的存储类: -```shell -kubectl get storageclass -``` - -在 minikube 默认为 standard。之后,使用 helm 命令安装: -```shell -helm install tdengine tdengine-3.0.2.tgz \ - --set storage.className= \ - --set image.tag=3.2.3.0 - -``` - -在 minikube 环境下,可以设置一个较小的容量避免超出磁盘可用空间: -```shell -helm install tdengine tdengine-3.0.2.tgz \ - --set storage.className=standard \ - --set storage.dataSize=2Gi \ - --set storage.logSize=10Mi \ - --set image.tag=3.2.3.0 -``` - -部署成功后,TDengine Chart 将会输出操作 TDengine 的说明: -```shell -export POD_NAME=$(kubectl get pods --namespace default \ - -l "app.kubernetes.io/name=tdengine,app.kubernetes.io/instance=tdengine" \ - -o jsonpath="{.items[0].metadata.name}") -kubectl --namespace default exec $POD_NAME -- taos -s "show dnodes; show mnodes" -kubectl --namespace default exec -it $POD_NAME -- taos -``` - -可以创建一个表进行测试: -```shell -kubectl --namespace default exec $POD_NAME -- \ - taos -s "create database test; - use test; - create table t1 (ts timestamp, n int); - insert into t1 values(now, 1)(now + 1s, 2); - select * from t1;" -``` - -### 配置 values - -TDengine 支持 `values.yaml` 自定义。 -通过 helm show values 可以获取 TDengine Chart 支持的全部 values 列表: -```shell -helm show values tdengine-3.0.2.tgz -``` - -你可以将结果保存为 values.yaml,之后可以修改其中的各项参数,如 replica 数量,存储类名称,容量大小,TDengine 配置等,然后使用如下命令安装 TDengine 集群: -```shell -helm install tdengine tdengine-3.0.2.tgz -f values.yaml -``` - -全部参数如下: -```yaml -# Default values for tdengine. -# This is a YAML-formatted file. -# Declare variables to be passed into helm templates. - -replicaCount: 1 - -image: - prefix: tdengine/tdengine - #pullPolicy: Always - # Overrides the image tag whose default is the chart appVersion. -# tag: "3.0.2.0" - -service: - # ClusterIP is the default service type, use NodeIP only if you know what you are doing. - type: ClusterIP - ports: - # TCP range required - tcp: [6030, 6041, 6042, 6043, 6044, 6046, 6047, 6048, 6049, 6060] - # UDP range - udp: [6044, 6045] - - -# Set timezone here, not in taoscfg -timezone: "Asia/Shanghai" - -resources: - # We usually recommend not to specify default resources and to leave this as a conscious - # choice for the user. This also increases chances charts run on environments with little - # resources, such as Minikube. If you do want to specify resources, uncomment the following - # lines, adjust them as necessary, and remove the curly braces after 'resources:'. - # limits: - # cpu: 100m - # memory: 128Mi - # requests: - # cpu: 100m - # memory: 128Mi - -storage: - # Set storageClassName for pvc. K8s use default storage class if not set. - # - className: "" - dataSize: "100Gi" - logSize: "10Gi" - -nodeSelectors: - taosd: - # node selectors - -clusterDomainSuffix: "" -# Config settings in taos.cfg file. -# -# The helm/k8s support will use environment variables for taos.cfg, -# converting an upper-snake-cased variable like `TAOS_DEBUG_FLAG`, -# to a camelCase taos config variable `debugFlag`. -# -# See the variable list at https://www.taosdata.com/cn/documentation/administrator . -# -# Note: -# 1. firstEp/secondEp: should not be set here, it's auto generated at scale-up. -# 2. serverPort: should not be set, we'll use the default 6030 in many places. -# 3. fqdn: will be auto generated in kubernetes, user should not care about it. -# 4. role: currently role is not supported - every node is able to be mnode and vnode. -# -# Btw, keep quotes "" around the value like below, even the value will be number or not. -taoscfg: - # Starts as cluster or not, must be 0 or 1. - # 0: all pods will start as a separate TDengine server - # 1: pods will start as TDengine server cluster. [default] - CLUSTER: "1" - - # number of replications, for cluster only - TAOS_REPLICA: "1" - - - # TAOS_NUM_OF_RPC_THREADS: number of threads for RPC - #TAOS_NUM_OF_RPC_THREADS: "2" - - # - # TAOS_NUM_OF_COMMIT_THREADS: number of threads to commit cache data - #TAOS_NUM_OF_COMMIT_THREADS: "4" - - # enable/disable installation / usage report - #TAOS_TELEMETRY_REPORTING: "1" - - # time interval of system monitor, seconds - #TAOS_MONITOR_INTERVAL: "30" - - # time interval of dnode status reporting to mnode, seconds, for cluster only - #TAOS_STATUS_INTERVAL: "1" - - # time interval of heart beat from shell to dnode, seconds - #TAOS_SHELL_ACTIVITY_TIMER: "3" - - # minimum sliding window time, milli-second - #TAOS_MIN_SLIDING_TIME: "10" - - # minimum time window, milli-second - #TAOS_MIN_INTERVAL_TIME: "1" - - # the compressed rpc message, option: - # -1 (no compression) - # 0 (all message compressed), - # > 0 (rpc message body which larger than this value will be compressed) - #TAOS_COMPRESS_MSG_SIZE: "-1" - - # max number of connections allowed in dnode - #TAOS_MAX_SHELL_CONNS: "50000" - - # stop writing logs when the disk size of the log folder is less than this value - #TAOS_MINIMAL_LOG_DIR_G_B: "0.1" - - # stop writing temporary files when the disk size of the tmp folder is less than this value - #TAOS_MINIMAL_TMP_DIR_G_B: "0.1" - - # if disk free space is less than this value, taosd service exit directly within startup process - #TAOS_MINIMAL_DATA_DIR_G_B: "0.1" - - # One mnode is equal to the number of vnode consumed - #TAOS_MNODE_EQUAL_VNODE_NUM: "4" - - # enbale/disable http service - #TAOS_HTTP: "1" - - # enable/disable system monitor - #TAOS_MONITOR: "1" - - # enable/disable async log - #TAOS_ASYNC_LOG: "1" - - # - # time of keeping log files, days - #TAOS_LOG_KEEP_DAYS: "0" - - # The following parameters are used for debug purpose only. - # debugFlag 8 bits mask: FILE-SCREEN-UNUSED-HeartBeat-DUMP-TRACE_WARN-ERROR - # 131: output warning and error - # 135: output debug, warning and error - # 143: output trace, debug, warning and error to log - # 199: output debug, warning and error to both screen and file - # 207: output trace, debug, warning and error to both screen and file - # - # debug flag for all log type, take effect when non-zero value\ - #TAOS_DEBUG_FLAG: "143" - - # generate core file when service crash - #TAOS_ENABLE_CORE_FILE: "1" -``` - -### 扩容 - -关于扩容可参考上一节的说明,有一些额外的操作需要从 helm 的部署中获取。 -首先,从部署中获取 StatefulSet 的名称。 -```shell -export STS_NAME=$(kubectl get statefulset \ - -l "app.kubernetes.io/name=tdengine" \ - -o jsonpath="{.items[0].metadata.name}") -``` - -扩容操作极其简单,增加 replica 即可。以下命令将 TDengine 扩充到三节点: -```shell -kubectl scale --replicas 3 statefulset/$STS_NAME -``` - -使用命令 `show dnodes` 和 `show mnodes` 检查是否扩容成功。 - -### 清理集群 - -Helm 管理下,清理操作也变得简单: - -```shell -helm uninstall tdengine -``` - +--- +sidebar_label: 集群部署 +title: 集群部署 +toc_max_heading_level: 4 +--- + +由于 TDengine 设计之初就采用了分布式架构,具有强大的水平扩展能力,以满足不断增长的数据处理需求,因此 TDengine 支持集群,并将此核心功能开源。用户可以根据实际环境和需求选择 4 种部署方式—手动部署、Docker 部署、Kubernetes 部署和 Helm 部署。 + +## 手动部署 + +### 部署 taosd + +taosd 是 TDengine 集群中最主要的服务组件,本节介绍手动部署 taosd 集群的步骤。 + +#### 1. 清除数据 + +如果搭建集群的物理节点中存在之前的测试数据或者装过其他版本(如 1.x/2.x)的TDengine,请先将其删除,并清空所有数据。 + +#### 2. 检查环境 + +在进行 TDengine 集群部署之前,全面检查所有 dnode 以及应用程序所在物理节点的网络设置至关重要。以下是检查步骤: + +- 第 1 步,在每个物理节点上执行 hostname -f 命令,以查看并确认所有节点的hostname 是唯一的。对于应用程序驱动所在的节点,这一步骤可以省略。 +- 第 2 步,在每个物理节点上执行 ping host 命令,其中 host 是其他物理节点的 hostname。这一步骤旨在检测当前节点与其他物理节点之间的网络连通性。如果发现无法 ping 通,请立即检查网络和 DNS 设置。对于 Linux 操作系统,请检查 /etc/hosts 文件;对于 Windows 操作系统,请检查C:\Windows\system32\drivers\etc\hosts 文件。网络不通畅将导致无法组建集群,请务必解决此问题。 +- 第 3 步,在应用程序运行的物理节点上重复上述网络检测步骤。如果发现网络不通畅,应用程序将无法连接到 taosd 服务。此时,请仔细检查应用程序所在物理节点的DNS 设置或 hosts 文件,确保其配置正确无误。 +- 第 4 步,检查端口,确保集群中所有主机在端口 6030 上的 TCP 能够互通。 + +通过以上步骤,你可以确保所有节点在网络层面顺利通信,从而为成功部署TDengine 集群奠定坚实基础 + +#### 3. 安装 + +为了确保集群内各物理节点的一致性和稳定性,请在所有物理节点上安装相同版本的 TDengine。 + +#### 4. 修改配置 + +修改 TDengine 的配置文件(所有节点的配置文件都需要修改)。假设准备启动的第 1 个 dnode 的 endpoint 为 h1.taosdata.com:6030,其与集群配置相关参数如下。 + +```shell +# firstEp 是每个 dnode 首次启动后连接的第 1 个 dnode +firstEp h1.taosdata.com:6030 +# 必须配置为本 dnode 的 FQDN,如果本机只有一个 hostname,可注释或删除如下这行代码 +fqdn h1.taosdata.com +# 配置本 dnode 的端口,默认是 6030 +serverPort 6030 +``` + +一定要修改的参数是 firstEp 和 fqdn。对于每个 dnode,firstEp 配置应该保持一致,但 fqdn 一定要配置成其所在 dnode 的值。其他参数可不做任何修改,除非你很清楚为什么要修改。 + +对于希望加入集群的 dnode 节点,必须确保下表所列的与 TDengine 集群相关的参数设置完全一致。任何参数的不匹配都可能导致 dnode 节点无法成功加入集群。 + +| 参数名称 | 含义 | +|:---------------:|:----------------------------------------------------------:| +|statusInterval | dnode 向 mnode 报告状态的间隔 | +|timezone | 时区 | +|locale | 系统区位信息及编码格式 | +|charset | 字符集编码 | +|ttlChangeOnWrite | ttl 到期时间是否伴随表的修改操作而改变 | + +#### 5. 启动 + +按照前述步骤启动第 1 个 dnode,例如 h1.taosdata.com。接着在终端中执行 taos,启动 TDengine 的 CLI 程序 taos,并在其中执行 show dnodes 命令,以查看当前集群中的所有 dnode 信息。 + +```shell +taos> show dnodes; + id | endpoint | vnodes|support_vnodes|status| create_time | note | +=================================================================================== + 1| h1.taosdata.com:6030 | 0| 1024| ready| 2022-07-16 10:50:42.673 | | +``` + +可以看到,刚刚启动的 dnode 节点的 endpoint 为 h1.taosdata.com:6030。这个地址就是新建集群的 first Ep。 + +#### 6. 添加 dnode + +按照前述步骤,在每个物理节点启动 taosd。每个 dnode 都需要在 taos.cfg 文件中将 firstEp 参数配置为新建集群首个节点的 endpoint,在本例中是 h1.taosdata.com:6030。在第 1 个 dnode 所在机器,在终端中运行 taos,打开 TDengine 的 CLI 程序 taos,然后登录TDengine 集群,执行如下 SQL。 + +```shell +create dnode "h2.taosdata.com:6030" +``` + +将新 dnode 的 endpoint 添加进集群的 endpoint 列表。需要为 `fqdn:port` 加上双引号,否则运行时出错。请注意将示例的 h2.taosdata.com:6030 替换为这个新 dnode 的 endpoint。然后执行如下 SQL 查看新节点是否成功加入。若要加入的 dnode 当前处于离线状态,请参考本节后面的 “常见问题”部分进行解决。 + +```shell +show dnodes; +``` + +在日志中,请确认输出的 dnode 的 fqdn 和端口是否与你刚刚尝试添加的 endpoint 一致。如果不一致,请修正为正确的 endpoint。遵循上述步骤,你可以持续地将新的 dnode 逐个加入集群,从而扩展集群规模并提高整体性能。确保在添加新节点时遵循正确的流程,这有助于维持集群的稳定性和可靠性。 + +**Tips** +- 任何已经加入集群的 dnode 都可以作为后续待加入节点的 firstEp。firstEp 参数仅仅在该 dnode 首次加入集群时起作用,加入集群后,该 dnode 会保存最新的 mnode 的 endpoint 列表,后续不再依赖这个参数。之后配置文件中的 firstEp 参数主要用于客户端连接,如果没有为 TDengine 的 CLI 设置参数,则默认连接由 firstEp 指定的节点。 +- 两个没有配置 firstEp 参数的 dnode 在启动后会独立运行。这时无法将其中一个dnode 加入另外一个 dnode,形成集群。 +- TDengine 不允许将两个独立的集群合并成新的集群。 + +#### 7. 添加 mnode + +在创建 TDengine 集群时,首个 dnode 将自动成为集群的 mnode,负责集群的管理和协调工作。为了实现 mnode 的高可用性,后续添加的 dnode 需要手动创建 mnode。请注意,一个集群最多允许创建 3 个 mnode,且每个 dnode 上只能创建一个 mnode。当集群中的 dnode 数量达到或超过 3 个时,你可以为现有集群创建 mnode。在第 1个 dnode 中,首先通过 TDengine 的 CLI 程序 taos 登录 TDengine,然后执行如下 SQL。 + +```shell +create mnode on dnode +``` + +请注意将上面示例中的 dnodeId 替换为刚创建 dnode 的序号(可以通过执行 `show dnodes` 命令获得)。最后执行如下 `show mnodes`,查看新创建的 mnode 是否成功加入集群。 + + +**Tips** + +在搭建 TDengine 集群的过程中,如果在执行 create dnode 命令以添加新节点后,新节点始终显示为离线状态,请按照以下步骤进行排查。 + +- 第 1 步,检查新节点上的 taosd 服务是否已经正常启动。你可以通过查看日志文件或使用 ps 命令来确认。 +- 第 2 步,如果 taosd 服务已启动,接下来请检查新节点的网络连接是否畅通,并确认防火墙是否已关闭。网络不通或防火墙设置可能会阻止节点与集群的其他节点通信。 +- 第 3 步,使用 taos -h fqdn 命令尝试连接到新节点,然后执行 show dnodes 命令。这将显示新节点作为独立集群的运行状态。如果显示的列表与主节点上显示的不一致,说明新节点可能已自行组成一个单节点集群。要解决这个问题,请按照以下步骤操作。首先,停止新节点上的 taosd 服务。其次,清空新节点上 taos.cfg 配置文件中指定的 dataDir 目录下的所有文件。这将删除与该节点相关的所有数据和配置信息。最后,重新启动新节点上的 taosd 服务。这将使新节点恢复到初始状态,并准备好重新加入主集群。 + +### 部署 taosAdapter + +本节讲述如何部署 taosAdapter,taosAdapter 为 TDengine 集群提供 RESTful 和 WebSocket 接入能力,因而在集群中扮演着很重要的角色。 + +1. 安装 + +TDengine Enterprise 安装完成后,即可使用 taosAdapter。如果想在不同的服务器上分别部署 taosAdapter,需要在这些服务器上都安装 TDengine Enterprise。 + +2. 单一实例部署 + +部署 taosAdapter 的单一实例非常简单,具体命令和配置参数请参考手册中 taosAdapter 部分。 + +3. 多实例部署 + +部署 taosAdapter 的多个实例的主要目的如下: +- 提升集群的吞吐量,避免 taosAdapter 成为系统瓶颈。 +- 提升集群的健壮性和高可用能力,当有一个实例因某种故障而不再提供服务时,可以将进入业务系统的请求自动路由到其他实例。 + +在部署 taosAdapter 的多个实例时,需要解决负载均衡问题,以避免某个节点过载而其他节点闲置。在部署过程中,需要分别部署多个单一实例,每个实例的部署步骤与部署单一实例完全相同。接下来关键的部分是配置 Nginx。以下是一个经过验证的较佳实践配置,你只须将其中的 endpoint 替换为实际环境中的正确地址即可。关于各参数的含义,请参考 Nginx 的官方文档。 + +```json +user root; +worker_processes auto; +error_log /var/log/nginx_error.log; + + +events { + use epoll; + worker_connections 1024; +} + +http { + + access_log off; + + map $http_upgrade $connection_upgrade { + default upgrade; + '' close; + } + + server { + listen 6041; + location ~* { + proxy_pass http://dbserver; + proxy_read_timeout 600s; + proxy_send_timeout 600s; + proxy_connect_timeout 600s; + proxy_next_upstream error http_502 non_idempotent; + proxy_http_version 1.1; + proxy_set_header Upgrade $http_upgrade; + proxy_set_header Connection $http_connection; + } + } + server { + listen 6043; + location ~* { + proxy_pass http://keeper; + proxy_read_timeout 60s; + proxy_next_upstream error http_502 http_500 non_idempotent; + } + } + + server { + listen 6060; + location ~* { + proxy_pass http://explorer; + proxy_read_timeout 60s; + proxy_next_upstream error http_502 http_500 non_idempotent; + } + } + upstream dbserver { + least_conn; + server 172.16.214.201:6041 max_fails=0; + server 172.16.214.202:6041 max_fails=0; + server 172.16.214.203:6041 max_fails=0; + } + upstream keeper { + ip_hash; + server 172.16.214.201:6043 ; + server 172.16.214.202:6043 ; + server 172.16.214.203:6043 ; + } + upstream explorer{ + ip_hash; + server 172.16.214.201:6060 ; + server 172.16.214.202:6060 ; + server 172.16.214.203:6060 ; + } +} +``` + +### 部署 taosKeeper + +如果要想使用 TDegnine 的监控功能,taosKeeper 是一个必要的组件,关于监控请参考[TDinsight](../../reference/components/tdinsight),关于部署 taosKeeper 的细节请参考[taosKeeper参考手册](../../reference/components/taoskeeper)。 + +### 部署 taosX + +如果想使用 TDengine 的数据接入能力,需要部署 taosX 服务,关于它的详细说明和部署请参考[taosX 参考手册](../../reference/components/taosx)。 + +### 部署 taosX-Agent + +有些数据源如 Pi, OPC 等,因为网络条件和数据源访问的限制,taosX 无法直接访问数据源,这种情况下需要部署一个代理服务 taosX-Agent,关于它的详细说明和部署请参考[taosX-Agent 参考手册](../../reference/components/taosx-agent)。 + +### 部署 taos-Explorer + +TDengine 提供了可视化管理 TDengine 集群的能力,要想使用图形化界面需要部署 taos-Explorer 服务,关于它的详细说明和部署请参考[taos-Explorer 参考手册](../../reference/components/explorer) + + +## Docker 部署 + +本节将介绍如何在 Docker 容器中启动 TDengine 服务并对其进行访问。你可以在 docker run 命令行或者 docker-compose 文件中使用环境变量来控制容器中服务的行为。 + +### 启动 TDengine + +TDengine 镜像启动时默认激活 HTTP 服务,使用下列命令便可创建一个带有 HTTP 服务的容器化 TDengine 环境。 +```shell +docker run -d --name tdengine \ +-v ~/data/taos/dnode/data:/var/lib/taos \ +-v ~/data/taos/dnode/log:/var/log/taos \ +-p 6041:6041 tdengine/tdengine +``` + +详细的参数说明如下。 +- /var/lib/taos:TDengine 默认数据文件目录,可通过配置文件修改位置。 +- /var/log/taos:TDengine 默认日志文件目录,可通过配置文件修改位置。 + +以上命令启动了一个名为 tdengine 的容器,并把其中的 HTTP 服务的端口 6041 映射到主机端口 6041。如下命令可以验证该容器中提供的 HTTP 服务是否可用。 + +```shell +curl -u root:taosdata -d "show databases" localhost:6041/rest/sql +``` + +运行如下命令可在容器中访问 TDengine。 +```shell +$ docker exec -it tdengine taos + +taos> show databases; + name | +================================= + information_schema | + performance_schema | +Query OK, 2 rows in database (0.033802s) +``` + +在容器中,TDengine CLI 或者各种连接器(例如 JDBC-JNI)与服务器通过容器的 hostname 建立连接。从容器外访问容器内的 TDengine 比较复杂,通过 RESTful/WebSocket 连接方式是最简单的方法。 + +### 在 host 网络模式下启动 TDengine + +运行以下命令可以在 host 网络模式下启动 TDengine,这样可以使用主机的 FQDN 建立连接,而不是使用容器的 hostname。 +```shell +docker run -d --name tdengine --network host tdengine/tdengine +``` + +这种方式与在主机上使用 systemctl 命令启动 TDengine 的效果相同。在主机上已安装 TDengine 客户端的情况下,可以直接使用下面的命令访问 TDengine 服务。 +```shell +$ taos + +taos> show dnodes; + id | endpoint | vnodes | support_vnodes | status | create_time | note | +================================================================================================================================================= + 1 | vm98:6030 | 0 | 32 | ready | 2022-08-19 14:50:05.337 | | +Query OK, 1 rows in database (0.010654s) +``` + +### 以指定的 hostname 和 port 启动 TDengine + +使用如下命令可以利用 TAOS_FQDN 环境变量或者 taos.cfg 中的 fqdn 配置项使TDengine 在指定的 hostname 上建立连接。这种方式为部署 TDengine 提供了更大的灵活性。 + +```shell +docker run -d \ + --name tdengine \ + -e TAOS_FQDN=tdengine \ + -p 6030:6030 \ + -p 6041-6049:6041-6049 \ + -p 6041-6049:6041-6049/udp \ + tdengine/tdengine +``` + +首先,上面的命令在容器中启动一个 TDengine 服务,其所监听的 hostname 为tdengine,并将容器的端口 6030 映射到主机的端口 6030,将容器的端口段 [6041, 6049] 映射到主机的端口段 [6041, 6049]。如果主机上该端口段已经被占用,可以修改上述命令以指定一个主机上空闲的端口段。 + +其次,要确保 tdengine 这个 hostname 在 /etc/hosts 中可解析。通过如下命令可将正确的配置信息保存到 hosts 文件中。 +```shell +echo 127.0.0.1 tdengine |sudo tee -a /etc/hosts +``` + +最后,可以通过 TDengine CLI 以 tdengine 为服务器地址访问 TDengine 服务,命令如下。 +```shell +taos -h tdengine -P 6030 +``` + +如果 TAOS_FQDN 被设置为与所在主机名相同,则效果与“在 host 网络模式下启动TDengine”相同。 + +## Kubernetes 部署 + +作为面向云原生架构设计的时序数据库,TDengine 本身就支持 Kubernetes 部署。这里介绍如何使用 YAML 文件从头一步一步创建一个可用于生产使用的高可用 TDengine 集群,并重点介绍 Kubernetes 环境下 TDengine 的常用操作。本小节要求读者对 Kubernetes 有一定的了解,可以熟练运行常见的 kubectl 命令,了解 statefulset、service、pvc 等概念,对这些概念不熟悉的读者,可以先参考 Kubernetes 的官网进行学习。 +为了满足高可用的需求,集群需要满足如下要求: +- 3 个及以上 dnode :TDengine 的同一个 vgroup 中的多个 vnode ,不允许同时分布在一个 dnode ,所以如果创建 3 副本的数据库,则 dnode 数大于等于 3 +- 3 个 mnode :mnode 负责整个集群的管理工作,TDengine 默认是一个 mnode。如果这个 mnode 所在的 dnode 掉线,则整个集群不可用。 +- 数据库的 3 副本:TDengine 的副本配置是数据库级别,所以数据库 3 副本可满足在 3 个 dnode 的集群中,任意一个 dnode 下线,都不影响集群的正常使用。如果下线 dnode 个数为 2 时,此时集群不可用,因为 RAFT 无法完成选举。(企业版:在灾难恢复场景,任一节点数据文件损坏,都可以通过重新拉起 dnode 进行恢复) + +### 前置条件 + +要使用 Kubernetes 部署管理 TDengine 集群,需要做好如下准备工作。 +- 本文适用 Kubernetes v1.19 以上版本 +- 本文使用 kubectl 工具进行安装部署,请提前安装好相应软件 +- Kubernetes 已经安装部署并能正常访问使用或更新必要的容器仓库或其他服务 + +### 配置 Service 服务 + +创建一个 Service 配置文件:taosd-service.yaml,服务名称 metadata.name (此处为 "taosd") 将在下一步中使用到。首先添加 TDengine 所用到的端口,然后在选择器设置确定的标签 app (此处为 “tdengine”)。 + +```yaml +--- +apiVersion: v1 +kind: Service +metadata: + name: "taosd" + labels: + app: "tdengine" +spec: + ports: + - name: tcp6030 + protocol: "TCP" + port: 6030 + - name: tcp6041 + protocol: "TCP" + port: 6041 + selector: + app: "tdengine" +``` + +### 有状态服务 StatefulSet + +根据 Kubernetes 对各类部署的说明,我们将使用 StatefulSet 作为 TDengine 的部署资源类型。 创建文件 tdengine.yaml,其中 replicas 定义集群节点的数量为 3。节点时区为中国(Asia/Shanghai),每个节点分配 5G 标准(standard)存储,你也可以根据实际情况进行相应修改。 + +请特别注意 startupProbe 的配置,在 dnode 的 Pod 掉线一段时间后,再重新启动,这个时候新上线的 dnode 会短暂不可用。如果 startupProbe 配置过小,Kubernetes 会认为该 Pod 处于不正常的状态,并尝试重启该 Pod,该 dnode 的 Pod 会频繁重启,始终无法恢复到正常状态。 + +```yaml +--- +apiVersion: apps/v1 +kind: StatefulSet +metadata: + name: "tdengine" + labels: + app: "tdengine" +spec: + serviceName: "taosd" + replicas: 3 + updateStrategy: + type: RollingUpdate + selector: + matchLabels: + app: "tdengine" + template: + metadata: + name: "tdengine" + labels: + app: "tdengine" + spec: + containers: + - name: "tdengine" + image: "tdengine/tdengine:3.2.3.0" + imagePullPolicy: "IfNotPresent" + ports: + - name: tcp6030 + protocol: "TCP" + containerPort: 6030 + - name: tcp6041 + protocol: "TCP" + containerPort: 6041 + env: + # POD_NAME for FQDN config + - name: POD_NAME + valueFrom: + fieldRef: + fieldPath: metadata.name + # SERVICE_NAME and NAMESPACE for fqdn resolve + - name: SERVICE_NAME + value: "taosd" + - name: STS_NAME + value: "tdengine" + - name: STS_NAMESPACE + valueFrom: + fieldRef: + fieldPath: metadata.namespace + # TZ for timezone settings, we recommend to always set it. + - name: TZ + value: "Asia/Shanghai" + # Environment variables with prefix TAOS_ will be parsed and converted into corresponding parameter in taos.cfg. For example, serverPort in taos.cfg should be configured by TAOS_SERVER_PORT when using K8S to deploy + - name: TAOS_SERVER_PORT + value: "6030" + # Must set if you want a cluster. + - name: TAOS_FIRST_EP + value: "$(STS_NAME)-0.$(SERVICE_NAME).$(STS_NAMESPACE).svc.cluster.local:$(TAOS_SERVER_PORT)" + # TAOS_FQND should always be set in k8s env. + - name: TAOS_FQDN + value: "$(POD_NAME).$(SERVICE_NAME).$(STS_NAMESPACE).svc.cluster.local" + volumeMounts: + - name: taosdata + mountPath: /var/lib/taos + startupProbe: + exec: + command: + - taos-check + failureThreshold: 360 + periodSeconds: 10 + readinessProbe: + exec: + command: + - taos-check + initialDelaySeconds: 5 + timeoutSeconds: 5000 + livenessProbe: + exec: + command: + - taos-check + initialDelaySeconds: 15 + periodSeconds: 20 + volumeClaimTemplates: + - metadata: + name: taosdata + spec: + accessModes: + - "ReadWriteOnce" + storageClassName: "standard" + resources: + requests: + storage: "5Gi" +``` + +### 使用 kubectl 命令部署 TDengine 集群 + +首先创建对应的 namespace dengine-test,以及 pvc,并保证 storageClassName 是 standard 的剩余空间足够。然后顺序执行以下命令: +```shell +kubectl apply -f taosd-service.yaml -n tdengine-test +``` + +上面的配置将生成一个三节点的 TDengine 集群,dnode 为自动配置,可以使用 show dnodes 命令查看当前集群的节点: +```shell +kubectl exec -it tdengine-0 -n tdengine-test -- taos -s "show dnodes" +kubectl exec -it tdengine-1 -n tdengine-test -- taos -s "show dnodes" +kubectl exec -it tdengine-2 -n tdengine-test -- taos -s "show dnodes" +``` + +输出如下: +```shell +taos show dnodes + id | endpoint | vnodes | support_vnodes | status | create_time | reboot_time | note | active_code | c_active_code | +============================================================================================================================================================================================================================================= + 1 | tdengine-0.ta... | 0 | 16 | ready | 2023-07-19 17:54:18.552 | 2023-07-19 17:54:18.469 | | | | + 2 | tdengine-1.ta... | 0 | 16 | ready | 2023-07-19 17:54:37.828 | 2023-07-19 17:54:38.698 | | | | + 3 | tdengine-2.ta... | 0 | 16 | ready | 2023-07-19 17:55:01.141 | 2023-07-19 17:55:02.039 | | | | +Query OK, 3 row(s) in set (0.001853s) +``` + +查看当前 mnode +```shell +kubectl exec -it tdengine-1 -n tdengine-test -- taos -s "show mnodes\G" +taos> show mnodes\G +*************************** 1.row *************************** + id: 1 + endpoint: tdengine-0.taosd.tdengine-test.svc.cluster.local:6030 + role: leader + status: ready +create_time: 2023-07-19 17:54:18.559 +reboot_time: 2023-07-19 17:54:19.520 +Query OK, 1 row(s) in set (0.001282s) +``` + +创建 mnode +```shell +kubectl exec -it tdengine-0 -n tdengine-test -- taos -s "create mnode on dnode 2" +kubectl exec -it tdengine-0 -n tdengine-test -- taos -s "create mnode on dnode 3" +``` + +查看 mnode +```shell +kubectl exec -it tdengine-1 -n tdengine-test -- taos -s "show mnodes\G" + +taos> show mnodes\G +*************************** 1.row *************************** + id: 1 + endpoint: tdengine-0.taosd.tdengine-test.svc.cluster.local:6030 + role: leader + status: ready +create_time: 2023-07-19 17:54:18.559 +reboot_time: 2023-07-20 09:19:36.060 +*************************** 2.row *************************** + id: 2 + endpoint: tdengine-1.taosd.tdengine-test.svc.cluster.local:6030 + role: follower + status: ready +create_time: 2023-07-20 09:22:05.600 +reboot_time: 2023-07-20 09:22:12.838 +*************************** 3.row *************************** + id: 3 + endpoint: tdengine-2.taosd.tdengine-test.svc.cluster.local:6030 + role: follower + status: ready +create_time: 2023-07-20 09:22:20.042 +reboot_time: 2023-07-20 09:22:23.271 +Query OK, 3 row(s) in set (0.003108s) +``` + +### 端口转发 + +利用 kubectl 端口转发功能可以使应用可以访问 Kubernetes 环境运行的 TDengine 集群。 + +```shell +kubectl port-forward -n tdengine-test tdengine-0 6041:6041 & +``` + +使用 curl 命令验证 TDengine REST API 使用的 6041 接口。 +```shell +curl -u root:taosdata -d "show databases" 127.0.0.1:6041/rest/sql +{"code":0,"column_meta":[["name","VARCHAR",64]],"data":[["information_schema"],["performance_schema"],["test"],["test1"]],"rows":4} +``` + +### 集群扩容 + +TDengine 支持集群扩容: +```shell +kubectl scale statefulsets tdengine -n tdengine-test --replicas=4 +``` + +上面命令行中参数 `--replica=4` 表示要将 TDengine 集群扩容到 4 个节点,执行后首先检查 POD 的状态: +```shell +kubectl get pod -l app=tdengine -n tdengine-test -o wide +``` + +输出如下: +```text +NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES +tdengine-0 1/1 Running 4 (6h26m ago) 6h53m 10.244.2.75 node86 +tdengine-1 1/1 Running 1 (6h39m ago) 6h53m 10.244.0.59 node84 +tdengine-2 1/1 Running 0 5h16m 10.244.1.224 node85 +tdengine-3 1/1 Running 0 3m24s 10.244.2.76 node86 +``` + +此时 Pod 的状态仍然是 Running,TDengine 集群中的 dnode 状态要等 Pod 状态为 ready 之后才能看到: +```shell +kubectl exec -it tdengine-3 -n tdengine-test -- taos -s "show dnodes" +``` + +扩容后的四节点 TDengine 集群的 dnode 列表: +```text +taos> show dnodes + id | endpoint | vnodes | support_vnodes | status | create_time | reboot_time | note | active_code | c_active_code | +============================================================================================================================================================================================================================================= + 1 | tdengine-0.ta... | 10 | 16 | ready | 2023-07-19 17:54:18.552 | 2023-07-20 09:39:04.297 | | | | + 2 | tdengine-1.ta... | 10 | 16 | ready | 2023-07-19 17:54:37.828 | 2023-07-20 09:28:24.240 | | | | + 3 | tdengine-2.ta... | 10 | 16 | ready | 2023-07-19 17:55:01.141 | 2023-07-20 10:48:43.445 | | | | + 4 | tdengine-3.ta... | 0 | 16 | ready | 2023-07-20 16:01:44.007 | 2023-07-20 16:01:44.889 | | | | +Query OK, 4 row(s) in set (0.003628s) +``` + +### 清理集群 + +**Warning** +删除 pvc 时需要注意下 pv persistentVolumeReclaimPolicy 策略,建议改为 Delete,这样在删除 pvc 时才会自动清理 pv,同时会清理底层的 csi 存储资源,如果没有配置删除 pvc 自动清理 pv 的策略,再删除 pvc 后,在手动清理 pv 时,pv 对应的 csi 存储资源可能不会被释放。 + +完整移除 TDengine 集群,需要分别清理 statefulset、svc、pvc,最后删除命名空间。 + +```shell +kubectl delete statefulset -l app=tdengine -n tdengine-test +kubectl delete svc -l app=tdengine -n tdengine-test +kubectl delete pvc -l app=tdengine -n tdengine-test +kubectl delete namespace tdengine-test +``` + +### 集群灾备能力 + +对于在 Kubernetes 环境下 TDengine 的高可用和高可靠来说,对于硬件损坏、灾难恢复,分为两个层面来讲: +- 底层的分布式块存储具备的灾难恢复能力,块存储的多副本,当下流行的分布式块存储如 Ceph,就具备多副本能力,将存储副本扩展到不同的机架、机柜、机房、数据中心(或者直接使用公有云厂商提供的块存储服务) +- TDengine 的灾难恢复,在 TDengine Enterprise 中,本身具备了当一个 dnode 永久下线(物理机磁盘损坏,数据分拣丢失)后,重新拉起一个空白的 dnode 来恢复原 dnode 的工作。 + +## 使用 Helm 部署 TDengine 集群 + +Helm 是 Kubernetes 的包管理器。 +上一节使用 Kubernetes 部署 TDengine 集群的操作已经足够简单,但 Helm 可以提供更强大的能力。 + +### 安装 Helm + +```shell +curl -fsSL -o get_helm.sh \ + https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 +chmod +x get_helm.sh +./get_helm.sh +``` + +Helm 会使用 kubectl 和 kubeconfig 的配置来操作 Kubernetes,可以参考 Rancher 安装 Kubernetes 的配置来进行设置。 + +### 安装 TDengine Chart + +TDengine Chart 尚未发布到 Helm 仓库,当前可以从 GitHub 直接下载: +```shell +wget https://github.com/taosdata/TDengine-Operator/raw/3.0/helm/tdengine-3.0.2.tgz +``` + +获取当前 Kubernetes 的存储类: +```shell +kubectl get storageclass +``` + +在 minikube 默认为 standard。之后,使用 helm 命令安装: +```shell +helm install tdengine tdengine-3.0.2.tgz \ + --set storage.className= \ + --set image.tag=3.2.3.0 + +``` + +在 minikube 环境下,可以设置一个较小的容量避免超出磁盘可用空间: +```shell +helm install tdengine tdengine-3.0.2.tgz \ + --set storage.className=standard \ + --set storage.dataSize=2Gi \ + --set storage.logSize=10Mi \ + --set image.tag=3.2.3.0 +``` + +部署成功后,TDengine Chart 将会输出操作 TDengine 的说明: +```shell +export POD_NAME=$(kubectl get pods --namespace default \ + -l "app.kubernetes.io/name=tdengine,app.kubernetes.io/instance=tdengine" \ + -o jsonpath="{.items[0].metadata.name}") +kubectl --namespace default exec $POD_NAME -- taos -s "show dnodes; show mnodes" +kubectl --namespace default exec -it $POD_NAME -- taos +``` + +可以创建一个表进行测试: +```shell +kubectl --namespace default exec $POD_NAME -- \ + taos -s "create database test; + use test; + create table t1 (ts timestamp, n int); + insert into t1 values(now, 1)(now + 1s, 2); + select * from t1;" +``` + +### 配置 values + +TDengine 支持 `values.yaml` 自定义。 +通过 helm show values 可以获取 TDengine Chart 支持的全部 values 列表: +```shell +helm show values tdengine-3.0.2.tgz +``` + +你可以将结果保存为 values.yaml,之后可以修改其中的各项参数,如 replica 数量,存储类名称,容量大小,TDengine 配置等,然后使用如下命令安装 TDengine 集群: +```shell +helm install tdengine tdengine-3.0.2.tgz -f values.yaml +``` + +全部参数如下: +```yaml +# Default values for tdengine. +# This is a YAML-formatted file. +# Declare variables to be passed into helm templates. + +replicaCount: 1 + +image: + prefix: tdengine/tdengine + #pullPolicy: Always + # Overrides the image tag whose default is the chart appVersion. +# tag: "3.0.2.0" + +service: + # ClusterIP is the default service type, use NodeIP only if you know what you are doing. + type: ClusterIP + ports: + # TCP range required + tcp: [6030, 6041, 6042, 6043, 6044, 6046, 6047, 6048, 6049, 6060] + # UDP range + udp: [6044, 6045] + + +# Set timezone here, not in taoscfg +timezone: "Asia/Shanghai" + +resources: + # We usually recommend not to specify default resources and to leave this as a conscious + # choice for the user. This also increases chances charts run on environments with little + # resources, such as Minikube. If you do want to specify resources, uncomment the following + # lines, adjust them as necessary, and remove the curly braces after 'resources:'. + # limits: + # cpu: 100m + # memory: 128Mi + # requests: + # cpu: 100m + # memory: 128Mi + +storage: + # Set storageClassName for pvc. K8s use default storage class if not set. + # + className: "" + dataSize: "100Gi" + logSize: "10Gi" + +nodeSelectors: + taosd: + # node selectors + +clusterDomainSuffix: "" +# Config settings in taos.cfg file. +# +# The helm/k8s support will use environment variables for taos.cfg, +# converting an upper-snake-cased variable like `TAOS_DEBUG_FLAG`, +# to a camelCase taos config variable `debugFlag`. +# +# See the variable list at https://www.taosdata.com/cn/documentation/administrator . +# +# Note: +# 1. firstEp/secondEp: should not be set here, it's auto generated at scale-up. +# 2. serverPort: should not be set, we'll use the default 6030 in many places. +# 3. fqdn: will be auto generated in kubernetes, user should not care about it. +# 4. role: currently role is not supported - every node is able to be mnode and vnode. +# +# Btw, keep quotes "" around the value like below, even the value will be number or not. +taoscfg: + # Starts as cluster or not, must be 0 or 1. + # 0: all pods will start as a separate TDengine server + # 1: pods will start as TDengine server cluster. [default] + CLUSTER: "1" + + # number of replications, for cluster only + TAOS_REPLICA: "1" + + + # TAOS_NUM_OF_RPC_THREADS: number of threads for RPC + #TAOS_NUM_OF_RPC_THREADS: "2" + + # + # TAOS_NUM_OF_COMMIT_THREADS: number of threads to commit cache data + #TAOS_NUM_OF_COMMIT_THREADS: "4" + + # enable/disable installation / usage report + #TAOS_TELEMETRY_REPORTING: "1" + + # time interval of system monitor, seconds + #TAOS_MONITOR_INTERVAL: "30" + + # time interval of dnode status reporting to mnode, seconds, for cluster only + #TAOS_STATUS_INTERVAL: "1" + + # time interval of heart beat from shell to dnode, seconds + #TAOS_SHELL_ACTIVITY_TIMER: "3" + + # minimum sliding window time, milli-second + #TAOS_MIN_SLIDING_TIME: "10" + + # minimum time window, milli-second + #TAOS_MIN_INTERVAL_TIME: "1" + + # the compressed rpc message, option: + # -1 (no compression) + # 0 (all message compressed), + # > 0 (rpc message body which larger than this value will be compressed) + #TAOS_COMPRESS_MSG_SIZE: "-1" + + # max number of connections allowed in dnode + #TAOS_MAX_SHELL_CONNS: "50000" + + # stop writing logs when the disk size of the log folder is less than this value + #TAOS_MINIMAL_LOG_DIR_G_B: "0.1" + + # stop writing temporary files when the disk size of the tmp folder is less than this value + #TAOS_MINIMAL_TMP_DIR_G_B: "0.1" + + # if disk free space is less than this value, taosd service exit directly within startup process + #TAOS_MINIMAL_DATA_DIR_G_B: "0.1" + + # One mnode is equal to the number of vnode consumed + #TAOS_MNODE_EQUAL_VNODE_NUM: "4" + + # enbale/disable http service + #TAOS_HTTP: "1" + + # enable/disable system monitor + #TAOS_MONITOR: "1" + + # enable/disable async log + #TAOS_ASYNC_LOG: "1" + + # + # time of keeping log files, days + #TAOS_LOG_KEEP_DAYS: "0" + + # The following parameters are used for debug purpose only. + # debugFlag 8 bits mask: FILE-SCREEN-UNUSED-HeartBeat-DUMP-TRACE_WARN-ERROR + # 131: output warning and error + # 135: output debug, warning and error + # 143: output trace, debug, warning and error to log + # 199: output debug, warning and error to both screen and file + # 207: output trace, debug, warning and error to both screen and file + # + # debug flag for all log type, take effect when non-zero value\ + #TAOS_DEBUG_FLAG: "143" + + # generate core file when service crash + #TAOS_ENABLE_CORE_FILE: "1" +``` + +### 扩容 + +关于扩容可参考上一节的说明,有一些额外的操作需要从 helm 的部署中获取。 +首先,从部署中获取 StatefulSet 的名称。 +```shell +export STS_NAME=$(kubectl get statefulset \ + -l "app.kubernetes.io/name=tdengine" \ + -o jsonpath="{.items[0].metadata.name}") +``` + +扩容操作极其简单,增加 replica 即可。以下命令将 TDengine 扩充到三节点: +```shell +kubectl scale --replicas 3 statefulset/$STS_NAME +``` + +使用命令 `show dnodes` 和 `show mnodes` 检查是否扩容成功。 + +### 清理集群 + +Helm 管理下,清理操作也变得简单: + +```shell +helm uninstall tdengine +``` + 但 Helm 也不会自动移除 PVC,需要手动获取 PVC 然后删除掉。 \ No newline at end of file diff --git a/docs/zh/07-operation/04-maintenance.md b/docs/zh/07-operation/04-maintenance.md index a7b307b5ae..3c02e4dd39 100644 --- a/docs/zh/07-operation/04-maintenance.md +++ b/docs/zh/07-operation/04-maintenance.md @@ -8,9 +8,13 @@ sidebar_label: 集群维护 本节介绍 TDengine Enterprise 中提供的高阶集群维护手段,能够使 TDengine 集群长期运行得更健壮和高效。 +## 节点管理 + +如何管理集群节点请参考[节点管理](../../reference/taos-sql/node) + ## 数据重整 -TDengine 面向多种写入场景,而很多写入场景下,TDengine 的存储会导致数据存储的放大或数据文件的空洞等。这一方面影响数据的存储效率,另一方面也会影响查询效率。为了解决上述问题,TDengine 企业版提供了对数据的重整功能,即 DATA COMPACT 功能,将存储的数据文件重新整理,删除文件空洞和无效数据,提高数据的组织度,从而提高存储和查询的效率。 +TDengine 面向多种写入场景,而很多写入场景下,TDengine 的存储会导致数据存储的放大或数据文件的空洞等。这一方面影响数据的存储效率,另一方面也会影响查询效率。为了解决上述问题,TDengine 企业版提供了对数据的重整功能,即 DATA COMPACT 功能,将存储的数据文件重新整理,删除文件空洞和无效数据,提高数据的组织度,从而提高存储和查询的效率。数据重整功能在 3.0.3.0 版本第一次发布,此后又经过了多次迭代优化,建议使用最新版本。 ### 语法 @@ -39,7 +43,7 @@ KILL COMPACT compact_id; ## Vgroup Leader 再平衡 -当多副本集群中的一个或多个节点因为升级或其它原因而重启后,有可能出现集群中各个 dnode 负载不均衡的现象,极端情况下会出现所有 vgroup 的 leader 都位于同一个 dnode 的情况。为了解决这个问题,可以使用下面的命令 +当多副本集群中的一个或多个节点因为升级或其它原因而重启后,有可能出现集群中各个 dnode 负载不均衡的现象,极端情况下会出现所有 vgroup 的 leader 都位于同一个 dnode 的情况。为了解决这个问题,可以使用下面的命令,该命令在 3.0.4.0 版本中首次发布,建议尽可能使用最新版本。 ```SQL balance vgroup leader; # 再平衡所有 vgroup 的 leader @@ -73,7 +77,7 @@ restore qnode on dnode ;# 恢复dnode上的qnode ## 分裂虚拟组 -当一个 vgroup 因为子表数过多而导致 CPU 或 Disk 资源使用量负载过高时,增加 dnode 节点后,可通过split vgroup命令把该vgroup分裂为两个虚拟组。分裂完成后,新产生的两个 vgroup 承担原来由一个 vgroup 提供的读写服务。 +当一个 vgroup 因为子表数过多而导致 CPU 或 Disk 资源使用量负载过高时,增加 dnode 节点后,可通过split vgroup命令把该vgroup分裂为两个虚拟组。分裂完成后,新产生的两个 vgroup 承担原来由一个 vgroup 提供的读写服务。该命令在 3.0.6.0 版本第一次发布,建议尽可能使用最新版本。 ```sql split vgroup @@ -97,7 +101,7 @@ split vgroup ## 双副本 -双副本是一种特殊的数据库高可用配置,本节对它的使用和维护操作进行特别说明。 +双副本是一种特殊的数据库高可用配置,本节对它的使用和维护操作进行特别说明。该功能在 3.3.0.0 版本中第一次发布,建议尽可能使用最新版本。 ### 查看 Vgroups 的状态 diff --git a/docs/zh/07-operation/10-disaster.md b/docs/zh/07-operation/10-disaster.md index 71589cf07e..b274e1373b 100644 --- a/docs/zh/07-operation/10-disaster.md +++ b/docs/zh/07-operation/10-disaster.md @@ -11,7 +11,7 @@ toc_max_heading_level: 4 TDengine 支持 WAL 机制,实现数据的容错能力,保证数据的高可靠。TDengine 接收到应用程序的请求数据包时,会先将请求的原始数据包写入数据库日志文件,等数据成功写入数据库数据文件后,再删除相应的 WAL。这样保证了 TDengine 能够在断电等因素导致的服务重启时,从数据库日志文件中恢复数据,避免数据丢失。涉及的配置参数有如下两个: - wal_level :WAL 级别,1 表示写 WAL,但不执行 fsync ; 2 表示写 WAL,而且执行 fsync。默认值为 1。 -- wal_fsync_period:当 wal_level 设置为 2 时,执行 fsync 的周期;当 wal-level 设置为 0 时,表示每次写入,立即执行 fsync。 +- wal_fsync_period:当 wal_level 设置为 2 时,执行 fsync 的周期;当 wal_fsync_period 设置为 0 时,表示每次写入,立即执行 fsync。 如果要 100% 保证数据不丢失,则需要将 wal_level 设置为 2,wal_fsync_period 设置为 0。这时写入速度将会下降。但如果应用程序侧启动的写数据的线程数达到一定的数量(超过 50),那么写入数据的性能也会很不错,只会比 wal_fsync_period 设置为 3000ms 下降 30% 左右。 diff --git a/docs/zh/07-operation/12-multi.md b/docs/zh/07-operation/12-multi.md index 6a221edf35..9b91a47e39 100644 --- a/docs/zh/07-operation/12-multi.md +++ b/docs/zh/07-operation/12-multi.md @@ -4,14 +4,13 @@ title: 多级存储与对象存储 toc_max_heading_level: 4 --- -TDengine 特有的多级存储功能,其作用是将较近的热度较高的数据存储在高速介质上,而时间久远热度很低的数据存储在低成本介质上,达成了以下目标: +本节介绍 TDengine Enterprise 特有的多级存储功能,其作用是将较近的热度较高的数据存储在高速介质上,而时间久远热度很低的数据存储在低成本介质上,达成了以下目标: - 降低存储成本 -- 将数据分级存储后,海量极冷数据存入廉价存储介质带来显著经济性 - 提升写入性能 -- 得益于每级存储可支持多个挂载点,WAL 预写日志也支持 0 级的多挂载点并行写入,极大提升写入性能(实际场景测得支持持续写入每秒 3 亿测点以上),在机械硬盘上可获得极高磁盘 IO 吞吐(实测可达 2GB/s) - 方便维护 -- 配置好各级存储挂载点后,系统数据迁移等工作,无需人工干预;存储扩容更灵活、方便 - 对 SQL 透明 -- 无论查询的数据是否跨级,一条 SQL 可返回所有数据,简单高效 - -多级存储所涉及的各层存储介质都是本地存储设备。除了本地存储设备之外,TDengine 还支持使用对象存储(S3),将最冷的一批数据保存在最廉价的介质上,以进一步降低存储成本,并在必要时仍然可以进行查询,且数据存储在哪里也对 SQL 透明。 +多级存储所涉及的各层存储介质都是本地存储设备。除了本地存储设备之外,TDengine Enterprise 还支持使用对象存储(S3),将最冷的一批数据保存在最廉价的介质上,以进一步降低存储成本,并在必要时仍然可以进行查询,且数据存储在哪里也对 SQL 透明。支持对象存储在 3.3.0.0 版本中首次发布,建议使用最新版本。 ## 多级存储 diff --git a/docs/zh/07-operation/14-user.md b/docs/zh/07-operation/14-user.md index 4e7087f10b..03a838462f 100644 --- a/docs/zh/07-operation/14-user.md +++ b/docs/zh/07-operation/14-user.md @@ -4,7 +4,7 @@ title: 用户和权限管理 toc_max_heading_level: 4 --- -TDengine 默认仅配置了一个 root 用户,该用户拥有最高权限。TDengine 支持对系统资源、库、表、视图和主题的访问权限控制。root 用户可以为每个用户针对不同的资源设置不同的访问权限。本节介绍 TDengine 中的用户和权限管理。 +TDengine 默认仅配置了一个 root 用户,该用户拥有最高权限。TDengine 支持对系统资源、库、表、视图和主题的访问权限控制。root 用户可以为每个用户针对不同的资源设置不同的访问权限。本节介绍 TDengine 中的用户和权限管理。用户和权限管理是 TDengine Enterprise 特有功能。 ## 用户管理 @@ -76,7 +76,7 @@ drop user user_name 在 TDengine 中,库和表的权限分为 read (读)和 write (写)两种。这些权限可以单独授予,也可以同时授予用户。 - read 权限:拥有 read 权限的用户仅能查询库或表中的数据,而无法对数据进行修改或删除。这种权限适用于需要访问数据但不需要对数据进行写入操作的场景,如数据分析师、报表生成器等。 -- write 权限:拥有 write 权限的用户既可以查询库或表中的数据,也可以向库或表中写入数据。这种权限适用于需要对数据进行写入操作的场景,如数据采集器、数据处理器等。 +- write 权限:拥有 write 权限的用户可以向库或表中写入数据。这种权限适用于需要对数据进行写入操作的场景,如数据采集器、数据处理器等。如果只拥有 write 权限而没有 read 权限,则只能写入数据但不能查询数据。 对某个用户进行库和表访问授权的语法如下。 diff --git a/docs/zh/07-operation/16-security.md b/docs/zh/07-operation/16-security.md index 48aa85d26b..4f47a644f7 100644 --- a/docs/zh/07-operation/16-security.md +++ b/docs/zh/07-operation/16-security.md @@ -4,7 +4,7 @@ title: 更多安全策略 toc_max_heading_level: 4 --- -除了传统的用户和权限管理之外,TDengine 还有其他的安全策略,例如 IP 白名单、审计日志、数据加密等。 +除了传统的用户和权限管理之外,TDengine 还有其他的安全策略,例如 IP 白名单、审计日志、数据加密等,这些都是 TDengine Enterprise 特有功能,其中白名单功能在 3.2.0.0 版本首次发布,审计日志在 3.1.1.0 版本中首次发布,数据库加密在 3.3.0.0 中首次发布,建议使用最新版本。 ## IP 白名单 @@ -18,13 +18,13 @@ alter user test add host host_name1 查询 IP 白名单的 SQL 如下。 ```sql -select test, allowed_host from ins_user_privileges; -show users; +SELECT TEST, ALLOWED_HOST FROM INS_USERS; +SHOW USERS; ``` 删除 IP 白名单的命令如下。 ```sql -alter user test drop host host_name1 +ALTER USER TEST DROP HOST HOST_NAME1 ``` ## 审计日志 diff --git a/docs/zh/07-operation/18-dual.md b/docs/zh/07-operation/18-dual.md index f50bf223d7..9de6a75b18 100644 --- a/docs/zh/07-operation/18-dual.md +++ b/docs/zh/07-operation/18-dual.md @@ -6,7 +6,7 @@ toc_max_heading_level: 4 ## 简介 -1. 部分用户因为部署环境的特殊性只能部署两台服务器,同时希望实现一定的服务高可用和数据高可靠。本文主要描述基于数据复制和客户端 Failover 两项关键技术的 TDengine 双活系统的产品行为,包括双活系统的架构、配置、运维等。TDengine 双活既可以用于前面所述资源受限的环境,也可用于在两套 TDengine 集群(不限资源)之间的灾备场景。 +1. 部分用户因为部署环境的特殊性只能部署两台服务器,同时希望实现一定的服务高可用和数据高可靠。本文主要描述基于数据复制和客户端 Failover 两项关键技术的 TDengine 双活系统的产品行为,包括双活系统的架构、配置、运维等。TDengine 双活既可以用于前面所述资源受限的环境,也可用于在两套 TDengine 集群(不限资源)之间的灾备场景。双活是 TDengine Enterprise 特有功能,在 3.3.0.0 版本中第一次发布,建议使用最新版本。 2. 双活系统的定义是:业务系统中有且仅有两台服务器,其上分别部署一套服务,在业务层看来这两台机器和两套服务是一个完整的系统,对其中的细节业务层不需要感知。双活中的两个节点通常被称为 Master-Slave,意为”主从“或”主备“,本文档中可能会出现混用的情况。 diff --git a/docs/zh/08-develop/01-connect/index.md b/docs/zh/08-develop/01-connect/index.md index 8e279e586e..fdde4aea2e 100644 --- a/docs/zh/08-develop/01-connect/index.md +++ b/docs/zh/08-develop/01-connect/index.md @@ -41,7 +41,7 @@ TDengine 提供了丰富的应用程序开发接口,为了便于用户快速 3. 使用 Websocket 连接,用户也无需安装客户端驱动程序 taosc。 4. 连接云服务实例,必须使用 REST 连接 或 Websocket 连接。 -一般我们建议使用 **Websocket 连接**。 +**推荐使用 WebSocket 连接** ## 安装客户端驱动 taosc diff --git a/docs/zh/08-develop/05-stmt.md b/docs/zh/08-develop/05-stmt.md index 8bb4a4f270..0e94af4a34 100644 --- a/docs/zh/08-develop/05-stmt.md +++ b/docs/zh/08-develop/05-stmt.md @@ -11,7 +11,9 @@ import TabItem from "@theme/TabItem"; - 减少解析时间:通过参数绑定,SQL 语句的结构在第一次执行时就已经确定,后续的执行只需要替换参数值,这样可以避免每次执行时都进行语法解析,从而减少解析时间。 - 预编译:当使用参数绑定时,SQL 语句可以被预编译并缓存,后续使用不同的参数值执行时,可以直接使用预编译的版本,提高执行效率。 -- 减少网络开销:参数绑定还可以减少发送到数据库的数据量,因为只需要发送参数值而不是完整的 SQL 语句,特别是在执行大量相似的插入或更新操作时,这种差异尤为明显。 +- 减少网络开销:参数绑定还可以减少发送到数据库的数据量,因为只需要发送参数值而不是完整的 SQL 语句,特别是在执行大量相似的插入或更新操作时,这种差异尤为明显。 + +**Tips: 数据写入推荐使用参数绑定方式** 下面我们继续以智能电表为例,展示各语言连接器使用参数绑定高效写入的功能: 1. 准备一个参数化的 SQL 插入语句,用于向超级表 `meters` 中插入数据。这个语句允许动态地指定子表名、标签和列值。 diff --git a/docs/zh/08-develop/09-udf.md b/docs/zh/08-develop/09-udf.md index 45e4ae6134..b16700b460 100644 --- a/docs/zh/08-develop/09-udf.md +++ b/docs/zh/08-develop/09-udf.md @@ -6,29 +6,28 @@ toc_max_heading_level: 4 ## UDF 简介 -在某些应用场景中,应用逻辑需要的查询功能无法直接使用TDengine内置的函数来实现。TDengine允许编写用户自定义函数(UDF),以便解决特殊应用场景中的使用需求。UDF在集群中注册成功后,可以像系统内置函数一样在SQL中调用,就使用角度而言没有任何区别。UDF分为标量函数和聚合函数。标量函数对每行数据输出一个值,如求绝对值abs、正弦函数sin、字符串拼接函数concat等。聚合函数对多行数据输出一个值,如求平均数avg、取最大值max等。 +在某些应用场景中,应用逻辑需要的查询功能无法直接使用内置函数来实现,TDengine 允许编写用户自定义函数(UDF),以便解决特殊应用场景中的使用需求。UDF 在集群中注册成功后,可以像系统内置函数一样在 SQL 中调用,就使用角度而言没有任何区别。UDF 分为标量函数和聚合函数。标量函数对每行数据输出一个值,如求绝对值(abs)、正弦函数(sin)、字符串拼接函数(concat)等。聚合函数对多行数据输出一个值,如求平均数(avg)、取最大值(max)等。 -TDengine支持用C和Python两种编程语言编写UDF。C语言编写的UDF与内置函数的性能几乎相同,Python语言编写的UDF可以利用丰富的Python运算库。为了避免UDF执行中发生异常影响数据库服务,TDengine使用了进程分离技术,把UDF的执行放到另一个进程中完成,即使用户编写的UDF崩溃,也不会影响TDengine的正常运行。 +TDengine 支持用 C 和 Python 两种编程语言编写 UDF。C 语言编写的 UDF 与内置函数的性能几乎相同,Python 语言编写的 UDF 可以利用丰富的 Python 运算库。为了避免 UDF 执行中发生异常影响数据库服务,TDengine 使用了进程分离技术,把 UDF 的执行放到另一个进程中完成,即使用户编写的 UDF 崩溃,也不会影响 TDengine 的正常运行。 ## 用 C 语言开发 UDF 使用 C 语言实现 UDF 时,需要实现规定的接口函数 - 标量函数需要实现标量接口函数 scalarfn 。 -- 聚合函数需要实现聚合接口函数 aggfn_start , aggfn , aggfn_finish。 -- 如果需要初始化,实现 udf_init;如果需要清理工作,实现udf_destroy。 - -接口函数的名称是 UDF 名称,或者是 UDF 名称和特定后缀(`_start`, `_finish`, `_init`, `_destroy`)的连接。列表中的scalarfn,aggfn, udf需要替换成udf函数名。 +- 聚合函数需要实现聚合接口函数 aggfn_start、aggfn、aggfn_finish。 +- 如果需要初始化,实现 udf_init。 +- 如果需要清理工作,实现 udf_destroy。 ### 接口定义 -在TDengine中,UDF的接口函数名称可以是UDF名称,也可以是UDF名称和特定后缀(如_start、_finish、_init、_destroy)的连接。后面内容中描述的函数名称,例如scalarfn、aggfn,需要替换成UDF名称。。 +接口函数的名称是 UDF 名称,或者是 UDF 名称和特定后缀(_start、_finish、_init、_destroy)的连接。后面内容中描述的函数名称,例如 scalarfn、aggfn,需要替换成 UDF 名称。 #### 标量函数接口 标量函数是一种将输入数据转换为输出数据的函数,通常用于对单个数据值进行计算和转换。标量函数的接口函数原型如下。 ```c -int32_t scalarfn(SUdfDataBlock* inputDataBlock, SUdfColumn *resultColumn) +int32_t scalarfn(SUdfDataBlock* inputDataBlock, SUdfColumn *resultColumn); ``` 主要参数说明如下。 - inputDataBlock:输入的数据块。 @@ -37,23 +36,22 @@ int32_t scalarfn(SUdfDataBlock* inputDataBlock, SUdfColumn *resultColumn) #### 聚合函数接口 聚合函数是一种特殊的函数,用于对数据进行分组和计算,从而生成汇总信息。聚合函数的工作原理如下。 -- 初始化结果缓冲区:首先调用aggfn_start函数,生成一个结果缓冲区(result buffer),用于存储中间结果。 +- 初始化结果缓冲区:首先调用 aggfn_start 函数,生成一个结果缓冲区(result buffer),用于存储中间结果。 - 分组数据:相关数据会被分为多个行数据块(row data block),每个行数据块包含一组具有相同分组键(grouping key)的数据。 -- 更新中间结果:对于每个数据块,调用aggfn函数更新中间结果。aggfn函数会根据聚合函数的类型(如sum、avg、count等)对数据进行相应的计算,并将计算结 +- 更新中间结果:对于每个数据块,调用 aggfn 函数更新中间结果。aggfn 函数会根据聚合函数的类型(如 sum、avg、count 等)对数据进行相应的计算,并将计算结 果存储在结果缓冲区中。 -- 生成最终结果:在所有数据块的中间结果更新完成后,调用aggfn_finish函数从结果缓冲区中提取最终结果。最终结果通常只包含0条或1条数据,具体取决于聚 +- 生成最终结果:在所有数据块的中间结果更新完成后,调用 aggfn_finish 函数从结果缓冲区中提取最终结果。最终结果只包含 0 条或 1 条数据,具体取决于聚 合函数的类型和输入数据。 聚合函数的接口函数原型如下。 ```c -int32_t aggfn_start(SUdfInterBuf *interBuf) -int32_t aggfn(SUdfDataBlock* inputBlock, SUdfInterBuf *interBuf, SUdfInterBuf *newInterBuf) -int32_t aggfn_finish(SUdfInterBuf* interBuf, SUdfInterBuf *result) +int32_t aggfn_start(SUdfInterBuf *interBuf); +int32_t aggfn(SUdfDataBlock* inputBlock, SUdfInterBuf *interBuf, SUdfInterBuf *newInterBuf); +int32_t aggfn_finish(SUdfInterBuf* interBuf, SUdfInterBuf *result); ``` - -其中 aggfn 是函数名的占位符。首先调用aggfn_start生成结果buffer,然后相关的数据会被分为多个行数据块,对每个数据块调用 aggfn 用数据块更新中间结果,最后再调用 aggfn_finish 从中间结果产生最终结果,最终结果只能含 0 或 1 条结果数据。 +其中 aggfn 是函数名的占位符。首先调用 aggfn_start 生成结果 buffer,然后相关的数据会被分为多个行数据块,对每个数据块调用 aggfn 用数据块更新中间结果,最后再调用 aggfn_finish 从中间结果产生最终结果,最终结果只能含 0 或 1 条结果数据。 主要参数说明如下。 - interBuf:中间结果缓存区。 @@ -61,29 +59,49 @@ int32_t aggfn_finish(SUdfInterBuf* interBuf, SUdfInterBuf *result) - newInterBuf:新的中间结果缓冲区。 - result:最终结果。 - #### 初始化和销毁接口 -初始化和销毁接口是标量函数和聚合函数共同使用的接口,相关API如下。 +初始化和销毁接口是标量函数和聚合函数共同使用的接口,相关 API 如下。 ```c int32_t udf_init() int32_t udf_destroy() ``` -其中,udf_init函数完成初始化工作,udf_destroy函数完成清理工作。如果没有初始化工作,无须定义udf_init函数;如果没有清理工作,无须定义udf_destroy函数。 +其中,udf_init 函数完成初始化工作,udf_destroy 函数完成清理工作。如果没有初始化工作,无须定义 udf_init 函数;如果没有清理工作,无须定义 udf_destroy 函数。 ### 标量函数模板 -用C语言开发标量函数的模板如下。 +用 C 语言开发标量函数的模板如下。 ```c +#include "taos.h" +#include "taoserror.h" +#include "taosudf.h" + +// Initialization function. +// If no initialization, we can skip definition of it. +// The initialization function shall be concatenation of the udf name and _init suffix. +// @return error number defined in taoserror.h int32_t scalarfn_init() { + // initialization. return TSDB_CODE_SUCCESS; } + +// Scalar function main computation function. +// @param inputDataBlock, input data block composed of multiple columns with each column defined by SUdfColumn +// @param resultColumn, output column +// @return error number defined in taoserror.h int32_t scalarfn(SUdfDataBlock* inputDataBlock, SUdfColumn* resultColumn) { + // read data from inputDataBlock and process, then output to resultColumn. return TSDB_CODE_SUCCESS; } + +// Cleanup function. +// If no cleanup related processing, we can skip definition of it. +// The destroy function shall be concatenation of the udf name and _destroy suffix. +// @return error number defined in taoserror.h int32_t scalarfn_destroy() { + // clean up return TSDB_CODE_SUCCESS; } ``` @@ -91,53 +109,211 @@ int32_t scalarfn_destroy() { 用C语言开发聚合函数的模板如下。 ```c +#include "taos.h" +#include "taoserror.h" +#include "taosudf.h" + +// Initialization function. +// If no initialization, we can skip definition of it. +// The initialization function shall be concatenation of the udf name and _init suffix. +// @return error number defined in taoserror.h int32_t aggfn_init() { + // initialization. return TSDB_CODE_SUCCESS; } + +// Aggregate start function. +// The intermediate value or the state(@interBuf) is initialized in this function. +// The function name shall be concatenation of udf name and _start suffix. +// @param interbuf intermediate value to initialize +// @return error number defined in taoserror.h int32_t aggfn_start(SUdfInterBuf* interBuf) { + // initialize intermediate value in interBuf return TSDB_CODE_SUCCESS; } + +// Aggregate reduce function. +// This function aggregate old state(@interbuf) and one data bock(inputBlock) and output a new state(@newInterBuf). +// @param inputBlock input data block +// @param interBuf old state +// @param newInterBuf new state +// @return error number defined in taoserror.h int32_t aggfn(SUdfDataBlock* inputBlock, SUdfInterBuf *interBuf, SUdfInterBuf *newInterBuf) { + // read from inputBlock and interBuf and output to newInterBuf return TSDB_CODE_SUCCESS; } + +// Aggregate function finish function. +// This function transforms the intermediate value(@interBuf) into the final output(@result). +// The function name must be concatenation of aggfn and _finish suffix. +// @interBuf : intermediate value +// @result: final result +// @return error number defined in taoserror.h int32_t int32_t aggfn_finish(SUdfInterBuf* interBuf, SUdfInterBuf *result) { + // read data from inputDataBlock and process, then output to result return TSDB_CODE_SUCCESS; } + +// Cleanup function. +// If no cleanup related processing, we can skip definition of it. +// The destroy function shall be concatenation of the udf name and _destroy suffix. +// @return error number defined in taoserror.h int32_t aggfn_destroy() { + // clean up return TSDB_CODE_SUCCESS; } ``` ### 编译 -在TDengine中,为了实现UDF,需要编写C语言源代码,并按照TDengine的规范编译为动态链接库文件。 -按照前面描述的规则,准备UDF的源代码bit_and.c。以Linux操作系统为例,执行如下指令,编译得到动态链接库文件。 +在 TDengine 中,为了实现 UDF,需要编写 C 语言源代码,并按照 TDengine 的规范编译为动态链接库文件。 +按照前面描述的规则,准备 UDF 的源代码 bit_and.c。以 Linux 操作系统为例,执行如下指令,编译得到动态链接库文件。 ```shell -gcc-g-O0-fPIC-sharedbit_and.c-olibbitand.so +gcc -g -O0 -fPIC -shared bit_and.c -o libbitand.so ``` -为了保证可靠运行,推荐使用7.5及以上版本的GCC。 +为了保证可靠运行,推荐使用 7.5 及以上版本的 GCC。 + +### C UDF 数据结构 +```c +typedef struct SUdfColumnMeta { + int16_t type; + int32_t bytes; + uint8_t precision; + uint8_t scale; +} SUdfColumnMeta; + +typedef struct SUdfColumnData { + int32_t numOfRows; + int32_t rowsAlloc; + union { + struct { + int32_t nullBitmapLen; + char *nullBitmap; + int32_t dataLen; + char *data; + } fixLenCol; + + struct { + int32_t varOffsetsLen; + int32_t *varOffsets; + int32_t payloadLen; + char *payload; + int32_t payloadAllocLen; + } varLenCol; + }; +} SUdfColumnData; + +typedef struct SUdfColumn { + SUdfColumnMeta colMeta; + bool hasNull; + SUdfColumnData colData; +} SUdfColumn; + +typedef struct SUdfDataBlock { + int32_t numOfRows; + int32_t numOfCols; + SUdfColumn **udfCols; +} SUdfDataBlock; + +typedef struct SUdfInterBuf { + int32_t bufLen; + char *buf; + int8_t numOfResult; //zero or one +} SUdfInterBuf; +``` +数据结构说明如下: + +- SUdfDataBlock 数据块包含行数 numOfRows 和列数 numCols。udfCols[i] (0 \<= i \<= numCols-1)表示每一列数据,类型为SUdfColumn*。 +- SUdfColumn 包含列的数据类型定义 colMeta 和列的数据 colData。 +- SUdfColumnMeta 成员定义同 taos.h 数据类型定义。 +- SUdfColumnData 数据可以变长,varLenCol 定义变长数据,fixLenCol 定义定长数据。 +- SUdfInterBuf 定义中间结构 buffer,以及 buffer 中结果个数 numOfResult + +为了更好的操作以上数据结构,提供了一些便利函数,定义在 taosudf.h。 + + +### C UDF 示例代码 + +#### 标量函数示例 [bit_and](https://github.com/taosdata/TDengine/blob/3.0/tests/script/sh/bit_and.c) + +bit_add 实现多列的按位与功能。如果只有一列,返回这一列。bit_add 忽略空值。 + +
+bit_and.c + +```c +{{#include tests/script/sh/bit_and.c}} +``` + +
+ +#### 聚合函数示例1 返回值为数值类型 [l2norm](https://github.com/taosdata/TDengine/blob/3.0/tests/script/sh/l2norm.c) + +l2norm 实现了输入列的所有数据的二阶范数,即对每个数据先平方,再累加求和,最后开方。 + +
+l2norm.c + +```c +{{#include tests/script/sh/l2norm.c}} +``` + +
+ +#### 聚合函数示例2 返回值为字符串类型 [max_vol](https://github.com/taosdata/TDengine/blob/3.0/tests/script/sh/max_vol.c) + +max_vol 实现了从多个输入的电压列中找到最大电压,返回由设备 ID + 最大电压所在(行,列)+ 最大电压值 组成的组合字符串值 + +创建表: +```bash +create table battery(ts timestamp, vol1 float, vol2 float, vol3 float, deviceId varchar(16)); +``` +创建自定义函数: +```bash +create aggregate function max_vol as '/root/udf/libmaxvol.so' outputtype binary(64) bufsize 10240 language 'C'; +``` +使用自定义函数: +```bash +select max_vol(vol1, vol2, vol3, deviceid) from battery; +``` + +
+max_vol.c + +```c +{{#include tests/script/sh/max_vol.c}} +``` + +
## 用 Python 语言开发 UDF ### 准备环境 准备环境的具体步骤如下: -- 第1步,准备好Python运行环境。 -- 第2步,安装Python包taospyudf。命令如下。 +- 第1步,准备好 Python 运行环境。 +- 第2步,安装 Python 包 taospyudf。命令如下。 ```shell pip3 install taospyudf ``` -- 第3步,执行命令ldconfig。 -- 第4步,启动taosd服务。 +- 第3步,执行命令 ldconfig。 +- 第4步,启动 taosd 服务。 + +安装过程中会编译 C++ 源码,因此系统上要有 cmake 和 gcc。编译生成的 libtaospyudf.so 文件自动会被复制到 /usr/local/lib/ 目录,因此如果是非 root 用户,安装时需加 sudo。安装完可以检查这个目录是否有了这个文件: + +```shell +root@slave11 ~/udf $ ls -l /usr/local/lib/libtaos* +-rw-r--r-- 1 root root 671344 May 24 22:54 /usr/local/lib/libtaospyudf.so +``` ### 接口定义 -当使用Python语言开发UDF时,需要实现规定的接口函数。具体要求如下。 -- 标量函数需要实现标量接口函数process。 -- 聚合函数需要实现聚合接口函数start、reduce、finish。 -- 如果需要初始化,则应实现函数init。 -- 如果需要清理工作,则实现函数destroy。 +当使用 Python 语言开发 UDF 时,需要实现规定的接口函数。具体要求如下。 +- 标量函数需要实现标量接口函数 process。 +- 聚合函数需要实现聚合接口函数 start、reduce、finish。 +- 如果需要初始化,则应实现函数 init。 +- 如果需要清理工作,则实现函数 destroy。 #### 标量函数接口 @@ -147,7 +323,7 @@ def process(input: datablock) -> tuple[output_type]: ``` 主要参数说明如下: -- input:datablock 类似二维矩阵,通过成员方法 data(row,col)返回位于 row 行,col 列的 python 对象 +- input:datablock 类似二维矩阵,通过成员方法 data(row, col) 读取位于 row 行、col 列的 python 对象 - 返回值是一个 Python 对象元组,每个元素类型为输出类型。 #### 聚合函数接口 @@ -159,13 +335,13 @@ def reduce(inputs: datablock, buf: bytes) -> bytes def finish(buf: bytes) -> output_type: ``` -上述代码定义了3个函数,分别用于实现一个自定义的聚合函数。具体过程如下。 +上述代码定义了 3 个函数,分别用于实现一个自定义的聚合函数。具体过程如下。 -首先,调用start函数生成最初的结果缓冲区。这个结果缓冲区用于存储聚合函数的内部状态,随着输入数据的处理而不断更新。 +首先,调用 start 函数生成最初的结果缓冲区。这个结果缓冲区用于存储聚合函数的内部状态,随着输入数据的处理而不断更新。 -然后,输入数据会被分为多个行数据块。对于每个行数据块,调用reduce函数,并将当前行数据块(inputs)和当前的中间结果(buf)作为参数传递。reduce函数会根据输入数据和当前状态来更新聚合函数的内部状态,并返回新的中间结果 +然后,输入数据会被分为多个行数据块。对于每个行数据块,调用 reduce 函数,并将当前行数据块(inputs)和当前的中间结果(buf)作为参数传递。reduce 函数会根据输入数据和当前状态来更新聚合函数的内部状态,并返回新的中间结果。 -最后,当所有行数据块都处理完毕后,调用finish函数。这个函数接收最终的中间结果(buf)作为参数,并从中生成最终的输出。由于聚合函数的特性,最终输出只能包含0条或1条数据。这个输出结果将作为聚合函数的计算结果返回给调用者。 +最后,当所有行数据块都处理完毕后,调用 finish 函数。这个函数接收最终的中间结果(buf)作为参数,并从中生成最终的输出。由于聚合函数的特性,最终输出只能包含 0 条或 1 条数据。这个输出结果将作为聚合函数的计算结果返回给调用者。 #### 初始化和销毁接口 @@ -179,7 +355,7 @@ def destroy() - init 完成初始化工作 - destroy 完成清理工作 -**注意** 用Python开发UDF时必须定义init函数和destroy函数 +**注意** 用 Python 开发 UDF 时必须定义 init 函数和 destroy 函数 ### 标量函数模板 @@ -204,7 +380,7 @@ def start() -> bytes: def reduce(inputs: datablock, buf: bytes) -> bytes # deserialize buf to state # reduce the inputs and state into new_state. - # use inputs.data(i,j) to access python object of location(i,j) + # use inputs.data(i, j) to access python object of location(i, j) # serialize new_state into new_state_bytes return new_state_bytes def finish(buf: bytes) -> output_type: @@ -217,13 +393,13 @@ def finish(buf: bytes) -> output_type: | **TDengine SQL数据类型** | **Python数据类型** | | :-----------------------: | ------------ | -|TINYINT / SMALLINT / INT / BIGINT | int | -|TINYINT UNSIGNED / SMALLINT UNSIGNED / INT UNSIGNED / BIGINT UNSIGNED | int | -|FLOAT / DOUBLE | float | -|BOOL | bool | -|BINARY / VARCHAR / NCHAR | bytes| -|TIMESTAMP | int | -|JSON and other types | 不支持 | +| TINYINT / SMALLINT / INT / BIGINT | int | +| TINYINT UNSIGNED / SMALLINT UNSIGNED / INT UNSIGNED / BIGINT UNSIGNED | int | +| FLOAT / DOUBLE | float | +| BOOL | bool | +| BINARY / VARCHAR / NCHAR | bytes| +| TIMESTAMP | int | +| JSON and other types | 不支持 | ### 开发示例 @@ -234,7 +410,7 @@ def finish(buf: bytes) -> output_type: #### 示例一 编写一个只接收一个整数的 UDF 函数: 输入 n, 输出 ln(n^2 + 1)。 -首先编写一个 Python 文件,存在系统某个目录,比如 /root/udf/myfun.py 内容如下 +首先编写一个 Python 文件,存在系统某个目录,比如 /root/udf/myfun.py 内容如下。 ```python from math import log @@ -250,23 +426,25 @@ def process(block): return [log(block.data(i, 0) ** 2 + 1) for i in range(rows)] ``` -这个文件包含 3 个函数, init 和 destroy 都是空函数,它们是 UDF 的生命周期函数,即使什么都不做也要定义。最关键的是 process 函数, 它接受一个数据块,这个数据块对象有两个方法: +这个文件包含 3 个函数, init 和 destroy 都是空函数,它们是 UDF 的生命周期函数,即使什么都不做也要定义。最关键的是 process 函数, 它接受一个数据块,这个数据块对象有两个方法。 1. shape() 返回数据块的行数和列数 2. data(i, j) 返回 i 行 j 列的数据 -标量函数的 process 方法传人的数据块有多少行,就需要返回多少个数据。上述代码中我们忽略的列数,因为我们只想对每行的第一个数做计算。 -接下来我们创建对应的 UDF 函数,在 TDengine CLI 中执行下面语句: + +标量函数的 process 方法传入的数据块有多少行,就需要返回多少行数据。上述代码忽略列数,因为只需对每行的第一列做计算。 + +接下来创建对应的 UDF 函数,在 TDengine CLI 中执行下面语句。 ```sql create function myfun as '/root/udf/myfun.py' outputtype double language 'Python' ``` -其输出如下 +其输出如下。 ```shell - taos> create function myfun as '/root/udf/myfun.py' outputtype double language 'Python'; +taos> create function myfun as '/root/udf/myfun.py' outputtype double language 'Python'; Create OK, 0 row(s) affected (0.005202s) ``` -看起来很顺利,接下来 show 一下系统中所有的自定义函数,确认创建成功: +看起来很顺利,接下来查看系统中所有的自定义函数,确认创建成功。 ```text taos> show functions; @@ -276,7 +454,7 @@ taos> show functions; Query OK, 1 row(s) in set (0.005767s) ``` -接下来就来测试一下这个函数,测试之前先执行下面的 SQL 命令,制造些测试数据,在 TDengine CLI 中执行下述命令 +生成测试数据,可以在 TDengine CLI 中执行下述命令。 ```sql create database test; @@ -286,7 +464,7 @@ insert into t values('2023-05-03 08:09:10', 2, 3, 4); insert into t values('2023-05-10 07:06:05', 3, 4, 5); ``` -测试 myfun 函数: +测试 myfun 函数。 ```sql taos> select myfun(v1, v2) from t; @@ -294,14 +472,13 @@ taos> select myfun(v1, v2) from t; DB error: udf function execution failure (0.011088s) ``` -不幸的是执行失败了,什么原因呢? -查看 udfd 进程的日志 +不幸的是执行失败了,什么原因呢?查看 udfd 进程的日志。 ```shell tail -10 /var/log/taos/udfd.log ``` -发现以下错误信息: +发现以下错误信息。 ```text 05/24 22:46:28.733545 01665799 UDF ERROR can not load library libtaospyudf.so. error: operation not permitted @@ -310,7 +487,7 @@ tail -10 /var/log/taos/udfd.log 错误很明确:没有加载到 Python 插件 libtaospyudf.so,如果遇到此错误,请参考前面的准备环境一节。 -修复环境错误后再次执行,如下: +修复环境错误后再次执行,如下。 ```sql taos> select myfun(v1) from t; @@ -325,7 +502,7 @@ taos> select myfun(v1) from t; #### 示例二 -上面的 myfun 虽然测试测试通过了,但是有两个缺点: +上面的 myfun 虽然测试测试通过了,但是有两个缺点。 1. 这个标量函数只接受 1 列数据作为输入,如果用户传入了多列也不会抛异常。 @@ -338,8 +515,7 @@ taos> select myfun(v1, v2) from t; 2.302585093 | ``` -2. 没有处理 null 值。我们期望如果输入有 null,则会抛异常终止执行。 -因此 process 函数改进如下: +2. 没有处理 null 值。我们期望如果输入有 null,则会抛异常终止执行。因此 process 函数改进如下。 ```python def process(block): @@ -349,13 +525,13 @@ def process(block): return [ None if block.data(i, 0) is None else log(block.data(i, 0) ** 2 + 1) for i in range(rows)] ``` -然后执行下面的语句更新已有的 UDF: +执行如下语句更新已有的 UDF。 ```sql create or replace function myfun as '/root/udf/myfun.py' outputtype double language 'Python'; ``` -再传入 myfun 两个参数,就会执行失败了 +再传入 myfun 两个参数,就会执行失败了。 ```sql taos> select myfun(v1, v2) from t; @@ -363,7 +539,7 @@ taos> select myfun(v1, v2) from t; DB error: udf function execution failure (0.014643s) ``` -但遗憾的是我们自定义的异常信息没有展示给用户,而是在插件的日志文件 /var/log/taos/taospyudf.log 中: +自定义的异常信息打印在插件的日志文件 /var/log/taos/taospyudf.log 中。 ```text 2023-05-24 23:21:06.790 ERROR [1666188] [doPyUdfScalarProc@507] call pyUdfScalar proc function. context 0x7faade26d180. error: Exception: require 1 parameter but given 2 @@ -378,18 +554,17 @@ At: #### 示例三 -编写一个 UDF:输入(x1, x2, ..., xn), 输出每个值和它们的序号的乘积的和: 1 * x1 + 2 * x2 + ... + n * xn。如果 x1 至 xn 中包含 null,则结果为 null。 -这个示例与示例一的区别是,可以接受任意多列作为输入,且要处理每一列的值。编写 UDF 文件 /root/udf/nsum.py: +输入(x1, x2, ..., xn), 输出每个值和它们的序号的乘积的和:1 * x1 + 2 * x2 + ... + n * xn。如果 x1 至 xn 中包含 null,则结果为 null。 + +本例与示例一的区别是,可以接受任意多列作为输入,且要处理每一列的值。编写 UDF 文件 /root/udf/nsum.py。 ```python def init(): pass - def destroy(): pass - def process(block): rows, cols = block.shape() result = [] @@ -405,13 +580,13 @@ def process(block): return result ``` -创建 UDF: +创建 UDF。 ```sql create function nsum as '/root/udf/nsum.py' outputtype double language 'Python'; ``` -测试 UDF: +测试 UDF。 ```sql taos> insert into t values('2023-05-25 09:09:15', 6, null, 8); @@ -430,22 +605,20 @@ Query OK, 4 row(s) in set (0.010653s) #### 示例四 编写一个 UDF,输入一个时间戳,输出距离这个时间最近的下一个周日。比如今天是 2023-05-25, 则下一个周日是 2023-05-28。 -完成这个函数要用到第三方库 momen。先安装这个库: +完成这个函数要用到第三方库 momen。先安装这个库。 ```shell pip3 install moment ``` -然后编写 UDF 文件 /root/udf/nextsunday.py +然后编写 UDF 文件 /root/udf/nextsunday.py。 ```python import moment - def init(): pass - def destroy(): pass @@ -460,13 +633,13 @@ def process(block): for i in range(rows)] ``` -UDF 框架会将 TDengine 的 timestamp 类型映射为 Python 的 int 类型,所以这个函数只接受一个表示毫秒数的整数。process 方法先做参数检查,然后用 moment 包替换时间的星期为星期日,最后格式化输出。输出的字符串长度是固定的10个字符长,因此可以这样创建 UDF 函数: +UDF 框架会将 TDengine 的 timestamp 类型映射为 Python 的 int 类型,所以这个函数只接受一个表示毫秒数的整数。process 方法先做参数检查,然后用 moment 包替换时间的星期为星期日,最后格式化输出。输出的字符串长度是固定的 10 个字符长,因此可以这样创建 UDF 函数。 ```sql create function nextsunday as '/root/udf/nextsunday.py' outputtype binary(10) language 'Python'; ``` -此时测试函数,如果你是用 systemctl 启动的 taosd,肯定会遇到错误: +此时测试函数,如果你是用 systemctl 启动的 taosd,肯定会遇到错误。 ```sql taos> select ts, nextsunday(ts) from t; @@ -475,11 +648,11 @@ DB error: udf function execution failure (1.123615s) ``` ```shell - tail -20 taospyudf.log +tail -20 taospyudf.log 2023-05-25 11:42:34.541 ERROR [1679419] [PyUdf::PyUdf@217] py udf load module failure. error ModuleNotFoundError: No module named 'moment' ``` -这是因为 “moment” 所在位置不在 python udf 插件默认的库搜索路径中。怎么确认这一点呢?通过以下命令搜索 taospyudf.log: +这是因为 “moment” 所在位置不在 python udf 插件默认的库搜索路径中。怎么确认这一点呢?通过以下命令搜索 taospyudf.log。 ```shell grep 'sys path' taospyudf.log | tail -1 @@ -492,7 +665,7 @@ grep 'sys path' taospyudf.log | tail -1 ``` 发现 python udf 插件默认搜索的第三方库安装路径是: /lib/python3/dist-packages,而 moment 默认安装到了 /usr/local/lib/python3.8/dist-packages。下面我们修改 python udf 插件默认的库搜索路径。 -先打开 python3 命令行,查看当前的 sys.path +先打开 python3 命令行,查看当前的 sys.path。 ```python >>> import sys @@ -500,13 +673,13 @@ grep 'sys path' taospyudf.log | tail -1 '/usr/lib/python3.8:/usr/lib/python3.8/lib-dynload:/usr/local/lib/python3.8/dist-packages:/usr/lib/python3/dist-packages' ``` -复制上面脚本的输出的字符串,然后编辑 /var/taos/taos.cfg 加入以下配置: +复制上面脚本的输出的字符串,然后编辑 /var/taos/taos.cfg 加入以下配置。 ```shell UdfdLdLibPath /usr/lib/python3.8:/usr/lib/python3.8/lib-dynload:/usr/local/lib/python3.8/dist-packages:/usr/lib/python3/dist-packages ``` -保存后执行 systemctl restart taosd, 再测试就不报错了: +保存后执行 systemctl restart taosd, 再测试就不报错了。 ```sql taos> select ts, nextsunday(ts) from t; @@ -522,7 +695,7 @@ Query OK, 4 row(s) in set (1.011474s) #### 示例五 编写一个聚合函数,计算某一列最大值和最小值的差。 -聚合函数与标量函数的区别是:标量函数是多行输入对应多个输出,聚合函数是多行输入对应一个输出。聚合函数的执行过程有点像经典的 map-reduce 框架的执行过程,框架把数据分成若干块,每个 mapper 处理一个块,reducer 再把 mapper 的结果做聚合。不一样的地方在于,对于 TDengine Python UDF 中的 reduce 函数既有 map 的功能又有 reduce 的功能。reduce 函数接受两个参数:一个是自己要处理的数据,一个是别的任务执行 reduce 函数的处理结果。如下面的示例 /root/udf/myspread.py: +聚合函数与标量函数的区别是:标量函数是多行输入对应多个输出,聚合函数是多行输入对应一个输出。聚合函数的执行过程有点像经典的 map-reduce 框架的执行过程,框架把数据分成若干块,每个 mapper 处理一个块,reducer 再把 mapper 的结果做聚合。不一样的地方在于,对于 TDengine Python UDF 中的 reduce 函数既有 map 的功能又有 reduce 的功能。reduce 函数接受两个参数:一个是自己要处理的数据,一个是别的任务执行 reduce 函数的处理结果。如下面的示例 /root/udf/myspread.py。 ```python import io @@ -531,26 +704,21 @@ import pickle LOG_FILE: io.TextIOBase = None - def init(): global LOG_FILE LOG_FILE = open("/var/log/taos/spread.log", "wt") log("init function myspead success") - def log(o): LOG_FILE.write(str(o) + '\n') - def destroy(): log("close log file: spread.log") LOG_FILE.close() - def start(): return pickle.dumps((-math.inf, math.inf)) - def reduce(block, buf): max_number, min_number = pickle.loads(buf) log(f"initial max_number={max_number}, min_number={min_number}") @@ -565,26 +733,26 @@ def reduce(block, buf): min_number = v return pickle.dumps((max_number, min_number)) - def finish(buf): max_number, min_number = pickle.loads(buf) return max_number - min_number ``` -在这个示例中我们不光定义了一个聚合函数,还添加记录执行日志的功能,讲解如下: -1. init 函数不再是空函数,而是打开了一个文件用于写执行日志 -2. log 函数是记录日志的工具,自动将传入的对象转成字符串,加换行符输出 -3. destroy 函数用来在执行结束关闭文件 -4. start 返回了初始的 buffer,用来存聚合函数的中间结果,我们把最大值初始化为负无穷大,最小值初始化为正无穷大 -5. reduce 处理每个数据块并聚合结果 -6. finish 函数将最终的 buffer 转换成最终的输出 -执行下面的 SQL语句创建对应的 UDF: +在这个示例中,我们不但定义了一个聚合函数,还增加了记录执行日志的功能。 +1. init 函数打开一个文件用于记录日志 +2. log 函数记录日志,自动将传入的对象转成字符串,加换行符输出 +3. destroy 函数在执行结束后关闭日志文件 +4. start 函数返回初始的 buffer,用来存聚合函数的中间结果,把最大值初始化为负无穷大,最小值初始化为正无穷大 +5. reduce 函数处理每个数据块并聚合结果 +6. finish 函数将 buffer 转换成最终的输出 + +执行下面 SQL 语句创建对应的 UDF。 ```sql create or replace aggregate function myspread as '/root/udf/myspread.py' outputtype double bufsize 128 language 'Python'; ``` -这个 SQL 语句与创建标量函数的 SQL 语句有两个重要区别: +这个 SQL 语句与创建标量函数的 SQL 语句有两个重要区别。 1. 增加了 aggregate 关键字 2. 增加了 bufsize 关键字,用来指定存储中间结果的内存大小,这个数值可以大于实际使用的数值。本例中间结果是两个浮点数组成的 tuple,序列化后实际占用大小只有 32 个字节,但指定的 bufsize 是128,可以用 python 命令行打印实际占用的字节数 @@ -609,7 +777,7 @@ taos> select spread(v1) from t; Query OK, 1 row(s) in set (0.005501s) ``` -最后,查看我们自己打印的执行日志,从日志可以看出,reduce 函数被执行了 3 次。执行过程中 max 值被更新了 4 次, min 值只被更新 1 次。 +最后,查看执行日志,可以看到 reduce 函数被执行了 3 次,执行过程中 max 值被更新了 4 次,min 值只被更新 1 次。 ```shell root@slave11 /var/log/taos $ cat spread.log @@ -627,39 +795,77 @@ close log file: spread.log 通过这个示例,我们学会了如何定义聚合函数,并打印自定义的日志信息。 +### 更多 Python UDF 示例代码 +#### 标量函数示例 [pybitand](https://github.com/taosdata/TDengine/blob/3.0/tests/script/sh/pybitand.py) + +pybitand 实现多列的按位与功能。如果只有一列,返回这一列。pybitand 忽略空值。 + +
+pybitand.py + +```Python +{{#include tests/script/sh/pybitand.py}} +``` + +
+ +#### 聚合函数示例 [pyl2norm](https://github.com/taosdata/TDengine/blob/3.0/tests/script/sh/pyl2norm.py) + +pyl2norm 实现了输入列的所有数据的二阶范数,即对每个数据先平方,再累加求和,最后开方。 + +
+pyl2norm.py + +```c +{{#include tests/script/sh/pyl2norm.py}} +``` + +
+ +#### 聚合函数示例 [pycumsum](https://github.com/taosdata/TDengine/blob/3.0/tests/script/sh/pycumsum.py) + +pycumsum 使用 numpy 计算输入列所有数据的累积和。 +
+pycumsum.py + +```c +{{#include tests/script/sh/pycumsum.py}} +``` + +
+ ## 管理 UDF -在集群中管理UDF的过程涉及创建、使用和维护这些函数。用户可以通过SQL在集群中创建和管理UDF,一旦创建成功,集群的所有用户都可以在SQL中使用这些函数。由于UDF存储在集群的mnode上,因此即使重启集群,已经创建的UDF也仍然可用。 +在集群中管理 UDF 的过程涉及创建、使用和维护这些函数。用户可以通过 SQL 在集群中创建和管理 UDF,一旦创建成功,集群的所有用户都可以在 SQL 中使用这些函数。由于 UDF 存储在集群的 mnode 上,因此即使重启集群,已经创建的 UDF 也仍然可用。 -在创建UDF时,需要区分标量函数和聚合函数。标量函数接受零个或多个输入参数,并返回一个单一的值。聚合函数接受一组输入值,并通过对这些值进行某种计算(如求和、计数等)来返回一个单一的值。如果创建时声明了错误的函数类别,则通过SQL调用函数时会报错。 +在创建 UDF 时,需要区分标量函数和聚合函数。标量函数接受零个或多个输入参数,并返回一个单一的值。聚合函数接受一组输入值,并通过对这些值进行某种计算(如求和、计数等)来返回一个单一的值。如果创建时声明了错误的函数类别,则通过 SQL 调用函数时会报错。 -此外,用户需要确保输入数据类型与UDF程序匹配,UDF输出的数据类型与outputtype匹配。这意味着在创建UDF时,需要为输入参数和输出值指定正确的数据类型。这有助于确保在调用UDF时,输入数据能够正确地传递给UDF,并且UDF的输出值与预期的数据类型相匹配。 +此外,用户需要确保输入数据类型与 UDF 程序匹配,UDF 输出的数据类型与 outputtype 匹配。这意味着在创建 UDF 时,需要为输入参数和输出值指定正确的数据类型。这有助于确保在调用 UDF 时,输入数据能够正确地传递给 UDF,并且 UDF 的输出值与预期的数据类型相匹配。 ### 创建标量函数 -创建标量函数的SQL语法如下。 +创建标量函数的 SQL 语法如下。 ```sql -CREATE FUNCTION function_name AS library_path OUTPUTTYPE output_type LANGUAGE 'Python'; +CREATE OR REPLACE FUNCTION function_name AS library_path OUTPUTTYPE output_type LANGUAGE 'Python'; ``` 各参数说明如下。 - or replace:如果函数已经存在,则会修改已有的函数属性。 - function_name:标量函数在SQL中被调用时的函数名。 -- language:支持C语言和Python语言(3.7及以上版本),默认为C。 -- library_path:如果编程语言是C,则路径是包含UDF实现的动态链接库的库文件绝对路径,通常指向一个so文件。如果编程语言是Python,则路径是包含UDF -实现的Python文件路径。路径需要用英文单引号或英文双引号括起来。 +- language:支持 C 语言和 Python 语言(3.7 及以上版本),默认为 C。 +- library_path:如果编程语言是 C,则路径是包含 UDF 实现的动态链接库的库文件绝对路径,通常指向一个 so 文件。如果编程语言是 Python,则路径是包含 UDF +实现的 Python 文件路径。路径需要用英文单引号或英文双引号括起来。 - output_type:函数计算结果的数据类型名称。 - ### 创建聚合函数 -创建聚合函数的SQL语法如下。 +创建聚合函数的 SQL 语法如下。 ```sql -CREATE AGGREGATE FUNCTION function_name library_path OUTPUTTYPE output_type LANGUAGE 'Python'; +CREATE OR REPLACE AGGREGATE FUNCTION function_name library_path OUTPUTTYPE output_type LANGUAGE 'Python'; ``` 其中,buffer_size 表示中间计算结果的缓冲区大小,单位是字节。其他参数的含义与标量函数相同。 -如下SQL创建一个名为 l2norm 的UDF。 +如下 SQL 创建一个名为 l2norm 的 UDF。 ```sql CREATE AGGREGATE FUNCTION l2norm AS "/home/taos/udf_example/libl2norm.so" OUTPUTTYPE DOUBLE bufsize 8; ``` @@ -673,8 +879,15 @@ DROP FUNCTION function_name; ### 查看 UDF -显示集群中当前可用的所有UDF的SQL如下。 +显示集群中当前可用的所有 UDF 的 SQL 如下。 ```sql show functions; ``` +### 查看函数信息 + +同名的 UDF 每更新一次,版本号会增加 1。 +```sql +select * from ins_functions \G; +``` + diff --git a/docs/zh/14-reference/01-components/01-taosd.md b/docs/zh/14-reference/01-components/01-taosd.md index 02cf2155a0..994f557a17 100644 --- a/docs/zh/14-reference/01-components/01-taosd.md +++ b/docs/zh/14-reference/01-components/01-taosd.md @@ -192,13 +192,13 @@ charset 的有效值是 UTF-8。 ### 压缩参数 -| 参数名称 | 参数说明 | -| :-------------: | :----------------------------------------------------------------------------------------------------------------------------------------------: | -| compressMsgSize | 是否对 RPC 消息进行压缩;-1: 所有消息都不压缩; 0: 所有消息都压缩; N (N>0): 只有大于 N 个字节的消息才压缩;缺省值 -1 | -| fPrecision | 设置 float 类型浮点数压缩精度 ,取值范围:0.1 ~ 0.00000001 ,默认值 0.00000001 , 小于此值的浮点数尾数部分将被截断 | -| dPrecision | 设置 double 类型浮点数压缩精度 , 取值范围:0.1 ~ 0.0000000000000001 , 缺省值 0.0000000000000001 , 小于此值的浮点数尾数部分将被截取 | -| lossyColumn | 对 float 和/或 double 类型启用 TSZ 有损压缩;取值范围: float, double, none;缺省值: none,表示关闭无损压缩 | -| ifAdtFse | 在启用 TSZ 有损压缩时,使用 FSE 算法替换 HUFFMAN 算法, FSE 算法压缩速度更快,但解压稍慢,追求压缩速度可选用此算法; 0: 关闭,1:打开;默认值为 0 | +| 参数名称 | 参数说明 | +|:-------------:|:----------------------------------------------------------------:| +| compressMsgSize | 是否对 RPC 消息进行压缩;-1: 所有消息都不压缩; 0: 所有消息都压缩; N (N>0): 只有大于 N 个字节的消息才压缩;缺省值 -1 | +| fPrecision | 设置 float 类型浮点数压缩精度 ,取值范围:0.1 ~ 0.00000001 ,默认值 0.00000001 , 小于此值的浮点数尾数部分将被截断 | +|dPrecision | 设置 double 类型浮点数压缩精度 , 取值范围:0.1 ~ 0.0000000000000001 , 缺省值 0.0000000000000001 , 小于此值的浮点数尾数部分将被截取 | +|lossyColumn | 对 float 和/或 double 类型启用 TSZ 有损压缩;取值范围: float, double, none;缺省值: none,表示关闭无损压缩。**注意:此参数在 3.3.0.0 及更高版本中不再使用** | +|ifAdtFse | 在启用 TSZ 有损压缩时,使用 FSE 算法替换 HUFFMAN 算法, FSE 算法压缩速度更快,但解压稍慢,追求压缩速度可选用此算法; 0: 关闭,1:打开;默认值为 0 | **补充说明** diff --git a/docs/zh/14-reference/01-components/04-taosx.md b/docs/zh/14-reference/01-components/04-taosx.md index e0313d79cb..e378c18800 100644 --- a/docs/zh/14-reference/01-components/04-taosx.md +++ b/docs/zh/14-reference/01-components/04-taosx.md @@ -3,7 +3,7 @@ title: taosX 参考手册 sidebar_label: taosX --- -taosX 是 TDengine 中的一个核心组件,提供零代码数据接入的能力,taosX 支持两种运行模式:服务模式和命令行模式。本节讲述如何以这两种方式使用 taosX。要想使用 taosX 需要先安装 TDengine Enterprise 安装包。 +taosX 是 TDengine Enterprise 中的一个核心组件,提供零代码数据接入的能力,taosX 支持两种运行模式:服务模式和命令行模式。本节讲述如何以这两种方式使用 taosX。要想使用 taosX 需要先安装 TDengine Enterprise 安装包。 ## 命令行模式 diff --git a/docs/zh/14-reference/01-components/05-taosx-agent.md b/docs/zh/14-reference/01-components/05-taosx-agent.md index 0fc1e825aa..da1c395b3d 100644 --- a/docs/zh/14-reference/01-components/05-taosx-agent.md +++ b/docs/zh/14-reference/01-components/05-taosx-agent.md @@ -3,7 +3,7 @@ title: taosX-Agent 参考手册 sidebar_label: taosX-Agent --- -本节讲述如何部署 `Agent` (for `taosX`)。使用之前需要安装 TDengine Enterprise 安装包之后。 +本节讲述如何部署 `Agent` (for `taosX`)。使用之前需要安装 TDengine Enterprise 安装包之后,taosX-Agent 用于在部分数据接入场景,如 Pi, OPC UA, OPC DA 等对访问数据源有一定限制或者网络环境特殊的场景下,可以将 taosX-Agent 部署在靠近数据源的环境中甚至与数据源在相同的服务器上,由 taosX-Agent 负责从数据源读取数据并发送给 taosX。 ## 配置 diff --git a/docs/zh/14-reference/01-components/07-explorer.md b/docs/zh/14-reference/01-components/07-explorer.md index 5e619ea119..5a84490b81 100644 --- a/docs/zh/14-reference/01-components/07-explorer.md +++ b/docs/zh/14-reference/01-components/07-explorer.md @@ -4,27 +4,27 @@ sidebar_label: taosExplorer toc_max_heading_level: 4 --- -taos-explorer 是一个为用户提供 TDengine 实例的可视化管理交互工具的 web 服务。本节主要讲述其安装和部署。它的各项功能都是基于简单易上手的图形界面,可以直接尝试,如果有需要也可以考高级功能和运维指南中的相关内容。 +taosExplorer 是一个为用户提供 TDengine 实例的可视化管理交互工具的 web 服务。本节主要讲述其安装和部署。它的各项功能都是基于简单易上手的图形界面,可以直接尝试,如果有需要也可以考高级功能和运维指南中的相关内容。 ## 安装 -taos-explorer 无需单独安装,从 TDengine 3.3.0.0 版本开始,它随着 TDengine Enterprise Server 安装包一起发布,安装完成后,就可以看到 `taos-explorer` 服务。 +taosEexplorer 无需单独安装,从 TDengine 3.3.0.0 版本开始,它随着 TDengine Enterprise Server 安装包一起发布,安装完成后,就可以看到 `taos-explorer` 服务。 ## 配置 -在启动 Explorer 之前,请确保配置文件中的内容正确。 +在启动 taosExplorer 之前,请确保配置文件中的内容正确。 ```TOML -# Explorer listen port +# listen port port = 6060 -# Explorer listen address for IPv4 +# listen address for IPv4 addr = "0.0.0.0" -# Explorer listen address for IPv4 +# listen address for IPv4 #ipv6 = "::1" -# Explorer log level. Possible: error,warn,info,debug,trace +# log level. Possible: error,warn,info,debug,trace log_level = "info" # taosAdapter address. @@ -49,9 +49,9 @@ cors = false 说明: -- `port`:Explorer 服务绑定的端口。 -- `addr`:Explorer 服务绑定的 IPv4 地址,默认为 `0.0.0.0`。如需修改,请配置为 `localhost` 之外的地址以对外提供服务。 -- `ipv6`:Explorer 服务绑定的 IPv6 地址,默认不绑定 IPv6 地址。 +- `port`:taosExplorer 服务绑定的端口。 +- `addr`:taosExplorer 服务绑定的 IPv4 地址,默认为 `0.0.0.0`。如需修改,请配置为 `localhost` 之外的地址以对外提供服务。 +- `ipv6`:taosExplorer 服务绑定的 IPv6 地址,默认不绑定 IPv6 地址。 - `log_level`:日志级别,可选值为 "error", "warn", "info", "debug", "trace"。 - `cluster`:TDengine 集群的 taosAdapter 地址。 - `x_api`:taosX 的 gRPC 地址。 @@ -62,7 +62,7 @@ cors = false ## 启动停止 -然后启动 Explorer,可以直接在命令行执行 taos-explorer 或者使用 systemctl 命令: +然后启动 taosExplorer,可以直接在命令行执行 taos-explorer 或者使用 systemctl 命令: ```bash systemctl start taos-explorer # Linux @@ -78,7 +78,7 @@ sc.exe stop taos-explorer # Windows ## 问题排查 1. 当通过浏览器打开 Explorer 站点遇到“无法访问此网站”的错误信息时,请通过命令行登录 taosExplorer 所在机器,并使用命令 `systemctl status taos-explorer` 检查服务的状态,如果返回的状态是 `inactive`,请使用命令`systemctl start taos-explorer` 启动服务。 -2. 如果需要获取 Explorer 的详细日志,可通过命令 `journalctl -u taos-explorer`。 +2. 如果需要获取 taosExplorer 的详细日志,可通过命令 `journalctl -u taos-explorer`。 3. 当使用 Nginx 或其他工具进行转发时,注意进行 CORS 设置或在配置文件中使用 `cors = true`。 这是一个 Nginx 配置文件 CORS 设置的例子: diff --git a/docs/zh/14-reference/01-components/index.md b/docs/zh/14-reference/01-components/index.md index 4f26d202cc..65f3a10eca 100644 --- a/docs/zh/14-reference/01-components/index.md +++ b/docs/zh/14-reference/01-components/index.md @@ -1,6 +1,13 @@ ---- -sidebar_label: 产品组件 -title: 产品组件 -toc_max_heading_level: 4 -description: TDengine 产品组件参考手册 ---- \ No newline at end of file +--- +title: 产品组件 +description: TDengine 产品组件参考手册 +--- + +本节详细说明 TDengine 中的主要产品组件的功能、命令行参数、配置参数等。 + +```mdx-code-block +import DocCardList from '@theme/DocCardList'; +import {useCurrentSidebarCategory} from '@docusaurus/theme-common'; + + +``` \ No newline at end of file diff --git a/docs/zh/14-reference/03-taos-sql/02-database.md b/docs/zh/14-reference/03-taos-sql/02-database.md index 8a366011ae..d2e9ba0646 100644 --- a/docs/zh/14-reference/03-taos-sql/02-database.md +++ b/docs/zh/14-reference/03-taos-sql/02-database.md @@ -13,26 +13,26 @@ database_options: database_option ... database_option: { - BUFFER value + VGROUPS value + | PRECISION {'ms' | 'us' | 'ns'} + | REPLICA value + | BUFFER value + | PAGES value + | PAGESIZE value | CACHEMODEL {'none' | 'last_row' | 'last_value' | 'both'} | CACHESIZE value | COMP {0 | 1 | 2} | DURATION value - | WAL_FSYNC_PERIOD value | MAXROWS value | MINROWS value | KEEP value - | PAGES value - | PAGESIZE value - | PRECISION {'ms' | 'us' | 'ns'} - | REPLICA value - | WAL_LEVEL {1 | 2} - | VGROUPS value - | SINGLE_STABLE {0 | 1} | STT_TRIGGER value + | SINGLE_STABLE {0 | 1} | TABLE_PREFIX value | TABLE_SUFFIX value | TSDB_PAGESIZE value + | WAL_LEVEL {1 | 2} + | WAL_FSYNC_PERIOD value | WAL_RETENTION_PERIOD value | WAL_RETENTION_SIZE value } @@ -40,7 +40,14 @@ database_option: { ### 参数说明 +- VGROUPS:数据库中初始 vgroup 的数目。 +- PRECISION:数据库的时间戳精度。ms 表示毫秒,us 表示微秒,ns 表示纳秒,默认 ms 毫秒。 +- REPLICA:表示数据库副本数,取值为 1、2 或 3,默认为 1; 2 仅在企业版 3.3.0.0 及以后版本中可用。在集群中使用,副本数必须小于或等于 DNODE 的数目。且使用时存在以下限制: + - 暂不支持对双副本数据库相关 Vgroup 进行 SPLITE VGROUP 或 REDISTRIBUTE VGROUP 操作 + - 单副本数据库可变更为双副本数据库,但不支持从双副本变更为其它副本数,也不支持从三副本变更为双副本 - BUFFER: 一个 VNODE 写入内存池大小,单位为 MB,默认为 256,最小为 3,最大为 16384。 +- PAGES:一个 VNODE 中元数据存储引擎的缓存页个数,默认为 256,最小 64。一个 VNODE 元数据存储占用 PAGESIZE \* PAGES,默认情况下为 1MB 内存。 +- PAGESIZE:一个 VNODE 中元数据存储引擎的页大小,单位为 KB,默认为 4 KB。范围为 1 到 16384,即 1 KB 到 16 MB。 - CACHEMODEL:表示是否在内存中缓存子表的最近数据。默认为 none。 - none:表示不缓存。 - last_row:表示缓存子表最近一行数据。这将显著改善 LAST_ROW 函数的性能表现。 @@ -53,27 +60,20 @@ database_option: { - 1:表示一阶段压缩。 - 2:表示两阶段压缩。 - DURATION:数据文件存储数据的时间跨度。可以使用加单位的表示形式,如 DURATION 100h、DURATION 10d 等,支持 m(分钟)、h(小时)和 d(天)三个单位。不加时间单位时默认单位为天,如 DURATION 50 表示 50 天。 -- WAL_FSYNC_PERIOD:当 WAL_LEVEL 参数设置为 2 时,用于设置落盘的周期。默认为 3000,单位毫秒。最小为 0,表示每次写入立即落盘;最大为 180000,即三分钟。 - MAXROWS:文件块中记录的最大条数,默认为 4096 条。 - MINROWS:文件块中记录的最小条数,默认为 100 条。 - KEEP:表示数据文件保存的天数,缺省值为 3650,取值范围 [1, 365000],且必须大于或等于3倍的 DURATION 参数值。数据库会自动删除保存时间超过 KEEP 值的数据。KEEP 可以使用加单位的表示形式,如 KEEP 100h、KEEP 10d 等,支持 m(分钟)、h(小时)和 d(天)三个单位。也可以不写单位,如 KEEP 50,此时默认单位为天。企业版支持[多级存储](https://docs.taosdata.com/tdinternal/arch/#%E5%A4%9A%E7%BA%A7%E5%AD%98%E5%82%A8)功能, 因此, 可以设置多个保存时间(多个以英文逗号分隔,最多 3 个,满足 keep 0 \<= keep 1 \<= keep 2,如 KEEP 100h,100d,3650d); 社区版不支持多级存储功能(即使配置了多个保存时间, 也不会生效, KEEP 会取最大的保存时间)。 -- PAGES:一个 VNODE 中元数据存储引擎的缓存页个数,默认为 256,最小 64。一个 VNODE 元数据存储占用 PAGESIZE \* PAGES,默认情况下为 1MB 内存。 -- PAGESIZE:一个 VNODE 中元数据存储引擎的页大小,单位为 KB,默认为 4 KB。范围为 1 到 16384,即 1 KB 到 16 MB。 -- PRECISION:数据库的时间戳精度。ms 表示毫秒,us 表示微秒,ns 表示纳秒,默认 ms 毫秒。 -- REPLICA:表示数据库副本数,取值为 1、2 或 3,默认为 1; 2 仅在企业版 3.3.0.0 及以后版本中可用。在集群中使用,副本数必须小于或等于 DNODE 的数目。且使用时存在以下限制: - - 暂不支持对双副本数据库相关 Vgroup 进行 SPLITE VGROUP 或 REDISTRIBUTE VGROUP 操作 - - 单副本数据库可变更为双副本数据库,但不支持从双副本变更为其它副本数,也不支持从三副本变更为双副本 -- WAL_LEVEL:WAL 级别,默认为 1。 - - 1:写 WAL,但不执行 fsync。 - - 2:写 WAL,而且执行 fsync。 -- VGROUPS:数据库中初始 vgroup 的数目。 +- STT_TRIGGER:表示落盘文件触发文件合并的个数。默认为 1,范围 1 到 16。对于少表高频场景,此参数建议使用默认配置,或较小的值;而对于多表低频场景,此参数建议配置较大的值。 - SINGLE_STABLE:表示此数据库中是否只可以创建一个超级表,用于超级表列非常多的情况。 - 0:表示可以创建多张超级表。 - 1:表示只可以创建一张超级表。 -- STT_TRIGGER:表示落盘文件触发文件合并的个数。默认为 1,范围 1 到 16。对于少表高频场景,此参数建议使用默认配置,或较小的值;而对于多表低频场景,此参数建议配置较大的值。 - TABLE_PREFIX:当其为正值时,在决定把一个表分配到哪个 vgroup 时要忽略表名中指定长度的前缀;当其为负值时,在决定把一个表分配到哪个 vgroup 时只使用表名中指定长度的前缀;例如,假定表名为 "v30001",当 TSDB_PREFIX = 2 时 使用 "0001" 来决定分配到哪个 vgroup ,当 TSDB_PREFIX = -2 时使用 "v3" 来决定分配到哪个 vgroup - TABLE_SUFFIX:当其为正值时,在决定把一个表分配到哪个 vgroup 时要忽略表名中指定长度的后缀;当其为负值时,在决定把一个表分配到哪个 vgroup 时只使用表名中指定长度的后缀;例如,假定表名为 "v30001",当 TSDB_SUFFIX = 2 时 使用 "v300" 来决定分配到哪个 vgroup ,当 TSDB_SUFFIX = -2 时使用 "01" 来决定分配到哪个 vgroup。 - TSDB_PAGESIZE:一个 VNODE 中时序数据存储引擎的页大小,单位为 KB,默认为 4 KB。范围为 1 到 16384,即 1 KB到 16 MB。 +- WAL_LEVEL:WAL 级别,默认为 1。 + - 1:写 WAL,但不执行 fsync。 + - 2:写 WAL,而且执行 fsync。 +- WAL_FSYNC_PERIOD:当 WAL_LEVEL 参数设置为 2 时,用于设置落盘的周期。默认为 3000,单位毫秒。最小为 0,表示每次写入立即落盘;最大为 180000,即三分钟。 - WAL_RETENTION_PERIOD: 为了数据订阅消费,需要WAL日志文件额外保留的最大时长策略。WAL日志清理,不受订阅客户端消费状态影响。单位为 s。默认为 3600,表示在 WAL 保留最近 3600 秒的数据,请根据数据订阅的需要修改这个参数为适当值。 - WAL_RETENTION_SIZE:为了数据订阅消费,需要WAL日志文件额外保留的最大累计大小策略。单位为 KB。默认为 0,表示累计大小无上限。 ### 创建数据库示例 diff --git a/docs/zh/14-reference/03-taos-sql/21-node.md b/docs/zh/14-reference/03-taos-sql/21-node.md index 1f57411838..967cb51127 100644 --- a/docs/zh/14-reference/03-taos-sql/21-node.md +++ b/docs/zh/14-reference/03-taos-sql/21-node.md @@ -1,7 +1,7 @@ --- -sidebar_label: 集群管理 -title: 集群管理 -description: 管理集群的 SQL 命令的详细解析 +sidebar_label: 节点管理 +title: 节点管理 +description: 管理集群节点的 SQL 命令的详细解析 --- 组成 TDengine 集群的物理实体是 dnode (data node 的缩写),它是一个运行在操作系统之上的进程。在 dnode 中可以建立负责时序数据存储的 vnode (virtual node),在多节点集群环境下当某个数据库的 replica 为 3 时,该数据库中的每个 vgroup 由 3 个 vnode 组成;当数据库的 replica 为 1 时,该数据库中的每个 vgroup 由 1 个 vnode 组成。如果要想配置某个数据库为多副本,则集群中的 dnode 数量至少为 3。在 dnode 还可以创建 mnode (management node),单个集群中最多可以创建三个 mnode。在 TDengine 3.0.0.0 中为了支持存算分离,引入了一种新的逻辑节点 qnode (query node),qnode 和 vnode 既可以共存在一个 dnode 中,也可以完全分离在不同的 dnode 上。 diff --git a/docs/zh/14-reference/03-taos-sql/30-join.md b/docs/zh/14-reference/03-taos-sql/30-join.md index c0daeb41c0..60b634d310 100644 --- a/docs/zh/14-reference/03-taos-sql/30-join.md +++ b/docs/zh/14-reference/03-taos-sql/30-join.md @@ -202,7 +202,7 @@ SELECT ... FROM table_name1 LEFT|RIGHT ASOF JOIN table_name2 [ON ...] [JLIMIT jl 表 d1001 电压值大于 220V 且表 d1002 中同一时刻或稍早前最后时刻出现电压大于 220V 的时间及各自的电压值: ```sql -SELECT a.ts, a.voltage, a.ts, b.voltage FROM d1001 a LEFT ASOF JOIN d1002 b ON a.ts >= b.ts where a.voltage > 220 and b.voltage > 220 +SELECT a.ts, a.voltage, b.ts, b.voltage FROM d1001 a LEFT ASOF JOIN d1002 b ON a.ts >= b.ts where a.voltage > 220 and b.voltage > 220 ``` ### Left/Right Window Join diff --git a/docs/zh/14-reference/05-connector/index.md b/docs/zh/14-reference/05-connector/index.md index f9e1bd837d..5c58a4e7bc 100644 --- a/docs/zh/14-reference/05-connector/index.md +++ b/docs/zh/14-reference/05-connector/index.md @@ -123,3 +123,9 @@ import VerifyMacOS from "./_verify_macos.mdx"; +```mdx-code-block +import DocCardList from '@theme/DocCardList'; +import {useCurrentSidebarCategory} from '@docusaurus/theme-common'; + + +``` \ No newline at end of file diff --git a/docs/zh/10-tdinternal/01-arch.md b/docs/zh/26-tdinternal/01-arch.md similarity index 99% rename from docs/zh/10-tdinternal/01-arch.md rename to docs/zh/26-tdinternal/01-arch.md index c45ae56bb6..04e47797a8 100644 --- a/docs/zh/10-tdinternal/01-arch.md +++ b/docs/zh/26-tdinternal/01-arch.md @@ -178,7 +178,7 @@ TDengine 集群可以容纳单个、多个甚至几千个数据节点。应用 TDengine 存储的数据包括采集的时序数据以及库、表相关的元数据、标签数据等,这些数据具体分为三部分: -- 时序数据:TDengine 的核心存储对象,存放于 vnode 里,由 data、head 和 last 三个文件组成,数据量大,查询量取决于应用场景。容许乱序写入,但暂时不支持删除操作,并且仅在 update 参数设置为 1 时允许更新操作。通过采用一个采集点一张表的模型,一个时间段的数据是连续存储,对单张表的写入是简单的追加操作,一次读,可以读到多条记录,这样保证对单个采集点的插入和查询操作,性能达到最优。 +- 时序数据:TDengine 的核心存储对象,存放于 vnode 里,由 data、head 和 last 三个文件组成,数据量大,查询量取决于应用场景。允许乱序写入,但暂时不支持删除操作,并且仅在 update 参数设置为 1 时允许更新操作。通过采用一个采集点一张表的模型,一个时间段的数据是连续存储,对单张表的写入是简单的追加操作,一次读,可以读到多条记录,这样保证对单个采集点的插入和查询操作,性能达到最优。 - 数据表元数据:包含标签信息和 Table Schema 信息,存放于 vnode 里的 meta 文件,支持增删改查四个标准操作。数据量很大,有 N 张表,就有 N 条记录,因此采用 LRU 存储,支持标签数据的索引。TDengine 支持多核多线程并发查询。只要计算内存足够,元数据全内存存储,千万级别规模的标签数据过滤结果能毫秒级返回。在内存资源不足的情况下,仍然可以支持数千万张表的快速查询。 - 数据库元数据:存放于 mnode 里,包含系统节点、用户、DB、STable Schema 等信息,支持增删改查四个标准操作。这部分数据的量不大,可以全内存保存,而且由于客户端有缓存,查询量也不大。因此目前的设计虽是集中式存储管理,但不会构成性能瓶颈。 @@ -301,7 +301,7 @@ TDengine 采用了一种数据驱动的策略来实现缓存数据的持久化 对于采集的数据,通常会有一定的保留期限,该期限由数据库参数 keep 指定。超出设定天数的数据文件将被集群自动移除,并释放相应的存储空间。 -当设置 duration 和 keep 两个参数后,一个处于典型工作状态的 vnode 中,总的数据文件数量应为向上取整 (keep/duration)+1 个。数据文件的总个数应保持在一个合理的范围内,不宜过多也不宜过少,通常介于 10 到 100 之间较为适宜。基于这一原则,可以合理设置 duration 参数。在本书编写时的版本中,可以调整参数 keep,但参数 duration 一旦设定,则无法更改。 +当设置 duration 和 keep 两个参数后,一个处于典型工作状态的 vnode 中,总的数据文件数量应为向上取整 (keep/duration)+1 个。数据文件的总个数应保持在一个合理的范围内,不宜过多也不宜过少,通常介于 10 到 100 之间较为适宜。基于这一原则,可以合理设置 duration 参数。可以调整参数 keep,但参数 duration 一旦设定,则无法更改。 在每个数据文件中,表的数据是以块的形式存储的。一张表可能包含一到多个数据文件块。在一个文件块内,数据采用列式存储,占据连续的存储空间,这有助于显著提高读取度。文件块的大小由数据库参数 maxRows(每块最大记录条数)控制,默认值为 4096。这个值应适中,过大可能导致定位特定时间段数据的搜索时间变长,影响读取速度;过小则可能导致数据文件块的索引过大,压缩效率降低,同样影响读取速度。 diff --git a/docs/zh/10-tdinternal/03-storage.md b/docs/zh/26-tdinternal/03-storage.md similarity index 100% rename from docs/zh/10-tdinternal/03-storage.md rename to docs/zh/26-tdinternal/03-storage.md diff --git a/docs/zh/10-tdinternal/05-query.md b/docs/zh/26-tdinternal/05-query.md similarity index 100% rename from docs/zh/10-tdinternal/05-query.md rename to docs/zh/26-tdinternal/05-query.md diff --git a/docs/zh/10-tdinternal/07-topic.md b/docs/zh/26-tdinternal/07-topic.md similarity index 100% rename from docs/zh/10-tdinternal/07-topic.md rename to docs/zh/26-tdinternal/07-topic.md diff --git a/docs/zh/10-tdinternal/09-stream.md b/docs/zh/26-tdinternal/09-stream.md similarity index 100% rename from docs/zh/10-tdinternal/09-stream.md rename to docs/zh/26-tdinternal/09-stream.md diff --git a/docs/zh/10-tdinternal/11-compress.md b/docs/zh/26-tdinternal/11-compress.md similarity index 100% rename from docs/zh/10-tdinternal/11-compress.md rename to docs/zh/26-tdinternal/11-compress.md diff --git a/docs/zh/10-tdinternal/_category_.yml b/docs/zh/26-tdinternal/_category_.yml similarity index 100% rename from docs/zh/10-tdinternal/_category_.yml rename to docs/zh/26-tdinternal/_category_.yml diff --git a/docs/zh/10-tdinternal/aggquery.png b/docs/zh/26-tdinternal/aggquery.png similarity index 100% rename from docs/zh/10-tdinternal/aggquery.png rename to docs/zh/26-tdinternal/aggquery.png diff --git a/docs/zh/10-tdinternal/brin.png b/docs/zh/26-tdinternal/brin.png similarity index 100% rename from docs/zh/10-tdinternal/brin.png rename to docs/zh/26-tdinternal/brin.png diff --git a/docs/zh/10-tdinternal/btree.png b/docs/zh/26-tdinternal/btree.png similarity index 100% rename from docs/zh/10-tdinternal/btree.png rename to docs/zh/26-tdinternal/btree.png diff --git a/docs/zh/10-tdinternal/btreepage.png b/docs/zh/26-tdinternal/btreepage.png similarity index 100% rename from docs/zh/10-tdinternal/btreepage.png rename to docs/zh/26-tdinternal/btreepage.png diff --git a/docs/zh/10-tdinternal/cache.png b/docs/zh/26-tdinternal/cache.png similarity index 100% rename from docs/zh/10-tdinternal/cache.png rename to docs/zh/26-tdinternal/cache.png diff --git a/docs/zh/10-tdinternal/column.png b/docs/zh/26-tdinternal/column.png similarity index 100% rename from docs/zh/10-tdinternal/column.png rename to docs/zh/26-tdinternal/column.png diff --git a/docs/zh/10-tdinternal/compression.png b/docs/zh/26-tdinternal/compression.png similarity index 100% rename from docs/zh/10-tdinternal/compression.png rename to docs/zh/26-tdinternal/compression.png diff --git a/docs/zh/10-tdinternal/consumer.png b/docs/zh/26-tdinternal/consumer.png similarity index 100% rename from docs/zh/10-tdinternal/consumer.png rename to docs/zh/26-tdinternal/consumer.png diff --git a/docs/zh/10-tdinternal/consuming.png b/docs/zh/26-tdinternal/consuming.png similarity index 100% rename from docs/zh/10-tdinternal/consuming.png rename to docs/zh/26-tdinternal/consuming.png diff --git a/docs/zh/10-tdinternal/dnode.webp b/docs/zh/26-tdinternal/dnode.webp similarity index 100% rename from docs/zh/10-tdinternal/dnode.webp rename to docs/zh/26-tdinternal/dnode.webp diff --git a/docs/zh/10-tdinternal/fileset.png b/docs/zh/26-tdinternal/fileset.png similarity index 100% rename from docs/zh/10-tdinternal/fileset.png rename to docs/zh/26-tdinternal/fileset.png diff --git a/docs/zh/10-tdinternal/index.md b/docs/zh/26-tdinternal/index.md similarity index 100% rename from docs/zh/10-tdinternal/index.md rename to docs/zh/26-tdinternal/index.md diff --git a/docs/zh/10-tdinternal/key-value.png b/docs/zh/26-tdinternal/key-value.png similarity index 100% rename from docs/zh/10-tdinternal/key-value.png rename to docs/zh/26-tdinternal/key-value.png diff --git a/docs/zh/10-tdinternal/message.webp b/docs/zh/26-tdinternal/message.webp similarity index 100% rename from docs/zh/10-tdinternal/message.webp rename to docs/zh/26-tdinternal/message.webp diff --git a/docs/zh/10-tdinternal/meta.png b/docs/zh/26-tdinternal/meta.png similarity index 100% rename from docs/zh/10-tdinternal/meta.png rename to docs/zh/26-tdinternal/meta.png diff --git a/docs/zh/10-tdinternal/modules.webp b/docs/zh/26-tdinternal/modules.webp similarity index 100% rename from docs/zh/10-tdinternal/modules.webp rename to docs/zh/26-tdinternal/modules.webp diff --git a/docs/zh/10-tdinternal/multi_tables.webp b/docs/zh/26-tdinternal/multi_tables.webp similarity index 100% rename from docs/zh/10-tdinternal/multi_tables.webp rename to docs/zh/26-tdinternal/multi_tables.webp diff --git a/docs/zh/10-tdinternal/offset.png b/docs/zh/26-tdinternal/offset.png similarity index 100% rename from docs/zh/10-tdinternal/offset.png rename to docs/zh/26-tdinternal/offset.png diff --git a/docs/zh/10-tdinternal/rebalance.png b/docs/zh/26-tdinternal/rebalance.png similarity index 100% rename from docs/zh/10-tdinternal/rebalance.png rename to docs/zh/26-tdinternal/rebalance.png diff --git a/docs/zh/10-tdinternal/replica-forward.webp b/docs/zh/26-tdinternal/replica-forward.webp similarity index 100% rename from docs/zh/10-tdinternal/replica-forward.webp rename to docs/zh/26-tdinternal/replica-forward.webp diff --git a/docs/zh/10-tdinternal/replica-master.webp b/docs/zh/26-tdinternal/replica-master.webp similarity index 100% rename from docs/zh/10-tdinternal/replica-master.webp rename to docs/zh/26-tdinternal/replica-master.webp diff --git a/docs/zh/10-tdinternal/replica-restore.webp b/docs/zh/26-tdinternal/replica-restore.webp similarity index 100% rename from docs/zh/10-tdinternal/replica-restore.webp rename to docs/zh/26-tdinternal/replica-restore.webp diff --git a/docs/zh/10-tdinternal/skiplist.png b/docs/zh/26-tdinternal/skiplist.png similarity index 100% rename from docs/zh/10-tdinternal/skiplist.png rename to docs/zh/26-tdinternal/skiplist.png diff --git a/docs/zh/10-tdinternal/statetransition.png b/docs/zh/26-tdinternal/statetransition.png similarity index 100% rename from docs/zh/10-tdinternal/statetransition.png rename to docs/zh/26-tdinternal/statetransition.png diff --git a/docs/zh/10-tdinternal/streamarch.png b/docs/zh/26-tdinternal/streamarch.png similarity index 100% rename from docs/zh/10-tdinternal/streamarch.png rename to docs/zh/26-tdinternal/streamarch.png diff --git a/docs/zh/10-tdinternal/streamtask.png b/docs/zh/26-tdinternal/streamtask.png similarity index 100% rename from docs/zh/10-tdinternal/streamtask.png rename to docs/zh/26-tdinternal/streamtask.png diff --git a/docs/zh/10-tdinternal/structure.webp b/docs/zh/26-tdinternal/structure.webp similarity index 100% rename from docs/zh/10-tdinternal/structure.webp rename to docs/zh/26-tdinternal/structure.webp diff --git a/docs/zh/10-tdinternal/taskarch.png b/docs/zh/26-tdinternal/taskarch.png similarity index 100% rename from docs/zh/10-tdinternal/taskarch.png rename to docs/zh/26-tdinternal/taskarch.png diff --git a/docs/zh/10-tdinternal/topic.png b/docs/zh/26-tdinternal/topic.png similarity index 100% rename from docs/zh/10-tdinternal/topic.png rename to docs/zh/26-tdinternal/topic.png diff --git a/docs/zh/10-tdinternal/topicarch.png b/docs/zh/26-tdinternal/topicarch.png similarity index 100% rename from docs/zh/10-tdinternal/topicarch.png rename to docs/zh/26-tdinternal/topicarch.png diff --git a/docs/zh/10-tdinternal/tsdb.png b/docs/zh/26-tdinternal/tsdb.png similarity index 100% rename from docs/zh/10-tdinternal/tsdb.png rename to docs/zh/26-tdinternal/tsdb.png diff --git a/docs/zh/10-tdinternal/tuple.png b/docs/zh/26-tdinternal/tuple.png similarity index 100% rename from docs/zh/10-tdinternal/tuple.png rename to docs/zh/26-tdinternal/tuple.png diff --git a/docs/zh/10-tdinternal/vnode.png b/docs/zh/26-tdinternal/vnode.png similarity index 100% rename from docs/zh/10-tdinternal/vnode.png rename to docs/zh/26-tdinternal/vnode.png diff --git a/docs/zh/10-tdinternal/vnode.webp b/docs/zh/26-tdinternal/vnode.webp similarity index 100% rename from docs/zh/10-tdinternal/vnode.webp rename to docs/zh/26-tdinternal/vnode.webp diff --git a/docs/zh/10-tdinternal/write_follower.webp b/docs/zh/26-tdinternal/write_follower.webp similarity index 100% rename from docs/zh/10-tdinternal/write_follower.webp rename to docs/zh/26-tdinternal/write_follower.webp diff --git a/docs/zh/10-tdinternal/write_leader.webp b/docs/zh/26-tdinternal/write_leader.webp similarity index 100% rename from docs/zh/10-tdinternal/write_leader.webp rename to docs/zh/26-tdinternal/write_leader.webp diff --git a/include/client/taos.h b/include/client/taos.h index 1d2b3a913c..73ab52357a 100644 --- a/include/client/taos.h +++ b/include/client/taos.h @@ -270,7 +270,10 @@ DLL_EXPORT TAOS_RES *taos_schemaless_insert_raw_ttl(TAOS *taos, char *lines, int int precision, int32_t ttl); DLL_EXPORT TAOS_RES *taos_schemaless_insert_raw_ttl_with_reqid(TAOS *taos, char *lines, int len, int32_t *totalRows, int protocol, int precision, int32_t ttl, int64_t reqid); - +DLL_EXPORT TAOS_RES *taos_schemaless_insert_raw_ttl_with_reqid_tbname_key(TAOS *taos, char *lines, int len, int32_t *totalRows, + int protocol, int precision, int32_t ttl, int64_t reqid, char *tbnameKey); +DLL_EXPORT TAOS_RES *taos_schemaless_insert_ttl_with_reqid_tbname_key(TAOS *taos, char *lines[], int numLines, int protocol, + int precision, int32_t ttl, int64_t reqid, char *tbnameKey); /* --------------------------TMQ INTERFACE------------------------------- */ typedef struct tmq_t tmq_t; diff --git a/include/common/rsync.h b/include/common/rsync.h index 5bb98f9eab..30f542857e 100644 --- a/include/common/rsync.h +++ b/include/common/rsync.h @@ -13,8 +13,8 @@ extern "C" { void stopRsync(); int32_t startRsync(); -int32_t uploadByRsync(const char* id, const char* path); -int32_t downloadRsync(const char* id, const char* path); +int32_t uploadByRsync(const char* id, const char* path, int64_t checkpointId); +int32_t downloadByRsync(const char* id, const char* path, int64_t checkpointId); int32_t deleteRsync(const char* id); #ifdef __cplusplus diff --git a/include/libs/executor/storageapi.h b/include/libs/executor/storageapi.h index a0ff3353bc..61ae034450 100644 --- a/include/libs/executor/storageapi.h +++ b/include/libs/executor/storageapi.h @@ -245,12 +245,12 @@ typedef struct SStoreSnapshotFn { } SStoreSnapshotFn; typedef struct SStoreMeta { - SMTbCursor* (*openTableMetaCursor)(void* pVnode); // metaOpenTbCursor - void (*closeTableMetaCursor)(SMTbCursor* pTbCur); // metaCloseTbCursor - void (*pauseTableMetaCursor)(SMTbCursor* pTbCur); // metaPauseTbCursor - void (*resumeTableMetaCursor)(SMTbCursor* pTbCur, int8_t first, int8_t move); // metaResumeTbCursor - int32_t (*cursorNext)(SMTbCursor* pTbCur, ETableType jumpTableType); // metaTbCursorNext - int32_t (*cursorPrev)(SMTbCursor* pTbCur, ETableType jumpTableType); // metaTbCursorPrev + SMTbCursor* (*openTableMetaCursor)(void* pVnode); // metaOpenTbCursor + void (*closeTableMetaCursor)(SMTbCursor* pTbCur); // metaCloseTbCursor + void (*pauseTableMetaCursor)(SMTbCursor* pTbCur); // metaPauseTbCursor + int32_t (*resumeTableMetaCursor)(SMTbCursor* pTbCur, int8_t first, int8_t move); // metaResumeTbCursor + int32_t (*cursorNext)(SMTbCursor* pTbCur, ETableType jumpTableType); // metaTbCursorNext + int32_t (*cursorPrev)(SMTbCursor* pTbCur, ETableType jumpTableType); // metaTbCursorPrev int32_t (*getTableTags)(void* pVnode, uint64_t suid, SArray* uidList); int32_t (*getTableTagsByUid)(void* pVnode, int64_t suid, SArray* uidList); diff --git a/include/libs/function/taosudf.h b/include/libs/function/taosudf.h index 04b92a897a..91487e5d1d 100644 --- a/include/libs/function/taosudf.h +++ b/include/libs/function/taosudf.h @@ -131,6 +131,14 @@ static FORCE_INLINE char *udfColDataGetData(const SUdfColumn *pColumn, int32_t r } } +static FORCE_INLINE int32_t udfColDataGetDataLen(const SUdfColumn *pColumn, int32_t row) { + if (IS_VAR_DATA_TYPE(pColumn->colMeta.type)) { + return *(uint16_t*)(pColumn->colData.varLenCol.payload + pColumn->colData.varLenCol.varOffsets[row]); + } else { + return pColumn->colMeta.bytes; + } +} + static FORCE_INLINE bool udfColDataIsNull(const SUdfColumn *pColumn, int32_t row) { if (IS_VAR_DATA_TYPE(pColumn->colMeta.type)) { if (pColumn->colMeta.type == TSDB_DATA_TYPE_JSON) { @@ -320,6 +328,30 @@ typedef int32_t (*TScriptUdfDestoryFunc)(void *udfCtx); typedef int32_t (*TScriptOpenFunc)(SScriptUdfEnvItem *items, int numItems); typedef int32_t (*TScriptCloseFunc)(); +// clang-format off +#ifdef WINDOWS + #define fnFatal(...) {} + #define fnError(...) {} + #define fnWarn(...) {} + #define fnInfo(...) {} + #define fnDebug(...) {} + #define fnTrace(...) {} +#else + DLL_EXPORT void taosPrintLog(const char *flags, int32_t level, int32_t dflag, const char *format, ...) +#ifdef __GNUC__ + __attribute__((format(printf, 4, 5))) +#endif + ; + extern int32_t udfDebugFlag; + #define udfFatal(...) { if (udfDebugFlag & 1) { taosPrintLog("UDF FATAL ", 1, 255, __VA_ARGS__); }} + #define udfError(...) { if (udfDebugFlag & 1) { taosPrintLog("UDF ERROR ", 1, 255, __VA_ARGS__); }} + #define udfWarn(...) { if (udfDebugFlag & 2) { taosPrintLog("UDF WARN ", 2, 255, __VA_ARGS__); }} + #define udfInfo(...) { if (udfDebugFlag & 2) { taosPrintLog("UDF ", 2, 255, __VA_ARGS__); }} + #define udfDebug(...) { if (udfDebugFlag & 4) { taosPrintLog("UDF ", 4, udfDebugFlag, __VA_ARGS__); }} + #define udfTrace(...) { if (udfDebugFlag & 8) { taosPrintLog("UDF ", 8, udfDebugFlag, __VA_ARGS__); }} +#endif +// clang-format on + #ifdef __cplusplus } #endif diff --git a/include/libs/nodes/querynodes.h b/include/libs/nodes/querynodes.h index bb06b65898..198163582b 100644 --- a/include/libs/nodes/querynodes.h +++ b/include/libs/nodes/querynodes.h @@ -636,7 +636,7 @@ bool nodesExprsHasColumn(SNodeList* pList); void* nodesGetValueFromNode(SValueNode* pNode); int32_t nodesSetValueNodeValue(SValueNode* pNode, void* value); char* nodesGetStrValueFromNode(SValueNode* pNode); -void nodesValueNodeToVariant(const SValueNode* pNode, SVariant* pVal); +int32_t nodesValueNodeToVariant(const SValueNode* pNode, SVariant* pVal); int32_t nodesMakeValueNodeFromString(char* literal, SValueNode** ppValNode); int32_t nodesMakeValueNodeFromBool(bool b, SValueNode** ppValNode); int32_t nodesMakeValueNodeFromInt32(int32_t value, SNode** ppNode); diff --git a/include/libs/qcom/query.h b/include/libs/qcom/query.h index f56860dd4f..6e2b83dce7 100644 --- a/include/libs/qcom/query.h +++ b/include/libs/qcom/query.h @@ -25,9 +25,9 @@ extern "C" { #include "tarray.h" #include "thash.h" #include "tlog.h" -#include "tsimplehash.h" #include "tmsg.h" #include "tmsgcb.h" +#include "tsimplehash.h" typedef enum { JOB_TASK_STATUS_NULL = 0, @@ -69,16 +69,16 @@ typedef enum { #define QUERY_MSG_MASK_SHOW_REWRITE() (1 << 0) #define QUERY_MSG_MASK_AUDIT() (1 << 1) #define QUERY_MSG_MASK_VIEW() (1 << 2) -#define TEST_SHOW_REWRITE_MASK(m) (((m) & QUERY_MSG_MASK_SHOW_REWRITE()) != 0) -#define TEST_AUDIT_MASK(m) (((m) & QUERY_MSG_MASK_AUDIT()) != 0) -#define TEST_VIEW_MASK(m) (((m) & QUERY_MSG_MASK_VIEW()) != 0) +#define TEST_SHOW_REWRITE_MASK(m) (((m)&QUERY_MSG_MASK_SHOW_REWRITE()) != 0) +#define TEST_AUDIT_MASK(m) (((m)&QUERY_MSG_MASK_AUDIT()) != 0) +#define TEST_VIEW_MASK(m) (((m)&QUERY_MSG_MASK_VIEW()) != 0) typedef struct STableComInfo { uint8_t numOfTags; // the number of tags in schema uint8_t precision; // the number of precision col_id_t numOfColumns; // the number of columns int16_t numOfPKs; - int32_t rowSize; // row size of the schema + int32_t rowSize; // row size of the schema } STableComInfo; typedef struct SIndexMeta { @@ -119,8 +119,9 @@ typedef struct STableMeta { int32_t sversion; int32_t tversion; STableComInfo tableInfo; - SSchemaExt* schemaExt; // There is no additional memory allocation, and the pointer is fixed to the next address of the schema content. - SSchema schema[]; + SSchemaExt* schemaExt; // There is no additional memory allocation, and the pointer is fixed to the next address of + // the schema content. + SSchema schema[]; } STableMeta; #pragma pack(pop) @@ -196,9 +197,9 @@ typedef struct SBoundColInfo { } SBoundColInfo; typedef struct STableColsData { - char tbName[TSDB_TABLE_NAME_LEN]; - SArray* aCol; - bool getFromHash; + char tbName[TSDB_TABLE_NAME_LEN]; + SArray* aCol; + bool getFromHash; } STableColsData; typedef struct STableVgUid { @@ -207,15 +208,14 @@ typedef struct STableVgUid { } STableVgUid; typedef struct STableBufInfo { - void* pCurBuff; - SArray* pBufList; - int64_t buffUnit; - int64_t buffSize; - int64_t buffIdx; - int64_t buffOffset; + void* pCurBuff; + SArray* pBufList; + int64_t buffUnit; + int64_t buffSize; + int64_t buffIdx; + int64_t buffOffset; } STableBufInfo; - typedef struct STableDataCxt { STableMeta* pMeta; STSchema* pSchema; @@ -237,23 +237,22 @@ typedef struct SStbInterlaceInfo { void* pRequest; uint64_t requestId; int64_t requestSelf; - bool tbFromHash; + bool tbFromHash; SHashObj* pVgroupHash; SArray* pVgroupList; SSHashObj* pTableHash; int64_t tbRemainNum; STableBufInfo tbBuf; char firstName[TSDB_TABLE_NAME_LEN]; - STSchema *pTSchema; - STableDataCxt *pDataCtx; - void *boundTags; + STSchema* pTSchema; + STableDataCxt* pDataCtx; + void* boundTags; - bool tableColsReady; - SArray *pTableCols; - int32_t pTableColsIdx; + bool tableColsReady; + SArray* pTableCols; + int32_t pTableColsIdx; } SStbInterlaceInfo; - typedef int32_t (*__async_send_cb_fn_t)(void* param, SDataBuf* pMsg, int32_t code); typedef int32_t (*__async_exec_fn_t)(void* param); @@ -308,6 +307,8 @@ void destroyAhandle(void* ahandle); int32_t asyncSendMsgToServerExt(void* pTransporter, SEpSet* epSet, int64_t* pTransporterId, SMsgSendInfo* pInfo, bool persistHandle, void* ctx); +int32_t asyncFreeConnById(void* pTransporter, int64_t pid); +; /** * Asynchronously send message to server, after the response received, the callback will be incured. * @@ -325,7 +326,7 @@ void initQueryModuleMsgHandle(); const SSchema* tGetTbnameColumnSchema(); bool tIsValidSchema(struct SSchema* pSchema, int32_t numOfCols, int32_t numOfTags); -int32_t getAsofJoinReverseOp(EOperatorType op); +int32_t getAsofJoinReverseOp(EOperatorType op); int32_t queryCreateCTableMetaFromMsg(STableMetaRsp* msg, SCTableMeta* pMeta); int32_t queryCreateTableMetaFromMsg(STableMetaRsp* msg, bool isSuperTable, STableMeta** pMeta); @@ -384,7 +385,7 @@ extern int32_t (*queryProcessMsgRsp[TDMT_MAX])(void* output, char* msg, int32_t #define NEED_CLIENT_RM_TBLMETA_REQ(_type) \ ((_type) == TDMT_VND_CREATE_TABLE || (_type) == TDMT_MND_CREATE_STB || (_type) == TDMT_VND_DROP_TABLE || \ - (_type) == TDMT_MND_DROP_STB || (_type) == TDMT_MND_CREATE_VIEW || (_type) == TDMT_MND_DROP_VIEW || \ + (_type) == TDMT_MND_DROP_STB || (_type) == TDMT_MND_CREATE_VIEW || (_type) == TDMT_MND_DROP_VIEW || \ (_type) == TDMT_MND_CREATE_TSMA || (_type) == TDMT_MND_DROP_TSMA || (_type) == TDMT_MND_DROP_TB_WITH_TSMA) #define NEED_SCHEDULER_REDIRECT_ERROR(_code) \ diff --git a/include/libs/stream/streamMsg.h b/include/libs/stream/streamMsg.h index 34921daac3..0ceaa93a72 100644 --- a/include/libs/stream/streamMsg.h +++ b/include/libs/stream/streamMsg.h @@ -164,6 +164,7 @@ int32_t tDecodeStreamTaskCheckpointReq(SDecoder* pDecoder, SStreamTaskCheckpoint typedef struct SStreamHbMsg { int32_t vgId; int32_t msgId; + int64_t ts; int32_t numOfTasks; SArray* pTaskStatus; // SArray SArray* pUpdateNodes; // SArray, needs update the epsets in stream tasks for those nodes. diff --git a/include/libs/transport/trpc.h b/include/libs/transport/trpc.h index 6c0d04354a..5b860cc23a 100644 --- a/include/libs/transport/trpc.h +++ b/include/libs/transport/trpc.h @@ -125,6 +125,7 @@ typedef struct SRpcInit { int32_t timeToGetConn; int8_t supportBatch; // 0: no batch, 1. batch int32_t batchSize; + int8_t notWaitAvaliableConn; // 1: wait to get, 0: no wait void *parent; } SRpcInit; @@ -158,18 +159,21 @@ void *rpcReallocCont(void *ptr, int64_t contLen); // Because taosd supports multi-process mode // These functions should not be used on the server side // Please use tmsg functions, which are defined in tmsgcb.h -int rpcSendRequest(void *thandle, const SEpSet *pEpSet, SRpcMsg *pMsg, int64_t *rid); -int rpcSendResponse(const SRpcMsg *pMsg); -int rpcRegisterBrokenLinkArg(SRpcMsg *msg); -int rpcReleaseHandle(void *handle, int8_t type); // just release conn to rpc instance, no close sock +int32_t rpcSendRequest(void *thandle, const SEpSet *pEpSet, SRpcMsg *pMsg, int64_t *rid); +int32_t rpcSendResponse(const SRpcMsg *pMsg); +int32_t rpcRegisterBrokenLinkArg(SRpcMsg *msg); +int32_t rpcReleaseHandle(void *handle, int8_t type); // just release conn to rpc instance, no close sock // These functions will not be called in the child process -int rpcSendRequestWithCtx(void *thandle, const SEpSet *pEpSet, SRpcMsg *pMsg, int64_t *rid, SRpcCtx *ctx); -int rpcSendRecv(void *shandle, SEpSet *pEpSet, SRpcMsg *pReq, SRpcMsg *pRsp); -int rpcSendRecvWithTimeout(void *shandle, SEpSet *pEpSet, SRpcMsg *pMsg, SRpcMsg *pRsp, int8_t *epUpdated, +int32_t rpcSendRequestWithCtx(void *thandle, const SEpSet *pEpSet, SRpcMsg *pMsg, int64_t *rid, SRpcCtx *ctx); +int32_t rpcSendRecv(void *shandle, SEpSet *pEpSet, SRpcMsg *pReq, SRpcMsg *pRsp); +int32_t rpcSendRecvWithTimeout(void *shandle, SEpSet *pEpSet, SRpcMsg *pMsg, SRpcMsg *pRsp, int8_t *epUpdated, int32_t timeoutMs); -int rpcSetDefaultAddr(void *thandle, const char *ip, const char *fqdn); -void *rpcAllocHandle(); + +int32_t rpcFreeConnById(void *shandle, int64_t connId); + +int32_t rpcSetDefaultAddr(void *thandle, const char *ip, const char *fqdn); +int32_t rpcAllocHandle(int64_t *refId); int32_t rpcSetIpWhite(void *thandl, void *arg); int32_t rpcUtilSIpRangeToStr(SIpV4Range *pRange, char *buf); diff --git a/include/util/taoserror.h b/include/util/taoserror.h index f0cb30e7e0..b091d870ec 100644 --- a/include/util/taoserror.h +++ b/include/util/taoserror.h @@ -529,6 +529,7 @@ int32_t taosGetErrSize(); #define TSDB_CODE_VND_META_DATA_UNSAFE_DELETE TAOS_DEF_ERROR_CODE(0, 0x0535) #define TSDB_CODE_VND_COLUMN_COMPRESS_ALREADY_EXIST TAOS_DEF_ERROR_CODE(0, 0x0536) #define TSDB_CODE_VND_ARB_NOT_SYNCED TAOS_DEF_ERROR_CODE(0, 0x0537) // internal +#define TSDB_CODE_VND_WRITE_DISABLED TAOS_DEF_ERROR_CODE(0, 0x0538) // internal // tsdb #define TSDB_CODE_TDB_INVALID_TABLE_ID TAOS_DEF_ERROR_CODE(0, 0x0600) diff --git a/include/util/tcompare.h b/include/util/tcompare.h index 80f992f646..c7a29cad57 100644 --- a/include/util/tcompare.h +++ b/include/util/tcompare.h @@ -49,6 +49,7 @@ int32_t InitRegexCache(); void DestroyRegexCache(); int32_t patternMatch(const char *pattern, size_t psize, const char *str, size_t ssize, const SPatternCompareInfo *pInfo); int32_t checkRegexPattern(const char *pPattern); +void DestoryThreadLocalRegComp(); int32_t wcsPatternMatch(const TdUcs4 *pattern, size_t psize, const TdUcs4 *str, size_t ssize, const SPatternCompareInfo *pInfo); diff --git a/include/util/tlog.h b/include/util/tlog.h index 67aafbfe44..b4dbbef532 100644 --- a/include/util/tlog.h +++ b/include/util/tlog.h @@ -74,13 +74,13 @@ void taosCloseLog(); void taosResetLog(); void taosDumpData(uint8_t *msg, int32_t len); -void taosPrintLog(const char *flags, ELogLevel level, int32_t dflag, const char *format, ...) +void taosPrintLog(const char *flags, int32_t level, int32_t dflag, const char *format, ...) #ifdef __GNUC__ __attribute__((format(printf, 4, 5))) #endif ; -void taosPrintLongString(const char *flags, ELogLevel level, int32_t dflag, const char *format, ...) +void taosPrintLongString(const char *flags, int32_t level, int32_t dflag, const char *format, ...) #ifdef __GNUC__ __attribute__((format(printf, 4, 5))) #endif diff --git a/include/util/tutil.h b/include/util/tutil.h index f1f2914eed..1ee3bb0e83 100644 --- a/include/util/tutil.h +++ b/include/util/tutil.h @@ -80,6 +80,11 @@ static FORCE_INLINE void taosEncryptPass_c(uint8_t *inBuf, size_t len, char *tar (void)memcpy(target, buf, TSDB_PASSWORD_LEN); } +static FORCE_INLINE int32_t taosHashBinary(char* pBuf, int32_t len) { + uint64_t hashVal = MurmurHash3_64(pBuf, len); + return sprintf(pBuf, "%" PRIu64, hashVal); +} + static FORCE_INLINE int32_t taosCreateMD5Hash(char *pBuf, int32_t len) { T_MD5_CTX ctx; tMD5Init(&ctx); @@ -87,11 +92,10 @@ static FORCE_INLINE int32_t taosCreateMD5Hash(char *pBuf, int32_t len) { tMD5Final(&ctx); char *p = pBuf; int32_t resLen = 0; - for (uint8_t i = 0; i < tListLen(ctx.digest); ++i) { - resLen += snprintf(p, 3, "%02x", ctx.digest[i]); - p += 2; - } - return resLen; + return sprintf(pBuf, "%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x", ctx.digest[0], ctx.digest[1], + ctx.digest[2], ctx.digest[3], ctx.digest[4], ctx.digest[5], ctx.digest[6], ctx.digest[7], + ctx.digest[8], ctx.digest[9], ctx.digest[10], ctx.digest[11], ctx.digest[12], ctx.digest[13], + ctx.digest[14], ctx.digest[15]); } static FORCE_INLINE int32_t taosGetTbHashVal(const char *tbname, int32_t tblen, int32_t method, int32_t prefix, diff --git a/source/client/inc/clientSml.h b/source/client/inc/clientSml.h index e901f86b41..209c376f30 100644 --- a/source/client/inc/clientSml.h +++ b/source/client/inc/clientSml.h @@ -204,6 +204,7 @@ typedef struct { STableMeta *currSTableMeta; STableDataCxt *currTableDataCtx; bool needModifySchema; + char *tbnameKey; } SSmlHandle; extern int64_t smlFactorNS[]; @@ -219,9 +220,10 @@ bool smlParseNumberOld(SSmlKv *kvVal, SSmlMsgBuf *msg); void smlBuildInvalidDataMsg(SSmlMsgBuf *pBuf, const char *msg1, const char *msg2); int32_t smlParseNumber(SSmlKv *kvVal, SSmlMsgBuf *msg); int64_t smlGetTimeValue(const char *value, int32_t len, uint8_t fromPrecision, uint8_t toPrecision); -int32_t smlBuildTableInfo(int numRows, const char* measure, int32_t measureLen, SSmlTableInfo** tInfo); + +int32_t smlBuildTableInfo(int numRows, const char* measure, int32_t measureLen, SSmlTableInfo** tInfo); int32_t smlBuildSTableMeta(bool isDataFormat, SSmlSTableMeta** sMeta); -int32_t smlSetCTableName(SSmlTableInfo *oneTable); +int32_t smlSetCTableName(SSmlTableInfo *oneTable, char *tbnameKey); int32_t getTableUid(SSmlHandle *info, SSmlLineInfo *currElement, SSmlTableInfo *tinfo); int32_t smlGetMeta(SSmlHandle *info, const void* measure, int32_t measureLen, STableMeta **pTableMeta); int32_t is_same_child_table_telnet(const void *a, const void *b); diff --git a/source/client/src/clientEnv.c b/source/client/src/clientEnv.c index 35e6651c41..d63701c529 100644 --- a/source/client/src/clientEnv.c +++ b/source/client/src/clientEnv.c @@ -374,8 +374,8 @@ int32_t openTransporter(const char *user, const char *auth, int32_t numOfThread, *pDnodeConn = rpcOpen(&rpcInit); if (*pDnodeConn == NULL) { - tscError("failed to init connection to server."); - code = TSDB_CODE_FAILED; + tscError("failed to init connection to server since %s", tstrerror(terrno)); + code = terrno; } return code; @@ -526,19 +526,17 @@ int32_t createRequest(uint64_t connId, int32_t type, int64_t reqid, SRequestObj int32_t code = TSDB_CODE_SUCCESS; *pRequest = (SRequestObj *)taosMemoryCalloc(1, sizeof(SRequestObj)); if (NULL == *pRequest) { - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } STscObj *pTscObj = acquireTscObj(connId); if (pTscObj == NULL) { - code = TSDB_CODE_TSC_DISCONNECTED; - goto _return; + TSC_ERR_JRET(TSDB_CODE_TSC_DISCONNECTED); } SSyncQueryParam *interParam = taosMemoryCalloc(1, sizeof(SSyncQueryParam)); if (interParam == NULL) { releaseTscObj(connId); - code = TSDB_CODE_OUT_OF_MEMORY; - goto _return; + TSC_ERR_JRET(terrno); } TSC_ERR_JRET(tsem_init(&interParam->sem, 0, 0)); interParam->pRequest = *pRequest; @@ -566,7 +564,11 @@ int32_t createRequest(uint64_t connId, int32_t type, int64_t reqid, SRequestObj return TSDB_CODE_SUCCESS; _return: - doDestroyRequest(*pRequest); + if ((*pRequest)->pTscObj) { + doDestroyRequest(*pRequest); + } else { + taosMemoryFree(*pRequest); + } return code; } @@ -869,7 +871,7 @@ _return: TSC_ERR_RET(terrno); } - return TSDB_CODE_SUCCESS; + return code; } void tscStopCrashReport() { diff --git a/source/client/src/clientHb.c b/source/client/src/clientHb.c index a174ae4b13..415c2d6685 100644 --- a/source/client/src/clientHb.c +++ b/source/client/src/clientHb.c @@ -1405,7 +1405,7 @@ _return: TSC_ERR_RET(terrno); } - return TSDB_CODE_SUCCESS; + return code; } static void hbStopThread() { diff --git a/source/client/src/clientImpl.c b/source/client/src/clientImpl.c index 4aa78caa15..664de5619f 100644 --- a/source/client/src/clientImpl.c +++ b/source/client/src/clientImpl.c @@ -2467,7 +2467,8 @@ TSDB_SERVER_STATUS taos_check_server_status(const char* fqdn, int port, char* de clientRpc = rpcOpen(&rpcInit); if (clientRpc == NULL) { - tscError("failed to init server status client"); + code = terrno; + tscError("failed to init server status client since %s", tstrerror(code)); goto _OVER; } diff --git a/source/client/src/clientSml.c b/source/client/src/clientSml.c index c244d36eb7..55be514e40 100644 --- a/source/client/src/clientSml.c +++ b/source/client/src/clientSml.c @@ -402,7 +402,7 @@ int32_t smlProcessChildTable(SSmlHandle *info, SSmlLineInfo *elements) { if (kv->valueEscaped) kv->value = NULL; } - code = smlSetCTableName(tinfo); + code = smlSetCTableName(tinfo, info->tbnameKey); if (code != TSDB_CODE_SUCCESS){ smlDestroyTableInfo(&tinfo); return code; @@ -486,10 +486,10 @@ int32_t smlParseEndLine(SSmlHandle *info, SSmlLineInfo *elements, SSmlKv *kvTs) return TSDB_CODE_SUCCESS; } -static int32_t smlParseTableName(SArray *tags, char *childTableName) { +static int32_t smlParseTableName(SArray *tags, char *childTableName, char *tbnameKey) { bool autoChildName = false; size_t delimiter = strlen(tsSmlAutoChildTableNameDelimiter); - if (delimiter > 0) { + if(delimiter > 0 && tbnameKey == NULL){ size_t totalNameLen = delimiter * (taosArrayGetSize(tags) - 1); for (int i = 0; i < taosArrayGetSize(tags); i++) { SSmlKv *tag = (SSmlKv *)taosArrayGet(tags, i); @@ -517,8 +517,11 @@ static int32_t smlParseTableName(SArray *tags, char *childTableName) { if (tsSmlDot2Underline) { smlStrReplace(childTableName, strlen(childTableName)); } - } else { - size_t childTableNameLen = strlen(tsSmlChildTableName); + }else{ + if (tbnameKey == NULL){ + tbnameKey = tsSmlChildTableName; + } + size_t childTableNameLen = strlen(tbnameKey); if (childTableNameLen <= 0) return TSDB_CODE_SUCCESS; for (int i = 0; i < taosArrayGetSize(tags); i++) { @@ -527,7 +530,7 @@ static int32_t smlParseTableName(SArray *tags, char *childTableName) { return TSDB_CODE_SML_INVALID_DATA; } // handle child table name - if (childTableNameLen == tag->keyLen && strncmp(tag->key, tsSmlChildTableName, tag->keyLen) == 0) { + if (childTableNameLen == tag->keyLen && strncmp(tag->key, tbnameKey, tag->keyLen) == 0) { (void)memset(childTableName, 0, TSDB_TABLE_NAME_LEN); (void)strncpy(childTableName, tag->value, (tag->length < TSDB_TABLE_NAME_LEN ? tag->length : TSDB_TABLE_NAME_LEN)); if (tsSmlDot2Underline) { @@ -542,8 +545,8 @@ static int32_t smlParseTableName(SArray *tags, char *childTableName) { return TSDB_CODE_SUCCESS; } -int32_t smlSetCTableName(SSmlTableInfo *oneTable) { - int32_t code = smlParseTableName(oneTable->tags, oneTable->childTableName); +int32_t smlSetCTableName(SSmlTableInfo *oneTable, char *tbnameKey) { + int32_t code = smlParseTableName(oneTable->tags, oneTable->childTableName, tbnameKey); if(code != TSDB_CODE_SUCCESS){ return code; } @@ -2127,7 +2130,7 @@ void smlSetReqSQL(SRequestObj *request, char *lines[], char *rawLine, char *rawL } TAOS_RES *taos_schemaless_insert_inner(TAOS *taos, char *lines[], char *rawLine, char *rawLineEnd, int numLines, - int protocol, int precision, int32_t ttl, int64_t reqid) { + int protocol, int precision, int32_t ttl, int64_t reqid, char *tbnameKey) { int32_t code = TSDB_CODE_SUCCESS; if (NULL == taos) { uError("SML:taos_schemaless_insert error taos is null"); @@ -2159,6 +2162,7 @@ TAOS_RES *taos_schemaless_insert_inner(TAOS *taos, char *lines[], char *rawLine, info->msgBuf.buf = info->pRequest->msgBuf; info->msgBuf.len = ERROR_MSG_BUF_DEFAULT_SIZE; info->lineNum = numLines; + info->tbnameKey = tbnameKey; smlSetReqSQL(request, lines, rawLine, rawLineEnd); @@ -2237,9 +2241,14 @@ end: * @return TAOS_RES */ +TAOS_RES *taos_schemaless_insert_ttl_with_reqid_tbname_key(TAOS *taos, char *lines[], int numLines, int protocol, + int precision, int32_t ttl, int64_t reqid, char *tbnameKey){ + return taos_schemaless_insert_inner(taos, lines, NULL, NULL, numLines, protocol, precision, ttl, reqid, tbnameKey); +} + TAOS_RES *taos_schemaless_insert_ttl_with_reqid(TAOS *taos, char *lines[], int numLines, int protocol, int precision, int32_t ttl, int64_t reqid) { - return taos_schemaless_insert_inner(taos, lines, NULL, NULL, numLines, protocol, precision, ttl, reqid); + return taos_schemaless_insert_ttl_with_reqid_tbname_key(taos, lines, numLines, protocol, precision, ttl, reqid, NULL); } TAOS_RES *taos_schemaless_insert(TAOS *taos, char *lines[], int numLines, int protocol, int precision) { @@ -2272,10 +2281,15 @@ static void getRawLineLen(char *lines, int len, int32_t *totalRows, int protocol } } +TAOS_RES *taos_schemaless_insert_raw_ttl_with_reqid_tbname_key(TAOS *taos, char *lines, int len, int32_t *totalRows, + int protocol, int precision, int32_t ttl, int64_t reqid, char *tbnameKey){ + getRawLineLen(lines, len, totalRows, protocol); + return taos_schemaless_insert_inner(taos, NULL, lines, lines + len, *totalRows, protocol, precision, ttl, reqid, tbnameKey); +} + TAOS_RES *taos_schemaless_insert_raw_ttl_with_reqid(TAOS *taos, char *lines, int len, int32_t *totalRows, int protocol, int precision, int32_t ttl, int64_t reqid) { - getRawLineLen(lines, len, totalRows, protocol); - return taos_schemaless_insert_inner(taos, NULL, lines, lines + len, *totalRows, protocol, precision, ttl, reqid); + return taos_schemaless_insert_raw_ttl_with_reqid_tbname_key(taos, lines, len, totalRows, protocol, precision, ttl, reqid, NULL); } TAOS_RES *taos_schemaless_insert_raw_with_reqid(TAOS *taos, char *lines, int len, int32_t *totalRows, int protocol, diff --git a/source/client/src/clientTmq.c b/source/client/src/clientTmq.c index 8cf39a51b9..8f35a2fad1 100644 --- a/source/client/src/clientTmq.c +++ b/source/client/src/clientTmq.c @@ -1275,7 +1275,7 @@ tmq_t* tmq_consumer_new(tmq_conf_t* conf, char* errstr, int32_t errstrLen) { // init semaphore if (tsem2_init(&pTmq->rspSem, 0, 0) != 0) { - tscError("consumer:0x %" PRIx64 " setup failed since %s, consumer group %s", pTmq->consumerId, terrstr(), + tscError("consumer:0x %" PRIx64 " setup failed since %s, consumer group %s", pTmq->consumerId, tstrerror(TAOS_SYSTEM_ERROR(errno)), pTmq->groupId); SET_ERROR_MSG_TMQ("init t_sem failed") goto _failed; @@ -2141,6 +2141,7 @@ static void* tmqHandleAllRsp(tmq_t* tmq, int64_t timeout) { taosWUnLockLatch(&tmq->lock); } setVgIdle(tmq, pollRspWrapper->topicName, pollRspWrapper->vgId); + tmqFreeRspWrapper(pRspWrapper); taosFreeQitem(pRspWrapper); } else if (pRspWrapper->tmqRspType == TMQ_MSG_TYPE__POLL_DATA_RSP) { SMqPollRspWrapper* pollRspWrapper = (SMqPollRspWrapper*)pRspWrapper; @@ -2844,6 +2845,7 @@ int32_t askEpCb(void* param, SDataBuf* pMsg, int32_t code) { pWrapper->epoch = head->epoch; (void)memcpy(&pWrapper->msg, pMsg->pData, sizeof(SMqRspHead)); if (tDecodeSMqAskEpRsp(POINTER_SHIFT(pMsg->pData, sizeof(SMqRspHead)), &pWrapper->msg) == NULL){ + tmqFreeRspWrapper((SMqRspWrapper*)pWrapper); taosFreeQitem(pWrapper); }else{ (void)taosWriteQitem(tmq->mqueue, pWrapper); diff --git a/source/common/src/rsync.c b/source/common/src/rsync.c index c7044864ae..b53c4e416e 100644 --- a/source/common/src/rsync.c +++ b/source/common/src/rsync.c @@ -163,7 +163,7 @@ int32_t startRsync() { return code; } -int32_t uploadByRsync(const char* id, const char* path) { +int32_t uploadByRsync(const char* id, const char* path, int64_t checkpointId) { int64_t st = taosGetTimestampMs(); char command[PATH_MAX] = {0}; @@ -203,12 +203,12 @@ int32_t uploadByRsync(const char* id, const char* path) { // prepare the data directory int32_t code = execCommand(command); if (code != 0) { - uError("[rsync] s-task:%s prepare checkpoint data in %s to %s failed, code:%d," ERRNO_ERR_FORMAT, id, path, + uError("[rsync] s-task:%s prepare checkpoint dir in %s to %s failed, code:%d," ERRNO_ERR_FORMAT, id, path, tsSnodeAddress, code, ERRNO_ERR_DATA); code = TAOS_SYSTEM_ERROR(errno); } else { int64_t el = (taosGetTimestampMs() - st); - uDebug("[rsync] s-task:%s prepare checkpoint data in:%s to %s successfully, elapsed time:%" PRId64 "ms", id, path, + uDebug("[rsync] s-task:%s prepare checkpoint dir in:%s to %s successfully, elapsed time:%" PRId64 "ms", id, path, tsSnodeAddress, el); } @@ -222,7 +222,7 @@ int32_t uploadByRsync(const char* id, const char* path) { #endif snprintf(command, PATH_MAX, "rsync -av --debug=all --log-file=%s/rsynclog --delete --timeout=10 --bwlimit=100000 %s/ " - "rsync://%s/checkpoint/%s/data/", + "rsync://%s/checkpoint/%s/%" PRId64 "/", tsLogDir, #ifdef WINDOWS pathTransform @@ -230,11 +230,11 @@ int32_t uploadByRsync(const char* id, const char* path) { path #endif , - tsSnodeAddress, id); + tsSnodeAddress, id, checkpointId); } else { snprintf(command, PATH_MAX, "rsync -av --debug=all --log-file=%s/rsynclog --delete --timeout=10 --bwlimit=100000 %s " - "rsync://%s/checkpoint/%s/data/", + "rsync://%s/checkpoint/%s/%" PRId64 "/", tsLogDir, #ifdef WINDOWS pathTransform @@ -242,7 +242,7 @@ int32_t uploadByRsync(const char* id, const char* path) { path #endif , - tsSnodeAddress, id); + tsSnodeAddress, id, checkpointId); } code = execCommand(command); @@ -260,7 +260,7 @@ int32_t uploadByRsync(const char* id, const char* path) { } // abort from retry if quit -int32_t downloadRsync(const char* id, const char* path) { +int32_t downloadByRsync(const char* id, const char* path, int64_t checkpointId) { int64_t st = taosGetTimestampMs(); int32_t MAX_RETRY = 10; int32_t times = 0; @@ -274,8 +274,9 @@ int32_t downloadRsync(const char* id, const char* path) { char command[PATH_MAX] = {0}; snprintf( command, PATH_MAX, - "rsync -av --debug=all --log-file=%s/rsynclog --timeout=10 --bwlimit=100000 rsync://%s/checkpoint/%s/data/ %s", - tsLogDir, tsSnodeAddress, id, + "rsync -av --debug=all --log-file=%s/rsynclog --timeout=10 --bwlimit=100000 rsync://%s/checkpoint/%s/%" PRId64 + "/ %s", + tsLogDir, tsSnodeAddress, id, checkpointId, #ifdef WINDOWS pathTransform #else @@ -283,19 +284,49 @@ int32_t downloadRsync(const char* id, const char* path) { #endif ); - uDebug("[rsync] %s start to sync data from remote to:%s, %s", id, path, command); + uDebug("[rsync] %s start to sync data from remote to:%s, cmd:%s", id, path, command); + + code = execCommand(command); + if (code != TSDB_CODE_SUCCESS) { + uError("[rsync] %s download checkpointId:%" PRId64 + " data:%s failed, retry after 1sec, times:%d, code:%d," ERRNO_ERR_FORMAT, + id, checkpointId, path, times, code, ERRNO_ERR_DATA); + } else { + int32_t el = taosGetTimestampMs() - st; + uDebug("[rsync] %s download checkpointId:%" PRId64 " data:%s successfully, elapsed time:%dms", id, checkpointId, + path, el); + } + + if (code != TSDB_CODE_SUCCESS) { // if failed, try to load it from data directory +#ifdef WINDOWS + memset(pathTransform, 0, PATH_MAX); + changeDirFromWindowsToLinux(path, pathTransform); +#endif + + memset(command, 0, PATH_MAX); + snprintf( + command, PATH_MAX, + "rsync -av --debug=all --log-file=%s/rsynclog --timeout=10 --bwlimit=100000 rsync://%s/checkpoint/%s/data/ %s", + tsLogDir, tsSnodeAddress, id, +#ifdef WINDOWS + pathTransform +#else + path +#endif + ); + + uDebug("[rsync] %s start to sync data from remote data dir to:%s, cmd:%s", id, path, command); - while (times++ < MAX_RETRY) { code = execCommand(command); if (code != TSDB_CODE_SUCCESS) { - uError("[rsync] %s download checkpoint data:%s failed, retry after 1sec, times:%d, code:%d," ERRNO_ERR_FORMAT, id, - path, times, code, ERRNO_ERR_DATA); - taosSsleep(1); - code = TAOS_SYSTEM_ERROR(errno); + uError("[rsync] %s download checkpointId:%" PRId64 + " data:%s failed, retry after 1sec, times:%d, code:%d," ERRNO_ERR_FORMAT, + id, checkpointId, path, times, code, ERRNO_ERR_DATA); + code = TAOS_SYSTEM_ERROR(code); } else { int32_t el = taosGetTimestampMs() - st; - uDebug("[rsync] %s download checkpoint data:%s successfully, elapsed time:%dms", id, path, el); - break; + uDebug("[rsync] %s download checkpointId:%" PRId64 " data:%s successfully, elapsed time:%dms", id, checkpointId, + path, el); } } return code; diff --git a/source/common/src/tmsg.c b/source/common/src/tmsg.c index 94eb2047c2..740e517e35 100644 --- a/source/common/src/tmsg.c +++ b/source/common/src/tmsg.c @@ -10140,6 +10140,7 @@ void *tDecodeMqSubTopicEp(void *buf, SMqSubTopicEp *pTopicEp) { buf = tDecodeSMqSubVgEp(buf, &vgEp); if (taosArrayPush(pTopicEp->vgs, &vgEp) == NULL) { taosArrayDestroy(pTopicEp->vgs); + pTopicEp->vgs = NULL; return NULL; } } diff --git a/source/common/src/ttime.c b/source/common/src/ttime.c index efabe5cf07..2a8e7951b1 100644 --- a/source/common/src/ttime.c +++ b/source/common/src/ttime.c @@ -815,6 +815,12 @@ int64_t taosTimeTruncate(int64_t ts, const SInterval* pInterval) { if (IS_CALENDAR_TIME_DURATION(pInterval->intervalUnit)) { int64_t news = (ts / pInterval->sliding) * pInterval->sliding; ASSERT(news <= ts); + if (pInterval->slidingUnit == 'd' || pInterval->slidingUnit == 'w') { +#if defined(WINDOWS) && _MSC_VER >= 1900 + int64_t timezone = _timezone; +#endif + news += (int64_t)(timezone * TSDB_TICK_PER_SECOND(precision)); + } if (news <= ts) { int64_t prev = news; diff --git a/source/dnode/mgmt/mgmt_dnode/src/dmWorker.c b/source/dnode/mgmt/mgmt_dnode/src/dmWorker.c index a3e1a64012..aeb519596d 100644 --- a/source/dnode/mgmt/mgmt_dnode/src/dmWorker.c +++ b/source/dnode/mgmt/mgmt_dnode/src/dmWorker.c @@ -293,7 +293,7 @@ int32_t dmStartNotifyThread(SDnodeMgmt *pMgmt) { (void)taosThreadAttrSetDetachState(&thAttr, PTHREAD_CREATE_JOINABLE); if (taosThreadCreate(&pMgmt->notifyThread, &thAttr, dmNotifyThreadFp, pMgmt) != 0) { code = TAOS_SYSTEM_ERROR(errno); - dError("failed to create notify thread since %s", strerror(code)); + dError("failed to create notify thread since %s", tstrerror(code)); return code; } diff --git a/source/dnode/mgmt/node_mgmt/src/dmTransport.c b/source/dnode/mgmt/node_mgmt/src/dmTransport.c index 3d758e1fd3..8186fcd4e9 100644 --- a/source/dnode/mgmt/node_mgmt/src/dmTransport.c +++ b/source/dnode/mgmt/node_mgmt/src/dmTransport.c @@ -109,7 +109,7 @@ static void dmProcessRpcMsg(SDnode *pDnode, SRpcMsg *pRpc, SEpSet *pEpSet) { int32_t svrVer = 0; (void)taosVersionStrToInt(version, &svrVer); if ((code = taosCheckVersionCompatible(pRpc->info.cliVer, svrVer, 3)) != 0) { - dError("Version not compatible, cli ver: %d, svr ver: %d", pRpc->info.cliVer, svrVer); + dError("Version not compatible, cli ver: %d, svr ver: %d, ip:0x%x", pRpc->info.cliVer, svrVer, pRpc->info.conn.clientIp); goto _OVER; } @@ -387,13 +387,14 @@ int32_t dmInitClient(SDnode *pDnode) { rpcInit.supportBatch = 1; rpcInit.batchSize = 8 * 1024; rpcInit.timeToGetConn = tsTimeToGetAvailableConn; + rpcInit.notWaitAvaliableConn = 1; (void)taosVersionStrToInt(version, &(rpcInit.compatibilityVer)); pTrans->clientRpc = rpcOpen(&rpcInit); if (pTrans->clientRpc == NULL) { - dError("failed to init dnode rpc client"); - return -1; + dError("failed to init dnode rpc client since:%s", tstrerror(terrno)); + return terrno; } dDebug("dnode rpc client is initialized"); @@ -436,8 +437,8 @@ int32_t dmInitStatusClient(SDnode *pDnode) { pTrans->statusRpc = rpcOpen(&rpcInit); if (pTrans->statusRpc == NULL) { - dError("failed to init dnode rpc status client"); - return TSDB_CODE_OUT_OF_MEMORY; + dError("failed to init dnode rpc status client since %s", tstrerror(terrno)); + return terrno; } dDebug("dnode rpc status client is initialized"); @@ -481,8 +482,8 @@ int32_t dmInitSyncClient(SDnode *pDnode) { pTrans->syncRpc = rpcOpen(&rpcInit); if (pTrans->syncRpc == NULL) { - dError("failed to init dnode rpc sync client"); - return TSDB_CODE_OUT_OF_MEMORY; + dError("failed to init dnode rpc sync client since %s", tstrerror(terrno)); + return terrno; } dDebug("dnode rpc sync client is initialized"); @@ -531,7 +532,7 @@ int32_t dmInitServer(SDnode *pDnode) { (void)taosVersionStrToInt(version, &(rpcInit.compatibilityVer)); pTrans->serverRpc = rpcOpen(&rpcInit); if (pTrans->serverRpc == NULL) { - dError("failed to init dnode rpc server"); + dError("failed to init dnode rpc server since:%s", tstrerror(terrno)); return terrno; } diff --git a/source/dnode/mnode/impl/inc/mndConsumer.h b/source/dnode/mnode/impl/inc/mndConsumer.h index a773c61986..175730b91e 100644 --- a/source/dnode/mnode/impl/inc/mndConsumer.h +++ b/source/dnode/mnode/impl/inc/mndConsumer.h @@ -46,7 +46,7 @@ const char *mndConsumerStatusName(int status); #define MND_TMQ_NULL_CHECK(c) \ do { \ if (c == NULL) { \ - code = TSDB_CODE_OUT_OF_MEMORY; \ + code = TAOS_GET_TERRNO(TSDB_CODE_OUT_OF_MEMORY); \ goto END; \ } \ } while (0) diff --git a/source/dnode/mnode/impl/inc/mndStream.h b/source/dnode/mnode/impl/inc/mndStream.h index d713de5158..a5d91c8aa8 100644 --- a/source/dnode/mnode/impl/inc/mndStream.h +++ b/source/dnode/mnode/impl/inc/mndStream.h @@ -57,6 +57,12 @@ typedef struct SStreamTaskResetMsg { int32_t transId; } SStreamTaskResetMsg; +typedef struct SChkptReportInfo { + SArray* pTaskList; + int64_t reportChkpt; + int64_t streamId; +} SChkptReportInfo; + typedef struct SStreamExecInfo { bool initTaskList; SArray *pNodeList; @@ -66,9 +72,9 @@ typedef struct SStreamExecInfo { SArray *pTaskList; TdThreadMutex lock; SHashObj *pTransferStateStreams; - SHashObj *pChkptStreams; + SHashObj *pChkptStreams; // use to update the checkpoint info, if all tasks send the checkpoint-report msgs SHashObj *pStreamConsensus; - SArray *pKilledChkptTrans; // SArray + SArray *pKilledChkptTrans; // SArray } SStreamExecInfo; extern SStreamExecInfo execInfo; @@ -79,6 +85,8 @@ typedef struct SNodeEntry { bool stageUpdated; // the stage has been updated due to the leader/follower change or node reboot. SEpSet epset; // compare the epset to identify the vgroup tranferring between different dnodes. int64_t hbTimestamp; // second + int32_t lastHbMsgId; // latest hb msgId + int64_t lastHbMsgTs; } SNodeEntry; typedef struct { @@ -151,6 +159,8 @@ int32_t mndGetConsensusInfo(SHashObj *pHash, int64_t streamId, int32_t numOfTask void mndAddConsensusTasks(SCheckpointConsensusInfo *pInfo, const SRestoreCheckpointInfo *pRestoreInfo); void mndClearConsensusRspEntry(SCheckpointConsensusInfo *pInfo); int64_t mndClearConsensusCheckpointId(SHashObj* pHash, int64_t streamId); +int64_t mndClearChkptReportInfo(SHashObj* pHash, int64_t streamId); +int32_t mndResetChkptReportInfo(SHashObj* pHash, int64_t streamId); int32_t setStreamAttrInResBlock(SStreamObj *pStream, SSDataBlock *pBlock, int32_t numOfRows); int32_t setTaskAttrInResBlock(SStreamObj *pStream, SStreamTask *pTask, SSDataBlock *pBlock, int32_t numOfRows); diff --git a/source/dnode/mnode/impl/src/mndConsumer.c b/source/dnode/mnode/impl/src/mndConsumer.c index 6116d2da19..8f2523d50e 100644 --- a/source/dnode/mnode/impl/src/mndConsumer.c +++ b/source/dnode/mnode/impl/src/mndConsumer.c @@ -142,7 +142,7 @@ static int32_t mndProcessConsumerClearMsg(SRpcMsg *pMsg) { mndConsumerStatusName(pConsumer->status)); MND_TMQ_RETURN_CHECK(tNewSMqConsumerObj(pConsumer->consumerId, pConsumer->cgroup, -1, NULL, NULL, &pConsumerNew)); - pTrans = mndTransCreate(pMnode, TRN_POLICY_ROLLBACK, TRN_CONFLICT_NOTHING, pMsg, "clear-csm"); + pTrans = mndTransCreate(pMnode, TRN_POLICY_RETRY, TRN_CONFLICT_NOTHING, pMsg, "clear-csm"); MND_TMQ_NULL_CHECK(pTrans); MND_TMQ_RETURN_CHECK(mndSetConsumerDropLogs(pTrans, pConsumerNew)); code = mndTransPrepare(pMnode, pTrans); diff --git a/source/dnode/mnode/impl/src/mndMain.c b/source/dnode/mnode/impl/src/mndMain.c index 37a171e9a4..11787a015b 100644 --- a/source/dnode/mnode/impl/src/mndMain.c +++ b/source/dnode/mnode/impl/src/mndMain.c @@ -443,7 +443,7 @@ static int32_t mndInitTimer(SMnode *pMnode) { (void)taosThreadAttrInit(&thAttr); (void)taosThreadAttrSetDetachState(&thAttr, PTHREAD_CREATE_JOINABLE); if ((code = taosThreadCreate(&pMnode->thread, &thAttr, mndThreadFp, pMnode)) != 0) { - mError("failed to create timer thread since %s", strerror(errno)); + mError("failed to create timer thread since %s", tstrerror(code)); TAOS_RETURN(code); } diff --git a/source/dnode/mnode/impl/src/mndStream.c b/source/dnode/mnode/impl/src/mndStream.c index 0b7562b923..20f0e7b105 100644 --- a/source/dnode/mnode/impl/src/mndStream.c +++ b/source/dnode/mnode/impl/src/mndStream.c @@ -533,7 +533,7 @@ int32_t mndPersistTaskDeployReq(STrans *pTrans, SStreamTask *pTask) { return code; } - code = setTransAction(pTrans, buf, tlen, TDMT_STREAM_TASK_DEPLOY, &pTask->info.epSet, 0, 0); + code = setTransAction(pTrans, buf, tlen, TDMT_STREAM_TASK_DEPLOY, &pTask->info.epSet, 0, TSDB_CODE_VND_INVALID_VGROUP_ID); if (code) { taosMemoryFree(buf); } @@ -2139,7 +2139,7 @@ static int32_t refreshNodeListFromExistedStreams(SMnode *pMnode, SArray *pNodeLi break; } - SNodeEntry entry = {.hbTimestamp = -1, .nodeId = pTask->info.nodeId}; + SNodeEntry entry = {.hbTimestamp = -1, .nodeId = pTask->info.nodeId, .lastHbMsgId = -1}; epsetAssign(&entry.epset, &pTask->info.epSet); (void)taosHashPut(pHash, &entry.nodeId, sizeof(entry.nodeId), &entry, sizeof(entry)); } @@ -2319,7 +2319,7 @@ void saveTaskAndNodeInfoIntoBuf(SStreamObj *pStream, SStreamExecInfo *pExecNode) } if (!exist) { - SNodeEntry nodeEntry = {.hbTimestamp = -1, .nodeId = pTask->info.nodeId}; + SNodeEntry nodeEntry = {.hbTimestamp = -1, .nodeId = pTask->info.nodeId, .lastHbMsgId = -1}; epsetAssign(&nodeEntry.epset, &pTask->info.epSet); void* px = taosArrayPush(pExecNode->pNodeList, &nodeEntry); @@ -2420,7 +2420,7 @@ int32_t mndProcessStreamReqCheckpoint(SRpcMsg *pReq) { if (pStream != NULL) { // TODO:handle error code = mndProcessStreamCheckpointTrans(pMnode, pStream, checkpointId, 0, false); if (code) { - mError("failed to create checkpoint trans, code:%s", strerror(code)); + mError("failed to create checkpoint trans, code:%s", tstrerror(code)); } } else { // todo: wait for the create stream trans completed, and launch the checkpoint trans @@ -2454,8 +2454,45 @@ int32_t mndProcessStreamReqCheckpoint(SRpcMsg *pReq) { return 0; } -static void doAddReportStreamTask(SArray* pList, const SCheckpointReport* pReport) { - bool existed = false; +// valid the info according to the HbMsg +static bool validateChkptReport(const SCheckpointReport *pReport, int64_t reportChkptId) { + STaskId id = {.streamId = pReport->streamId, .taskId = pReport->taskId}; + STaskStatusEntry *pTaskEntry = taosHashGet(execInfo.pTaskMap, &id, sizeof(id)); + if (pTaskEntry == NULL) { + mError("invalid checkpoint-report msg from task:0x%x, discard", pReport->taskId); + return false; + } + + if (pTaskEntry->checkpointInfo.latestId >= pReport->checkpointId) { + mError("s-task:0x%x invalid checkpoint-report msg, checkpointId:%" PRId64 " saved checkpointId:%" PRId64 " discard", + pReport->taskId, pReport->checkpointId, pTaskEntry->checkpointInfo.activeId); + return false; + } + + // now the task in checkpoint procedure + if ((pTaskEntry->checkpointInfo.activeId != 0) && (pTaskEntry->checkpointInfo.activeId > pReport->checkpointId)) { + mError("s-task:0x%x invalid checkpoint-report msg, checkpointId:%" PRId64 " active checkpointId:%" PRId64 + " discard", + pReport->taskId, pReport->checkpointId, pTaskEntry->checkpointInfo.activeId); + return false; + } + + if (reportChkptId >= pReport->checkpointId) { + mError("s-task:0x%x expired checkpoint-report msg, checkpointId:%" PRId64 " already update checkpointId:%" PRId64 + " discard", + pReport->taskId, pReport->checkpointId, reportChkptId); + return false; + } + + return true; +} + +static void doAddReportStreamTask(SArray *pList, int64_t reportChkptId, const SCheckpointReport *pReport) { + bool valid = validateChkptReport(pReport, reportChkptId); + if (!valid) { + return; + } + for (int32_t i = 0; i < taosArrayGetSize(pList); ++i) { STaskChkptInfo *p = taosArrayGet(pList, i); if (p == NULL) { @@ -2463,27 +2500,38 @@ static void doAddReportStreamTask(SArray* pList, const SCheckpointReport* pRepor } if (p->taskId == pReport->taskId) { - existed = true; - break; + if (p->checkpointId > pReport->checkpointId) { + mError("s-task:0x%x invalid checkpoint-report msg, existed:%" PRId64 " req checkpointId:%" PRId64 ", discard", + pReport->taskId, p->checkpointId, pReport->checkpointId); + } else if (p->checkpointId < pReport->checkpointId) { // expired checkpoint-report msg, update it + mDebug("s-task:0x%x expired checkpoint-report msg in checkpoint-report list update from %" PRId64 "->%" PRId64, + pReport->taskId, p->checkpointId, pReport->checkpointId); + + memcpy(p, pReport, sizeof(STaskChkptInfo)); + } else { + mWarn("taskId:0x%x already in checkpoint-report list", pReport->taskId); + } + return; } } - if (!existed) { - STaskChkptInfo info = { - .streamId = pReport->streamId, - .taskId = pReport->taskId, - .transId = pReport->transId, - .dropHTask = pReport->dropHTask, - .version = pReport->checkpointVer, - .ts = pReport->checkpointTs, - .checkpointId = pReport->checkpointId, - .nodeId = pReport->nodeId, - }; + STaskChkptInfo info = { + .streamId = pReport->streamId, + .taskId = pReport->taskId, + .transId = pReport->transId, + .dropHTask = pReport->dropHTask, + .version = pReport->checkpointVer, + .ts = pReport->checkpointTs, + .checkpointId = pReport->checkpointId, + .nodeId = pReport->nodeId, + }; - void* p = taosArrayPush(pList, &info); - if (p == NULL) { - mError("failed to put into task list, taskId:0x%x", pReport->taskId); - } + void *p = taosArrayPush(pList, &info); + if (p == NULL) { + mError("failed to put into task list, taskId:0x%x", pReport->taskId); + } else { + int32_t size = taosArrayGetSize(pList); + mDebug("stream:0x%"PRIx64" %d tasks has send checkpoint-report", pReport->streamId, size); } } @@ -2530,23 +2578,23 @@ int32_t mndProcessCheckpointReport(SRpcMsg *pReq) { int32_t numOfTasks = (pStream == NULL) ? 0 : mndGetNumOfStreamTasks(pStream); - SArray **pReqTaskList = (SArray **)taosHashGet(execInfo.pChkptStreams, &req.streamId, sizeof(req.streamId)); - if (pReqTaskList == NULL) { - SArray *pList = taosArrayInit(4, sizeof(STaskChkptInfo)); - if (pList != NULL) { - doAddReportStreamTask(pList, &req); - code = taosHashPut(execInfo.pChkptStreams, &req.streamId, sizeof(req.streamId), &pList, POINTER_BYTES); + SChkptReportInfo *pInfo = (SChkptReportInfo*)taosHashGet(execInfo.pChkptStreams, &req.streamId, sizeof(req.streamId)); + if (pInfo == NULL) { + SChkptReportInfo info = {.pTaskList = taosArrayInit(4, sizeof(STaskChkptInfo)), .streamId = req.streamId}; + if (info.pTaskList != NULL) { + doAddReportStreamTask(info.pTaskList, info.reportChkpt, &req); + code = taosHashPut(execInfo.pChkptStreams, &req.streamId, sizeof(req.streamId), &info, sizeof(info)); if (code) { mError("stream:0x%" PRIx64 " failed to put into checkpoint stream", req.streamId); } - pReqTaskList = (SArray **)taosHashGet(execInfo.pChkptStreams, &req.streamId, sizeof(req.streamId)); + pInfo = (SChkptReportInfo *)taosHashGet(execInfo.pChkptStreams, &req.streamId, sizeof(req.streamId)); } } else { - doAddReportStreamTask(*pReqTaskList, &req); + doAddReportStreamTask(pInfo->pTaskList, pInfo->reportChkpt, &req); } - int32_t total = taosArrayGetSize(*pReqTaskList); + int32_t total = taosArrayGetSize(pInfo->pTaskList); if (total == numOfTasks) { // all tasks has send the reqs mInfo("stream:0x%" PRIx64 " %s all %d tasks send checkpoint-report, checkpoint meta-info for checkpointId:%" PRId64 " will be issued soon", diff --git a/source/dnode/mnode/impl/src/mndStreamHb.c b/source/dnode/mnode/impl/src/mndStreamHb.c index 11556a212d..59f07ce977 100644 --- a/source/dnode/mnode/impl/src/mndStreamHb.c +++ b/source/dnode/mnode/impl/src/mndStreamHb.c @@ -211,6 +211,10 @@ int32_t mndProcessResetStatusReq(SRpcMsg *pReq) { SStreamTaskResetMsg* pMsg = pReq->pCont; mndKillTransImpl(pMnode, pMsg->transId, ""); + streamMutexLock(&execInfo.lock); + (void) mndResetChkptReportInfo(execInfo.pChkptStreams, pMsg->streamId); + streamMutexUnlock(&execInfo.lock); + code = mndGetStreamObj(pMnode, pMsg->streamId, &pStream); if (pStream == NULL || code != 0) { code = TSDB_CODE_STREAM_TASK_NOT_EXIST; @@ -333,7 +337,8 @@ int32_t mndProcessStreamHb(SRpcMsg *pReq) { } tDecoderClear(&decoder); - mDebug("receive stream-meta hb from vgId:%d, active numOfTasks:%d, msgId:%d", req.vgId, req.numOfTasks, req.msgId); + mDebug("receive stream-meta hb from vgId:%d, active numOfTasks:%d, HbMsgId:%d, HbMsgTs:%" PRId64, req.vgId, + req.numOfTasks, req.msgId, req.ts); pFailedChkpt = taosArrayInit(4, sizeof(SFailedCheckpointInfo)); pOrphanTasks = taosArrayInit(4, sizeof(SOrphanTask)); @@ -356,6 +361,31 @@ int32_t mndProcessStreamHb(SRpcMsg *pReq) { TAOS_RETURN(TSDB_CODE_INVALID_MSG); } + for(int32_t i = 0; i < taosArrayGetSize(execInfo.pNodeList); ++i) { + SNodeEntry* pEntry = taosArrayGet(execInfo.pNodeList, i); + if (pEntry == NULL) { + continue; + } + + if (pEntry->nodeId != req.vgId) { + continue; + } + + if ((pEntry->lastHbMsgId == req.msgId) && (pEntry->lastHbMsgTs == req.ts)) { + mError("vgId:%d HbMsgId:%d already handled, bh msg discard", pEntry->nodeId, req.msgId); + + terrno = TSDB_CODE_INVALID_MSG; + doSendHbMsgRsp(terrno, &pReq->info, req.vgId, req.msgId); + + streamMutexUnlock(&execInfo.lock); + cleanupAfterProcessHbMsg(&req, pFailedChkpt, pOrphanTasks); + return terrno; + } else { + pEntry->lastHbMsgId = req.msgId; + pEntry->lastHbMsgTs = req.ts; + } + } + int32_t numOfUpdated = taosArrayGetSize(req.pUpdateNodes); if (numOfUpdated > 0) { mDebug("%d stream node(s) need updated from hbMsg(vgId:%d)", numOfUpdated, req.vgId); @@ -393,6 +423,7 @@ int32_t mndProcessStreamHb(SRpcMsg *pReq) { SStreamObj *pStream = NULL; code = mndGetStreamObj(pMnode, p->id.streamId, &pStream); if (code) { + mError("stream obj not exist, failed to handle consensus checkpoint-info req, code:%s", tstrerror(code)); continue; } @@ -426,7 +457,7 @@ int32_t mndProcessStreamHb(SRpcMsg *pReq) { addIntoCheckpointList(pFailedChkpt, &info); // remove failed trans from pChkptStreams - code = taosHashRemove(execInfo.pChkptStreams, &p->id.streamId, sizeof(p->id.streamId)); + code = mndResetChkptReportInfo(execInfo.pChkptStreams, p->id.streamId); if (code) { mError("failed to remove stream:0x%"PRIx64" in checkpoint stream list", p->id.streamId); } @@ -484,14 +515,14 @@ int32_t mndProcessStreamHb(SRpcMsg *pReq) { } if (pMnode != NULL) { // make sure that the unit test case can work - mndStreamSendUpdateChkptInfoMsg(pMnode); + code = mndStreamSendUpdateChkptInfoMsg(pMnode); } streamMutexUnlock(&execInfo.lock); doSendHbMsgRsp(TSDB_CODE_SUCCESS, &pReq->info, req.vgId, req.msgId); - cleanupAfterProcessHbMsg(&req, pFailedChkpt, pOrphanTasks); + return code; } diff --git a/source/dnode/mnode/impl/src/mndStreamUtil.c b/source/dnode/mnode/impl/src/mndStreamUtil.c index 38fddd8bf0..383ffe16da 100644 --- a/source/dnode/mnode/impl/src/mndStreamUtil.c +++ b/source/dnode/mnode/impl/src/mndStreamUtil.c @@ -129,6 +129,8 @@ int32_t mndTakeVgroupSnapshot(SMnode *pMnode, bool *allReady, SArray **pList) { goto _err; } + *allReady = true; + while (1) { pIter = sdbFetch(pSdb, SDB_VGROUP, pIter, (void **)&pVgroup); if (pIter == NULL) { @@ -320,7 +322,7 @@ static int32_t doSetResumeAction(STrans *pTrans, SMnode *pMnode, SStreamTask *pT return terrno; } - code = setTransAction(pTrans, pReq, sizeof(SVResumeStreamTaskReq), TDMT_STREAM_TASK_RESUME, &epset, 0, 0); + code = setTransAction(pTrans, pReq, sizeof(SVResumeStreamTaskReq), TDMT_STREAM_TASK_RESUME, &epset, 0, TSDB_CODE_VND_INVALID_VGROUP_ID); if (code != 0) { taosMemoryFree(pReq); return terrno; @@ -424,7 +426,7 @@ static int32_t doSetPauseAction(SMnode *pMnode, STrans *pTrans, SStreamTask *pTa (void) epsetToStr(&epset, buf, tListLen(buf)); mDebug("pause stream task in node:%d, epset:%s", pTask->info.nodeId, buf); - code = setTransAction(pTrans, pReq, sizeof(SVPauseStreamTaskReq), TDMT_STREAM_TASK_PAUSE, &epset, 0, 0); + code = setTransAction(pTrans, pReq, sizeof(SVPauseStreamTaskReq), TDMT_STREAM_TASK_PAUSE, &epset, 0, TSDB_CODE_VND_INVALID_VGROUP_ID); if (code != 0) { taosMemoryFree(pReq); return code; @@ -484,7 +486,7 @@ static int32_t doSetDropAction(SMnode *pMnode, STrans *pTrans, SStreamTask *pTas } // The epset of nodeId of this task may have been expired now, let's use the newest epset from mnode. - code = setTransAction(pTrans, pReq, sizeof(SVDropStreamTaskReq), TDMT_STREAM_TASK_DROP, &epset, 0, 0); + code = setTransAction(pTrans, pReq, sizeof(SVDropStreamTaskReq), TDMT_STREAM_TASK_DROP, &epset, 0, TSDB_CODE_VND_INVALID_VGROUP_ID); if (code != 0) { taosMemoryFree(pReq); return code; @@ -540,7 +542,7 @@ static int32_t doSetDropActionFromId(SMnode *pMnode, STrans *pTrans, SOrphanTask } // The epset of nodeId of this task may have been expired now, let's use the newest epset from mnode. - code = setTransAction(pTrans, pReq, sizeof(SVDropStreamTaskReq), TDMT_STREAM_TASK_DROP, &epset, 0, 0); + code = setTransAction(pTrans, pReq, sizeof(SVDropStreamTaskReq), TDMT_STREAM_TASK_DROP, &epset, 0, TSDB_CODE_VND_INVALID_VGROUP_ID); if (code != 0) { taosMemoryFree(pReq); return code; @@ -713,7 +715,7 @@ static int32_t doSetResetAction(SMnode *pMnode, STrans *pTrans, SStreamTask *pTa return code; } - code = setTransAction(pTrans, pReq, sizeof(SVResetStreamTaskReq), TDMT_VND_STREAM_TASK_RESET, &epset, 0, 0); + code = setTransAction(pTrans, pReq, sizeof(SVResetStreamTaskReq), TDMT_VND_STREAM_TASK_RESET, &epset, 0, TSDB_CODE_VND_INVALID_VGROUP_ID); if (code != TSDB_CODE_SUCCESS) { taosMemoryFree(pReq); } @@ -811,9 +813,13 @@ void removeExpiredNodeInfo(const SArray *pNodeSnapshot) { } if (pEntry->nodeId == p->nodeId) { + p->hbTimestamp = pEntry->hbTimestamp; + void* px = taosArrayPush(pValidList, p); if (px == NULL) { mError("failed to put node into list, nodeId:%d", p->nodeId); + } else { + mDebug("vgId:%d ts:%" PRId64 " HbMsgId:%d is valid", p->nodeId, p->hbTimestamp, p->lastHbMsgId); } break; } @@ -898,8 +904,9 @@ void removeStreamTasksInBuf(SStreamObj *pStream, SStreamExecInfo *pExecNode) { ASSERT(taosHashGetSize(pExecNode->pTaskMap) == taosArrayGetSize(pExecNode->pTaskList)); - // 2. remove stream entry in consensus hash table + // 2. remove stream entry in consensus hash table and checkpoint-report hash table (void) mndClearConsensusCheckpointId(execInfo.pStreamConsensus, pStream->uid); + (void) mndClearChkptReportInfo(execInfo.pChkptStreams, pStream->uid); streamMutexUnlock(&pExecNode->lock); destroyStreamTaskIter(pIter); @@ -967,9 +974,8 @@ int32_t removeExpiredNodeEntryAndTaskInBuf(SArray *pNodeSnapshot) { static int32_t doSetUpdateChkptAction(SMnode *pMnode, STrans *pTrans, SStreamTask *pTask) { SVUpdateCheckpointInfoReq *pReq = taosMemoryCalloc(1, sizeof(SVUpdateCheckpointInfoReq)); if (pReq == NULL) { - terrno = TSDB_CODE_OUT_OF_MEMORY; mError("failed to malloc in reset stream, size:%" PRIzu ", code:%s", sizeof(SVUpdateCheckpointInfoReq), - tstrerror(TSDB_CODE_OUT_OF_MEMORY)); + tstrerror(terrno)); return terrno; } @@ -977,12 +983,14 @@ static int32_t doSetUpdateChkptAction(SMnode *pMnode, STrans *pTrans, SStreamTas pReq->taskId = pTask->id.taskId; pReq->streamId = pTask->id.streamId; - SArray **pReqTaskList = (SArray **)taosHashGet(execInfo.pChkptStreams, &pTask->id.streamId, sizeof(pTask->id.streamId)); - ASSERT(pReqTaskList); + SChkptReportInfo *pStreamItem = (SChkptReportInfo*)taosHashGet(execInfo.pChkptStreams, &pTask->id.streamId, sizeof(pTask->id.streamId)); + if (pStreamItem == NULL) { + return TSDB_CODE_INVALID_PARA; + } - int32_t size = taosArrayGetSize(*pReqTaskList); + int32_t size = taosArrayGetSize(pStreamItem->pTaskList); for(int32_t i = 0; i < size; ++i) { - STaskChkptInfo* pInfo = taosArrayGet(*pReqTaskList, i); + STaskChkptInfo* pInfo = taosArrayGet(pStreamItem->pTaskList, i); if (pInfo == NULL) { continue; } @@ -1057,11 +1065,15 @@ int32_t mndScanCheckpointReportInfo(SRpcMsg *pReq) { } mDebug("start to scan checkpoint report info"); + streamMutexLock(&execInfo.lock); while ((pIter = taosHashIterate(execInfo.pChkptStreams, pIter)) != NULL) { - SArray *pList = *(SArray **)pIter; + SChkptReportInfo* px = (SChkptReportInfo *)pIter; + if (taosArrayGetSize(px->pTaskList) == 0) { + continue; + } - STaskChkptInfo *pInfo = taosArrayGet(pList, 0); + STaskChkptInfo *pInfo = taosArrayGet(px->pTaskList, 0); if (pInfo == NULL) { continue; } @@ -1074,12 +1086,11 @@ int32_t mndScanCheckpointReportInfo(SRpcMsg *pReq) { if (p == NULL) { mError("failed to put stream into drop list:0x%" PRIx64, pInfo->streamId); } - continue; } int32_t total = mndGetNumOfStreamTasks(pStream); - int32_t existed = (int32_t)taosArrayGetSize(pList); + int32_t existed = (int32_t)taosArrayGetSize(px->pTaskList); if (total == existed) { mDebug("stream:0x%" PRIx64 " %s all %d tasks send checkpoint-report, start to update checkpoint-info", @@ -1087,14 +1098,11 @@ int32_t mndScanCheckpointReportInfo(SRpcMsg *pReq) { bool conflict = mndStreamTransConflictCheck(pMnode, pStream->uid, MND_STREAM_CHKPT_UPDATE_NAME, false); if (!conflict) { - code = mndCreateStreamChkptInfoUpdateTrans(pMnode, pStream, pList); + code = mndCreateStreamChkptInfoUpdateTrans(pMnode, pStream, px->pTaskList); if (code == TSDB_CODE_SUCCESS || code == TSDB_CODE_ACTION_IN_PROGRESS) { // remove this entry - void* p = taosArrayPush(pDropped, &pInfo->streamId); - if (p == NULL) { - mError("failed to remove stream:0x%" PRIx64, pInfo->streamId); - } else { - mDebug("stream:0x%" PRIx64 " removed", pInfo->streamId); - } + taosArrayClear(px->pTaskList); + px->reportChkpt = pInfo->checkpointId; + mDebug("stream:0x%" PRIx64 " clear checkpoint-report list", pInfo->streamId); } else { mDebug("stream:0x%" PRIx64 " not launch chkpt-meta update trans, due to checkpoint not finished yet", pInfo->streamId); @@ -1129,6 +1137,8 @@ int32_t mndScanCheckpointReportInfo(SRpcMsg *pReq) { mDebug("drop %d stream(s) in checkpoint-report list, remain:%d", size, numOfStreams); } + streamMutexUnlock(&execInfo.lock); + taosArrayDestroy(pDropped); return TSDB_CODE_SUCCESS; } @@ -1313,7 +1323,7 @@ int64_t mndClearConsensusCheckpointId(SHashObj* pHash, int64_t streamId) { int32_t code = 0; int32_t numOfStreams = taosHashGetSize(pHash); if (numOfStreams == 0) { - return TSDB_CODE_SUCCESS; + return code; } code = taosHashRemove(pHash, &streamId, sizeof(streamId)); @@ -1326,6 +1336,35 @@ int64_t mndClearConsensusCheckpointId(SHashObj* pHash, int64_t streamId) { return code; } +int64_t mndClearChkptReportInfo(SHashObj* pHash, int64_t streamId) { + int32_t code = 0; + int32_t numOfStreams = taosHashGetSize(pHash); + if (numOfStreams == 0) { + return code; + } + + code = taosHashRemove(pHash, &streamId, sizeof(streamId)); + if (code == 0) { + mDebug("drop stream:0x%" PRIx64 " in chkpt-report list, remain:%d", streamId, numOfStreams); + } else { + mError("failed to remove stream:0x%"PRIx64" in chkpt-report list, remain:%d", streamId, numOfStreams); + } + + return code; +} + +int32_t mndResetChkptReportInfo(SHashObj* pHash, int64_t streamId) { + SChkptReportInfo* pInfo = taosHashGet(pHash, &streamId, sizeof(streamId)); + if (pInfo != NULL) { + taosArrayClear(pInfo->pTaskList); + mDebug("stream:0x%" PRIx64 " checkpoint-report list cleared, prev report checkpointId:%" PRId64, streamId, + pInfo->reportChkpt); + return 0; + } + + return TSDB_CODE_MND_STREAM_NOT_EXIST; +} + static void mndShowStreamStatus(char *dst, SStreamObj *pStream) { int8_t status = atomic_load_8(&pStream->status); if (status == STREAM_STATUS__NORMAL) { diff --git a/source/dnode/mnode/impl/src/mndSubscribe.c b/source/dnode/mnode/impl/src/mndSubscribe.c index 1131f2f2d5..b3f8cf0aea 100644 --- a/source/dnode/mnode/impl/src/mndSubscribe.c +++ b/source/dnode/mnode/impl/src/mndSubscribe.c @@ -643,7 +643,7 @@ static int32_t mndPersistRebResult(SMnode *pMnode, SRpcMsg *pMsg, const SMqRebOu char cgroup[TSDB_CGROUP_LEN] = {0}; mndSplitSubscribeKey(pOutput->pSub->key, topic, cgroup, true); - pTrans = mndTransCreate(pMnode, TRN_POLICY_ROLLBACK, TRN_CONFLICT_DB_INSIDE, pMsg, "tmq-reb"); + pTrans = mndTransCreate(pMnode, TRN_POLICY_RETRY, TRN_CONFLICT_DB_INSIDE, pMsg, "tmq-reb"); if (pTrans == NULL) { code = TSDB_CODE_MND_RETURN_VALUE_NULL; if (terrno != 0) code = terrno; @@ -882,7 +882,7 @@ static int32_t buildRebOutput(SMnode *pMnode, SMqRebInputObj *rebInput, SMqRebOu rebInput->oldConsumerNum = 0; code = mndCreateSubscription(pMnode, pTopic, key, &rebOutput->pSub); if (code != 0) { - mError("[rebalance] mq rebalance %s failed create sub since %s, ignore", key, terrstr()); + mError("[rebalance] mq rebalance %s failed create sub since %s, ignore", key, tstrerror(code)); taosRUnLockLatch(&pTopic->lock); mndReleaseTopic(pMnode, pTopic); return code; @@ -1067,7 +1067,7 @@ static int32_t mndProcessDropCgroupReq(SRpcMsg *pMsg) { return 0; } else { code = TSDB_CODE_MND_SUBSCRIBE_NOT_EXIST; - mError("topic:%s, cgroup:%s, failed to drop since %s", dropReq.topic, dropReq.cgroup, terrstr()); + mError("topic:%s, cgroup:%s, failed to drop since %s", dropReq.topic, dropReq.cgroup, tstrerror(code)); return code; } } @@ -1075,11 +1075,11 @@ static int32_t mndProcessDropCgroupReq(SRpcMsg *pMsg) { taosWLockLatch(&pSub->lock); if (taosHashGetSize(pSub->consumerHash) != 0) { code = TSDB_CODE_MND_CGROUP_USED; - mError("cgroup:%s on topic:%s, failed to drop since %s", dropReq.cgroup, dropReq.topic, terrstr()); + mError("cgroup:%s on topic:%s, failed to drop since %s", dropReq.cgroup, dropReq.topic, tstrerror(code)); goto END; } - pTrans = mndTransCreate(pMnode, TRN_POLICY_ROLLBACK, TRN_CONFLICT_DB, pMsg, "drop-cgroup"); + pTrans = mndTransCreate(pMnode, TRN_POLICY_RETRY, TRN_CONFLICT_DB, pMsg, "drop-cgroup"); MND_TMQ_NULL_CHECK(pTrans); mInfo("trans:%d, used to drop cgroup:%s on topic %s", pTrans->id, dropReq.cgroup, dropReq.topic); mndTransSetDbName(pTrans, pSub->dbName, NULL); @@ -1372,7 +1372,7 @@ static int32_t buildResult(SSDataBlock *pBlock, int32_t *numOfRows, int64_t cons OffsetRows *tmp = taosArrayGet(offsetRows, i); MND_TMQ_NULL_CHECK(tmp); if (tmp->vgId != pVgEp->vgId) { - mError("mnd show subscriptions: do not find vgId:%d, %d in offsetRows", tmp->vgId, pVgEp->vgId); + mInfo("mnd show subscriptions: do not find vgId:%d, %d in offsetRows", tmp->vgId, pVgEp->vgId); continue; } data = tmp; @@ -1396,7 +1396,7 @@ static int32_t buildResult(SSDataBlock *pBlock, int32_t *numOfRows, int64_t cons pColInfo = taosArrayGet(pBlock->pDataBlock, cols++); MND_TMQ_NULL_CHECK(pColInfo); colDataSetNULL(pColInfo, *numOfRows); - mError("mnd show subscriptions: do not find vgId:%d in offsetRows", pVgEp->vgId); + mInfo("mnd show subscriptions: do not find vgId:%d in offsetRows", pVgEp->vgId); } (*numOfRows)++; } diff --git a/source/dnode/mnode/impl/src/mndTopic.c b/source/dnode/mnode/impl/src/mndTopic.c index 4bca13508e..dcac86a7d4 100644 --- a/source/dnode/mnode/impl/src/mndTopic.c +++ b/source/dnode/mnode/impl/src/mndTopic.c @@ -431,7 +431,7 @@ static int32_t mndCreateTopic(SMnode *pMnode, SRpcMsg *pReq, SCMCreateTopicReq * SQueryPlan *pPlan = NULL; SMqTopicObj topicObj = {0}; - pTrans = mndTransCreate(pMnode, TRN_POLICY_ROLLBACK, TRN_CONFLICT_DB, pReq, "create-topic"); + pTrans = mndTransCreate(pMnode, TRN_POLICY_RETRY, TRN_CONFLICT_DB, pReq, "create-topic"); MND_TMQ_NULL_CHECK(pTrans); mndTransSetDbName(pTrans, pDb->name, NULL); MND_TMQ_RETURN_CHECK(mndTransCheckConflict(pMnode, pTrans)); @@ -574,7 +574,7 @@ static int32_t mndProcessCreateTopicReq(SRpcMsg *pReq) { END: if (code != 0 && code != TSDB_CODE_ACTION_IN_PROGRESS) { - mError("failed to create topic:%s since %s", createTopicReq.name, terrstr()); + mError("failed to create topic:%s since %s", createTopicReq.name, tstrerror(code)); } mndReleaseTopic(pMnode, pTopic); @@ -699,7 +699,7 @@ static int32_t mndProcessDropTopicReq(SRpcMsg *pReq) { tFreeSMDropTopicReq(&dropReq); return 0; } else { - mError("topic:%s, failed to drop since %s", dropReq.name, terrstr()); + mError("topic:%s, failed to drop since %s", dropReq.name, tstrerror(code)); tFreeSMDropTopicReq(&dropReq); return code; } @@ -727,7 +727,7 @@ END: mndReleaseTopic(pMnode, pTopic); mndTransDrop(pTrans); if (code != 0) { - mError("topic:%s, failed to drop since %s", dropReq.name, terrstr()); + mError("topic:%s, failed to drop since %s", dropReq.name, tstrerror(code)); tFreeSMDropTopicReq(&dropReq); return code; } diff --git a/source/dnode/mnode/impl/src/mndUser.c b/source/dnode/mnode/impl/src/mndUser.c index 8bb2f11e7c..c4823dc62e 100644 --- a/source/dnode/mnode/impl/src/mndUser.c +++ b/source/dnode/mnode/impl/src/mndUser.c @@ -1677,7 +1677,7 @@ int32_t mndAcquireUser(SMnode *pMnode, const char *userName, SUserObj **ppUser) *ppUser = sdbAcquire(pSdb, SDB_USER, userName); if (*ppUser == NULL) { - if (code == TSDB_CODE_SDB_OBJ_NOT_THERE) { + if (terrno == TSDB_CODE_SDB_OBJ_NOT_THERE) { code = TSDB_CODE_MND_USER_NOT_EXIST; } else { code = TSDB_CODE_MND_USER_NOT_AVAILABLE; @@ -3149,7 +3149,8 @@ int32_t mndValidateUserAuthInfo(SMnode *pMnode, SUserAuthVersion *pUsers, int32_ (void)memcpy(rsp.user, pUsers[i].user, TSDB_USER_LEN); (void)taosArrayPush(batchRsp.pArray, &rsp); } - mError("user:%s, failed to auth user since %s", pUsers[i].user, terrstr()); + mError("user:%s, failed to auth user since %s", pUsers[i].user, tstrerror(code)); + code = 0; continue; } diff --git a/source/dnode/vnode/src/inc/vnodeInt.h b/source/dnode/vnode/src/inc/vnodeInt.h index c45d6e6edf..e81b858d0e 100644 --- a/source/dnode/vnode/src/inc/vnodeInt.h +++ b/source/dnode/vnode/src/inc/vnodeInt.h @@ -131,7 +131,7 @@ typedef SVCreateTSmaReq SSmaCfg; SMTbCursor* metaOpenTbCursor(void* pVnode); void metaCloseTbCursor(SMTbCursor* pTbCur); void metaPauseTbCursor(SMTbCursor* pTbCur); -void metaResumeTbCursor(SMTbCursor* pTbCur, int8_t first, int8_t move); +int32_t metaResumeTbCursor(SMTbCursor* pTbCur, int8_t first, int8_t move); int32_t metaTbCursorNext(SMTbCursor* pTbCur, ETableType jumpTableType); int32_t metaTbCursorPrev(SMTbCursor* pTbCur, ETableType jumpTableType); @@ -362,7 +362,7 @@ int32_t streamTaskSnapReaderClose(SStreamTaskReader* pReader); int32_t streamTaskSnapRead(SStreamTaskReader* pReader, uint8_t** ppData); int32_t streamTaskSnapWriterOpen(STQ* pTq, int64_t sver, int64_t ever, SStreamTaskWriter** ppWriter); -int32_t streamTaskSnapWriterClose(SStreamTaskWriter* ppWriter, int8_t rollback); +int32_t streamTaskSnapWriterClose(SStreamTaskWriter* ppWriter, int8_t rollback, int8_t loadTask); int32_t streamTaskSnapWrite(SStreamTaskWriter* pWriter, uint8_t* pData, uint32_t nData); int32_t streamStateSnapReaderOpen(STQ* pTq, int64_t sver, int64_t ever, SStreamStateReader** ppReader); @@ -473,6 +473,7 @@ struct SVnode { STfs* pTfs; int32_t diskPrimary; SMsgCb msgCb; + bool disableWrite; // Buffer Pool TdThreadMutex mutex; diff --git a/source/dnode/vnode/src/meta/metaQuery.c b/source/dnode/vnode/src/meta/metaQuery.c index d4ff980311..27a4179172 100644 --- a/source/dnode/vnode/src/meta/metaQuery.c +++ b/source/dnode/vnode/src/meta/metaQuery.c @@ -231,6 +231,7 @@ _exit: #if 1 // =================================================== SMTbCursor *metaOpenTbCursor(void *pVnode) { SMTbCursor *pTbCur = NULL; + int32_t code; pTbCur = (SMTbCursor *)taosMemoryCalloc(1, sizeof(*pTbCur)); if (pTbCur == NULL) { @@ -241,7 +242,12 @@ SMTbCursor *metaOpenTbCursor(void *pVnode) { // tdbTbcMoveToFirst((TBC *)pTbCur->pDbc); pTbCur->pMeta = pVnodeObj->pMeta; pTbCur->paused = 1; - metaResumeTbCursor(pTbCur, 1, 0); + code = metaResumeTbCursor(pTbCur, 1, 0); + if (code) { + terrno = code; + taosMemoryFree(pTbCur); + return NULL; + } return pTbCur; } @@ -266,28 +272,39 @@ void metaPauseTbCursor(SMTbCursor *pTbCur) { pTbCur->paused = 1; } } -void metaResumeTbCursor(SMTbCursor *pTbCur, int8_t first, int8_t move) { +int32_t metaResumeTbCursor(SMTbCursor *pTbCur, int8_t first, int8_t move) { + int32_t code = 0; + int32_t lino; + if (pTbCur->paused) { metaReaderDoInit(&pTbCur->mr, pTbCur->pMeta, META_READER_LOCK); - (void)tdbTbcOpen(((SMeta *)pTbCur->pMeta)->pUidIdx, (TBC **)&pTbCur->pDbc, NULL); + code = tdbTbcOpen(((SMeta *)pTbCur->pMeta)->pUidIdx, (TBC **)&pTbCur->pDbc, NULL); + TSDB_CHECK_CODE(code, lino, _exit); if (first) { - (void)tdbTbcMoveToFirst((TBC *)pTbCur->pDbc); + code = tdbTbcMoveToFirst((TBC *)pTbCur->pDbc); + TSDB_CHECK_CODE(code, lino, _exit); } else { int c = 1; - (void)tdbTbcMoveTo(pTbCur->pDbc, pTbCur->pKey, pTbCur->kLen, &c); + code = tdbTbcMoveTo(pTbCur->pDbc, pTbCur->pKey, pTbCur->kLen, &c); + TSDB_CHECK_CODE(code, lino, _exit); if (c == 0) { if (move) tdbTbcMoveToNext(pTbCur->pDbc); } else if (c < 0) { - (void)tdbTbcMoveToPrev(pTbCur->pDbc); + code = tdbTbcMoveToPrev(pTbCur->pDbc); + TSDB_CHECK_CODE(code, lino, _exit); } else { - (void)tdbTbcMoveToNext(pTbCur->pDbc); + code = tdbTbcMoveToNext(pTbCur->pDbc); + TSDB_CHECK_CODE(code, lino, _exit); } } pTbCur->paused = 0; } + +_exit: + return code; } int32_t metaTbCursorNext(SMTbCursor *pTbCur, ETableType jumpTableType) { @@ -298,12 +315,14 @@ int32_t metaTbCursorNext(SMTbCursor *pTbCur, ETableType jumpTableType) { for (;;) { ret = tdbTbcNext((TBC *)pTbCur->pDbc, &pTbCur->pKey, &pTbCur->kLen, &pTbCur->pVal, &pTbCur->vLen); if (ret < 0) { - return -1; + return ret; } tDecoderClear(&pTbCur->mr.coder); - (void)metaGetTableEntryByVersion(&pTbCur->mr, ((SUidIdxVal *)pTbCur->pVal)[0].version, *(tb_uid_t *)pTbCur->pKey); + ret = metaGetTableEntryByVersion(&pTbCur->mr, ((SUidIdxVal *)pTbCur->pVal)[0].version, *(tb_uid_t *)pTbCur->pKey); + if (ret) return ret; + if (pTbCur->mr.me.type == jumpTableType) { continue; } @@ -355,7 +374,9 @@ _query: version = ((SUidIdxVal *)pData)[0].version; - (void)tdbTbGet(pMeta->pTbDb, &(STbDbKey){.uid = uid, .version = version}, sizeof(STbDbKey), &pData, &nData); + if (tdbTbGet(pMeta->pTbDb, &(STbDbKey){.uid = uid, .version = version}, sizeof(STbDbKey), &pData, &nData) != 0) { + goto _err; + } SMetaEntry me = {0}; tDecoderInit(&dc, pData, nData); @@ -385,7 +406,9 @@ _query: } tDecoderInit(&dc, pData, nData); - (void)tDecodeSSchemaWrapperEx(&dc, &schema); + if (tDecodeSSchemaWrapperEx(&dc, &schema) != 0) { + goto _err; + } pSchema = tCloneSSchemaWrapper(&schema); tDecoderClear(&dc); @@ -588,6 +611,7 @@ STSchema *metaGetTbTSchema(SMeta *pMeta, tb_uid_t uid, int32_t sver, int lock) { int32_t metaGetTbTSchemaEx(SMeta *pMeta, tb_uid_t suid, tb_uid_t uid, int32_t sver, STSchema **ppTSchema) { int32_t code = 0; + int32_t lino; void *pData = NULL; int nData = 0; @@ -603,7 +627,8 @@ int32_t metaGetTbTSchemaEx(SMeta *pMeta, tb_uid_t suid, tb_uid_t uid, int32_t sv skmDbKey.uid = suid ? suid : uid; skmDbKey.sver = INT32_MAX; - (void)tdbTbcOpen(pMeta->pSkmDb, &pSkmDbC, NULL); + code = tdbTbcOpen(pMeta->pSkmDb, &pSkmDbC, NULL); + TSDB_CHECK_CODE(code, lino, _exit); metaRLock(pMeta); if (tdbTbcMoveTo(pSkmDbC, &skmDbKey, sizeof(skmDbKey), &c) < 0) { @@ -1440,6 +1465,11 @@ int32_t metaGetTableTags(void *pVnode, uint64_t suid, SArray *pUidTagInfo) { STUidTagInfo info = {.uid = uid, .pTagVal = pCur->pVal}; info.pTagVal = taosMemoryMalloc(pCur->vLen); + if (!info.pTagVal) { + metaCloseCtbCursor(pCur); + taosHashCleanup(pSepecifiedUidMap); + return TSDB_CODE_OUT_OF_MEMORY; + } memcpy(info.pTagVal, pCur->pVal, pCur->vLen); if (taosArrayPush(pUidTagInfo, &info) == NULL) { metaCloseCtbCursor(pCur); @@ -1462,6 +1492,11 @@ int32_t metaGetTableTags(void *pVnode, uint64_t suid, SArray *pUidTagInfo) { STUidTagInfo *pTagInfo = taosArrayGet(pUidTagInfo, *index); if (pTagInfo->pTagVal == NULL) { pTagInfo->pTagVal = taosMemoryMalloc(pCur->vLen); + if (!pTagInfo->pTagVal) { + metaCloseCtbCursor(pCur); + taosHashCleanup(pSepecifiedUidMap); + return TSDB_CODE_OUT_OF_MEMORY; + } memcpy(pTagInfo->pTagVal, pCur->pVal, pCur->vLen); } } diff --git a/source/dnode/vnode/src/meta/metaTable.c b/source/dnode/vnode/src/meta/metaTable.c index 2badeec2c9..596fbf70b0 100644 --- a/source/dnode/vnode/src/meta/metaTable.c +++ b/source/dnode/vnode/src/meta/metaTable.c @@ -306,12 +306,13 @@ _err: } int metaDropSTable(SMeta *pMeta, int64_t verison, SVDropStbReq *pReq, SArray *tbUidList) { - void *pKey = NULL; - int nKey = 0; - void *pData = NULL; - int nData = 0; - int c = 0; - int rc = 0; + void *pKey = NULL; + int nKey = 0; + void *pData = NULL; + int nData = 0; + int c = 0; + int rc = 0; + int32_t lino; // check if super table exists rc = tdbTbGet(pMeta->pNameIdx, pReq->name, strlen(pReq->name) + 1, &pData, &nData); @@ -323,7 +324,11 @@ int metaDropSTable(SMeta *pMeta, int64_t verison, SVDropStbReq *pReq, SArray *tb // drop all child tables TBC *pCtbIdxc = NULL; - (void)(void)tdbTbcOpen(pMeta->pCtbIdx, &pCtbIdxc, NULL); + rc = tdbTbcOpen(pMeta->pCtbIdx, &pCtbIdxc, NULL); + if (rc) { + return (terrno = rc); + } + rc = tdbTbcMoveTo(pCtbIdxc, &(SCtbIdxKey){.suid = pReq->suid, .uid = INT64_MIN}, sizeof(SCtbIdxKey), &c); if (rc < 0) { (void)tdbTbcClose(pCtbIdxc); @@ -379,20 +384,20 @@ _exit: return 0; } -static void metaGetSubtables(SMeta *pMeta, int64_t suid, SArray *uids) { - if (!uids) return; +static int32_t metaGetSubtables(SMeta *pMeta, int64_t suid, SArray *uids) { + if (!uids) return TSDB_CODE_INVALID_PARA; int c = 0; void *pKey = NULL; int nKey = 0; TBC *pCtbIdxc = NULL; - (void)tdbTbcOpen(pMeta->pCtbIdx, &pCtbIdxc, NULL); + TAOS_CHECK_RETURN(tdbTbcOpen(pMeta->pCtbIdx, &pCtbIdxc, NULL)); int rc = tdbTbcMoveTo(pCtbIdxc, &(SCtbIdxKey){.suid = suid, .uid = INT64_MIN}, sizeof(SCtbIdxKey), &c); if (rc < 0) { (void)tdbTbcClose(pCtbIdxc); metaWLock(pMeta); - return; + return 0; } for (;;) { @@ -405,12 +410,17 @@ static void metaGetSubtables(SMeta *pMeta, int64_t suid, SArray *uids) { break; } - (void)taosArrayPush(uids, &(((SCtbIdxKey *)pKey)->uid)); + if (taosArrayPush(uids, &(((SCtbIdxKey *)pKey)->uid)) == NULL) { + tdbFree(pKey); + (void)tdbTbcClose(pCtbIdxc); + return terrno; + } } tdbFree(pKey); (void)tdbTbcClose(pCtbIdxc); + return 0; } int metaAlterSTable(SMeta *pMeta, int64_t version, SVCreateStbReq *pReq) { @@ -425,7 +435,7 @@ int metaAlterSTable(SMeta *pMeta, int64_t version, SVCreateStbReq *pReq) { int32_t ret; int32_t c = -2; - (void)tdbTbcOpen(pMeta->pUidIdx, &pUidIdxc, NULL); + TAOS_CHECK_RETURN(tdbTbcOpen(pMeta->pUidIdx, &pUidIdxc, NULL)); ret = tdbTbcMoveTo(pUidIdxc, &pReq->suid, sizeof(tb_uid_t), &c); if (ret < 0 || c) { (void)tdbTbcClose(pUidIdxc); @@ -442,7 +452,7 @@ int metaAlterSTable(SMeta *pMeta, int64_t version, SVCreateStbReq *pReq) { oversion = ((SUidIdxVal *)pData)[0].version; - (void)tdbTbcOpen(pMeta->pTbDb, &pTbDbc, NULL); + TAOS_CHECK_RETURN(tdbTbcOpen(pMeta->pTbDb, &pTbDbc, NULL)); ret = tdbTbcMoveTo(pTbDbc, &((STbDbKey){.uid = pReq->suid, .version = oversion}), sizeof(STbDbKey), &c); if (!(ret == 0 && c == 0)) { (void)tdbTbcClose(pUidIdxc); @@ -486,7 +496,7 @@ int metaAlterSTable(SMeta *pMeta, int64_t version, SVCreateStbReq *pReq) { int16_t cid = pReq->schemaRow.pSchema[nCols - 1].colId; int8_t col_type = pReq->schemaRow.pSchema[nCols - 1].type; - metaGetSubtables(pMeta, pReq->suid, uids); + TAOS_CHECK_RETURN(metaGetSubtables(pMeta, pReq->suid, uids)); (void)tsdbCacheNewSTableColumn(pTsdb, uids, cid, col_type); } else if (deltaCol == -1) { int16_t cid = -1; @@ -502,7 +512,7 @@ int metaAlterSTable(SMeta *pMeta, int64_t version, SVCreateStbReq *pReq) { } if (cid != -1) { - metaGetSubtables(pMeta, pReq->suid, uids); + TAOS_CHECK_RETURN(metaGetSubtables(pMeta, pReq->suid, uids)); (void)tsdbCacheDropSTableColumn(pTsdb, uids, cid, hasPrimaryKey); } } @@ -619,7 +629,7 @@ int metaAddIndexToSTable(SMeta *pMeta, int64_t version, SVCreateStbReq *pReq) { * iterator all pTdDbc by uid and version */ TBC *pCtbIdxc = NULL; - (void)tdbTbcOpen(pMeta->pCtbIdx, &pCtbIdxc, NULL); + TAOS_CHECK_RETURN(tdbTbcOpen(pMeta->pCtbIdx, &pCtbIdxc, NULL)); int rc = tdbTbcMoveTo(pCtbIdxc, &(SCtbIdxKey){.suid = suid, .uid = INT64_MIN}, sizeof(SCtbIdxKey), &c); if (rc < 0) { (void)tdbTbcClose(pCtbIdxc); @@ -756,7 +766,7 @@ int metaDropIndexFromSTable(SMeta *pMeta, int64_t version, SDropIndexReq *pReq) * iterator all pTdDbc by uid and version */ TBC *pCtbIdxc = NULL; - (void)tdbTbcOpen(pMeta->pCtbIdx, &pCtbIdxc, NULL); + TAOS_CHECK_RETURN(tdbTbcOpen(pMeta->pCtbIdx, &pCtbIdxc, NULL)); int rc = tdbTbcMoveTo(pCtbIdxc, &(SCtbIdxKey){.suid = suid, .uid = INT64_MIN}, sizeof(SCtbIdxKey), &c); if (rc < 0) { (void)tdbTbcClose(pCtbIdxc); @@ -1424,7 +1434,7 @@ static int metaAlterTableColumn(SMeta *pMeta, int64_t version, SVAlterTbReq *pAl // search uid index TBC *pUidIdxc = NULL; - (void)tdbTbcOpen(pMeta->pUidIdx, &pUidIdxc, NULL); + TAOS_CHECK_RETURN(tdbTbcOpen(pMeta->pUidIdx, &pUidIdxc, NULL)); (void)tdbTbcMoveTo(pUidIdxc, &uid, sizeof(uid), &c); if (c != 0) { (void)tdbTbcClose(pUidIdxc); @@ -1438,7 +1448,7 @@ static int metaAlterTableColumn(SMeta *pMeta, int64_t version, SVAlterTbReq *pAl // search table.db TBC *pTbDbc = NULL; - (void)tdbTbcOpen(pMeta->pTbDb, &pTbDbc, NULL); + TAOS_CHECK_RETURN(tdbTbcOpen(pMeta->pTbDb, &pTbDbc, NULL)); (void)tdbTbcMoveTo(pTbDbc, &((STbDbKey){.uid = uid, .version = oversion}), sizeof(STbDbKey), &c); if (c != 0) { (void)tdbTbcClose(pUidIdxc); @@ -1689,7 +1699,7 @@ static int metaUpdateTableTagVal(SMeta *pMeta, int64_t version, SVAlterTbReq *pA // search uid index TBC *pUidIdxc = NULL; - (void)tdbTbcOpen(pMeta->pUidIdx, &pUidIdxc, NULL); + TAOS_CHECK_RETURN(tdbTbcOpen(pMeta->pUidIdx, &pUidIdxc, NULL)); (void)tdbTbcMoveTo(pUidIdxc, &uid, sizeof(uid), &c); if (c != 0) { (void)tdbTbcClose(pUidIdxc); @@ -1706,7 +1716,7 @@ static int metaUpdateTableTagVal(SMeta *pMeta, int64_t version, SVAlterTbReq *pA SDecoder dc2 = {0}; /* get ctbEntry */ - (void)tdbTbcOpen(pMeta->pTbDb, &pTbDbc, NULL); + TAOS_CHECK_RETURN(tdbTbcOpen(pMeta->pTbDb, &pTbDbc, NULL)); (void)tdbTbcMoveTo(pTbDbc, &((STbDbKey){.uid = uid, .version = oversion}), sizeof(STbDbKey), &c); if (c != 0) { (void)tdbTbcClose(pUidIdxc); @@ -1869,7 +1879,7 @@ static int metaUpdateTableOptions(SMeta *pMeta, int64_t version, SVAlterTbReq *p // search uid index TBC *pUidIdxc = NULL; - (void)tdbTbcOpen(pMeta->pUidIdx, &pUidIdxc, NULL); + TAOS_CHECK_RETURN(tdbTbcOpen(pMeta->pUidIdx, &pUidIdxc, NULL)); (void)tdbTbcMoveTo(pUidIdxc, &uid, sizeof(uid), &c); if (c != 0) { (void)tdbTbcClose(pUidIdxc); @@ -1883,7 +1893,7 @@ static int metaUpdateTableOptions(SMeta *pMeta, int64_t version, SVAlterTbReq *p // search table.db TBC *pTbDbc = NULL; - (void)tdbTbcOpen(pMeta->pTbDb, &pTbDbc, NULL); + TAOS_CHECK_RETURN(tdbTbcOpen(pMeta->pTbDb, &pTbDbc, NULL)); (void)tdbTbcMoveTo(pTbDbc, &((STbDbKey){.uid = uid, .version = oversion}), sizeof(STbDbKey), &c); if (c != 0) { (void)tdbTbcClose(pUidIdxc); @@ -2018,7 +2028,7 @@ static int metaAddTagIndex(SMeta *pMeta, int64_t version, SVAlterTbReq *pAlterTb * iterator all pTdDbc by uid and version */ TBC *pCtbIdxc = NULL; - (void)tdbTbcOpen(pMeta->pCtbIdx, &pCtbIdxc, NULL); + TAOS_CHECK_RETURN(tdbTbcOpen(pMeta->pCtbIdx, &pCtbIdxc, NULL)); int rc = tdbTbcMoveTo(pCtbIdxc, &(SCtbIdxKey){.suid = suid, .uid = INT64_MIN}, sizeof(SCtbIdxKey), &c); if (rc < 0) { (void)tdbTbcClose(pCtbIdxc); @@ -2157,7 +2167,7 @@ static int metaDropTagIndex(SMeta *pMeta, int64_t version, SVAlterTbReq *pAlterT SArray *tagIdxList = taosArrayInit(512, sizeof(SMetaPair)); TBC *pTagIdxc = NULL; - (void)tdbTbcOpen(pMeta->pTagIdx, &pTagIdxc, NULL); + TAOS_CHECK_RETURN(tdbTbcOpen(pMeta->pTagIdx, &pTagIdxc, NULL)); int rc = tdbTbcMoveTo(pTagIdxc, &(STagIdxKey){.suid = suid, .cid = INT32_MIN, .type = pCol->type}, sizeof(STagIdxKey), &c); for (;;) { diff --git a/source/dnode/vnode/src/tq/tqScan.c b/source/dnode/vnode/src/tq/tqScan.c index d072d7199c..4357456790 100644 --- a/source/dnode/vnode/src/tq/tqScan.c +++ b/source/dnode/vnode/src/tq/tqScan.c @@ -101,6 +101,7 @@ int32_t tqScanData(STQ* pTq, STqHandle* pHandle, SMqDataRsp* pRsp, STqOffsetVal* TSDB_CHECK_CODE(code, line, END); qStreamSetSourceExcluded(task, pRequest->sourceExcluded); + uint64_t st = taosGetTimestampMs(); while (1) { SSDataBlock* pDataBlock = NULL; code = getDataBlock(task, pHandle, vgId, &pDataBlock); @@ -160,7 +161,7 @@ int32_t tqScanData(STQ* pTq, STqHandle* pHandle, SMqDataRsp* pRsp, STqOffsetVal* pRsp->common.blockNum++; totalRows += pDataBlock->info.rows; - if (totalRows >= tmqRowSize) { + if (totalRows >= tmqRowSize || (taosGetTimestampMs() - st > 1000)) { break; } } diff --git a/source/dnode/vnode/src/tq/tqStreamTaskSnap.c b/source/dnode/vnode/src/tq/tqStreamTaskSnap.c index c89ad807c7..8dff60c4cf 100644 --- a/source/dnode/vnode/src/tq/tqStreamTaskSnap.c +++ b/source/dnode/vnode/src/tq/tqStreamTaskSnap.c @@ -192,7 +192,7 @@ int32_t streamTaskSnapWriterOpen(STQ* pTq, int64_t sver, int64_t ever, SStreamTa return code; } -int32_t streamTaskSnapWriterClose(SStreamTaskWriter* pWriter, int8_t rollback) { +int32_t streamTaskSnapWriterClose(SStreamTaskWriter* pWriter, int8_t rollback, int8_t loadTask) { int32_t code = 0; STQ* pTq = pWriter->pTq; @@ -213,6 +213,10 @@ int32_t streamTaskSnapWriterClose(SStreamTaskWriter* pWriter, int8_t rollback) { } streamMetaWUnLock(pTq->pStreamMeta); taosMemoryFree(pWriter); + + if (loadTask == 1) { + streamMetaLoadAllTasks(pTq->pStreamMeta); + } return code; _err: diff --git a/source/dnode/vnode/src/tq/tqUtil.c b/source/dnode/vnode/src/tq/tqUtil.c index 9724973440..b3d2300996 100644 --- a/source/dnode/vnode/src/tq/tqUtil.c +++ b/source/dnode/vnode/src/tq/tqUtil.c @@ -347,6 +347,7 @@ static int32_t extractDataAndRspForDbStbSubscribe(STQ* pTq, STqHandle* pHandle, code = TAOS_GET_TERRNO(TSDB_CODE_OUT_OF_MEMORY); goto END; } + totalMetaRows++; if ((taosArrayGetSize(btMetaRsp.batchMetaReq) >= tmqRowSize) || (taosGetTimestampMs() - st > 1000)) { tqOffsetResetToLog(&btMetaRsp.rspOffset, fetchVer); code = tqSendBatchMetaPollRsp(pHandle, pMsg, pRequest, &btMetaRsp, vgId); diff --git a/source/dnode/vnode/src/tqCommon/tqCommon.c b/source/dnode/vnode/src/tqCommon/tqCommon.c index a4c490e9b5..7164c7f543 100644 --- a/source/dnode/vnode/src/tqCommon/tqCommon.c +++ b/source/dnode/vnode/src/tqCommon/tqCommon.c @@ -417,7 +417,6 @@ int32_t tqStreamTaskProcessDispatchRsp(SStreamMeta* pMeta, SRpcMsg* pMsg) { return code; } else { tqDebug("vgId:%d failed to handle the dispatch rsp, since find task:0x%x failed", vgId, pRsp->upstreamTaskId); - terrno = TSDB_CODE_STREAM_TASK_NOT_EXIST; return TSDB_CODE_STREAM_TASK_NOT_EXIST; } } @@ -563,7 +562,7 @@ int32_t tqStreamTaskProcessCheckpointReadyMsg(SStreamMeta* pMeta, SRpcMsg* pMsg) pTask->id.idStr, req.downstreamTaskId, req.downstreamNodeId); } - code = streamProcessCheckpointReadyMsg(pTask, req.checkpointId, req.downstreamTaskId, req.downstreamNodeId); + code = streamProcessCheckpointReadyMsg(pTask, req.checkpointId, req.downstreamNodeId, req.downstreamTaskId); streamMetaReleaseTask(pMeta, pTask); if (code) { return code; @@ -996,7 +995,13 @@ int32_t tqStreamTaskProcessRetrieveTriggerReq(SStreamMeta* pMeta, SRpcMsg* pMsg) int64_t checkpointId = 0; streamTaskGetActiveCheckpointInfo(pTask, &transId, &checkpointId); - ASSERT(checkpointId == pReq->checkpointId); + if (checkpointId != pReq->checkpointId) { + tqError("s-task:%s invalid checkpoint-trigger retrieve msg from 0x%" PRIx64 ", current checkpointId:%" PRId64 + " req:%" PRId64, + pTask->id.idStr, pReq->downstreamTaskId, checkpointId, pReq->checkpointId); + streamMetaReleaseTask(pMeta, pTask); + return TSDB_CODE_INVALID_MSG; + } if (streamTaskAlreadySendTrigger(pTask, pReq->downstreamNodeId)) { // re-send the lost checkpoint-trigger msg to downstream task diff --git a/source/dnode/vnode/src/tsdb/tsdbMergeTree.c b/source/dnode/vnode/src/tsdb/tsdbMergeTree.c index e943ef2442..e55ede560e 100644 --- a/source/dnode/vnode/src/tsdb/tsdbMergeTree.c +++ b/source/dnode/vnode/src/tsdb/tsdbMergeTree.c @@ -28,7 +28,7 @@ int32_t tCreateSttBlockLoadInfo(STSchema *pSchema, int16_t *colList, int32_t num SSttBlockLoadInfo *pLoadInfo = taosMemoryCalloc(1, sizeof(SSttBlockLoadInfo)); if (pLoadInfo == NULL) { - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } pLoadInfo->blockData[0].sttBlockIndex = -1; @@ -50,9 +50,8 @@ int32_t tCreateSttBlockLoadInfo(STSchema *pSchema, int16_t *colList, int32_t num pLoadInfo->aSttBlk = taosArrayInit(4, sizeof(SSttBlk)); if (pLoadInfo->aSttBlk == NULL) { - code = TSDB_CODE_OUT_OF_MEMORY; taosMemoryFreeClear(pLoadInfo); - return code; + return terrno; } pLoadInfo->pSchema = pSchema; @@ -358,7 +357,7 @@ static int32_t tValueDupPayload(SValue *pVal) { char *p = (char *)pVal->pData; char *pBuf = taosMemoryMalloc(pVal->nData); if (pBuf == NULL) { - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } memcpy(pBuf, p, pVal->nData); @@ -371,13 +370,15 @@ static int32_t tValueDupPayload(SValue *pVal) { static int32_t loadSttStatisticsBlockData(SSttFileReader *pSttFileReader, SSttBlockLoadInfo *pBlockLoadInfo, TStatisBlkArray *pStatisBlkArray, uint64_t suid, const char *id) { int32_t code = TSDB_CODE_SUCCESS; + int32_t lino = 0; void* px = NULL; + int32_t startIndex = 0; + int32_t numOfBlocks = TARRAY2_SIZE(pStatisBlkArray); if (numOfBlocks <= 0) { return code; } - int32_t startIndex = 0; while ((startIndex < numOfBlocks) && (pStatisBlkArray->data[startIndex].maxTbid.suid < suid)) { ++startIndex; } @@ -413,150 +414,113 @@ static int32_t loadSttStatisticsBlockData(SSttFileReader *pSttFileReader, SSttBl // existed if (i < rows) { - if (pBlockLoadInfo->info.pUid == NULL) { - pBlockLoadInfo->info.pUid = taosArrayInit(rows, sizeof(int64_t)); - pBlockLoadInfo->info.pFirstTs = taosArrayInit(rows, sizeof(int64_t)); - pBlockLoadInfo->info.pLastTs = taosArrayInit(rows, sizeof(int64_t)); - pBlockLoadInfo->info.pCount = taosArrayInit(rows, sizeof(int64_t)); + SSttTableRowsInfo* pInfo = &pBlockLoadInfo->info; - pBlockLoadInfo->info.pFirstKey = taosArrayInit(rows, sizeof(SValue)); - pBlockLoadInfo->info.pLastKey = taosArrayInit(rows, sizeof(SValue)); + if (pInfo->pUid == NULL) { + pInfo->pUid = taosArrayInit(rows, sizeof(int64_t)); + pInfo->pFirstTs = taosArrayInit(rows, sizeof(int64_t)); + pInfo->pLastTs = taosArrayInit(rows, sizeof(int64_t)); + pInfo->pCount = taosArrayInit(rows, sizeof(int64_t)); + + pInfo->pFirstKey = taosArrayInit(rows, sizeof(SValue)); + pInfo->pLastKey = taosArrayInit(rows, sizeof(SValue)); + + if (pInfo->pUid == NULL || pInfo->pFirstTs == NULL || pInfo->pLastTs == NULL || pInfo->pCount == NULL || + pInfo->pFirstKey == NULL || pInfo->pLastKey == NULL) { + code = terrno; + goto _end; + } } if (pStatisBlkArray->data[k].maxTbid.suid == suid) { int32_t size = rows - i; int32_t offset = i * sizeof(int64_t); - px = taosArrayAddBatch(pBlockLoadInfo->info.pUid, tBufferGetDataAt(&block.uids, offset), size); - if (px == NULL) { - return terrno; - } + px = taosArrayAddBatch(pInfo->pUid, tBufferGetDataAt(&block.uids, offset), size); + TSDB_CHECK_NULL(px, code, lino, _end, terrno); - px = taosArrayAddBatch(pBlockLoadInfo->info.pFirstTs, tBufferGetDataAt(&block.firstKeyTimestamps, offset), size); - if (px == NULL){ - return terrno; - } + px = taosArrayAddBatch(pInfo->pFirstTs, tBufferGetDataAt(&block.firstKeyTimestamps, offset), size); + TSDB_CHECK_NULL(px, code, lino, _end, terrno); - px = taosArrayAddBatch(pBlockLoadInfo->info.pLastTs, tBufferGetDataAt(&block.lastKeyTimestamps, offset), size); - if (px == NULL){ - return terrno; - } + px = taosArrayAddBatch(pInfo->pLastTs, tBufferGetDataAt(&block.lastKeyTimestamps, offset), size); + TSDB_CHECK_NULL(px, code, lino, _end, terrno); - px = taosArrayAddBatch(pBlockLoadInfo->info.pCount, tBufferGetDataAt(&block.counts, offset), size); - if (px == NULL){ - return terrno; - } + px = taosArrayAddBatch(pInfo->pCount, tBufferGetDataAt(&block.counts, offset), size); + TSDB_CHECK_NULL(px, code, lino, _end, terrno); if (block.numOfPKs > 0) { SValue vFirst = {0}, vLast = {0}; for (int32_t f = i; f < rows; ++f) { code = tValueColumnGet(&block.firstKeyPKs[0], f, &vFirst); - if (code) { - break; - } + TSDB_CHECK_CODE(code, lino, _end); code = tValueDupPayload(&vFirst); - if (code) { - break; - } + TSDB_CHECK_CODE(code, lino, _end); - px = taosArrayPush(pBlockLoadInfo->info.pFirstKey, &vFirst); - if (px == NULL) { - return terrno; - } + px = taosArrayPush(pInfo->pFirstKey, &vFirst); + TSDB_CHECK_NULL(px, code, lino, _end, terrno); // todo add api to clone the original data code = tValueColumnGet(&block.lastKeyPKs[0], f, &vLast); - if (code) { - break; - } + TSDB_CHECK_CODE(code, lino, _end); code = tValueDupPayload(&vLast); - if (code) { - break; - } + TSDB_CHECK_CODE(code, lino, _end); - px = taosArrayPush(pBlockLoadInfo->info.pLastKey, &vLast); - if (px == NULL) { - return terrno; - } + px = taosArrayPush(pInfo->pLastKey, &vLast); + TSDB_CHECK_NULL(px, code, lino, _end, terrno); } } else { SValue vFirst = {0}; for (int32_t j = 0; j < size; ++j) { - px = taosArrayPush(pBlockLoadInfo->info.pFirstKey, &vFirst); - if (px == NULL) { - return terrno; - } + px = taosArrayPush(pInfo->pFirstKey, &vFirst); + TSDB_CHECK_NULL(px, code, lino, _end, terrno); - px = taosArrayPush(pBlockLoadInfo->info.pLastKey, &vFirst); - if (px == NULL) { - return terrno; - } + px = taosArrayPush(pInfo->pLastKey, &vFirst); + TSDB_CHECK_NULL(px, code, lino, _end, terrno); } } } else { STbStatisRecord record = {0}; - while (i < rows) { (void)tStatisBlockGet(&block, i, &record); if (record.suid != suid) { break; } - px = taosArrayPush(pBlockLoadInfo->info.pUid, &record.uid); - if (px == NULL) { - return terrno; - } + px = taosArrayPush(pInfo->pUid, &record.uid); + TSDB_CHECK_NULL(px, code, lino, _end, terrno); - px = taosArrayPush(pBlockLoadInfo->info.pCount, &record.count); - if (px == NULL) { - return terrno; - } + px = taosArrayPush(pInfo->pCount, &record.count); + TSDB_CHECK_NULL(px, code, lino, _end, terrno); - px = taosArrayPush(pBlockLoadInfo->info.pFirstTs, &record.firstKey.ts); - if (px == NULL) { - return terrno; - } + px = taosArrayPush(pInfo->pFirstTs, &record.firstKey.ts); + TSDB_CHECK_NULL(px, code, lino, _end, terrno); - px = taosArrayPush(pBlockLoadInfo->info.pLastTs, &record.lastKey.ts); - if (px == NULL) { - return terrno; - } + px = taosArrayPush(pInfo->pLastTs, &record.lastKey.ts); + TSDB_CHECK_NULL(px, code, lino, _end, terrno); if (record.firstKey.numOfPKs > 0) { SValue s = record.firstKey.pks[0]; code = tValueDupPayload(&s); - if (code) { - return code; - } + TSDB_CHECK_CODE(code, lino, _end); - px = taosArrayPush(pBlockLoadInfo->info.pFirstKey, &s); - if (px == NULL) { - return terrno; - } + px = taosArrayPush(pInfo->pFirstKey, &s); + TSDB_CHECK_NULL(px, code, lino, _end, terrno); s = record.lastKey.pks[0]; code = tValueDupPayload(&s); - if (code) { - return code; - } + TSDB_CHECK_CODE(code, lino, _end); - px = taosArrayPush(pBlockLoadInfo->info.pLastKey, &s); - if (px == NULL) { - return terrno; - } + px = taosArrayPush(pInfo->pLastKey, &s); + TSDB_CHECK_NULL(px, code, lino, _end, terrno); } else { SValue v = {0}; - px = taosArrayPush(pBlockLoadInfo->info.pFirstKey, &v); - if (px == NULL) { - return terrno; - } + px = taosArrayPush(pInfo->pFirstKey, &v); + TSDB_CHECK_NULL(px, code, lino, _end, terrno); - px = taosArrayPush(pBlockLoadInfo->info.pLastKey, &v); - if (px == NULL) { - return terrno; - } + px = taosArrayPush(pInfo->pLastKey, &v); + TSDB_CHECK_NULL(px, code, lino, _end, terrno); } i += 1; @@ -565,6 +529,7 @@ static int32_t loadSttStatisticsBlockData(SSttFileReader *pSttFileReader, SSttBl } } + _end: (void)tStatisBlockDestroy(&block); double el = (taosGetTimestampUs() - st) / 1000.0; diff --git a/source/dnode/vnode/src/tsdb/tsdbRead2.c b/source/dnode/vnode/src/tsdb/tsdbRead2.c index 09ca1cdc84..880e73c5c0 100644 --- a/source/dnode/vnode/src/tsdb/tsdbRead2.c +++ b/source/dnode/vnode/src/tsdb/tsdbRead2.c @@ -3519,8 +3519,10 @@ static int32_t initForFirstBlockInFile(STsdbReader* pReader, SDataBlockIter* pBl resetTableListIndex(&pReader->status); } - // set the correct start position according to the query time window - initBlockDumpInfo(pReader, pBlockIter); + if (code == TSDB_CODE_SUCCESS) { + // set the correct start position according to the query time window + initBlockDumpInfo(pReader, pBlockIter); + } taosArrayDestroy(pTableList); return code; } @@ -4706,8 +4708,7 @@ int32_t tsdbReaderOpen2(void* pVnode, SQueryTableDataCond* pCond, void* pTableLi pReader->pSchemaMap = tSimpleHashInit(8, taosFastHash); if (pReader->pSchemaMap == NULL) { tsdbError("failed init schema hash for reader %s", pReader->idStr); - code = TSDB_CODE_OUT_OF_MEMORY; - goto _err; + TSDB_CHECK_NULL(pReader->pSchemaMap, code, lino, _err, terrno); } tSimpleHashSetFreeFp(pReader->pSchemaMap, freeSchemaFunc); diff --git a/source/dnode/vnode/src/tsdb/tsdbReadUtil.c b/source/dnode/vnode/src/tsdb/tsdbReadUtil.c index 1d0cfecdd0..4dabffc10a 100644 --- a/source/dnode/vnode/src/tsdb/tsdbReadUtil.c +++ b/source/dnode/vnode/src/tsdb/tsdbReadUtil.c @@ -1074,8 +1074,12 @@ int32_t doAdjustValidDataIters(SArray* pLDIterList, int32_t numOfFileObj) { int32_t inc = numOfFileObj - size; for (int32_t k = 0; k < inc; ++k) { SLDataIter* pIter = taosMemoryCalloc(1, sizeof(SLDataIter)); - void* px = taosArrayPush(pLDIterList, &pIter); + if (!pIter) { + return terrno; + } + void* px = taosArrayPush(pLDIterList, &pIter); if (px == NULL) { + taosMemoryFree(pIter); return TSDB_CODE_OUT_OF_MEMORY; } } diff --git a/source/dnode/vnode/src/vnd/vnodeOpen.c b/source/dnode/vnode/src/vnd/vnodeOpen.c index 8a2b10d2ef..4f5d7c24e1 100644 --- a/source/dnode/vnode/src/vnd/vnodeOpen.c +++ b/source/dnode/vnode/src/vnd/vnodeOpen.c @@ -403,6 +403,7 @@ SVnode *vnodeOpen(const char *path, int32_t diskPrimary, STfs *pTfs, SMsgCb msgC pVnode->msgCb = msgCb; (void)taosThreadMutexInit(&pVnode->lock, NULL); pVnode->blocked = false; + pVnode->disableWrite = false; (void)tsem_init(&pVnode->syncSem, 0, 0); (void)taosThreadMutexInit(&pVnode->mutex, NULL); diff --git a/source/dnode/vnode/src/vnd/vnodeSnapshot.c b/source/dnode/vnode/src/vnd/vnodeSnapshot.c index 92e2ffeb7f..28e7ae97ca 100644 --- a/source/dnode/vnode/src/vnd/vnodeSnapshot.c +++ b/source/dnode/vnode/src/vnd/vnodeSnapshot.c @@ -609,7 +609,10 @@ int32_t vnodeSnapWriterOpen(SVnode *pVnode, SSnapshotParam *pParam, SVSnapWriter int64_t sver = pParam->start; int64_t ever = pParam->end; - // cancel and disable all bg task + // disable write, cancel and disable all bg tasks + (void)taosThreadMutexLock(&pVnode->mutex); + pVnode->disableWrite = true; + (void)taosThreadMutexUnlock(&pVnode->mutex); (void)vnodeCancelAndDisableAllBgTask(pVnode); // alloc @@ -722,7 +725,8 @@ int32_t vnodeSnapWriterClose(SVSnapWriter *pWriter, int8_t rollback, SSnapshot * } if (pWriter->pStreamTaskWriter) { - code = streamTaskSnapWriterClose(pWriter->pStreamTaskWriter, rollback); + code = streamTaskSnapWriterClose(pWriter->pStreamTaskWriter, rollback, pWriter->pStreamStateWriter == NULL ? 1 : 0); + if (code) goto _exit; } @@ -741,6 +745,9 @@ int32_t vnodeSnapWriterClose(SVSnapWriter *pWriter, int8_t rollback, SSnapshot * } (void)vnodeBegin(pVnode); + (void)taosThreadMutexLock(&pVnode->mutex); + pVnode->disableWrite = false; + (void)taosThreadMutexUnlock(&pVnode->mutex); _exit: if (code) { diff --git a/source/dnode/vnode/src/vnd/vnodeSvr.c b/source/dnode/vnode/src/vnd/vnodeSvr.c index feaad0f46d..3f6ca053cd 100644 --- a/source/dnode/vnode/src/vnd/vnodeSvr.c +++ b/source/dnode/vnode/src/vnd/vnodeSvr.c @@ -518,6 +518,14 @@ int32_t vnodeProcessWriteMsg(SVnode *pVnode, SRpcMsg *pMsg, int64_t ver, SRpcMsg void *pReq; int32_t len; + (void)taosThreadMutexLock(&pVnode->mutex); + if (pVnode->disableWrite) { + (void)taosThreadMutexUnlock(&pVnode->mutex); + vError("vgId:%d write is disabled for snapshot, version:%" PRId64, TD_VID(pVnode), ver); + return TSDB_CODE_VND_WRITE_DISABLED; + } + (void)taosThreadMutexUnlock(&pVnode->mutex); + if (ver <= pVnode->state.applied) { vError("vgId:%d, duplicate write request. ver: %" PRId64 ", applied: %" PRId64 "", TD_VID(pVnode), ver, pVnode->state.applied); diff --git a/source/libs/command/src/command.c b/source/libs/command/src/command.c index 2811402ea1..11ddc89d4c 100644 --- a/source/libs/command/src/command.c +++ b/source/libs/command/src/command.c @@ -38,7 +38,7 @@ static int32_t buildRetrieveTableRsp(SSDataBlock* pBlock, int32_t numOfCols, SRe size_t rspSize = sizeof(SRetrieveTableRsp) + blockGetEncodeSize(pBlock) + PAYLOAD_PREFIX_LEN; *pRsp = taosMemoryCalloc(1, rspSize); if (NULL == *pRsp) { - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } (*pRsp)->useconds = 0; @@ -289,7 +289,7 @@ static int32_t buildRetension(SArray* pRetension, char **ppRetentions ) { char* p1 = taosMemoryCalloc(1, 100); if(NULL == p1) { - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } int32_t len = 0; @@ -849,7 +849,7 @@ _return: static int32_t buildLocalVariablesResultDataBlock(SSDataBlock** pOutput) { SSDataBlock* pBlock = taosMemoryCalloc(1, sizeof(SSDataBlock)); if (NULL == pBlock) { - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } pBlock->info.hasVarCol = true; diff --git a/source/libs/command/src/explain.c b/source/libs/command/src/explain.c index 8d9f1fb9cc..3a73c05de2 100644 --- a/source/libs/command/src/explain.c +++ b/source/libs/command/src/explain.c @@ -227,7 +227,7 @@ int32_t qExplainGenerateResNode(SPhysiNode *pNode, SExplainGroup *group, SExplai SExplainResNode *resNode = taosMemoryCalloc(1, sizeof(SExplainResNode)); if (NULL == resNode) { qError("calloc SPhysiNodeExplainRes failed"); - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } int32_t code = 0; diff --git a/source/libs/executor/inc/executorInt.h b/source/libs/executor/inc/executorInt.h index 668d40dd0b..d295e868e9 100644 --- a/source/libs/executor/inc/executorInt.h +++ b/source/libs/executor/inc/executorInt.h @@ -202,6 +202,7 @@ typedef struct SExchangeInfo { SLimitInfo limitInfo; int64_t openedTs; // start exec time stamp, todo: move to SLoadRemoteDataInfo char* pTaskId; + SArray* pFetchRpcHandles; } SExchangeInfo; typedef struct SScanInfo { diff --git a/source/libs/executor/src/aggregateoperator.c b/source/libs/executor/src/aggregateoperator.c index f0aadd12de..093555c9c5 100644 --- a/source/libs/executor/src/aggregateoperator.c +++ b/source/libs/executor/src/aggregateoperator.c @@ -147,7 +147,10 @@ _error: if (pInfo != NULL) { destroyAggOperatorInfo(pInfo); } - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } @@ -340,6 +343,7 @@ int32_t doAggregateImpl(SOperatorInfo* pOperator, SqlFunctionCtx* pCtx) { static int32_t createDataBlockForEmptyInput(SOperatorInfo* pOperator, SSDataBlock** ppBlock) { int32_t code = TSDB_CODE_SUCCESS; int32_t lino = 0; + SSDataBlock* pBlock = NULL; if (!tsCountAlwaysReturnValue) { return TSDB_CODE_SUCCESS; } @@ -363,7 +367,6 @@ static int32_t createDataBlockForEmptyInput(SOperatorInfo* pOperator, SSDataBloc return TSDB_CODE_SUCCESS; } - SSDataBlock* pBlock = NULL; code = createDataBlock(&pBlock); if (code) { return code; @@ -411,6 +414,7 @@ static int32_t createDataBlockForEmptyInput(SOperatorInfo* pOperator, SSDataBloc _end: if (code != TSDB_CODE_SUCCESS) { + blockDataDestroy(pBlock); qError("%s failed at line %d since %s", __func__, __LINE__, tstrerror(code)); } return code; @@ -449,11 +453,14 @@ void doSetTableGroupOutputBuf(SOperatorInfo* pOperator, int32_t numOfOutput, uin SResultRow* pResultRow = doSetResultOutBufByKey(pAggInfo->aggSup.pResultBuf, pResultRowInfo, (char*)&groupId, sizeof(groupId), true, groupId, pTaskInfo, false, &pAggInfo->aggSup, true); + if (pResultRow == NULL || pTaskInfo->code != 0) { + T_LONG_JMP(pTaskInfo->env, pTaskInfo->code); + } /* * not assign result buffer yet, add new result buffer * all group belong to one result set, and each group result has different group id so set the id to be one */ - if (pResultRow == NULL || pResultRow->pageId == -1) { + if (pResultRow->pageId == -1) { int32_t ret = addNewResultRowBuf(pResultRow, pAggInfo->aggSup.pResultBuf, pAggInfo->binfo.pRes->info.rowSize); if (ret != TSDB_CODE_SUCCESS) { T_LONG_JMP(pTaskInfo->env, terrno); diff --git a/source/libs/executor/src/cachescanoperator.c b/source/libs/executor/src/cachescanoperator.c index 652ebc0f9a..2751cf2851 100644 --- a/source/libs/executor/src/cachescanoperator.c +++ b/source/libs/executor/src/cachescanoperator.c @@ -245,9 +245,14 @@ _error: if (code != TSDB_CODE_SUCCESS) { qError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); } - pInfo->pTableList = NULL; - destroyCacheScanOperator(pInfo); - destroyOperator(pOperator); + if (pInfo != NULL) { + pInfo->pTableList = NULL; + destroyCacheScanOperator(pInfo); + } + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } return code; } @@ -448,7 +453,7 @@ void destroyCacheScanOperator(void* param) { taosArrayDestroy(pInfo->matchInfo.pList); tableListDestroy(pInfo->pTableList); - if (pInfo->pLastrowReader != NULL) { + if (pInfo->pLastrowReader != NULL && pInfo->readHandle.api.cacheFn.closeReader != NULL) { pInfo->readHandle.api.cacheFn.closeReader(pInfo->pLastrowReader); pInfo->pLastrowReader = NULL; } diff --git a/source/libs/executor/src/countwindowoperator.c b/source/libs/executor/src/countwindowoperator.c index 63c0c5fe87..9019fa0fef 100644 --- a/source/libs/executor/src/countwindowoperator.c +++ b/source/libs/executor/src/countwindowoperator.c @@ -294,10 +294,11 @@ int32_t createCountwindowOperatorInfo(SOperatorInfo* downstream, SPhysiNode* phy SSDataBlock* pResBlock = createDataBlockFromDescNode(pCountWindowNode->window.node.pOutputDataBlockDesc); QUERY_CHECK_NULL(pResBlock, code, lino, _error, terrno); + initBasicInfo(&pInfo->binfo, pResBlock); + code = blockDataEnsureCapacity(pResBlock, pOperator->resultInfo.capacity); QUERY_CHECK_CODE(code, lino, _error); - initBasicInfo(&pInfo->binfo, pResBlock); initResultRowInfo(&pInfo->binfo.resultRowInfo); pInfo->binfo.inputTsOrder = physiNode->inputTsOrder; pInfo->binfo.outputTsOrder = physiNode->outputTsOrder; @@ -341,7 +342,10 @@ _error: destroyCountWindowOperatorInfo(pInfo); } - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } diff --git a/source/libs/executor/src/eventwindowoperator.c b/source/libs/executor/src/eventwindowoperator.c index f9ae8be84f..d4e5dedd20 100644 --- a/source/libs/executor/src/eventwindowoperator.c +++ b/source/libs/executor/src/eventwindowoperator.c @@ -110,11 +110,11 @@ int32_t createEventwindowOperatorInfo(SOperatorInfo* downstream, SPhysiNode* phy SSDataBlock* pResBlock = createDataBlockFromDescNode(pEventWindowNode->window.node.pOutputDataBlockDesc); QUERY_CHECK_NULL(pResBlock, code, lino, _error, terrno); + initBasicInfo(&pInfo->binfo, pResBlock); code = blockDataEnsureCapacity(pResBlock, pOperator->resultInfo.capacity); QUERY_CHECK_CODE(code, lino, _error); - initBasicInfo(&pInfo->binfo, pResBlock); initResultRowInfo(&pInfo->binfo.resultRowInfo); pInfo->binfo.inputTsOrder = physiNode->inputTsOrder; pInfo->binfo.outputTsOrder = physiNode->outputTsOrder; @@ -145,7 +145,10 @@ _error: destroyEWindowOperatorInfo(pInfo); } - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } diff --git a/source/libs/executor/src/exchangeoperator.c b/source/libs/executor/src/exchangeoperator.c index 21b1c2838b..5afae596a4 100644 --- a/source/libs/executor/src/exchangeoperator.c +++ b/source/libs/executor/src/exchangeoperator.c @@ -298,13 +298,13 @@ _end: pTaskInfo->code = code; T_LONG_JMP(pTaskInfo->env, code); } - (*ppRes) = NULL; + (*ppRes) = NULL; return code; } static SSDataBlock* loadRemoteData(SOperatorInfo* pOperator) { SSDataBlock* pRes = NULL; - int32_t code = loadRemoteDataNext(pOperator, &pRes); + int32_t code = loadRemoteDataNext(pOperator, &pRes); return pRes; } @@ -346,6 +346,14 @@ static int32_t initExchangeOperator(SExchangePhysiNode* pExNode, SExchangeInfo* qError("%s invalid number: %d of sources in exchange operator", id, (int32_t)numOfSources); return TSDB_CODE_INVALID_PARA; } + pInfo->pFetchRpcHandles = taosArrayInit(numOfSources, sizeof(int64_t)); + if (!pInfo->pFetchRpcHandles) { + return terrno; + } + void* ret = taosArrayReserve(pInfo->pFetchRpcHandles, numOfSources); + if (!ret) { + return terrno; + } pInfo->pSources = taosArrayInit(numOfSources, sizeof(SDownstreamSourceNode)); if (pInfo->pSources == NULL) { @@ -382,7 +390,14 @@ static int32_t initExchangeOperator(SExchangePhysiNode* pExNode, SExchangeInfo* } initLimitInfo(pExNode->node.pLimit, pExNode->node.pSlimit, &pInfo->limitInfo); - pInfo->self = taosAddRef(exchangeObjRefPool, pInfo); + int64_t refId = taosAddRef(exchangeObjRefPool, pInfo); + if (refId < 0) { + int32_t code = terrno; + qError("%s failed at line %d since %s", __func__, __LINE__, tstrerror(code)); + return code; + } else { + pInfo->self = refId; + } return initDataSource(numOfSources, pInfo, id); } @@ -391,7 +406,7 @@ int32_t createExchangeOperatorInfo(void* pTransporter, SExchangePhysiNode* pExNo SOperatorInfo** pOptrInfo) { QRY_OPTR_CHECK(pOptrInfo); - int32_t code = 0; + int32_t code = 0; int32_t lino = 0; SExchangeInfo* pInfo = taosMemoryCalloc(1, sizeof(SExchangeInfo)); SOperatorInfo* pOperator = taosMemoryCalloc(1, sizeof(SOperatorInfo)); @@ -443,7 +458,10 @@ _error: doDestroyExchangeOperatorInfo(pInfo); } - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } @@ -468,6 +486,16 @@ void freeSourceDataInfo(void* p) { void doDestroyExchangeOperatorInfo(void* param) { SExchangeInfo* pExInfo = (SExchangeInfo*)param; + if (pExInfo->pFetchRpcHandles) { + for (int32_t i = 0; i < pExInfo->pFetchRpcHandles->size; ++i) { + int64_t* pRpcHandle = taosArrayGet(pExInfo->pFetchRpcHandles, i); + if (*pRpcHandle > 0) { + SDownstreamSourceNode* pSource = taosArrayGet(pExInfo->pSources, i); + (void)asyncFreeConnById(pExInfo->pTransporter, *pRpcHandle); + } + } + taosArrayDestroy(pExInfo->pFetchRpcHandles); + } taosArrayDestroy(pExInfo->pSources); taosArrayDestroyEx(pExInfo->pSourceDataInfo, freeSourceDataInfo); @@ -495,6 +523,8 @@ int32_t loadRemoteDataCallback(void* param, SDataBuf* pMsg, int32_t code) { } int32_t index = pWrapper->sourceIndex; + int64_t* pRpcHandle = taosArrayGet(pExchangeInfo->pFetchRpcHandles, index); + *pRpcHandle = -1; SSourceDataInfo* pSourceDataInfo = taosArrayGet(pExchangeInfo->pSourceDataInfo, index); if (!pSourceDataInfo) { return terrno; @@ -668,6 +698,8 @@ int32_t doSendFetchDataRequest(SExchangeInfo* pExchangeInfo, SExecTaskInfo* pTas int64_t transporterId = 0; code = asyncSendMsgToServer(pExchangeInfo->pTransporter, &pSource->addr.epSet, &transporterId, pMsgSendInfo); QUERY_CHECK_CODE(code, lino, _end); + int64_t* pRpcHandle = taosArrayGet(pExchangeInfo->pFetchRpcHandles, sourceIndex); + *pRpcHandle = transporterId; } _end: @@ -686,11 +718,12 @@ void updateLoadRemoteInfo(SLoadRemoteDataInfo* pInfo, int64_t numOfRows, int32_t } int32_t extractDataBlockFromFetchRsp(SSDataBlock* pRes, char* pData, SArray* pColList, char** pNextStart) { - int32_t code = TSDB_CODE_SUCCESS; - int32_t lino = 0; + int32_t code = TSDB_CODE_SUCCESS; + int32_t lino = 0; + SSDataBlock* pBlock = NULL; if (pColList == NULL) { // data from other sources blockDataCleanup(pRes); - code = blockDecode(pRes, pData, (const char**) pNextStart); + code = blockDecode(pRes, pData, (const char**)pNextStart); if (code) { return code; } @@ -710,7 +743,7 @@ int32_t extractDataBlockFromFetchRsp(SSDataBlock* pRes, char* pData, SArray* pCo pStart += sizeof(SSysTableSchema); } - SSDataBlock* pBlock = NULL; + pBlock = NULL; code = createDataBlock(&pBlock); QUERY_CHECK_CODE(code, lino, _end); @@ -735,10 +768,12 @@ int32_t extractDataBlockFromFetchRsp(SSDataBlock* pRes, char* pData, SArray* pCo QUERY_CHECK_CODE(code, lino, _end); blockDataDestroy(pBlock); + pBlock = NULL; } _end: if (code != TSDB_CODE_SUCCESS) { + blockDataDestroy(pBlock); qError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); } return code; diff --git a/source/libs/executor/src/executil.c b/source/libs/executor/src/executil.c index c2297d9fba..616c2593cb 100644 --- a/source/libs/executor/src/executil.c +++ b/source/libs/executor/src/executil.c @@ -278,7 +278,7 @@ SSDataBlock* createDataBlockFromDescNode(SDataBlockDescNode* pNode) { qError("%s failed at line %d since %s", __func__, __LINE__, tstrerror(code)); blockDataDestroy(pBlock); pBlock = NULL; - terrno = code; + terrno = TSDB_CODE_INVALID_PARA; break; } SColumnInfoData idata = @@ -1094,7 +1094,7 @@ SSDataBlock* createTagValBlockForFilter(SArray* pColList, int32_t numOfTables, S code = blockDataEnsureCapacity(pResBlock, numOfTables); if (code != TSDB_CODE_SUCCESS) { terrno = code; - taosMemoryFree(pResBlock); + blockDataDestroy(pResBlock); return NULL; } @@ -1166,7 +1166,7 @@ SSDataBlock* createTagValBlockForFilter(SArray* pColList, int32_t numOfTables, S _end: if (code != TSDB_CODE_SUCCESS) { - taosMemoryFree(pResBlock); + blockDataDestroy(pResBlock); qError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); terrno = code; return NULL; @@ -1781,7 +1781,7 @@ int32_t createExprFromOneNode(SExprInfo* pExp, SNode* pNode, int16_t slotId) { pExp->base.resSchema = createResSchema(pType->type, pType->bytes, slotId, pType->scale, pType->precision, pValNode->node.aliasName); pExp->base.pParam[0].type = FUNC_PARAM_TYPE_VALUE; - nodesValueNodeToVariant(pValNode, &pExp->base.pParam[0].param); + code = nodesValueNodeToVariant(pValNode, &pExp->base.pParam[0].param); } else if (type == QUERY_NODE_FUNCTION) { pExp->pExpr->nodeType = QUERY_NODE_FUNCTION; SFunctionNode* pFuncNode = (SFunctionNode*)pNode; @@ -1811,12 +1811,10 @@ int32_t createExprFromOneNode(SExprInfo* pExp, SNode* pNode, int16_t slotId) { if (TSDB_CODE_SUCCESS == code) { code = nodesMakeNode(QUERY_NODE_VALUE, (SNode**)&res); } - if (TSDB_CODE_SUCCESS != code) { // todo handle error - } else { - res->node.resType = (SDataType){.bytes = sizeof(int64_t), .type = TSDB_DATA_TYPE_BIGINT}; - code = nodesListAppend(pFuncNode->pParameterList, (SNode*)res); - QUERY_CHECK_CODE(code, lino, _end); - } + QUERY_CHECK_CODE(code, lino, _end); + res->node.resType = (SDataType){.bytes = sizeof(int64_t), .type = TSDB_DATA_TYPE_BIGINT}; + code = nodesListAppend(pFuncNode->pParameterList, (SNode*)res); + QUERY_CHECK_CODE(code, lino, _end); } #endif @@ -1826,7 +1824,7 @@ int32_t createExprFromOneNode(SExprInfo* pExp, SNode* pNode, int16_t slotId) { QUERY_CHECK_NULL(pExp->base.pParam, code, lino, _end, terrno); pExp->base.numOfParams = numOfParam; - for (int32_t j = 0; j < numOfParam; ++j) { + for (int32_t j = 0; j < numOfParam && TSDB_CODE_SUCCESS == code; ++j) { SNode* p1 = nodesListGetNode(pFuncNode->pParameterList, j); QUERY_CHECK_NULL(p1, code, lino, _end, terrno); if (p1->type == QUERY_NODE_COLUMN) { @@ -1839,7 +1837,8 @@ int32_t createExprFromOneNode(SExprInfo* pExp, SNode* pNode, int16_t slotId) { } else if (p1->type == QUERY_NODE_VALUE) { SValueNode* pvn = (SValueNode*)p1; pExp->base.pParam[j].type = FUNC_PARAM_TYPE_VALUE; - nodesValueNodeToVariant(pvn, &pExp->base.pParam[j].param); + code = nodesValueNodeToVariant(pvn, &pExp->base.pParam[j].param); + QUERY_CHECK_CODE(code, lino, _end); } } } else if (type == QUERY_NODE_OPERATOR) { @@ -1871,13 +1870,10 @@ int32_t createExprFromOneNode(SExprInfo* pExp, SNode* pNode, int16_t slotId) { SLogicConditionNode* pCond = (SLogicConditionNode*)pNode; pExp->base.pParam = taosMemoryCalloc(1, sizeof(SFunctParam)); QUERY_CHECK_NULL(pExp->base.pParam, code, lino, _end, terrno); - - if (TSDB_CODE_SUCCESS == code) { - pExp->base.numOfParams = 1; - SDataType* pType = &pCond->node.resType; - pExp->base.resSchema = createResSchema(pType->type, pType->bytes, slotId, pType->scale, pType->precision, pCond->node.aliasName); - pExp->pExpr->_optrRoot.pRootNode = pNode; - } + pExp->base.numOfParams = 1; + SDataType* pType = &pCond->node.resType; + pExp->base.resSchema = createResSchema(pType->type, pType->bytes, slotId, pType->scale, pType->precision, pCond->node.aliasName); + pExp->pExpr->_optrRoot.pRootNode = pNode; } else { ASSERT(0); } diff --git a/source/libs/executor/src/filloperator.c b/source/libs/executor/src/filloperator.c index 882a0dc4b6..f61b514626 100644 --- a/source/libs/executor/src/filloperator.c +++ b/source/libs/executor/src/filloperator.c @@ -567,7 +567,10 @@ _error: } pTaskInfo->code = code; - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } return code; } diff --git a/source/libs/executor/src/groupcacheoperator.c b/source/libs/executor/src/groupcacheoperator.c index f2e24f160c..00b8c3b9ae 100644 --- a/source/libs/executor/src/groupcacheoperator.c +++ b/source/libs/executor/src/groupcacheoperator.c @@ -1504,7 +1504,10 @@ _error: destroyGroupCacheOperator(pInfo); } - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } diff --git a/source/libs/executor/src/groupoperator.c b/source/libs/executor/src/groupoperator.c index 309fb4b52f..69a9045004 100644 --- a/source/libs/executor/src/groupoperator.c +++ b/source/libs/executor/src/groupoperator.c @@ -1246,7 +1246,10 @@ _error: destroyPartitionOperatorInfo(pInfo); } pTaskInfo->code = code; - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } TAOS_RETURN(code); } @@ -1792,7 +1795,10 @@ int32_t createStreamPartitionOperatorInfo(SOperatorInfo* downstream, SStreamPart _error: pTaskInfo->code = code; if (pInfo != NULL) destroyStreamPartitionOperatorInfo(pInfo); - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } qError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); return code; } diff --git a/source/libs/executor/src/operator.c b/source/libs/executor/src/operator.c index 174e68ea7a..af52c31364 100644 --- a/source/libs/executor/src/operator.c +++ b/source/libs/executor/src/operator.c @@ -650,7 +650,7 @@ void destroyOperator(SOperatorInfo* pOperator) { freeResetOperatorParams(pOperator, OP_GET_PARAM, true); freeResetOperatorParams(pOperator, OP_NOTIFY_PARAM, true); - if (pOperator->fpSet.closeFn != NULL) { + if (pOperator->fpSet.closeFn != NULL && pOperator->info != NULL) { pOperator->fpSet.closeFn(pOperator->info); } diff --git a/source/libs/executor/src/projectoperator.c b/source/libs/executor/src/projectoperator.c index 7c9f242a63..66a7408b13 100644 --- a/source/libs/executor/src/projectoperator.c +++ b/source/libs/executor/src/projectoperator.c @@ -179,8 +179,11 @@ int32_t createProjectOperatorInfo(SOperatorInfo* downstream, SProjectPhysiNode* return code; _error: - destroyProjectOperatorInfo(pInfo); - destroyOperator(pOperator); + if (pInfo != NULL) destroyProjectOperatorInfo(pInfo); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } @@ -528,8 +531,11 @@ int32_t createIndefinitOutputOperatorInfo(SOperatorInfo* downstream, SPhysiNode* return code; _error: - destroyIndefinitOperatorInfo(pInfo); - destroyOperator(pOperator); + if (pInfo != NULL) destroyIndefinitOperatorInfo(pInfo); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } diff --git a/source/libs/executor/src/scanoperator.c b/source/libs/executor/src/scanoperator.c index da0df786fb..e15dcf806a 100644 --- a/source/libs/executor/src/scanoperator.c +++ b/source/libs/executor/src/scanoperator.c @@ -88,20 +88,26 @@ static void switchCtxOrder(SqlFunctionCtx* pCtx, int32_t numOfOutput) { } } -static bool overlapWithTimeWindow(SInterval* pInterval, SDataBlockInfo* pBlockInfo, int32_t order) { +static int32_t overlapWithTimeWindow(SInterval* pInterval, SDataBlockInfo* pBlockInfo, int32_t order, bool* overlap) { + int32_t code = TSDB_CODE_SUCCESS; STimeWindow w = {0}; // 0 by default, which means it is not a interval operator of the upstream operator. if (pInterval->interval == 0) { - return false; + *overlap = false; + return code; } if (order == TSDB_ORDER_ASC) { w = getAlignQueryTimeWindow(pInterval, pBlockInfo->window.skey); - ASSERT(w.ekey >= pBlockInfo->window.skey); + if(w.ekey < pBlockInfo->window.skey) { + qError("w.ekey:%" PRId64 " < pBlockInfo->window.skey:%" PRId64, w.ekey, pBlockInfo->window.skey); + return TSDB_CODE_QRY_EXECUTOR_INTERNAL_ERROR; + } if (w.ekey < pBlockInfo->window.ekey) { - return true; + *overlap = true; + return code; } while (1) { @@ -110,17 +116,25 @@ static bool overlapWithTimeWindow(SInterval* pInterval, SDataBlockInfo* pBlockIn break; } - ASSERT(w.ekey > pBlockInfo->window.ekey); + if(w.ekey <= pBlockInfo->window.ekey) { + qError("w.ekey:%" PRId64 " <= pBlockInfo->window.ekey:%" PRId64, w.ekey, pBlockInfo->window.ekey); + return TSDB_CODE_QRY_EXECUTOR_INTERNAL_ERROR; + } if (TMAX(w.skey, pBlockInfo->window.skey) <= pBlockInfo->window.ekey) { - return true; + *overlap = true; + return code; } } } else { w = getAlignQueryTimeWindow(pInterval, pBlockInfo->window.ekey); - ASSERT(w.skey <= pBlockInfo->window.ekey); + if(w.skey > pBlockInfo->window.ekey) { + qError("w.skey:%" PRId64 " > pBlockInfo->window.skey:%" PRId64, w.skey, pBlockInfo->window.ekey); + return TSDB_CODE_QRY_EXECUTOR_INTERNAL_ERROR; + } if (w.skey > pBlockInfo->window.skey) { - return true; + *overlap = true; + return code; } while (1) { @@ -129,14 +143,19 @@ static bool overlapWithTimeWindow(SInterval* pInterval, SDataBlockInfo* pBlockIn break; } - ASSERT(w.skey < pBlockInfo->window.skey); + if(w.skey >= pBlockInfo->window.skey){ + qError("w.skey:%" PRId64 " >= pBlockInfo->window.skey:%" PRId64, w.skey, pBlockInfo->window.skey); + return TSDB_CODE_QRY_EXECUTOR_INTERNAL_ERROR; + } if (pBlockInfo->window.skey <= TMIN(w.ekey, pBlockInfo->window.ekey)) { - return true; + *overlap = true; + return code; } } } - return false; + *overlap = false; + return code; } // this function is for table scanner to extract temporary results of upstream aggregate results. @@ -319,9 +338,18 @@ static int32_t loadDataBlock(SOperatorInfo* pOperator, STableScanBase* pTableSca bool loadSMA = false; *status = pTableScanInfo->dataBlockLoadFlag; - if (pOperator->exprSupp.pFilterInfo != NULL || - overlapWithTimeWindow(&pTableScanInfo->pdInfo.interval, &pBlock->info, pTableScanInfo->cond.order)) { + if (pOperator->exprSupp.pFilterInfo != NULL) { (*status) = FUNC_DATA_REQUIRED_DATA_LOAD; + } else { + bool overlap = false; + int ret = + overlapWithTimeWindow(&pTableScanInfo->pdInfo.interval, &pBlock->info, pTableScanInfo->cond.order, &overlap); + if (ret != TSDB_CODE_SUCCESS) { + return ret; + } + if (overlap) { + (*status) = FUNC_DATA_REQUIRED_DATA_LOAD; + } } SDataBlockInfo* pBlockInfo = &pBlock->info; @@ -358,7 +386,10 @@ static int32_t loadDataBlock(SOperatorInfo* pOperator, STableScanBase* pTableSca } } - ASSERT(*status == FUNC_DATA_REQUIRED_DATA_LOAD); + if(*status != FUNC_DATA_REQUIRED_DATA_LOAD) { + qError("[loadDataBlock] invalid status:%d", *status); + return TSDB_CODE_QRY_EXECUTOR_INTERNAL_ERROR; + } // try to filter data block according to sma info if (pOperator->exprSupp.pFilterInfo != NULL && (!loadSMA)) { @@ -413,7 +444,10 @@ static int32_t loadDataBlock(SOperatorInfo* pOperator, STableScanBase* pTableSca return code; } - ASSERT(p == pBlock); + if(p != pBlock) { + qError("[loadDataBlock] p != pBlock"); + return TSDB_CODE_QRY_EXECUTOR_INTERNAL_ERROR; + } doSetTagColumnData(pTableScanInfo, pBlock, pTaskInfo, pBlock->info.rows); // restore the previous value @@ -1432,7 +1466,10 @@ _error: destroyTableScanOperatorInfo(pInfo); } - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } @@ -1531,6 +1568,8 @@ void resetTableScanInfo(STableScanInfo* pTableScanInfo, STimeWindow* pWin, uint6 static SSDataBlock* readPreVersionData(SOperatorInfo* pTableScanOp, uint64_t tbUid, TSKEY startTs, TSKEY endTs, int64_t maxVersion) { + int32_t code = TSDB_CODE_SUCCESS; + int32_t lino = 0; STableKeyInfo tblInfo = {.uid = tbUid, .groupId = 0}; STableScanInfo* pTableScanInfo = pTableScanOp->info; @@ -1545,37 +1584,33 @@ static SSDataBlock* readPreVersionData(SOperatorInfo* pTableScanOp, uint64_t tbU SSDataBlock* pBlock = pTableScanInfo->pResBlock; STsdbReader* pReader = NULL; - int32_t code = pAPI->tsdReader.tsdReaderOpen(pTableScanInfo->base.readHandle.vnode, &cond, &tblInfo, 1, pBlock, + code = pAPI->tsdReader.tsdReaderOpen(pTableScanInfo->base.readHandle.vnode, &cond, &tblInfo, 1, pBlock, (void**)&pReader, GET_TASKID(pTaskInfo), NULL); - if (code != TSDB_CODE_SUCCESS) { - terrno = code; - T_LONG_JMP(pTaskInfo->env, code); - return NULL; - } + QUERY_CHECK_CODE(code, lino, _end); bool hasNext = false; code = pAPI->tsdReader.tsdNextDataBlock(pReader, &hasNext); - if (code != TSDB_CODE_SUCCESS) { - terrno = code; - T_LONG_JMP(pTaskInfo->env, code); - return NULL; - } + QUERY_CHECK_CODE(code, lino, _end); if (hasNext) { SSDataBlock* p = NULL; code = pAPI->tsdReader.tsdReaderRetrieveDataBlock(pReader, &p, NULL); - if (code != TSDB_CODE_SUCCESS) { - return NULL; - } + QUERY_CHECK_CODE(code, lino, _end); doSetTagColumnData(&pTableScanInfo->base, pBlock, pTaskInfo, pBlock->info.rows); pBlock->info.id.groupId = tableListGetTableGroupId(pTableScanInfo->base.pTableListInfo, pBlock->info.id.uid); } +_end: pAPI->tsdReader.tsdReaderClose(pReader); qDebug("retrieve prev rows:%" PRId64 ", skey:%" PRId64 ", ekey:%" PRId64 " uid:%" PRIu64 ", max ver:%" PRId64 ", suid:%" PRIu64, pBlock->info.rows, startTs, endTs, tbUid, maxVersion, cond.suid); + if (code != TSDB_CODE_SUCCESS) { + qError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); + terrno = code; + return NULL; + } return pBlock->info.rows > 0 ? pBlock : NULL; } @@ -2222,6 +2257,10 @@ static int32_t generatePartitionDelResBlock(SStreamScanInfo* pInfo, SSDataBlock* uint64_t srcUid = srcUidData[delI]; char tbname[VARSTR_HEADER_SIZE + TSDB_TABLE_NAME_LEN] = {0}; SSDataBlock* pPreRes = readPreVersionData(pInfo->pTableScanOp, srcUid, srcStartTsCol[delI], srcEndTsCol[delI], ver); + if (!pPreRes) { + qError("%s failed at line %d since %s", __func__, __LINE__, tstrerror(terrno)); + continue; + } code = blockDataEnsureCapacity(pDestBlock, pDestBlock->info.rows + pPreRes->info.rows); QUERY_CHECK_CODE(code, lino, _end); for (int32_t preJ = 0; preJ < pPreRes->info.rows; preJ++) { @@ -2294,6 +2333,10 @@ static int32_t generateDeleteResultBlockImpl(SStreamScanInfo* pInfo, SSDataBlock if (winCode != TSDB_CODE_SUCCESS) { SSDataBlock* pPreRes = readPreVersionData(pInfo->pTableScanOp, srcUid, srcStartTsCol[i], srcStartTsCol[i], ver); + if (!pPreRes) { + qError("%s failed at line %d since %s", __func__, __LINE__, tstrerror(terrno)); + continue; + } printDataBlock(pPreRes, "pre res", GET_TASKID(pInfo->pStreamScanOp->pTaskInfo)); code = calBlockTbName(pInfo, pPreRes, 0); QUERY_CHECK_CODE(code, lino, _end); @@ -3743,7 +3786,7 @@ static void destroyStreamScanOperatorInfo(void* param) { destroyOperator(pStreamScan->pTableScanOp); } - if (pStreamScan->tqReader) { + if (pStreamScan->tqReader != NULL && pStreamScan->readerFn.tqReaderClose != NULL) { pStreamScan->readerFn.tqReaderClose(pStreamScan->tqReader); } if (pStreamScan->matchInfo.pList) { @@ -4113,7 +4156,10 @@ _error: destroyStreamScanOperatorInfo(pInfo); } - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } @@ -4279,11 +4325,12 @@ static int32_t tagScanFilterByTagCond(SArray* aUidTags, SNode* pTagCond, SArray* int32_t code = TSDB_CODE_SUCCESS; int32_t lino = 0; int32_t numOfTables = taosArrayGetSize(aUidTags); + SArray* pBlockList = NULL; SSDataBlock* pResBlock = createTagValBlockForFilter(pInfo->filterCtx.cInfoList, numOfTables, aUidTags, pVnode, pAPI); QUERY_CHECK_NULL(pResBlock, code, lino, _end, terrno); - SArray* pBlockList = taosArrayInit(1, POINTER_BYTES); + pBlockList = taosArrayInit(1, POINTER_BYTES); QUERY_CHECK_NULL(pBlockList, code, lino, _end, terrno); void* tmp = taosArrayPush(pBlockList, &pResBlock); @@ -4573,7 +4620,7 @@ static SSDataBlock* doTagScanFromMetaEntry(SOperatorInfo* pOperator) { static void destroyTagScanOperatorInfo(void* param) { STagScanInfo* pInfo = (STagScanInfo*)param; - if (pInfo->pCtbCursor != NULL) { + if (pInfo->pCtbCursor != NULL && pInfo->pStorageAPI != NULL) { pInfo->pStorageAPI->metaFn.closeCtbCursor(pInfo->pCtbCursor); } taosHashCleanup(pInfo->filterCtx.colHash); @@ -4670,7 +4717,10 @@ _error: } if (pInfo != NULL) destroyTagScanOperatorInfo(pInfo); - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } return code; } @@ -5735,8 +5785,10 @@ void destroyTableMergeScanOperatorInfo(void* param) { STableMergeScanInfo* pTableScanInfo = (STableMergeScanInfo*)param; // start one reader variable - pTableScanInfo->base.readerAPI.tsdReaderClose(pTableScanInfo->base.dataReader); - pTableScanInfo->base.dataReader = NULL; + if (pTableScanInfo->base.readerAPI.tsdReaderClose != NULL) { + pTableScanInfo->base.readerAPI.tsdReaderClose(pTableScanInfo->base.dataReader); + pTableScanInfo->base.dataReader = NULL; + } for (int32_t i = 0; i < pTableScanInfo->numNextDurationBlocks; ++i) { if (pTableScanInfo->nextDurationBlocks[i] != NULL) { @@ -5791,7 +5843,8 @@ int32_t createTableMergeScanOperatorInfo(STableScanPhysiNode* pTableScanNode, SR SOperatorInfo** pOptrInfo) { QRY_OPTR_CHECK(pOptrInfo); - int32_t code = 0; + int32_t code = TSDB_CODE_SUCCESS; + int32_t lino = 0; STableMergeScanInfo* pInfo = taosMemoryCalloc(1, sizeof(STableMergeScanInfo)); SOperatorInfo* pOperator = taosMemoryCalloc(1, sizeof(SOperatorInfo)); if (pInfo == NULL || pOperator == NULL) { @@ -5804,16 +5857,10 @@ int32_t createTableMergeScanOperatorInfo(STableScanPhysiNode* pTableScanNode, SR int32_t numOfCols = 0; code = extractColMatchInfo(pTableScanNode->scan.pScanCols, pDescNode, &numOfCols, COL_MATCH_FROM_COL_ID, &pInfo->base.matchInfo); - int32_t lino = 0; - if (code != TSDB_CODE_SUCCESS) { - goto _error; - } + QUERY_CHECK_CODE(code, lino, _error); code = initQueryTableDataCond(&pInfo->base.cond, pTableScanNode, readHandle); - if (code != TSDB_CODE_SUCCESS) { - taosArrayDestroy(pInfo->base.matchInfo.pList); - goto _error; - } + QUERY_CHECK_CODE(code, lino, _error); if (pTableScanNode->scan.pScanPseudoCols != NULL) { SExprSupp* pSup = &pInfo->base.pseudoSup; @@ -5828,10 +5875,7 @@ int32_t createTableMergeScanOperatorInfo(STableScanPhysiNode* pTableScanNode, SR pInfo->scanInfo = (SScanInfo){.numOfAsc = pTableScanNode->scanSeq[0], .numOfDesc = pTableScanNode->scanSeq[1]}; pInfo->base.metaCache.pTableMetaEntryCache = taosLRUCacheInit(1024 * 128, -1, .5); - if (pInfo->base.metaCache.pTableMetaEntryCache == NULL) { - code = terrno; - goto _error; - } + QUERY_CHECK_NULL(pInfo->base.metaCache.pTableMetaEntryCache, code, lino, _error, terrno); pInfo->base.readerAPI = pTaskInfo->storageAPI.tsdReader; pInfo->base.dataBlockLoadFlag = FUNC_DATA_REQUIRED_DATA_LOAD; @@ -5848,9 +5892,7 @@ int32_t createTableMergeScanOperatorInfo(STableScanPhysiNode* pTableScanNode, SR pInfo->sample.seed = taosGetTimestampSec(); code = filterInitFromNode((SNode*)pTableScanNode->scan.node.pConditions, &pOperator->exprSupp.pFilterInfo, 0); - if (code != TSDB_CODE_SUCCESS) { - goto _error; - } + QUERY_CHECK_CODE(code, lino, _error); initLimitInfo(pTableScanNode->scan.node.pLimit, pTableScanNode->scan.node.pSlimit, &pInfo->limitInfo); @@ -5912,10 +5954,18 @@ int32_t createTableMergeScanOperatorInfo(STableScanPhysiNode* pTableScanNode, SR return code; _error: + if (code != TSDB_CODE_SUCCESS) { + qError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); + } pTaskInfo->code = code; - pInfo->base.pTableListInfo = NULL; - if (pInfo != NULL) destroyTableMergeScanOperatorInfo(pInfo); - destroyOperator(pOperator); + if (pInfo != NULL) { + pInfo->base.pTableListInfo = NULL; + destroyTableMergeScanOperatorInfo(pInfo); + } + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } return code; } @@ -6072,7 +6122,10 @@ _error: if (pInfo != NULL) { destoryTableCountScanOperator(pInfo); } - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } diff --git a/source/libs/executor/src/sortoperator.c b/source/libs/executor/src/sortoperator.c index 2a1b4f3e23..59b4e1cbbb 100644 --- a/source/libs/executor/src/sortoperator.c +++ b/source/libs/executor/src/sortoperator.c @@ -164,7 +164,10 @@ _error: destroySortOperatorInfo(pInfo); } - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } @@ -836,6 +839,9 @@ _error: if (pInfo != NULL) { destroyGroupSortOperatorInfo(pInfo); } - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } return code; } diff --git a/source/libs/executor/src/streamcountwindowoperator.c b/source/libs/executor/src/streamcountwindowoperator.c index 39fe78502d..62506858fc 100644 --- a/source/libs/executor/src/streamcountwindowoperator.c +++ b/source/libs/executor/src/streamcountwindowoperator.c @@ -926,7 +926,10 @@ _error: destroyStreamCountAggOperatorInfo(pInfo); } - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; qError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); return code; diff --git a/source/libs/executor/src/streameventwindowoperator.c b/source/libs/executor/src/streameventwindowoperator.c index 57e31cfebe..bde6198709 100644 --- a/source/libs/executor/src/streameventwindowoperator.c +++ b/source/libs/executor/src/streameventwindowoperator.c @@ -981,7 +981,10 @@ int32_t createStreamEventAggOperatorInfo(SOperatorInfo* downstream, SPhysiNode* _error: if (pInfo != NULL) destroyStreamEventOperatorInfo(pInfo); - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; qError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); return code; diff --git a/source/libs/executor/src/streamfilloperator.c b/source/libs/executor/src/streamfilloperator.c index b8686d0f19..8d91af46b1 100644 --- a/source/libs/executor/src/streamfilloperator.c +++ b/source/libs/executor/src/streamfilloperator.c @@ -1459,7 +1459,10 @@ _error: qError("%s failed at line %d since %s. task:%s", __func__, lino, tstrerror(code), GET_TASKID(pTaskInfo)); } if (pInfo != NULL) destroyStreamFillOperatorInfo(pInfo); - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } diff --git a/source/libs/executor/src/streamtimewindowoperator.c b/source/libs/executor/src/streamtimewindowoperator.c index 08b644b6ec..61d4eed156 100644 --- a/source/libs/executor/src/streamtimewindowoperator.c +++ b/source/libs/executor/src/streamtimewindowoperator.c @@ -2013,7 +2013,10 @@ int32_t createStreamFinalIntervalOperatorInfo(SOperatorInfo* downstream, SPhysiN _error: if (pInfo != NULL) destroyStreamFinalIntervalOperatorInfo(pInfo); - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } @@ -3832,7 +3835,10 @@ _error: destroyStreamSessionAggOperatorInfo(pInfo); } - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; qError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); return code; @@ -4088,7 +4094,10 @@ _error: if (pInfo != NULL) { destroyStreamSessionAggOperatorInfo(pInfo); } - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; if (code != TSDB_CODE_SUCCESS) { qError("%s failed at line %d since %s. task:%s", __func__, lino, tstrerror(code), GET_TASKID(pTaskInfo)); @@ -4978,7 +4987,10 @@ int32_t createStreamStateAggOperatorInfo(SOperatorInfo* downstream, SPhysiNode* _error: if (pInfo != NULL) destroyStreamStateOperatorInfo(pInfo); - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; qError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); return code; @@ -5191,6 +5203,8 @@ int32_t createStreamIntervalOperatorInfo(SOperatorInfo* downstream, SPhysiNode* SSDataBlock* pResBlock = createDataBlockFromDescNode(pPhyNode->pOutputDataBlockDesc); QUERY_CHECK_NULL(pResBlock, code, lino, _error, terrno); + initBasicInfo(&pInfo->binfo, pResBlock); + pInfo->interval = (SInterval){ .interval = pIntervalPhyNode->interval, .sliding = pIntervalPhyNode->sliding, @@ -5218,7 +5232,6 @@ int32_t createStreamIntervalOperatorInfo(SOperatorInfo* downstream, SPhysiNode* SExprSupp* pSup = &pOperator->exprSupp; pSup->hasWindowOrGroup = true; - initBasicInfo(&pInfo->binfo, pResBlock); code = initExecTimeWindowInfo(&pInfo->twAggSup.timeWindowData, &pTaskInfo->window); QUERY_CHECK_CODE(code, lino, _error); @@ -5313,7 +5326,10 @@ int32_t createStreamIntervalOperatorInfo(SOperatorInfo* downstream, SPhysiNode* _error: if (pInfo != NULL) destroyStreamFinalIntervalOperatorInfo(pInfo); - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } diff --git a/source/libs/executor/src/sysscanoperator.c b/source/libs/executor/src/sysscanoperator.c index cdd22c2adc..e11ee6b0dc 100644 --- a/source/libs/executor/src/sysscanoperator.c +++ b/source/libs/executor/src/sysscanoperator.c @@ -568,7 +568,7 @@ static SSDataBlock* sysTableScanUserCols(SOperatorInfo* pOperator) { if (pInfo->pCur == NULL) { pInfo->pCur = pAPI->metaFn.openTableMetaCursor(pInfo->readHandle.vnode); } else { - pAPI->metaFn.resumeTableMetaCursor(pInfo->pCur, 0, 0); + (void)pAPI->metaFn.resumeTableMetaCursor(pInfo->pCur, 0, 0); } if (pInfo->pSchema == NULL) { @@ -672,6 +672,7 @@ static SSDataBlock* sysTableScanUserCols(SOperatorInfo* pOperator) { } blockDataDestroy(pDataBlock); + pDataBlock = NULL; if (ret != 0) { pAPI->metaFn.closeTableMetaCursor(pInfo->pCur); pInfo->pCur = NULL; @@ -683,6 +684,7 @@ static SSDataBlock* sysTableScanUserCols(SOperatorInfo* pOperator) { _end: if (code != TSDB_CODE_SUCCESS) { + blockDataDestroy(pDataBlock); qError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); pTaskInfo->code = code; T_LONG_JMP(pTaskInfo->env, code); @@ -695,6 +697,7 @@ static SSDataBlock* sysTableScanUserTags(SOperatorInfo* pOperator) { int32_t lino = 0; SExecTaskInfo* pTaskInfo = pOperator->pTaskInfo; SStorageAPI* pAPI = &pTaskInfo->storageAPI; + SSDataBlock* dataBlock = NULL; SSysTableScanInfo* pInfo = pOperator->info; if (pOperator->status == OP_EXEC_DONE) { @@ -704,7 +707,7 @@ static SSDataBlock* sysTableScanUserTags(SOperatorInfo* pOperator) { blockDataCleanup(pInfo->pRes); int32_t numOfRows = 0; - SSDataBlock* dataBlock = buildInfoSchemaTableMetaBlock(TSDB_INS_TABLE_TAGS); + dataBlock = buildInfoSchemaTableMetaBlock(TSDB_INS_TABLE_TAGS); code = blockDataEnsureCapacity(dataBlock, pOperator->resultInfo.capacity); QUERY_CHECK_CODE(code, lino, _end); @@ -777,8 +780,9 @@ static SSDataBlock* sysTableScanUserTags(SOperatorInfo* pOperator) { int32_t ret = 0; if (pInfo->pCur == NULL) { pInfo->pCur = pAPI->metaFn.openTableMetaCursor(pInfo->readHandle.vnode); + QUERY_CHECK_NULL(pInfo->pCur, code, lino, _end, terrno); } else { - pAPI->metaFn.resumeTableMetaCursor(pInfo->pCur, 0, 0); + (void)pAPI->metaFn.resumeTableMetaCursor(pInfo->pCur, 0, 0); } while ((ret = pAPI->metaFn.cursorNext(pInfo->pCur, TSDB_SUPER_TABLE)) == 0) { @@ -826,6 +830,7 @@ static SSDataBlock* sysTableScanUserTags(SOperatorInfo* pOperator) { } blockDataDestroy(dataBlock); + dataBlock = NULL; if (ret != 0) { pAPI->metaFn.closeTableMetaCursor(pInfo->pCur); pInfo->pCur = NULL; @@ -837,6 +842,7 @@ static SSDataBlock* sysTableScanUserTags(SOperatorInfo* pOperator) { _end: if (code != TSDB_CODE_SUCCESS) { qError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); + blockDataDestroy(dataBlock); pAPI->metaFn.closeTableMetaCursor(pInfo->pCur); pInfo->pCur = NULL; pTaskInfo->code = code; @@ -1196,7 +1202,7 @@ static SSDataBlock* buildInfoSchemaTableMetaBlock(char* tableName) { } SSDataBlock* pBlock = NULL; - int32_t code = createDataBlock(&pBlock); + int32_t code = createDataBlock(&pBlock); if (code) { terrno = code; return NULL; @@ -1310,9 +1316,11 @@ int32_t buildSysDbTableInfo(const SSysTableScanInfo* pInfo, int32_t capacity) { QUERY_CHECK_CODE(code, lino, _end); blockDataDestroy(p); + p = NULL; _end: if (code != TSDB_CODE_SUCCESS) { + blockDataDestroy(p); qError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); } return code; @@ -1325,6 +1333,7 @@ static SSDataBlock* sysTableBuildUserTablesByUids(SOperatorInfo* pOperator) { SStorageAPI* pAPI = &pTaskInfo->storageAPI; SSysTableScanInfo* pInfo = pOperator->info; SSysTableIndex* pIdx = pInfo->pIdx; + SSDataBlock* p = NULL; blockDataCleanup(pInfo->pRes); int32_t numOfRows = 0; @@ -1344,7 +1353,7 @@ static SSDataBlock* sysTableBuildUserTablesByUids(SOperatorInfo* pOperator) { varDataSetLen(dbname, strlen(varDataVal(dbname))); - SSDataBlock* p = buildInfoSchemaTableMetaBlock(TSDB_INS_TABLE_TABLES); + p = buildInfoSchemaTableMetaBlock(TSDB_INS_TABLE_TABLES); code = blockDataEnsureCapacity(p, pOperator->resultInfo.capacity); QUERY_CHECK_CODE(code, lino, _end); @@ -1366,7 +1375,7 @@ static SSDataBlock* sysTableBuildUserTablesByUids(SOperatorInfo* pOperator) { // table name SColumnInfoData* pColInfoData = taosArrayGet(p->pDataBlock, 0); QUERY_CHECK_NULL(pColInfoData, code, lino, _end, terrno); - + code = colDataSetVal(pColInfoData, numOfRows, n, false); QUERY_CHECK_CODE(code, lino, _end); @@ -1545,12 +1554,14 @@ static SSDataBlock* sysTableBuildUserTablesByUids(SOperatorInfo* pOperator) { } blockDataDestroy(p); + p = NULL; pInfo->loadInfo.totalRows += pInfo->pRes->info.rows; _end: if (code != TSDB_CODE_SUCCESS) { qError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); + blockDataDestroy(p); pTaskInfo->code = code; T_LONG_JMP(pTaskInfo->env, code); } @@ -1563,14 +1574,16 @@ static SSDataBlock* sysTableBuildUserTables(SOperatorInfo* pOperator) { SExecTaskInfo* pTaskInfo = pOperator->pTaskInfo; SStorageAPI* pAPI = &pTaskInfo->storageAPI; int8_t firstMetaCursor = 0; + SSDataBlock* p = NULL; SSysTableScanInfo* pInfo = pOperator->info; if (pInfo->pCur == NULL) { pInfo->pCur = pAPI->metaFn.openTableMetaCursor(pInfo->readHandle.vnode); + QUERY_CHECK_NULL(pInfo->pCur, code, lino, _end, terrno); firstMetaCursor = 1; } if (!firstMetaCursor) { - pAPI->metaFn.resumeTableMetaCursor(pInfo->pCur, 0, 1); + (void)pAPI->metaFn.resumeTableMetaCursor(pInfo->pCur, 0, 1); } blockDataCleanup(pInfo->pRes); @@ -1590,7 +1603,7 @@ static SSDataBlock* sysTableBuildUserTables(SOperatorInfo* pOperator) { varDataSetLen(dbname, strlen(varDataVal(dbname))); - SSDataBlock* p = buildInfoSchemaTableMetaBlock(TSDB_INS_TABLE_TABLES); + p = buildInfoSchemaTableMetaBlock(TSDB_INS_TABLE_TABLES); QUERY_CHECK_NULL(p, code, lino, _end, terrno); code = blockDataEnsureCapacity(p, pOperator->resultInfo.capacity); @@ -1783,6 +1796,7 @@ static SSDataBlock* sysTableBuildUserTables(SOperatorInfo* pOperator) { } blockDataDestroy(p); + p = NULL; // todo temporarily free the cursor here, the true reason why the free is not valid needs to be found if (ret != 0) { @@ -1796,6 +1810,7 @@ static SSDataBlock* sysTableBuildUserTables(SOperatorInfo* pOperator) { _end: if (code != TSDB_CODE_SUCCESS) { qError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); + blockDataDestroy(p); pTaskInfo->code = code; T_LONG_JMP(pTaskInfo->env, code); } @@ -2021,7 +2036,7 @@ static void sysTableScanFillTbName(SOperatorInfo* pOperator, const SSysTableScan if (pInfo->tbnameSlotId != -1) { SColumnInfoData* pColumnInfoData = (SColumnInfoData*)taosArrayGet(pBlock->pDataBlock, pInfo->tbnameSlotId); QUERY_CHECK_NULL(pColumnInfoData, code, lino, _end, terrno); - char varTbName[TSDB_TABLE_FNAME_LEN - 1 + VARSTR_HEADER_SIZE] = {0}; + char varTbName[TSDB_TABLE_FNAME_LEN - 1 + VARSTR_HEADER_SIZE] = {0}; STR_TO_VARSTR(varTbName, name); code = colDataSetNItems(pColumnInfoData, 0, varTbName, pBlock->info.rows, true); @@ -2138,8 +2153,8 @@ static SSDataBlock* sysTableScanFromMNode(SOperatorInfo* pOperator, SSysTableSca } } -int32_t createSysTableScanOperatorInfo(void* readHandle, SSystemTableScanPhysiNode* pScanPhyNode, - const char* pUser, SExecTaskInfo* pTaskInfo, SOperatorInfo** pOptrInfo) { +int32_t createSysTableScanOperatorInfo(void* readHandle, SSystemTableScanPhysiNode* pScanPhyNode, const char* pUser, + SExecTaskInfo* pTaskInfo, SOperatorInfo** pOptrInfo) { QRY_OPTR_CHECK(pOptrInfo); int32_t code = TSDB_CODE_SUCCESS; @@ -2148,7 +2163,7 @@ int32_t createSysTableScanOperatorInfo(void* readHandle, SSystemTableScanPhysiNo SOperatorInfo* pOperator = taosMemoryCalloc(1, sizeof(SOperatorInfo)); if (pInfo == NULL || pOperator == NULL) { code = TSDB_CODE_OUT_OF_MEMORY; - lino = __LINE__; + lino = __LINE__; goto _error; } @@ -2209,7 +2224,10 @@ _error: if (code != TSDB_CODE_SUCCESS) { qError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); } - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } @@ -2244,7 +2262,7 @@ void destroySysScanOperator(void* param) { if (strncasecmp(name, TSDB_INS_TABLE_TABLES, TSDB_TABLE_FNAME_LEN) == 0 || strncasecmp(name, TSDB_INS_TABLE_TAGS, TSDB_TABLE_FNAME_LEN) == 0 || strncasecmp(name, TSDB_INS_TABLE_COLS, TSDB_TABLE_FNAME_LEN) == 0 || pInfo->pCur != NULL) { - if (pInfo->pAPI->metaFn.closeTableMetaCursor != NULL) { + if (pInfo->pAPI != NULL && pInfo->pAPI->metaFn.closeTableMetaCursor != NULL) { pInfo->pAPI->metaFn.closeTableMetaCursor(pInfo->pCur); } @@ -2437,10 +2455,10 @@ static FORCE_INLINE int optSysBinarySearch(SArray* arr, int s, int e, uint64_t k } int32_t optSysIntersection(SArray* in, SArray* out) { - int32_t code = TSDB_CODE_SUCCESS; - int32_t lino = 0; + int32_t code = TSDB_CODE_SUCCESS; + int32_t lino = 0; MergeIndex* mi = NULL; - int32_t sz = (int32_t)taosArrayGetSize(in); + int32_t sz = (int32_t)taosArrayGetSize(in); if (sz <= 0) { goto _end; } @@ -2678,7 +2696,6 @@ static int32_t doBlockInfoScanNext(SOperatorInfo* pOperator, SSDataBlock** ppRes int32_t slotId = pOperator->exprSupp.pExprInfo->base.resSchema.slotId; SColumnInfoData* pColInfo = taosArrayGet(pBlock->pDataBlock, slotId); QUERY_CHECK_NULL(pColInfo, code, lino, _end, terrno); - int32_t len = tSerializeBlockDistInfo(NULL, 0, &blockDistInfo); char* p = taosMemoryCalloc(1, len + VARSTR_HEADER_SIZE); @@ -2700,7 +2717,7 @@ static int32_t doBlockInfoScanNext(SOperatorInfo* pOperator, SSDataBlock** ppRes if (slotId != 0) { SColumnInfoData* p1 = taosArrayGet(pBlock->pDataBlock, 0); QUERY_CHECK_NULL(p1, code, lino, _end, terrno); - int64_t v = 0; + int64_t v = 0; colDataSetInt64(p1, 0, &v); } @@ -2726,7 +2743,9 @@ static SSDataBlock* doBlockInfoScan(SOperatorInfo* pOperator) { static void destroyBlockDistScanOperatorInfo(void* param) { SBlockDistInfo* pDistInfo = (SBlockDistInfo*)param; blockDataDestroy(pDistInfo->pResBlock); - pDistInfo->readHandle.api.tsdReader.tsdReaderClose(pDistInfo->pHandle); + if (pDistInfo->readHandle.api.tsdReader.tsdReaderClose != NULL) { + pDistInfo->readHandle.api.tsdReader.tsdReaderClose(pDistInfo->pHandle); + } tableListDestroy(pDistInfo->pTableListInfo); taosMemoryFreeClear(param); } @@ -2760,11 +2779,12 @@ static int32_t initTableblockDistQueryCond(uint64_t uid, SQueryTableDataCond* pC } int32_t createDataBlockInfoScanOperator(SReadHandle* readHandle, SBlockDistScanPhysiNode* pBlockScanNode, - STableListInfo* pTableListInfo, SExecTaskInfo* pTaskInfo, SOperatorInfo** pOptrInfo) { + STableListInfo* pTableListInfo, SExecTaskInfo* pTaskInfo, + SOperatorInfo** pOptrInfo) { QRY_OPTR_CHECK(pOptrInfo); - int32_t code = 0; - int32_t lino = 0; + int32_t code = 0; + int32_t lino = 0; SBlockDistInfo* pInfo = taosMemoryCalloc(1, sizeof(SBlockDistInfo)); SOperatorInfo* pOperator = taosMemoryCalloc(1, sizeof(SOperatorInfo)); if (pInfo == NULL || pOperator == NULL) { @@ -2815,6 +2835,9 @@ _error: pInfo->pTableListInfo = NULL; destroyBlockDistScanOperatorInfo(pInfo); } - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } return code; } diff --git a/source/libs/executor/src/tfill.c b/source/libs/executor/src/tfill.c index 3158c85987..957a5d1d2e 100644 --- a/source/libs/executor/src/tfill.c +++ b/source/libs/executor/src/tfill.c @@ -758,7 +758,10 @@ SFillColInfo* createFillColInfo(SExprInfo* pExpr, int32_t numOfFillExpr, SExprIn SValueNode* pv = (SValueNode*)nodesListGetNode(pValNode->pNodeList, index); QUERY_CHECK_NULL(pv, code, lino, _end, terrno); - nodesValueNodeToVariant(pv, &pFillCol[i].fillVal); + code = nodesValueNodeToVariant(pv, &pFillCol[i].fillVal); + } + if (TSDB_CODE_SUCCESS != code) { + goto _end; } } pFillCol->numOfFillExpr = numOfFillExpr; diff --git a/source/libs/executor/src/timesliceoperator.c b/source/libs/executor/src/timesliceoperator.c index 47ba8e560a..65fcc4d4bc 100644 --- a/source/libs/executor/src/timesliceoperator.c +++ b/source/libs/executor/src/timesliceoperator.c @@ -1208,7 +1208,10 @@ _error: qError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); } if (pInfo != NULL) destroyTimeSliceOperatorInfo(pInfo); - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } diff --git a/source/libs/executor/src/timewindowoperator.c b/source/libs/executor/src/timewindowoperator.c index a367afeed4..ca6b89f7c5 100644 --- a/source/libs/executor/src/timewindowoperator.c +++ b/source/libs/executor/src/timewindowoperator.c @@ -666,7 +666,9 @@ static bool isCalculatedWin(SIntervalAggOperatorInfo* pInfo, const STimeWindow* * every tuple in every block. * And the boundedQueue keeps refreshing all records with smaller ts key. */ -static bool filterWindowWithLimit(SIntervalAggOperatorInfo* pOperatorInfo, STimeWindow* win, uint64_t groupId) { +static bool filterWindowWithLimit(SIntervalAggOperatorInfo* pOperatorInfo, STimeWindow* win, uint64_t groupId, SExecTaskInfo* pTaskInfo) { + int32_t code = TSDB_CODE_SUCCESS; + int32_t lino = 0; if (!pOperatorInfo->limited // if no limit info, no filter will be applied || pOperatorInfo->binfo.inputTsOrder != pOperatorInfo->binfo.outputTsOrder // if input/output ts order mismatch, no filter @@ -678,6 +680,7 @@ static bool filterWindowWithLimit(SIntervalAggOperatorInfo* pOperatorInfo, STime if (pOperatorInfo->pBQ == NULL) { pOperatorInfo->pBQ = createBoundedQueue(pOperatorInfo->limit - 1, tsKeyCompFn, taosMemoryFree, pOperatorInfo); + QUERY_CHECK_NULL(pOperatorInfo->pBQ, code, lino, _end, terrno); } bool shouldFilter = false; @@ -694,12 +697,21 @@ static bool filterWindowWithLimit(SIntervalAggOperatorInfo* pOperatorInfo, STime // cur win not been filtered out and not been pushed into BQ yet, push it into BQ PriorityQueueNode node = {.data = taosMemoryMalloc(sizeof(TSKEY))}; + QUERY_CHECK_NULL(node.data, code, lino, _end, terrno); + *((TSKEY*)node.data) = win->skey; if (NULL == taosBQPush(pOperatorInfo->pBQ, &node)) { taosMemoryFree(node.data); return true; } + +_end: + if (code != TSDB_CODE_SUCCESS) { + qError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); + pTaskInfo->code = code; + T_LONG_JMP(pTaskInfo->env, code); + } return false; } @@ -731,7 +743,7 @@ static bool hashIntervalAgg(SOperatorInfo* pOperatorInfo, SResultRowInfo* pResul STimeWindow win = getActiveTimeWindow(pInfo->aggSup.pResultBuf, pResultRowInfo, ts, &pInfo->interval, pInfo->binfo.inputTsOrder); - if (filterWindowWithLimit(pInfo, &win, tableGroupId)) return false; + if (filterWindowWithLimit(pInfo, &win, tableGroupId, pTaskInfo)) return false; int32_t ret = setTimeWindowOutputBuf(pResultRowInfo, &win, (scanFlag == MAIN_SCAN), &pResult, tableGroupId, pSup->pCtx, numOfOutput, pSup->rowEntryInfoOffset, &pInfo->aggSup, pTaskInfo); @@ -770,7 +782,7 @@ static bool hashIntervalAgg(SOperatorInfo* pOperatorInfo, SResultRowInfo* pResul int32_t prevEndPos = forwardRows - 1 + startPos; startPos = getNextQualifiedWindow(&pInfo->interval, &nextWin, &pBlock->info, tsCols, prevEndPos, pInfo->binfo.inputTsOrder); - if (startPos < 0 || filterWindowWithLimit(pInfo, &nextWin, tableGroupId)) { + if (startPos < 0 || filterWindowWithLimit(pInfo, &nextWin, tableGroupId, pTaskInfo)) { break; } // null data, failed to allocate more memory buffer @@ -1402,7 +1414,10 @@ _error: if (pInfo != NULL) { destroyIntervalOperatorInfo(pInfo); } - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } @@ -1678,7 +1693,10 @@ _error: destroyStateWindowOperatorInfo(pInfo); } - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } @@ -1771,7 +1789,10 @@ int32_t createSessionAggOperatorInfo(SOperatorInfo* downstream, SSessionWinodwPh _error: if (pInfo != NULL) destroySWindowOperatorInfo(pInfo); - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } @@ -2082,8 +2103,11 @@ int32_t createMergeAlignedIntervalOperatorInfo(SOperatorInfo* downstream, SMerge return code; _error: - destroyMAIOperatorInfo(miaInfo); - destroyOperator(pOperator); + if (miaInfo != NULL) destroyMAIOperatorInfo(miaInfo); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } @@ -2415,7 +2439,10 @@ _error: destroyMergeIntervalOperatorInfo(pMergeIntervalInfo); } - destroyOperator(pOperator); + if (pOperator != NULL) { + pOperator->info = NULL; + destroyOperator(pOperator); + } pTaskInfo->code = code; return code; } diff --git a/source/libs/executor/src/tsort.c b/source/libs/executor/src/tsort.c index 9e902140f4..cedc23ed2d 100644 --- a/source/libs/executor/src/tsort.c +++ b/source/libs/executor/src/tsort.c @@ -152,8 +152,10 @@ static void destoryAllocatedTuple(void* t) { taosMemoryFree(t); } * @param colIndex the columnIndex, for setting null bitmap * @return the next offset to add field * */ -static inline size_t tupleAddField(char** t, uint32_t colNum, uint32_t offset, uint32_t colIdx, void* data, - size_t length, bool isNull, uint32_t tupleLen) { +static inline int32_t tupleAddField(char** t, uint32_t colNum, uint32_t offset, uint32_t colIdx, void* data, + size_t length, bool isNull, uint32_t tupleLen, uint32_t* pOffset) { + int32_t code = TSDB_CODE_SUCCESS; + int32_t lino = 0; tupleSetOffset(*t, colIdx, offset); if (isNull) { @@ -161,16 +163,20 @@ static inline size_t tupleAddField(char** t, uint32_t colNum, uint32_t offset, u } else { if (offset + length > tupleLen + tupleGetDataStartOffset(colNum)) { void* px = taosMemoryRealloc(*t, offset + length); - if (px == NULL) { - return terrno; - } + QUERY_CHECK_NULL(px, code, lino, _end, terrno); *t = px; } tupleSetData(*t, offset, data, length); } - return offset + length; + (*pOffset) = offset + length; + +_end: + if (code != TSDB_CODE_SUCCESS) { + qError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); + } + return code; } static void* tupleGetField(char* t, uint32_t colIdx, uint32_t colNum) { @@ -200,6 +206,7 @@ typedef struct ReferencedTuple { } ReferencedTuple; static int32_t createAllocatedTuple(SSDataBlock* pBlock, size_t colNum, uint32_t tupleLen, size_t rowIdx, TupleDesc** pDesc) { + int32_t code = TSDB_CODE_SUCCESS; TupleDesc* t = taosMemoryCalloc(1, sizeof(TupleDesc)); if (t == NULL) { return terrno; @@ -216,15 +223,20 @@ static int32_t createAllocatedTuple(SSDataBlock* pBlock, size_t colNum, uint32_t for (size_t colIdx = 0; colIdx < colNum; ++colIdx) { SColumnInfoData* pCol = taosArrayGet(pBlock->pDataBlock, colIdx); if (pCol == NULL) { + qError("%s failed at line %d since %s", __func__, __LINE__, tstrerror(terrno)); return terrno; } if (colDataIsNull_s(pCol, rowIdx)) { - offset = tupleAddField((char**)&pTuple, colNum, offset, colIdx, 0, 0, true, tupleLen); + code = tupleAddField((char**)&pTuple, colNum, offset, colIdx, 0, 0, true, tupleLen, &offset); } else { colLen = colDataGetRowLength(pCol, rowIdx); - offset = - tupleAddField((char**)&pTuple, colNum, offset, colIdx, colDataGetData(pCol, rowIdx), colLen, false, tupleLen); + code = + tupleAddField((char**)&pTuple, colNum, offset, colIdx, colDataGetData(pCol, rowIdx), colLen, false, tupleLen, &offset); + } + if (code != TSDB_CODE_SUCCESS) { + qError("%s failed at line %d since %s", __func__, __LINE__, tstrerror(code)); + return code; } } @@ -232,7 +244,7 @@ static int32_t createAllocatedTuple(SSDataBlock* pBlock, size_t colNum, uint32_t t->data = pTuple; *pDesc = t; - return 0; + return code; } int32_t tupleDescGetField(const TupleDesc* pDesc, int32_t colIdx, uint32_t colNum, void** pResult) { @@ -259,7 +271,7 @@ int32_t tupleDescGetField(const TupleDesc* pDesc, int32_t colIdx, uint32_t colNu void destroyTuple(void* t) { TupleDesc* pDesc = t; - if (pDesc->type == AllocatedTupleType) { + if (pDesc != NULL && pDesc->type == AllocatedTupleType) { destoryAllocatedTuple(pDesc->data); taosMemoryFree(pDesc); } @@ -1686,6 +1698,7 @@ static int32_t initRowIdSort(SSortHandle* pHandle) { biTs.compFn = getKeyComparFunc(TSDB_DATA_TYPE_TIMESTAMP, biTs.order); void* p = taosArrayPush(pOrderInfoList, &biTs); if (p == NULL) { + taosArrayDestroy(pOrderInfoList); return terrno; } @@ -1698,6 +1711,7 @@ static int32_t initRowIdSort(SSortHandle* pHandle) { void* px = taosArrayPush(pOrderInfoList, &biPk); if (px == NULL) { + taosArrayDestroy(pOrderInfoList); return terrno; } } diff --git a/source/libs/function/src/functionMgt.c b/source/libs/function/src/functionMgt.c index 49d700e8c8..e50dbf8b14 100644 --- a/source/libs/function/src/functionMgt.c +++ b/source/libs/function/src/functionMgt.c @@ -434,7 +434,7 @@ static int32_t createPartialFunction(const SFunctionNode* pSrcFunc, SFunctionNod (*pPartialFunc)->originalFuncId = pSrcFunc->hasOriginalFunc ? pSrcFunc->originalFuncId : pSrcFunc->funcId; char name[TSDB_FUNC_NAME_LEN + TSDB_NAME_DELIMITER_LEN + TSDB_POINTER_PRINT_BYTES + 1] = {0}; int32_t len = snprintf(name, sizeof(name) - 1, "%s.%p", (*pPartialFunc)->functionName, pSrcFunc); - (void)taosCreateMD5Hash(name, len); + (void)taosHashBinary(name, len); (void)strncpy((*pPartialFunc)->node.aliasName, name, TSDB_COL_NAME_LEN - 1); (*pPartialFunc)->hasPk = pSrcFunc->hasPk; (*pPartialFunc)->pkBytes = pSrcFunc->pkBytes; diff --git a/source/libs/function/src/tudf.c b/source/libs/function/src/tudf.c index ad9e5ce7d4..9a751db801 100644 --- a/source/libs/function/src/tudf.c +++ b/source/libs/function/src/tudf.c @@ -162,7 +162,7 @@ static int32_t udfSpawnUdfd(SUdfdData *pData) { fnInfo("[UDFD]Succsess to set TAOS_FQDN:%s", taosFqdn); } else { fnError("[UDFD]Failed to allocate memory for TAOS_FQDN"); - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } } @@ -837,10 +837,13 @@ int32_t convertDataBlockToUdfDataBlock(SSDataBlock *block, SUdfDataBlock *udfBlo udfBlock->numOfRows = block->info.rows; udfBlock->numOfCols = taosArrayGetSize(block->pDataBlock); udfBlock->udfCols = taosMemoryCalloc(taosArrayGetSize(block->pDataBlock), sizeof(SUdfColumn *)); + if((udfBlock->udfCols) == NULL) { + return terrno; + } for (int32_t i = 0; i < udfBlock->numOfCols; ++i) { udfBlock->udfCols[i] = taosMemoryCalloc(1, sizeof(SUdfColumn)); if(udfBlock->udfCols[i] == NULL) { - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } SColumnInfoData *col = (SColumnInfoData *)taosArrayGet(block->pDataBlock, i); SUdfColumn *udfCol = udfBlock->udfCols[i]; @@ -854,13 +857,13 @@ int32_t convertDataBlockToUdfDataBlock(SSDataBlock *block, SUdfDataBlock *udfBlo udfCol->colData.varLenCol.varOffsetsLen = sizeof(int32_t) * udfBlock->numOfRows; udfCol->colData.varLenCol.varOffsets = taosMemoryMalloc(udfCol->colData.varLenCol.varOffsetsLen); if(udfCol->colData.varLenCol.varOffsets == NULL) { - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } memcpy(udfCol->colData.varLenCol.varOffsets, col->varmeta.offset, udfCol->colData.varLenCol.varOffsetsLen); udfCol->colData.varLenCol.payloadLen = colDataGetLength(col, udfBlock->numOfRows); udfCol->colData.varLenCol.payload = taosMemoryMalloc(udfCol->colData.varLenCol.payloadLen); if(udfCol->colData.varLenCol.payload == NULL) { - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } if (col->reassigned) { for (int32_t row = 0; row < udfCol->colData.numOfRows; ++row) { @@ -882,7 +885,7 @@ int32_t convertDataBlockToUdfDataBlock(SSDataBlock *block, SUdfDataBlock *udfBlo int32_t bitmapLen = udfCol->colData.fixLenCol.nullBitmapLen; udfCol->colData.fixLenCol.nullBitmap = taosMemoryMalloc(udfCol->colData.fixLenCol.nullBitmapLen); if(udfCol->colData.fixLenCol.nullBitmap == NULL) { - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } char *bitmap = udfCol->colData.fixLenCol.nullBitmap; memcpy(bitmap, col->nullbitmap, bitmapLen); @@ -985,7 +988,7 @@ int32_t convertDataBlockToScalarParm(SSDataBlock *input, SScalarParam *output) { output->columnData = taosMemoryMalloc(sizeof(SColumnInfoData)); if(output->columnData == NULL) { - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } memcpy(output->columnData, taosArrayGet(input->pDataBlock, 0), sizeof(SColumnInfoData)); output->colAlloced = true; @@ -1724,7 +1727,7 @@ int32_t udfcStartUvTask(SClientUvTaskNode *uvTask) { if(conn == NULL) { fnError("udfc event loop start connect task malloc conn failed."); taosMemoryFree(pipe); - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } conn->pipe = pipe; conn->readBuf.len = 0; @@ -1954,7 +1957,7 @@ int32_t udfcRunUdfUvTask(SClientUdfTask *task, int8_t uvTaskType) { SClientUvTaskNode *uvTask = taosMemoryCalloc(1, sizeof(SClientUvTaskNode)); if(uvTask == NULL) { fnError("udfc client task: %p failed to allocate memory for uvTask", task); - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } fnDebug("udfc client task: %p created uvTask: %p. pipe: %p", task, uvTask, task->session->udfUvPipe); @@ -1986,13 +1989,13 @@ int32_t doSetupUdf(char udfName[], UdfcFuncHandle *funcHandle) { SClientUdfTask *task = taosMemoryCalloc(1, sizeof(SClientUdfTask)); if(task == NULL) { fnError("doSetupUdf, failed to allocate memory for task"); - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } task->session = taosMemoryCalloc(1, sizeof(SUdfcUvSession)); if(task->session == NULL) { fnError("doSetupUdf, failed to allocate memory for session"); taosMemoryFree(task); - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } task->session->udfc = &gUdfcProxy; task->type = UDF_TASK_SETUP; @@ -2037,7 +2040,7 @@ int32_t callUdf(UdfcFuncHandle handle, int8_t callType, SSDataBlock *input, SUdf SClientUdfTask *task = taosMemoryCalloc(1, sizeof(SClientUdfTask)); if(task == NULL) { fnError("udfc call udf. failed to allocate memory for task"); - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } task->session = (SUdfcUvSession *)handle; task->type = UDF_TASK_CALL; @@ -2169,7 +2172,7 @@ int32_t doTeardownUdf(UdfcFuncHandle handle) { if(task == NULL) { fnError("doTeardownUdf, failed to allocate memory for task"); taosMemoryFree(session); - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } task->session = session; task->type = UDF_TASK_TEARDOWN; diff --git a/source/libs/function/src/udfd.c b/source/libs/function/src/udfd.c index 7339f115a3..2d8a926c72 100644 --- a/source/libs/function/src/udfd.c +++ b/source/libs/function/src/udfd.c @@ -409,6 +409,10 @@ int32_t udfdInitializePythonPlugin(SUdfScriptPlugin *plugin) { int16_t lenPythonPath = strlen(tsUdfdLdLibPath) + strlen(global.udfDataDir) + 1 + 1; // global.udfDataDir:tsUdfdLdLibPath char *pythonPath = taosMemoryMalloc(lenPythonPath); + if(pythonPath == NULL) { + uv_dlclose(&plugin->lib); + return terrno; + } #ifdef WINDOWS snprintf(pythonPath, lenPythonPath, "%s;%s", global.udfDataDir, tsUdfdLdLibPath); #else @@ -705,6 +709,10 @@ void udfdProcessSetupRequest(SUvUdfWork *uvUdf, SUdfRequest *request) { uv_mutex_unlock(&udf->lock); } SUdfcFuncHandle *handle = taosMemoryMalloc(sizeof(SUdfcFuncHandle)); + if(handle == NULL) { + fnError("udfdProcessSetupRequest: malloc failed."); + code = terrno; + } handle->udf = udf; _send: @@ -775,7 +783,7 @@ void udfdProcessCallRequest(SUvUdfWork *uvUdf, SUdfRequest *request) { if (outBuf.buf != NULL) { code = udf->scriptPlugin->udfAggStartFunc(&outBuf, udf->scriptUdfCtx); } else { - code = TSDB_CODE_OUT_OF_MEMORY; + code = terrno; } subRsp->resultBuf = outBuf; break; @@ -784,9 +792,13 @@ void udfdProcessCallRequest(SUvUdfWork *uvUdf, SUdfRequest *request) { SUdfDataBlock input = {0}; if (convertDataBlockToUdfDataBlock(&call->block, &input) == TSDB_CODE_SUCCESS) { SUdfInterBuf outBuf = {.buf = taosMemoryMalloc(udf->bufSize), .bufLen = udf->bufSize, .numOfResult = 0}; - code = udf->scriptPlugin->udfAggProcFunc(&input, &call->interBuf, &outBuf, udf->scriptUdfCtx); - freeUdfInterBuf(&call->interBuf); - subRsp->resultBuf = outBuf; + if (outBuf.buf != NULL) { + code = udf->scriptPlugin->udfAggProcFunc(&input, &call->interBuf, &outBuf, udf->scriptUdfCtx); + freeUdfInterBuf(&call->interBuf); + subRsp->resultBuf = outBuf; + } else { + code = terrno; + } } freeUdfDataDataBlock(&input); @@ -794,18 +806,27 @@ void udfdProcessCallRequest(SUvUdfWork *uvUdf, SUdfRequest *request) { } case TSDB_UDF_CALL_AGG_MERGE: { SUdfInterBuf outBuf = {.buf = taosMemoryMalloc(udf->bufSize), .bufLen = udf->bufSize, .numOfResult = 0}; - code = udf->scriptPlugin->udfAggMergeFunc(&call->interBuf, &call->interBuf2, &outBuf, udf->scriptUdfCtx); - freeUdfInterBuf(&call->interBuf); - freeUdfInterBuf(&call->interBuf2); - subRsp->resultBuf = outBuf; + if (outBuf.buf != NULL) { + code = udf->scriptPlugin->udfAggMergeFunc(&call->interBuf, &call->interBuf2, &outBuf, udf->scriptUdfCtx); + freeUdfInterBuf(&call->interBuf); + freeUdfInterBuf(&call->interBuf2); + subRsp->resultBuf = outBuf; + } else { + code = terrno; + } break; } case TSDB_UDF_CALL_AGG_FIN: { SUdfInterBuf outBuf = {.buf = taosMemoryMalloc(udf->bufSize), .bufLen = udf->bufSize, .numOfResult = 0}; - code = udf->scriptPlugin->udfAggFinishFunc(&call->interBuf, &outBuf, udf->scriptUdfCtx); - freeUdfInterBuf(&call->interBuf); - subRsp->resultBuf = outBuf; + if (outBuf.buf != NULL) { + code = udf->scriptPlugin->udfAggFinishFunc(&call->interBuf, &outBuf, udf->scriptUdfCtx); + freeUdfInterBuf(&call->interBuf); + subRsp->resultBuf = outBuf; + } else { + code = terrno; + } + break; } default: @@ -820,19 +841,24 @@ void udfdProcessCallRequest(SUvUdfWork *uvUdf, SUdfRequest *request) { int32_t len = encodeUdfResponse(NULL, rsp); if(len < 0) { fnError("udfdProcessCallRequest: encode udf response failed. len %d", len); - return; + goto _exit; } rsp->msgLen = len; void *bufBegin = taosMemoryMalloc(len); + if (bufBegin == NULL) { + fnError("udfdProcessCallRequest: malloc failed. len %d", len); + goto _exit; + } void *buf = bufBegin; if(encodeUdfResponse(&buf, rsp) < 0) { fnError("udfdProcessCallRequest: encode udf response failed. len %d", len); taosMemoryFree(bufBegin); - return; + goto _exit; } uvUdf->output = uv_buf_init(bufBegin, len); +_exit: switch (call->callType) { case TSDB_UDF_CALL_SCALA_PROC: { blockDataFreeRes(&call->block); @@ -906,6 +932,10 @@ _send: } rsp->msgLen = len; void *bufBegin = taosMemoryMalloc(len); + if(bufBegin == NULL) { + fnError("udfdProcessTeardownRequest: malloc failed. len %d", len); + return; + } void *buf = bufBegin; if (encodeUdfResponse(&buf, rsp) < 0) { fnError("udfdProcessTeardownRequest: encode udf response failed. len %d", len); @@ -1173,7 +1203,7 @@ int32_t udfdOpenClientRpc() { global.clientRpc = rpcOpen(&rpcInit); if (global.clientRpc == NULL) { fnError("failed to init dnode rpc client"); - return -1; + return terrno; } return 0; } @@ -1210,6 +1240,11 @@ void udfdSendResponse(uv_work_t *work, int status) { if (udfWork->conn != NULL) { uv_write_t *write_req = taosMemoryMalloc(sizeof(uv_write_t)); + if(write_req == NULL) { + fnError("udfd send response error, malloc failed"); + taosMemoryFree(work); + return; + } write_req->data = udfWork; int32_t code = uv_write(write_req, udfWork->conn->client, &udfWork->output, 1, udfdOnWrite); if (code != 0) { @@ -1269,7 +1304,16 @@ void udfdHandleRequest(SUdfdUvConn *conn) { int32_t inputLen = conn->inputLen; uv_work_t *work = taosMemoryMalloc(sizeof(uv_work_t)); + if(work == NULL) { + fnError("udfd malloc work failed"); + return; + } SUvUdfWork *udfWork = taosMemoryMalloc(sizeof(SUvUdfWork)); + if(udfWork == NULL) { + fnError("udfd malloc udf work failed"); + taosMemoryFree(work); + return; + } udfWork->conn = conn; udfWork->pWorkNext = conn->pWorkList; conn->pWorkList = udfWork; @@ -1334,6 +1378,10 @@ void udfdOnNewConnection(uv_stream_t *server, int status) { int32_t code = 0; uv_pipe_t *client = (uv_pipe_t *)taosMemoryMalloc(sizeof(uv_pipe_t)); + if(client == NULL) { + fnError("udfd pipe malloc failed"); + return; + } code = uv_pipe_init(global.loop, client, 0); if (code) { fnError("udfd pipe init error %s", uv_strerror(code)); @@ -1342,6 +1390,10 @@ void udfdOnNewConnection(uv_stream_t *server, int status) { } if (uv_accept(server, (uv_stream_t *)client) == 0) { SUdfdUvConn *ctx = taosMemoryMalloc(sizeof(SUdfdUvConn)); + if(ctx == NULL) { + fnError("udfd conn malloc failed"); + goto _exit; + } ctx->pWorkList = NULL; ctx->client = (uv_stream_t *)client; ctx->inputBuf = 0; @@ -1356,9 +1408,11 @@ void udfdOnNewConnection(uv_stream_t *server, int status) { taosMemoryFree(ctx); taosMemoryFree(client); } - } else { - uv_close((uv_handle_t *)client, NULL); + return; } +_exit: + uv_close((uv_handle_t *)client, NULL); + taosMemoryFree(client); } void udfdIntrSignalHandler(uv_signal_t *handle, int signum) { @@ -1411,6 +1465,10 @@ static int32_t udfdInitLog() { void udfdCtrlAllocBufCb(uv_handle_t *handle, size_t suggested_size, uv_buf_t *buf) { buf->base = taosMemoryMalloc(suggested_size); + if (buf->base == NULL) { + fnError("udfd ctrl pipe alloc buffer failed"); + return; + } buf->len = suggested_size; } @@ -1477,13 +1535,13 @@ static int32_t udfdGlobalDataInit() { uv_loop_t *loop = taosMemoryMalloc(sizeof(uv_loop_t)); if (loop == NULL) { fnError("udfd init uv loop failed, mem overflow"); - return -1; + return terrno; } global.loop = loop; if (uv_mutex_init(&global.scriptPluginsMutex) != 0) { fnError("udfd init script plugins mutex failed"); - return -1; + return TSDB_CODE_UDF_UV_EXEC_FAILURE; } global.udfsHash = taosHashInit(64, taosGetDefaultHashFunction(TSDB_DATA_TYPE_BINARY), true, HASH_NO_LOCK); @@ -1494,7 +1552,7 @@ static int32_t udfdGlobalDataInit() { if (uv_mutex_init(&global.udfsMutex) != 0) { fnError("udfd init udfs mutex failed"); - return -2; + return TSDB_CODE_UDF_UV_EXEC_FAILURE; } return 0; diff --git a/source/libs/geometry/src/geomFunc.c b/source/libs/geometry/src/geomFunc.c index 194590c06c..4426427bf5 100644 --- a/source/libs/geometry/src/geomFunc.c +++ b/source/libs/geometry/src/geomFunc.c @@ -156,6 +156,10 @@ _exit: int32_t executeGeomFromTextFunc(SColumnInfoData *pInputData, int32_t i, SColumnInfoData *pOutputData) { int32_t code = TSDB_CODE_FAILED; + if (!IS_VAR_DATA_TYPE((pInputData)->info.type)) { + return TSDB_CODE_FUNC_FUNTION_PARA_VALUE; + } + char *input = colDataGetData(pInputData, i); unsigned char *output = NULL; diff --git a/source/libs/geometry/src/geosWrapper.c b/source/libs/geometry/src/geosWrapper.c index 6ca8a39bb5..dde34edc91 100644 --- a/source/libs/geometry/src/geosWrapper.c +++ b/source/libs/geometry/src/geosWrapper.c @@ -147,7 +147,17 @@ static int32_t initWktRegex(pcre2_code **ppRegex, pcre2_match_data **ppMatchData "*)(([-+]?[0-9]+\\.?[0-9]*)|([-+]?[0-9]*\\.?[0-9]+))(e[-+]?[0-9]+)?){1,3}( *))*( *)\\)))( *))*( *)\\)))( *))*( " "*)\\)))|(GEOCOLLECTION\\((?R)(( *)(,)( *)(?R))*( *)\\))( *)$"); - code = doRegComp(ppRegex, ppMatchData, wktPatternWithSpace); + pcre2_code *pRegex = NULL; + pcre2_match_data *pMatchData = NULL; + code = doRegComp(&pRegex, &pMatchData, wktPatternWithSpace); + if (code < 0) { + taosMemoryFree(wktPatternWithSpace); + return TSDB_CODE_OUT_OF_MEMORY; + } + + *ppRegex = pRegex; + *ppMatchData = pMatchData; + taosMemoryFree(wktPatternWithSpace); return code; } diff --git a/source/libs/nodes/src/nodesUtilFuncs.c b/source/libs/nodes/src/nodesUtilFuncs.c index 6e69c56687..6b06530b3e 100644 --- a/source/libs/nodes/src/nodesUtilFuncs.c +++ b/source/libs/nodes/src/nodesUtilFuncs.c @@ -141,7 +141,7 @@ static int32_t callocNodeChunk(SNodeAllocator* pAllocator, SNodeMemChunk** pOutC static int32_t nodesCallocImpl(int32_t size, void** pOut) { if (NULL == g_pNodeAllocator) { *pOut = taosMemoryCalloc(1, size); - if (!pOut) return TSDB_CODE_OUT_OF_MEMORY; + if (!*pOut) return TSDB_CODE_OUT_OF_MEMORY; return TSDB_CODE_SUCCESS; } @@ -2638,11 +2638,12 @@ int32_t nodesGetOutputNumFromSlotList(SNodeList* pSlots) { return num; } -void nodesValueNodeToVariant(const SValueNode* pNode, SVariant* pVal) { +int32_t nodesValueNodeToVariant(const SValueNode* pNode, SVariant* pVal) { + int32_t code = 0; if (pNode->isNull) { pVal->nType = TSDB_DATA_TYPE_NULL; pVal->nLen = tDataTypes[TSDB_DATA_TYPE_NULL].bytes; - return; + return code; } pVal->nType = pNode->node.resType.type; pVal->nLen = pNode->node.resType.bytes; @@ -2676,13 +2677,21 @@ void nodesValueNodeToVariant(const SValueNode* pNode, SVariant* pVal) { case TSDB_DATA_TYPE_VARBINARY: case TSDB_DATA_TYPE_GEOMETRY: pVal->pz = taosMemoryMalloc(pVal->nLen + 1); - memcpy(pVal->pz, pNode->datum.p, pVal->nLen); - pVal->pz[pVal->nLen] = 0; + if (pVal->pz) { + memcpy(pVal->pz, pNode->datum.p, pVal->nLen); + pVal->pz[pVal->nLen] = 0; + } else { + code = terrno; + } break; case TSDB_DATA_TYPE_JSON: pVal->nLen = getJsonValueLen(pNode->datum.p); pVal->pz = taosMemoryMalloc(pVal->nLen); - memcpy(pVal->pz, pNode->datum.p, pVal->nLen); + if (pVal->pz) { + memcpy(pVal->pz, pNode->datum.p, pVal->nLen); + } else { + code = terrno; + } break; case TSDB_DATA_TYPE_DECIMAL: case TSDB_DATA_TYPE_BLOB: @@ -2690,6 +2699,7 @@ void nodesValueNodeToVariant(const SValueNode* pNode, SVariant* pVal) { default: break; } + return code; } int32_t nodesMergeConds(SNode** pDst, SNodeList** pSrc) { diff --git a/source/libs/parser/src/parAstCreater.c b/source/libs/parser/src/parAstCreater.c index cd7cda01e0..ee5f215d72 100644 --- a/source/libs/parser/src/parAstCreater.c +++ b/source/libs/parser/src/parAstCreater.c @@ -316,15 +316,8 @@ SNode* releaseRawExprNode(SAstCreateContext* pCxt, SNode* pNode) { // See TS-3398. // Len of pRawExpr->p could be larger than len of aliasName[TSDB_COL_NAME_LEN]. // If aliasName is truncated, hash value of aliasName could be the same. - T_MD5_CTX ctx; - tMD5Init(&ctx); - tMD5Update(&ctx, (uint8_t*)pRawExpr->p, pRawExpr->n); - tMD5Final(&ctx); - char* p = pExpr->aliasName; - for (uint8_t i = 0; i < tListLen(ctx.digest); ++i) { - sprintf(p, "%02x", ctx.digest[i]); - p += 2; - } + uint64_t hashVal = MurmurHash3_64(pRawExpr->p, pRawExpr->n); + sprintf(pExpr->aliasName, "%"PRIu64, hashVal); strncpy(pExpr->userAlias, pRawExpr->p, len); pExpr->userAlias[len] = '\0'; } diff --git a/source/libs/parser/src/parTranslater.c b/source/libs/parser/src/parTranslater.c index 7a2a73d013..03c3355920 100755 --- a/source/libs/parser/src/parTranslater.c +++ b/source/libs/parser/src/parTranslater.c @@ -4805,7 +4805,7 @@ static int32_t createMultiResFunc(SFunctionNode* pSrcFunc, SExprNode* pExpr, SNo strcpy(pFunc->node.aliasName, pCol->colName); } else { len = snprintf(buf, sizeof(buf) - 1, "%s(%s.%s)", pSrcFunc->functionName, pCol->tableAlias, pCol->colName); - (void)taosCreateMD5Hash(buf, len); + (void)taosHashBinary(buf, len); strncpy(pFunc->node.aliasName, buf, TSDB_COL_NAME_LEN - 1); len = snprintf(buf, sizeof(buf) - 1, "%s(%s)", pSrcFunc->functionName, pCol->colName); // note: userAlias could be truncated here @@ -4813,7 +4813,7 @@ static int32_t createMultiResFunc(SFunctionNode* pSrcFunc, SExprNode* pExpr, SNo } } else { len = snprintf(buf, sizeof(buf) - 1, "%s(%s)", pSrcFunc->functionName, pExpr->aliasName); - (void)taosCreateMD5Hash(buf, len); + (void)taosHashBinary(buf, len); strncpy(pFunc->node.aliasName, buf, TSDB_COL_NAME_LEN - 1); len = snprintf(buf, sizeof(buf) - 1, "%s(%s)", pSrcFunc->functionName, pExpr->userAlias); // note: userAlias could be truncated here @@ -11864,7 +11864,7 @@ static int32_t buildCreateTSMAReq(STranslateContext* pCxt, SCreateTSMAStmt* pStm if (checkRecursiveTsmaInterval(pRecursiveTsma->interval, pRecursiveTsma->unit, pInterval->datum.i, pInterval->unit, pDbInfo.precision, true)) { } else { - code = TSDB_CODE_TSMA_INVALID_PARA; + code = TSDB_CODE_TSMA_INVALID_INTERVAL; } } } diff --git a/source/libs/planner/src/planOptimizer.c b/source/libs/planner/src/planOptimizer.c index 27d3e80ef0..c7ce9aed76 100644 --- a/source/libs/planner/src/planOptimizer.c +++ b/source/libs/planner/src/planOptimizer.c @@ -3164,7 +3164,7 @@ static void partTagsSetAlias(char* pAlias, const char* pTableAlias, const char* char name[TSDB_COL_FNAME_LEN + 1] = {0}; int32_t len = snprintf(name, TSDB_COL_FNAME_LEN, "%s.%s", pTableAlias, pColName); - (void)taosCreateMD5Hash(name, len); + (void)taosHashBinary(name, len); strncpy(pAlias, name, TSDB_COL_NAME_LEN - 1); } @@ -3827,7 +3827,7 @@ static int32_t rewriteUniqueOptCreateFirstFunc(SFunctionNode* pSelectValue, SNod int64_t pointer = (int64_t)pFunc; char name[TSDB_FUNC_NAME_LEN + TSDB_POINTER_PRINT_BYTES + TSDB_NAME_DELIMITER_LEN + 1] = {0}; int32_t len = snprintf(name, sizeof(name) - 1, "%s.%" PRId64 "", pFunc->functionName, pointer); - (void)taosCreateMD5Hash(name, len); + (void)taosHashBinary(name, len); strncpy(pFunc->node.aliasName, name, TSDB_COL_NAME_LEN - 1); } SNode* pNew = NULL; @@ -7197,7 +7197,7 @@ static int32_t tsmaOptCreateWStart(int8_t precision, SFunctionNode** pWStartOut) int64_t pointer = (int64_t)pWStart; char name[TSDB_COL_NAME_LEN + TSDB_POINTER_PRINT_BYTES + TSDB_NAME_DELIMITER_LEN + 1] = {0}; int32_t len = snprintf(name, sizeof(name) - 1, "%s.%" PRId64 "", pWStart->functionName, pointer); - (void)taosCreateMD5Hash(name, len); + (void)taosHashBinary(name, len); strncpy(pWStart->node.aliasName, name, TSDB_COL_NAME_LEN - 1); pWStart->node.resType.precision = precision; diff --git a/source/libs/planner/src/planPhysiCreater.c b/source/libs/planner/src/planPhysiCreater.c index d75e02bc6b..e50e574f01 100644 --- a/source/libs/planner/src/planPhysiCreater.c +++ b/source/libs/planner/src/planPhysiCreater.c @@ -39,47 +39,101 @@ typedef struct SPhysiPlanContext { bool hasSysScan; } SPhysiPlanContext; -static int32_t getSlotKey(SNode* pNode, const char* pStmtName, char* pKey, int32_t keyBufSize) { - int32_t len = 0; +static int32_t getSlotKey(SNode* pNode, const char* pStmtName, char** ppKey, int32_t *pLen) { + int32_t code = 0; if (QUERY_NODE_COLUMN == nodeType(pNode)) { SColumnNode* pCol = (SColumnNode*)pNode; if (NULL != pStmtName) { if ('\0' != pStmtName[0]) { - len = snprintf(pKey, keyBufSize, "%s.%s", pStmtName, pCol->node.aliasName); - return taosCreateMD5Hash(pKey, len); + *ppKey = taosMemoryCalloc(1, TSDB_TABLE_NAME_LEN + 1 + TSDB_COL_NAME_LEN + 1); + if (!*ppKey) { + return terrno; + } + strcat(*ppKey, pStmtName); + strcat(*ppKey, "."); + strcat(*ppKey, pCol->node.aliasName); + *pLen = taosHashBinary(*ppKey, strlen(*ppKey)); + return code; } else { - return snprintf(pKey, keyBufSize, "%s", pCol->node.aliasName); + *ppKey = taosMemoryCalloc(1, TSDB_COL_NAME_LEN + 1); + if (!*ppKey) { + return terrno; + } + strcat(*ppKey, pCol->node.aliasName); + *pLen = strlen(*ppKey); + return code; } } if ('\0' == pCol->tableAlias[0]) { - return snprintf(pKey, keyBufSize, "%s", pCol->colName); + *ppKey = taosMemoryCalloc(1, TSDB_COL_NAME_LEN + 1); + if (!*ppKey) { + return terrno; + } + strcat(*ppKey, pCol->colName); + *pLen = strlen(*ppKey); + return code; } - len = snprintf(pKey, keyBufSize, "%s.%s", pCol->tableAlias, pCol->colName); - return taosCreateMD5Hash(pKey, len); + *ppKey = taosMemoryCalloc(1, TSDB_TABLE_NAME_LEN + 1 + TSDB_COL_NAME_LEN + 1); + if (!*ppKey) { + return terrno; + } + strcat(*ppKey, pCol->tableAlias); + strcat(*ppKey, "."); + strcat(*ppKey, pCol->colName); + *pLen = taosHashBinary(*ppKey, strlen(*ppKey)); + return code; } else if (QUERY_NODE_FUNCTION == nodeType(pNode)) { SFunctionNode* pFunc = (SFunctionNode*)pNode; if (FUNCTION_TYPE_TBNAME == pFunc->funcType) { SValueNode* pVal = (SValueNode*)nodesListGetNode(pFunc->pParameterList, 0); if (pVal) { if (NULL != pStmtName && '\0' != pStmtName[0]) { - len = snprintf(pKey, keyBufSize, "%s.%s", pStmtName, ((SExprNode*)pNode)->aliasName); - return taosCreateMD5Hash(pKey, len); + *ppKey = taosMemoryCalloc(1, TSDB_TABLE_NAME_LEN + 1 + TSDB_COL_NAME_LEN + 1); + if (!*ppKey) { + return terrno; + } + strcat(*ppKey, pStmtName); + strcat(*ppKey, "."); + strcat(*ppKey, ((SExprNode*)pNode)->aliasName); + *pLen = taosHashBinary(*ppKey, strlen(*ppKey)); + return code; } - len = snprintf(pKey, keyBufSize, "%s.%s", pVal->literal, ((SExprNode*)pNode)->aliasName); - return taosCreateMD5Hash(pKey, len); + *ppKey = taosMemoryCalloc(1, strlen(pVal->literal) + 1 + TSDB_COL_NAME_LEN + 1); + if (!*ppKey) { + return terrno; + } + strcat(*ppKey, pVal->literal); + strcat(*ppKey, "."); + strcat(*ppKey, ((SExprNode*)pNode)->aliasName); + *pLen = taosHashBinary(*ppKey, strlen(*ppKey)); + return code; } } } if (NULL != pStmtName && '\0' != pStmtName[0]) { - len = snprintf(pKey, keyBufSize, "%s.%s", pStmtName, ((SExprNode*)pNode)->aliasName); - return taosCreateMD5Hash(pKey, len); + *ppKey = taosMemoryCalloc(1, TSDB_TABLE_NAME_LEN + 1 + TSDB_COL_NAME_LEN + 1); + if (!*ppKey) { + return terrno; + } + strcat(*ppKey, pStmtName); + strcat(*ppKey, "."); + strcat(*ppKey, ((SExprNode*)pNode)->aliasName); + *pLen = taosHashBinary(*ppKey, strlen(*ppKey)); + return code; } - return snprintf(pKey, keyBufSize, "%s", ((SExprNode*)pNode)->aliasName); + *ppKey = taosMemoryCalloc(1, TSDB_COL_NAME_LEN + 1); + if (!*ppKey) { + return terrno; + } + strcat(*ppKey, ((SExprNode*)pNode)->aliasName); + *pLen = strlen(*ppKey); + return code; } + static SNode* createSlotDesc(SPhysiPlanContext* pCxt, const char* pName, const SNode* pNode, int16_t slotId, bool output, bool reserve) { SSlotDescNode* pSlot = NULL; @@ -132,8 +186,8 @@ static int32_t putSlotToHashImpl(int16_t dataBlockId, int16_t slotId, const char return taosHashPut(pHash, pName, len, &index, sizeof(SSlotIndex)); } -static int32_t putSlotToHash(const char* pName, int16_t dataBlockId, int16_t slotId, SNode* pNode, SHashObj* pHash) { - return putSlotToHashImpl(dataBlockId, slotId, pName, strlen(pName), pHash); +static int32_t putSlotToHash(const char* pName, int32_t len, int16_t dataBlockId, int16_t slotId, SNode* pNode, SHashObj* pHash) { + return putSlotToHashImpl(dataBlockId, slotId, pName, len, pHash); } static int32_t createDataBlockDescHash(SPhysiPlanContext* pCxt, int32_t capacity, int16_t dataBlockId, @@ -162,12 +216,16 @@ static int32_t buildDataBlockSlots(SPhysiPlanContext* pCxt, SNodeList* pList, SD int16_t slotId = 0; SNode* pNode = NULL; FOREACH(pNode, pList) { - char name[TSDB_COL_FNAME_LEN + 1] = {0}; - (void)getSlotKey(pNode, NULL, name, TSDB_COL_FNAME_LEN); - code = nodesListStrictAppend(pDataBlockDesc->pSlots, createSlotDesc(pCxt, name, pNode, slotId, true, false)); + char* name = NULL; + int32_t len = 0; + code = getSlotKey(pNode, NULL, &name, &len); if (TSDB_CODE_SUCCESS == code) { - code = putSlotToHash(name, pDataBlockDesc->dataBlockId, slotId, pNode, pHash); + code = nodesListStrictAppend(pDataBlockDesc->pSlots, createSlotDesc(pCxt, name, pNode, slotId, true, false)); } + if (TSDB_CODE_SUCCESS == code) { + code = putSlotToHash(name, len, pDataBlockDesc->dataBlockId, slotId, pNode, pHash); + } + taosMemoryFree(name); if (TSDB_CODE_SUCCESS == code) { pDataBlockDesc->totalRowSize += ((SExprNode*)pNode)->resType.bytes; pDataBlockDesc->outputRowSize += ((SExprNode*)pNode)->resType.bytes; @@ -226,25 +284,29 @@ static int32_t addDataBlockSlotsImpl(SPhysiPlanContext* pCxt, SNodeList* pList, SNode* pNode = NULL; FOREACH(pNode, pList) { SNode* pExpr = QUERY_NODE_ORDER_BY_EXPR == nodeType(pNode) ? ((SOrderByExprNode*)pNode)->pExpr : pNode; - char name[TSDB_COL_FNAME_LEN + 1] = {0}; - int32_t len = getSlotKey(pExpr, pStmtName, name, TSDB_COL_FNAME_LEN); - SSlotIndex* pIndex = taosHashGet(pHash, name, len); - if (NULL == pIndex) { - code = + char *name = NULL; + int32_t len = 0; + code = getSlotKey(pExpr, pStmtName, &name, &len); + if (TSDB_CODE_SUCCESS == code) { + SSlotIndex* pIndex = taosHashGet(pHash, name, len); + if (NULL == pIndex) { + code = nodesListStrictAppend(pDataBlockDesc->pSlots, createSlotDesc(pCxt, name, pExpr, nextSlotId, output, reserve)); - if (TSDB_CODE_SUCCESS == code) { - code = putSlotToHashImpl(pDataBlockDesc->dataBlockId, nextSlotId, name, len, pHash); + if (TSDB_CODE_SUCCESS == code) { + code = putSlotToHashImpl(pDataBlockDesc->dataBlockId, nextSlotId, name, len, pHash); + } + pDataBlockDesc->totalRowSize += ((SExprNode*)pExpr)->resType.bytes; + if (output) { + pDataBlockDesc->outputRowSize += ((SExprNode*)pExpr)->resType.bytes; + } + slotId = nextSlotId; + ++nextSlotId; + } else { + slotId = getUnsetSlotId(pIndex->pSlotIdsInfo); } - pDataBlockDesc->totalRowSize += ((SExprNode*)pExpr)->resType.bytes; - if (output) { - pDataBlockDesc->outputRowSize += ((SExprNode*)pExpr)->resType.bytes; - } - slotId = nextSlotId; - ++nextSlotId; - } else { - slotId = getUnsetSlotId(pIndex->pSlotIdsInfo); } + taosMemoryFree(name); if (TSDB_CODE_SUCCESS == code) { SNode* pTarget = NULL; code = createTarget(pNode, pDataBlockDesc->dataBlockId, slotId, &pTarget); @@ -315,8 +377,12 @@ static void dumpSlots(const char* pName, SHashObj* pHash) { static EDealRes doSetSlotId(SNode* pNode, void* pContext) { if (QUERY_NODE_COLUMN == nodeType(pNode) && 0 != strcmp(((SColumnNode*)pNode)->colName, "*")) { SSetSlotIdCxt* pCxt = (SSetSlotIdCxt*)pContext; - char name[TSDB_COL_FNAME_LEN + 1] = {0}; - int32_t len = getSlotKey(pNode, NULL, name, TSDB_COL_FNAME_LEN); + char *name = NULL; + int32_t len = 0; + pCxt->errCode = getSlotKey(pNode, NULL, &name, &len); + if (TSDB_CODE_SUCCESS != pCxt->errCode) { + return DEAL_RES_ERROR; + } SSlotIndex* pIndex = taosHashGet(pCxt->pLeftHash, name, len); if (NULL == pIndex) { pIndex = taosHashGet(pCxt->pRightHash, name, len); @@ -327,8 +393,10 @@ static EDealRes doSetSlotId(SNode* pNode, void* pContext) { dumpSlots("left datablock desc", pCxt->pLeftHash); dumpSlots("right datablock desc", pCxt->pRightHash); pCxt->errCode = TSDB_CODE_PLAN_INTERNAL_ERROR; + taosMemoryFree(name); return DEAL_RES_ERROR; } + taosMemoryFree(name); ((SColumnNode*)pNode)->dataBlockId = pIndex->dataBlockId; ((SColumnNode*)pNode)->slotId = ((SSlotIdInfo*)taosArrayGet(pIndex->pSlotIdsInfo, 0))->slotId; return DEAL_RES_IGNORE_CHILD; @@ -1174,7 +1242,6 @@ static int32_t createHashJoinColList(int16_t lBlkId, int16_t rBlkId, SNode* pEq1 static int32_t sortHashJoinTargets(int16_t lBlkId, int16_t rBlkId, SHashJoinPhysiNode* pJoin) { SNode* pNode = NULL; - char name[TSDB_COL_FNAME_LEN + 1] = {0}; SSHashObj* pHash = tSimpleHashInit(pJoin->pTargets->length, taosGetDefaultHashFunction(TSDB_DATA_TYPE_BINARY)); if (NULL == pHash) { return TSDB_CODE_OUT_OF_MEMORY; @@ -1185,8 +1252,13 @@ static int32_t sortHashJoinTargets(int16_t lBlkId, int16_t rBlkId, SHashJoinPhys if (TSDB_CODE_SUCCESS == code) { FOREACH(pNode, pJoin->pTargets) { SColumnNode* pCol = (SColumnNode*)pNode; - int32_t len = getSlotKey(pNode, NULL, name, TSDB_COL_FNAME_LEN); - code = tSimpleHashPut(pHash, name, len, &pCol, POINTER_BYTES); + char *pName = NULL; + int32_t len = 0; + code = getSlotKey(pNode, NULL, &pName, &len); + if (TSDB_CODE_SUCCESS == code) { + code = tSimpleHashPut(pHash, pName, len, &pCol, POINTER_BYTES); + } + taosMemoryFree(pName); if (TSDB_CODE_SUCCESS != code) { break; } @@ -1197,36 +1269,44 @@ static int32_t sortHashJoinTargets(int16_t lBlkId, int16_t rBlkId, SHashJoinPhys pJoin->pTargets = pNew; FOREACH(pNode, pJoin->pOnLeft) { + char* pName = NULL; SColumnNode* pCol = (SColumnNode*)pNode; - int32_t len = getSlotKey(pNode, NULL, name, TSDB_COL_FNAME_LEN); - SNode** p = tSimpleHashGet(pHash, name, len); - if (p) { - code = nodesListStrictAppend(pJoin->pTargets, *p); - if (TSDB_CODE_SUCCESS != code) { - break; - } - code = tSimpleHashRemove(pHash, name, len); - if (TSDB_CODE_SUCCESS != code) { - break; + int32_t len = 0; + code = getSlotKey(pNode, NULL, &pName, &len); + if (TSDB_CODE_SUCCESS == code) { + SNode** p = tSimpleHashGet(pHash, pName, len); + if (p) { + code = nodesListStrictAppend(pJoin->pTargets, *p); + if (TSDB_CODE_SUCCESS == code) { + code = tSimpleHashRemove(pHash, pName, len); + } } } + taosMemoryFree(pName); + if (TSDB_CODE_SUCCESS != code) { + break; + } } } if (TSDB_CODE_SUCCESS == code) { FOREACH(pNode, pJoin->pOnRight) { + char* pName = NULL; SColumnNode* pCol = (SColumnNode*)pNode; - int32_t len = getSlotKey(pNode, NULL, name, TSDB_COL_FNAME_LEN); - SNode** p = tSimpleHashGet(pHash, name, len); - if (p) { - code = nodesListStrictAppend(pJoin->pTargets, *p); - if (TSDB_CODE_SUCCESS != code) { - break; - } - code = tSimpleHashRemove(pHash, name, len); - if (TSDB_CODE_SUCCESS != code) { - break; + int32_t len = 0; + code = getSlotKey(pNode, NULL, &pName, &len); + if (TSDB_CODE_SUCCESS == code) { + SNode** p = tSimpleHashGet(pHash, pName, len); + if (p) { + code = nodesListStrictAppend(pJoin->pTargets, *p); + if (TSDB_CODE_SUCCESS == code) { + code = tSimpleHashRemove(pHash, pName, len); + } } } + taosMemoryFree(pName); + if (TSDB_CODE_SUCCESS != code) { + break; + } } } if (TSDB_CODE_SUCCESS == code) { diff --git a/source/libs/planner/src/planSpliter.c b/source/libs/planner/src/planSpliter.c index efbcd79b69..706394507a 100644 --- a/source/libs/planner/src/planSpliter.c +++ b/source/libs/planner/src/planSpliter.c @@ -432,7 +432,7 @@ static int32_t stbSplAppendWStart(SNodeList* pFuncs, int32_t* pIndex, uint8_t pr int64_t pointer = (int64_t)pWStart; char name[TSDB_COL_NAME_LEN + TSDB_POINTER_PRINT_BYTES + TSDB_NAME_DELIMITER_LEN + 1] = {0}; int32_t len = snprintf(name, sizeof(name) - 1, "%s.%" PRId64 "", pWStart->functionName, pointer); - (void)taosCreateMD5Hash(name, len); + (void)taosHashBinary(name, len); strncpy(pWStart->node.aliasName, name, TSDB_COL_NAME_LEN - 1); pWStart->node.resType.precision = precision; @@ -464,7 +464,7 @@ static int32_t stbSplAppendWEnd(SWindowLogicNode* pWin, int32_t* pIndex) { int64_t pointer = (int64_t)pWEnd; char name[TSDB_COL_NAME_LEN + TSDB_POINTER_PRINT_BYTES + TSDB_NAME_DELIMITER_LEN + 1] = {0}; int32_t len = snprintf(name, sizeof(name) - 1, "%s.%" PRId64 "", pWEnd->functionName, pointer); - (void)taosCreateMD5Hash(name, len); + (void)taosHashBinary(name, len); strncpy(pWEnd->node.aliasName, name, TSDB_COL_NAME_LEN - 1); code = fmGetFuncInfo(pWEnd, NULL, 0); diff --git a/source/libs/planner/src/planUtil.c b/source/libs/planner/src/planUtil.c index 91dc36b99f..e1e98f221f 100644 --- a/source/libs/planner/src/planUtil.c +++ b/source/libs/planner/src/planUtil.c @@ -631,7 +631,7 @@ SFunctionNode* createGroupKeyAggFunc(SColumnNode* pGroupCol) { if (TSDB_CODE_SUCCESS == code) { char name[TSDB_FUNC_NAME_LEN + TSDB_NAME_DELIMITER_LEN + TSDB_POINTER_PRINT_BYTES + 1] = {0}; int32_t len = snprintf(name, sizeof(name) - 1, "%s.%p", pFunc->functionName, pFunc); - (void)taosCreateMD5Hash(name, len); + (void)taosHashBinary(name, len); strncpy(pFunc->node.aliasName, name, TSDB_COL_NAME_LEN - 1); } } diff --git a/source/libs/qcom/src/queryUtil.c b/source/libs/qcom/src/queryUtil.c index 9b61f81939..d47a183121 100644 --- a/source/libs/qcom/src/queryUtil.c +++ b/source/libs/qcom/src/queryUtil.c @@ -225,6 +225,7 @@ int32_t asyncSendMsgToServerExt(void* pTransporter, SEpSet* epSet, int64_t* pTra .code = 0 }; TRACE_SET_ROOTID(&rpcMsg.info.traceId, pInfo->requestId); + int code = rpcSendRequestWithCtx(pTransporter, epSet, &rpcMsg, pTransporterId, rpcCtx); if (code) { destroySendMsgInfo(pInfo); @@ -235,6 +236,9 @@ int32_t asyncSendMsgToServerExt(void* pTransporter, SEpSet* epSet, int64_t* pTra int32_t asyncSendMsgToServer(void* pTransporter, SEpSet* epSet, int64_t* pTransporterId, SMsgSendInfo* pInfo) { return asyncSendMsgToServerExt(pTransporter, epSet, pTransporterId, pInfo, false, NULL); } +int32_t asyncFreeConnById(void* pTransporter, int64_t pid) { + return rpcFreeConnById(pTransporter, pid); +} char* jobTaskStatusStr(int32_t status) { switch (status) { @@ -448,13 +452,13 @@ void parseTagDatatoJson(void* p, char** jsonStr) { if (value == NULL) { goto end; } - if(!cJSON_AddItemToObject(json, tagJsonKey, value)){ + if (!cJSON_AddItemToObject(json, tagJsonKey, value)) { goto end; } } else if (type == TSDB_DATA_TYPE_NCHAR) { cJSON* value = NULL; if (pTagVal->nData > 0) { - char* tagJsonValue = taosMemoryCalloc(pTagVal->nData, 1); + char* tagJsonValue = taosMemoryCalloc(pTagVal->nData, 1); if (tagJsonValue == NULL) { goto end; } @@ -479,7 +483,7 @@ void parseTagDatatoJson(void* p, char** jsonStr) { goto end; } - if(!cJSON_AddItemToObject(json, tagJsonKey, value)){ + if (!cJSON_AddItemToObject(json, tagJsonKey, value)) { goto end; } } else if (type == TSDB_DATA_TYPE_DOUBLE) { @@ -488,7 +492,7 @@ void parseTagDatatoJson(void* p, char** jsonStr) { if (value == NULL) { goto end; } - if(!cJSON_AddItemToObject(json, tagJsonKey, value)){ + if (!cJSON_AddItemToObject(json, tagJsonKey, value)) { goto end; } } else if (type == TSDB_DATA_TYPE_BOOL) { @@ -497,7 +501,7 @@ void parseTagDatatoJson(void* p, char** jsonStr) { if (value == NULL) { goto end; } - if(!cJSON_AddItemToObject(json, tagJsonKey, value)){ + if (!cJSON_AddItemToObject(json, tagJsonKey, value)) { goto end; } } else { diff --git a/source/libs/qcom/src/querymsg.c b/source/libs/qcom/src/querymsg.c index e8deed1df9..207bd91bd9 100644 --- a/source/libs/qcom/src/querymsg.c +++ b/source/libs/qcom/src/querymsg.c @@ -34,7 +34,7 @@ int32_t queryBuildUseDbOutput(SUseDbOutput *pOut, SUseDbRsp *usedbRsp) { pOut->dbVgroup = taosMemoryCalloc(1, sizeof(SDBVgInfo)); if (NULL == pOut->dbVgroup) { - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } pOut->dbVgroup->vgVersion = usedbRsp->vgVersion; @@ -509,7 +509,7 @@ int32_t queryCreateTableMetaFromMsg(STableMetaRsp *msg, bool isStb, STableMeta * STableMeta *pTableMeta = taosMemoryCalloc(1, metaSize + schemaExtSize); if (NULL == pTableMeta) { qError("calloc size[%d] failed", metaSize); - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } SSchemaExt *pSchemaExt = (SSchemaExt *)((char *)pTableMeta + metaSize); @@ -764,7 +764,7 @@ int32_t queryProcessGetTbCfgRsp(void *output, char *msg, int32_t msgSize) { STableCfgRsp *out = taosMemoryCalloc(1, sizeof(STableCfgRsp)); if(out == NULL) { - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } if (tDeserializeSTableCfgRsp(msg, msgSize, out) != 0) { qError("tDeserializeSTableCfgRsp failed, msgSize:%d", msgSize); @@ -785,7 +785,7 @@ int32_t queryProcessGetViewMetaRsp(void *output, char *msg, int32_t msgSize) { SViewMetaRsp *out = taosMemoryCalloc(1, sizeof(SViewMetaRsp)); if (out == NULL) { - return TSDB_CODE_OUT_OF_MEMORY; + return terrno; } if (tDeserializeSViewMetaRsp(msg, msgSize, out) != 0) { qError("tDeserializeSViewMetaRsp failed, msgSize:%d", msgSize); diff --git a/source/libs/scalar/inc/filterInt.h b/source/libs/scalar/inc/filterInt.h index ab04f06b02..4d45fb344c 100644 --- a/source/libs/scalar/inc/filterInt.h +++ b/source/libs/scalar/inc/filterInt.h @@ -464,7 +464,7 @@ struct SFilterInfo { (colInfo).type = RANGE_TYPE_UNIT; \ (colInfo).dataType = FILTER_UNIT_DATA_TYPE(u); \ if (taosArrayPush((SArray *)((colInfo).info), &u) == NULL) { \ - FLT_ERR_RET(TSDB_CODE_OUT_OF_MEMORY); \ + FLT_ERR_RET(terrno); \ } \ } while (0) #define FILTER_PUSH_VAR_HASH(colInfo, ha) \ @@ -481,6 +481,9 @@ struct SFilterInfo { #define FILTER_COPY_IDX(dst, src, n) \ do { \ *(dst) = taosMemoryMalloc(sizeof(uint32_t) * n); \ + if (NULL == *(dst)) { \ + FLT_ERR_JRET(terrno); \ + } \ (void)memcpy(*(dst), src, sizeof(uint32_t) * n); \ } while (0) diff --git a/source/libs/scalar/src/filter.c b/source/libs/scalar/src/filter.c index 5226668f82..e5d0fe594a 100644 --- a/source/libs/scalar/src/filter.c +++ b/source/libs/scalar/src/filter.c @@ -2343,7 +2343,7 @@ _return: (void)filterFreeRangeCtx(ctx); // No need to handle the return value. - return TSDB_CODE_SUCCESS; + return code; } int32_t filterMergeGroupUnits(SFilterInfo *info, SFilterGroupCtx **gRes, int32_t *gResNum) { @@ -2671,7 +2671,7 @@ _return: (void)filterFreeRangeCtx(ctx); // No need to handle the return value. - return TSDB_CODE_SUCCESS; + return code; } int32_t filterMergeGroups(SFilterInfo *info, SFilterGroupCtx **gRes, int32_t *gResNum) { @@ -2758,7 +2758,7 @@ _return: FILTER_SET_FLAG(info->status, FI_STATUS_ALL); - return TSDB_CODE_SUCCESS; + return code; } int32_t filterConvertGroupFromArray(SFilterInfo *info, SArray *group) { @@ -2958,7 +2958,7 @@ _return: taosMemoryFreeClear(idxNum); taosMemoryFreeClear(idxs); - return TSDB_CODE_SUCCESS; + return code; } int32_t filterPostProcessRange(SFilterInfo *info) { @@ -3601,12 +3601,12 @@ int32_t filterPreprocess(SFilterInfo *info) { if (FILTER_GET_FLAG(info->status, FI_STATUS_ALL)) { fltInfo("Final - FilterInfo: [ALL]"); - goto _return; + goto _return1; } if (FILTER_GET_FLAG(info->status, FI_STATUS_EMPTY)) { fltInfo("Final - FilterInfo: [EMPTY]"); - goto _return; + goto _return1; } FLT_ERR_JRET(filterGenerateColRange(info, gRes, gResNum)); @@ -3619,10 +3619,10 @@ int32_t filterPreprocess(SFilterInfo *info) { FLT_ERR_JRET(filterGenerateComInfo(info)); +_return1: + FLT_ERR_JRET(filterSetExecFunc(info)); + _return: - - FLT_ERR_RET(filterSetExecFunc(info)); - for (int32_t i = 0; i < gResNum; ++i) { filterFreeGroupCtx(gRes[i]); } @@ -3660,15 +3660,25 @@ int32_t fltInitFromNode(SNode *tree, SFilterInfo *info, uint32_t options) { FLT_ERR_JRET(terrno); } - FLT_ERR_JRET(filterInitUnitsFields(info)); + code = filterInitUnitsFields(info); + if(TSDB_CODE_SUCCESS != code) { + taosArrayDestroy(group); + goto _return; + } SFltBuildGroupCtx tctx = {.info = info, .group = group}; nodesWalkExpr(tree, fltTreeToGroup, (void *)&tctx); - FLT_ERR_JRET(tctx.code); - - FLT_ERR_JRET(filterConvertGroupFromArray(info, group)); + if (TSDB_CODE_SUCCESS != tctx.code) { + taosArrayDestroy(group); + code = tctx.code; + goto _return; + } + code = filterConvertGroupFromArray(info, group); + if (TSDB_CODE_SUCCESS != code) { + taosArrayDestroy(group); + goto _return; + } taosArrayDestroy(group); - FLT_ERR_JRET(fltInitValFieldData(info)); if (!FILTER_GET_FLAG(info->options, FLT_OPTION_NO_REWRITE)) { @@ -4993,7 +5003,7 @@ int32_t fltOptimizeNodes(SFilterInfo *pInfo, SNode **pNode, SFltTreeStat *pStat) } _return: taosArrayDestroy(sclOpList); - return TSDB_CODE_SUCCESS; + return code; } int32_t fltGetDataFromColId(void *param, int32_t id, void **data) { diff --git a/source/libs/scalar/src/scalar.c b/source/libs/scalar/src/scalar.c index d08f358ce0..9428f051aa 100644 --- a/source/libs/scalar/src/scalar.c +++ b/source/libs/scalar/src/scalar.c @@ -904,9 +904,8 @@ int32_t sclExecOperator(SOperatorNode *node, SScalarCtx *ctx, SScalarParam *outp terrno = TSDB_CODE_SUCCESS; SCL_ERR_JRET(OperatorFn(pLeft, pRight, output, TSDB_ORDER_ASC)); - SCL_ERR_JRET(terrno); -_return: +_return: sclFreeParamList(params, paramNum); SCL_RET(code); } diff --git a/source/libs/scalar/src/sclfunc.c b/source/libs/scalar/src/sclfunc.c index cd67b89208..5db7aebeee 100644 --- a/source/libs/scalar/src/sclfunc.c +++ b/source/libs/scalar/src/sclfunc.c @@ -2913,7 +2913,7 @@ int32_t histogramScalarFunction(SScalarParam *pInput, int32_t inputNum, SScalarP _return: taosMemoryFree(bins); - return TSDB_CODE_SUCCESS; + return code; } int32_t selectScalarFunction(SScalarParam *pInput, int32_t inputNum, SScalarParam *pOutput) { diff --git a/source/libs/scheduler/src/schRemote.c b/source/libs/scheduler/src/schRemote.c index 08a8f684f5..9215254f9c 100644 --- a/source/libs/scheduler/src/schRemote.c +++ b/source/libs/scheduler/src/schRemote.c @@ -17,11 +17,11 @@ #include "command.h" #include "query.h" #include "schInt.h" +#include "tglobal.h" +#include "tmisce.h" #include "tmsg.h" #include "tref.h" #include "trpc.h" -#include "tglobal.h" -#include "tmisce.h" // clang-format off int32_t schValidateRspMsgType(SSchJob *pJob, SSchTask *pTask, int32_t msgType) { @@ -975,11 +975,13 @@ int32_t schAsyncSendMsg(SSchJob *pJob, SSchTask *pTask, SSchTrans *trans, SQuery SCH_ERR_JRET(schUpdateSendTargetInfo(pMsgSendInfo, addr, pTask)); if (isHb && persistHandle && trans->pHandle == 0) { - trans->pHandle = rpcAllocHandle(); - if (NULL == trans->pHandle) { - SCH_TASK_ELOG("rpcAllocHandle failed, code:%x", terrno); - SCH_ERR_JRET(terrno); + int64_t refId = 0; + code = rpcAllocHandle(&refId); + if (code != 0) { + SCH_TASK_ELOG("rpcAllocHandle failed, code:%x", code); + SCH_ERR_JRET(code); } + trans->pHandle = (void *)refId; } if (pJob && pTask) { @@ -1200,7 +1202,14 @@ int32_t schBuildAndSendMsg(SSchJob *pJob, SSchTask *pTask, SQueryNodeAddr *addr, } persistHandle = true; - SCH_SET_TASK_HANDLE(pTask, rpcAllocHandle()); + int64_t refId = 0; + code = rpcAllocHandle(&refId); + if (code != 0) { + SCH_TASK_ELOG("rpcAllocHandle failed, code:%x", code); + SCH_ERR_JRET(code); + } + + SCH_SET_TASK_HANDLE(pTask, (void *)refId); break; } case TDMT_SCH_FETCH: diff --git a/source/libs/stream/inc/streamInt.h b/source/libs/stream/inc/streamInt.h index 93d2edd639..350bd35490 100644 --- a/source/libs/stream/inc/streamInt.h +++ b/source/libs/stream/inc/streamInt.h @@ -48,23 +48,30 @@ extern "C" { #define stTrace(...) do { if (stDebugFlag & DEBUG_TRACE) { taosPrintLog("STM ", DEBUG_TRACE, stDebugFlag, __VA_ARGS__); }} while(0) // clang-format on +typedef struct SStreamTmrInfo { + int32_t activeCounter; // make sure only launch one checkpoint trigger check tmr + tmr_h tmrHandle; + int64_t launchChkptId; + int8_t isActive; +} SStreamTmrInfo; + struct SActiveCheckpointInfo { - TdThreadMutex lock; - int32_t transId; - int64_t firstRecvTs; // first time to recv checkpoint trigger info - int64_t activeId; // current active checkpoint id - int64_t failedId; - bool dispatchTrigger; - SArray* pDispatchTriggerList; // SArray - SArray* pReadyMsgList; // SArray - int8_t allUpstreamTriggerRecv; - SArray* pCheckpointReadyRecvList; // SArray - int32_t checkCounter; - tmr_h pChkptTriggerTmr; - int32_t sendReadyCheckCounter; - tmr_h pSendReadyMsgTmr; + TdThreadMutex lock; + int32_t transId; + int64_t firstRecvTs; // first time to recv checkpoint trigger info + int64_t activeId; // current active checkpoint id + int64_t failedId; + bool dispatchTrigger; + SArray* pDispatchTriggerList; // SArray + SArray* pReadyMsgList; // SArray + int8_t allUpstreamTriggerRecv; + SArray* pCheckpointReadyRecvList; // SArray + SStreamTmrInfo chkptTriggerMsgTmr; + SStreamTmrInfo chkptReadyMsgTmr; }; +int32_t streamCleanBeforeQuitTmr(SStreamTmrInfo* pInfo, SStreamTask* pTask); + typedef struct { int8_t type; SSDataBlock* pBlock; @@ -222,7 +229,7 @@ int32_t streamMetaSendHbHelper(SStreamMeta* pMeta); ECHECKPOINT_BACKUP_TYPE streamGetCheckpointBackupType(); -int32_t streamTaskDownloadCheckpointData(const char* id, char* path); +int32_t streamTaskDownloadCheckpointData(const char* id, char* path, int64_t checkpointId); int32_t streamTaskOnNormalTaskReady(SStreamTask* pTask); int32_t streamTaskOnScanHistoryTaskReady(SStreamTask* pTask); diff --git a/source/libs/stream/src/streamBackendRocksdb.c b/source/libs/stream/src/streamBackendRocksdb.c index a0c6317126..ee87d3b897 100644 --- a/source/libs/stream/src/streamBackendRocksdb.c +++ b/source/libs/stream/src/streamBackendRocksdb.c @@ -247,7 +247,7 @@ int32_t rebuildDirFromCheckpoint(const char* path, int64_t chkpId, char** dst) { } else { stError("failed to start stream backend at %s, reason: %s, restart from default state dir:%s", chkp, - tstrerror(TAOS_SYSTEM_ERROR(errno)), state); + tstrerror(terrno), state); code = taosMkDir(state); if (code != 0) { code = TAOS_SYSTEM_ERROR(errno); @@ -447,7 +447,7 @@ int32_t rebuildFromRemoteChkp_rsync(const char* key, char* checkpointPath, int64 cleanDir(defaultPath, key); stDebug("clear local default dir before downloading checkpoint data:%s succ", defaultPath); - code = streamTaskDownloadCheckpointData(key, checkpointPath); + code = streamTaskDownloadCheckpointData(key, checkpointPath, checkpointId); if (code != 0) { stError("failed to download checkpoint data:%s", key); return code; @@ -482,7 +482,7 @@ int32_t rebuildDataFromS3(char* chkpPath, int64_t chkpId) { int32_t rebuildFromRemoteChkp_s3(const char* key, char* chkpPath, int64_t chkpId, char* defaultPath) { int8_t rename = 0; - int32_t code = streamTaskDownloadCheckpointData(key, chkpPath); + int32_t code = streamTaskDownloadCheckpointData(key, chkpPath, chkpId); if (code != 0) { return code; } @@ -683,7 +683,7 @@ static int32_t rebuildFromLocalCheckpoint(const char* pTaskIdStr, const char* ch defaultPath); } } else { - code = TSDB_CODE_FAILED; + code = terrno; stError("%s no valid data for checkpointId:%" PRId64 " in %s", pTaskIdStr, checkpointId, checkpointPath); } @@ -763,7 +763,7 @@ int32_t restoreCheckpointData(const char* path, const char* key, int64_t chkptId } if (code != 0) { - stError("failed to start stream backend at %s, restart from default defaultPath:%s, reason:%s", checkpointPath, + stError("failed to start stream backend at %s, restart from defaultPath:%s, reason:%s", checkpointPath, defaultPath, tstrerror(code)); code = 0; // reset the error code } @@ -1144,6 +1144,8 @@ int32_t chkpMayDelObsolete(void* arg, int64_t chkpId, char* path) { int64_t id = *(int64_t*)taosArrayGet(chkpDel, i); char tbuf[256] = {0}; sprintf(tbuf, "%s%scheckpoint%" PRId64 "", path, TD_DIRSEP, id); + + stInfo("backend remove obsolete checkpoint: %s", tbuf); if (taosIsDir(tbuf)) { taosRemoveDir(tbuf); } @@ -2661,6 +2663,7 @@ void taskDbDestroy(void* pDb, bool flush) { if (wrapper->removeAllFiles) { char* err = NULL; + stInfo("drop task remove backend dat:%s", wrapper->path); taosRemoveDir(wrapper->path); } taosMemoryFree(wrapper->path); diff --git a/source/libs/stream/src/streamCheckpoint.c b/source/libs/stream/src/streamCheckpoint.c index 9da7a5d9c8..270f678d26 100644 --- a/source/libs/stream/src/streamCheckpoint.c +++ b/source/libs/stream/src/streamCheckpoint.c @@ -20,7 +20,7 @@ static int32_t downloadCheckpointDataByName(const char* id, const char* fname, const char* dstName); static int32_t deleteCheckpointFile(const char* id, const char* name); -static int32_t streamTaskUploadCheckpoint(const char* id, const char* path); +static int32_t streamTaskUploadCheckpoint(const char* id, const char* path, int64_t checkpointId); static int32_t deleteCheckpoint(const char* id); static int32_t downloadCheckpointByNameS3(const char* id, const char* fname, const char* dstName); static int32_t continueDispatchCheckpointTriggerBlock(SStreamDataBlock* pBlock, SStreamTask* pTask); @@ -297,14 +297,26 @@ int32_t streamProcessCheckpointTriggerBlock(SStreamTask* pTask, SStreamDataBlock return code; } - int32_t ref = atomic_add_fetch_32(&pTask->status.timerActive, 1); - stDebug("s-task:%s start checkpoint-trigger monitor in 10s, ref:%d ", pTask->id.idStr, ref); - streamMetaAcquireOneTask(pTask); + // if previous launched timer not started yet, not start a new timer + // todo: fix this bug: previous set checkpoint-trigger check tmr is running, while we happen to try to launch + // a new checkpoint-trigger timer right now. + // And if we don't start a new timer, and the lost of checkpoint-trigger message may cause the whole checkpoint + // procedure to be stucked. + SStreamTmrInfo* pTmrInfo = &pActiveInfo->chkptTriggerMsgTmr; + int8_t old = atomic_val_compare_exchange_8(&pTmrInfo->isActive, 0, 1); + if (old == 0) { + int32_t ref = atomic_add_fetch_32(&pTask->status.timerActive, 1); + stDebug("s-task:%s start checkpoint-trigger monitor in 10s, ref:%d ", pTask->id.idStr, ref); + streamMetaAcquireOneTask(pTask); - if (pActiveInfo->pChkptTriggerTmr == NULL) { - pActiveInfo->pChkptTriggerTmr = taosTmrStart(checkpointTriggerMonitorFn, 100, pTask, streamTimer); - } else { - streamTmrReset(checkpointTriggerMonitorFn, 100, pTask, streamTimer, &pActiveInfo->pChkptTriggerTmr, vgId, "trigger-recv-monitor"); + if (pTmrInfo->tmrHandle == NULL) { + pTmrInfo->tmrHandle = taosTmrStart(checkpointTriggerMonitorFn, 200, pTask, streamTimer); + } else { + streamTmrReset(checkpointTriggerMonitorFn, 200, pTask, streamTimer, &pTmrInfo->tmrHandle, vgId, "trigger-recv-monitor"); + } + pTmrInfo->launchChkptId = pActiveInfo->activeId; + } else { // already launched, do nothing + stError("s-task:%s previous checkpoint-trigger monitor tmr is set, not start new one", pTask->id.idStr); } } @@ -349,7 +361,6 @@ int32_t streamProcessCheckpointTriggerBlock(SStreamTask* pTask, SStreamDataBlock (void)streamTaskBuildCheckpoint(pTask); // todo: not handle error yet } else { // source & agg tasks need to forward the checkpoint msg downwards stDebug("s-task:%s process checkpoint-trigger block, all %d upstreams sent, forwards to downstream", id, num); - flushStateDataInExecutor(pTask, (SStreamQueueItem*)pBlock); // Put the checkpoint-trigger block into outputQ, to make sure all blocks with less version have been handled by @@ -364,8 +375,8 @@ int32_t streamProcessCheckpointTriggerBlock(SStreamTask* pTask, SStreamDataBlock // only when all downstream tasks are send checkpoint rsp, we can start the checkpoint procedure for the agg task static int32_t processCheckpointReadyHelp(SActiveCheckpointInfo* pInfo, int32_t numOfDownstream, int32_t downstreamNodeId, int64_t streamId, int32_t downstreamTaskId, - const char* id, int32_t* pNotReady, int32_t* pTransId) { - bool received = false; + const char* id, int32_t* pNotReady, int32_t* pTransId, bool* alreadyRecv) { + *alreadyRecv = false; int32_t size = taosArrayGetSize(pInfo->pCheckpointReadyRecvList); for (int32_t i = 0; i < size; ++i) { STaskDownstreamReadyInfo* p = taosArrayGet(pInfo->pCheckpointReadyRecvList, i); @@ -374,12 +385,12 @@ static int32_t processCheckpointReadyHelp(SActiveCheckpointInfo* pInfo, int32_t } if (p->downstreamTaskId == downstreamTaskId) { - received = true; + (*alreadyRecv) = true; break; } } - if (received) { + if (*alreadyRecv) { stDebug("s-task:%s already recv checkpoint-ready msg from downstream:0x%x, ignore. %d/%d downstream not ready", id, downstreamTaskId, (int32_t)(numOfDownstream - taosArrayGetSize(pInfo->pCheckpointReadyRecvList)), numOfDownstream); @@ -415,6 +426,7 @@ int32_t streamProcessCheckpointReadyMsg(SStreamTask* pTask, int64_t checkpointId int32_t code = 0; int32_t notReady = 0; int32_t transId = 0; + bool alreadyHandled = false; // 1. not in checkpoint status now SStreamTaskState pStat = streamTaskGetStatus(pTask); @@ -433,12 +445,17 @@ int32_t streamProcessCheckpointReadyMsg(SStreamTask* pTask, int64_t checkpointId streamMutexLock(&pInfo->lock); code = processCheckpointReadyHelp(pInfo, total, downstreamNodeId, pTask->id.streamId, downstreamTaskId, id, ¬Ready, - &transId); + &transId, &alreadyHandled); streamMutexUnlock(&pInfo->lock); - if ((notReady == 0) && (code == 0)) { - stDebug("s-task:%s all downstream tasks have completed build checkpoint, do checkpoint for current task", id); - (void)appendCheckpointIntoInputQ(pTask, STREAM_INPUT__CHECKPOINT, checkpointId, transId, -1); + if (alreadyHandled) { + stDebug("s-task:%s checkpoint-ready msg checkpointId:%" PRId64 " from task:0x%x already handled, not handle again", + id, checkpointId, downstreamTaskId); + } else { + if ((notReady == 0) && (code == 0) && (!alreadyHandled)) { + stDebug("s-task:%s all downstream tasks have completed build checkpoint, do checkpoint for current task", id); + (void)appendCheckpointIntoInputQ(pTask, STREAM_INPUT__CHECKPOINT, checkpointId, transId, -1); + } } return code; @@ -508,8 +525,8 @@ int32_t streamTaskUpdateTaskCheckpointInfo(SStreamTask* pTask, bool restored, SV streamMutexLock(&pTask->lock); if (pReq->checkpointId <= pInfo->checkpointId) { - stDebug("s-task:%s vgId:%d latest checkpointId:%" PRId64 " checkpointVer:%" PRId64 - " no need to update the checkpoint info, updated checkpointId:%" PRId64 " checkpointVer:%" PRId64 + stDebug("s-task:%s vgId:%d latest checkpointId:%" PRId64 " Ver:%" PRId64 + " no need to update checkpoint info, updated checkpointId:%" PRId64 " Ver:%" PRId64 " transId:%d ignored", id, vgId, pInfo->checkpointId, pInfo->checkpointVer, pReq->checkpointId, pReq->checkpointVer, pReq->transId); @@ -541,28 +558,33 @@ int32_t streamTaskUpdateTaskCheckpointInfo(SStreamTask* pTask, bool restored, SV id, vgId, pStatus.name, pInfo->checkpointId, pReq->checkpointId, pInfo->checkpointVer, pReq->checkpointVer, pInfo->checkpointTime, pReq->checkpointTs); } else { // not in restore status, must be in checkpoint status - stDebug("s-task:%s vgId:%d status:%s start to update the checkpoint-info, checkpointId:%" PRId64 "->%" PRId64 - " checkpointVer:%" PRId64 "->%" PRId64 " checkpointTs:%" PRId64 "->%" PRId64, - id, vgId, pStatus.name, pInfo->checkpointId, pReq->checkpointId, pInfo->checkpointVer, pReq->checkpointVer, - pInfo->checkpointTime, pReq->checkpointTs); + if (pStatus.state == TASK_STATUS__CK) { + stDebug("s-task:%s vgId:%d status:%s start to update the checkpoint-info, checkpointId:%" PRId64 "->%" PRId64 + " checkpointVer:%" PRId64 "->%" PRId64 " checkpointTs:%" PRId64 "->%" PRId64, + id, vgId, pStatus.name, pInfo->checkpointId, pReq->checkpointId, pInfo->checkpointVer, + pReq->checkpointVer, pInfo->checkpointTime, pReq->checkpointTs); + } else { + stDebug("s-task:%s vgId:%d status:%s NOT update the checkpoint-info, checkpointId:%" PRId64 "->%" PRId64 + " checkpointVer:%" PRId64 "->%" PRId64, + id, vgId, pStatus.name, pInfo->checkpointId, pReq->checkpointId, pInfo->checkpointVer, + pReq->checkpointVer); + } } ASSERT(pInfo->checkpointId <= pReq->checkpointId && pInfo->checkpointVer <= pReq->checkpointVer && pInfo->processedVer <= pReq->checkpointVer); - pInfo->checkpointId = pReq->checkpointId; - pInfo->checkpointVer = pReq->checkpointVer; - pInfo->checkpointTime = pReq->checkpointTs; + // update only it is in checkpoint status. + if (pStatus.state == TASK_STATUS__CK) { + pInfo->checkpointId = pReq->checkpointId; + pInfo->checkpointVer = pReq->checkpointVer; + pInfo->checkpointTime = pReq->checkpointTs; + + code = streamTaskHandleEvent(pTask->status.pSM, TASK_EVENT_CHECKPOINT_DONE); + } streamTaskClearCheckInfo(pTask, true); - if (pStatus.state == TASK_STATUS__CK) { - // todo handle error - code = streamTaskHandleEvent(pTask->status.pSM, TASK_EVENT_CHECKPOINT_DONE); - } else { - stDebug("s-task:0x%x vgId:%d not handle checkpoint-done event, status:%s", pReq->taskId, vgId, pStatus.name); - } - if (pReq->dropRelHTask) { stDebug("s-task:0x%x vgId:%d drop the related fill-history task:0x%" PRIx64 " after update checkpoint", pReq->taskId, vgId, pReq->hTaskId); @@ -670,7 +692,7 @@ int32_t uploadCheckpointData(SStreamTask* pTask, int64_t checkpointId, int64_t d } if (code == TSDB_CODE_SUCCESS) { - code = streamTaskUploadCheckpoint(idStr, path); + code = streamTaskUploadCheckpoint(idStr, path, checkpointId); if (code == TSDB_CODE_SUCCESS) { stDebug("s-task:%s upload checkpointId:%" PRId64 " to remote succ", idStr, checkpointId); } else { @@ -810,6 +832,7 @@ void checkpointTriggerMonitorFn(void* param, void* tmrId) { const char* id = pTask->id.idStr; SActiveCheckpointInfo* pActiveInfo = pTask->chkInfo.pActiveInfo; + SStreamTmrInfo* pTmrInfo = &pActiveInfo->chkptTriggerMsgTmr; if (pTask->info.taskLevel == TASK_LEVEL__SOURCE) { int32_t ref = atomic_sub_fetch_32(&pTask->status.timerActive, 1); @@ -820,24 +843,24 @@ void checkpointTriggerMonitorFn(void* param, void* tmrId) { // check the status every 100ms if (streamTaskShouldStop(pTask)) { - int32_t ref = atomic_sub_fetch_32(&pTask->status.timerActive, 1); + int32_t ref = streamCleanBeforeQuitTmr(pTmrInfo, pTask); stDebug("s-task:%s vgId:%d quit from monitor checkpoint-trigger, ref:%d", id, vgId, ref); streamMetaReleaseTask(pTask->pMeta, pTask); return; } - if (++pActiveInfo->checkCounter < 100) { - streamTmrReset(checkpointTriggerMonitorFn, 100, pTask, streamTimer, &pActiveInfo->pChkptTriggerTmr, vgId, "trigger-recv-monitor"); + if (++pTmrInfo->activeCounter < 50) { + streamTmrReset(checkpointTriggerMonitorFn, 200, pTask, streamTimer, &pTmrInfo->tmrHandle, vgId, "trigger-recv-monitor"); return; } - pActiveInfo->checkCounter = 0; + pTmrInfo->activeCounter = 0; stDebug("s-task:%s vgId:%d checkpoint-trigger monitor in tmr, ts:%" PRId64, id, vgId, now); streamMutexLock(&pTask->lock); SStreamTaskState pState = streamTaskGetStatus(pTask); if (pState.state != TASK_STATUS__CK) { - int32_t ref = atomic_sub_fetch_32(&pTask->status.timerActive, 1); + int32_t ref = streamCleanBeforeQuitTmr(pTmrInfo, pTask); stDebug("s-task:%s vgId:%d not in checkpoint status, quit from monitor checkpoint-trigger, ref:%d", id, vgId, ref); streamMutexUnlock(&pTask->lock); @@ -847,7 +870,7 @@ void checkpointTriggerMonitorFn(void* param, void* tmrId) { // checkpoint-trigger recv flag is set, quit if (pActiveInfo->allUpstreamTriggerRecv) { - int32_t ref = atomic_sub_fetch_32(&pTask->status.timerActive, 1); + int32_t ref = streamCleanBeforeQuitTmr(pTmrInfo, pTask); stDebug("s-task:%s vgId:%d all checkpoint-trigger recv, quit from monitor checkpoint-trigger, ref:%d", id, vgId, ref); @@ -867,6 +890,31 @@ void checkpointTriggerMonitorFn(void* param, void* tmrId) { terrno = TSDB_CODE_OUT_OF_MEMORY; stDebug("s-task:%s start to triggerMonitor, reason:%s", id, tstrerror(terrno)); streamMutexUnlock(&pActiveInfo->lock); + + stDebug("s-task:%s start to monitor checkpoint-trigger in 10s", id); + streamTmrReset(checkpointTriggerMonitorFn, 200, pTask, streamTimer, &pTmrInfo->tmrHandle, vgId, "trigger-recv-monitor"); + return; + } + + if ((pTmrInfo->launchChkptId != pActiveInfo->activeId) || (pActiveInfo->activeId == 0)) { + streamMutexUnlock(&pActiveInfo->lock); + int32_t ref = streamCleanBeforeQuitTmr(pTmrInfo, pTask); + stWarn("s-task:%s vgId:%d checkpoint-trigger retrieve by previous checkpoint procedure, checkpointId:%" PRId64 + ", quit, ref:%d", + id, vgId, pTmrInfo->launchChkptId, ref); + + streamMetaReleaseTask(pTask->pMeta, pTask); + return; + } + + // active checkpoint info is cleared for now + if ((pActiveInfo->activeId == 0) || (pActiveInfo->transId == 0) || (pTask->chkInfo.startTs == 0)) { + streamMutexUnlock(&pActiveInfo->lock); + int32_t ref = streamCleanBeforeQuitTmr(pTmrInfo, pTask); + stWarn("s-task:%s vgId:%d active checkpoint may be cleared, quit from retrieve checkpoint-trigger send tmr, ref:%d", + id, vgId, ref); + + streamMetaReleaseTask(pTask->pMeta, pTask); return; } @@ -900,9 +948,9 @@ void checkpointTriggerMonitorFn(void* param, void* tmrId) { // check every 100ms if (size > 0) { stDebug("s-task:%s start to monitor checkpoint-trigger in 10s", id); - streamTmrReset(checkpointTriggerMonitorFn, 100, pTask, streamTimer, &pActiveInfo->pChkptTriggerTmr, vgId, "trigger-recv-monitor"); + streamTmrReset(checkpointTriggerMonitorFn, 200, pTask, streamTimer, &pTmrInfo->tmrHandle, vgId, "trigger-recv-monitor"); } else { - int32_t ref = atomic_sub_fetch_32(&pTask->status.timerActive, 1); + int32_t ref = streamCleanBeforeQuitTmr(pTmrInfo, pTask); stDebug("s-task:%s all checkpoint-trigger recved, quit from monitor checkpoint-trigger tmr, ref:%d", id, ref); streamMetaReleaseTask(pTask->pMeta, pTask); } @@ -1060,11 +1108,8 @@ void streamTaskInitTriggerDispatchInfo(SStreamTask* pTask) { streamMutexUnlock(&pInfo->lock); } -int32_t streamTaskGetNumOfConfirmed(SStreamTask* pTask) { - SActiveCheckpointInfo* pInfo = pTask->chkInfo.pActiveInfo; - +int32_t streamTaskGetNumOfConfirmed(SActiveCheckpointInfo* pInfo) { int32_t num = 0; - streamMutexLock(&pInfo->lock); for (int32_t i = 0; i < taosArrayGetSize(pInfo->pDispatchTriggerList); ++i) { STaskTriggerSendInfo* p = taosArrayGet(pInfo->pDispatchTriggerList, i); if (p == NULL) { @@ -1075,7 +1120,6 @@ int32_t streamTaskGetNumOfConfirmed(SStreamTask* pTask) { num++; } } - streamMutexUnlock(&pInfo->lock); return num; } @@ -1101,9 +1145,9 @@ void streamTaskSetTriggerDispatchConfirmed(SStreamTask* pTask, int32_t vgId) { } } + int32_t numOfConfirmed = streamTaskGetNumOfConfirmed(pInfo); streamMutexUnlock(&pInfo->lock); - int32_t numOfConfirmed = streamTaskGetNumOfConfirmed(pTask); int32_t total = streamTaskGetNumOfDownstream(pTask); if (taskId == 0) { stError("s-task:%s recv invalid trigger-dispatch confirm, vgId:%d", pTask->id.idStr, vgId); @@ -1198,7 +1242,7 @@ ECHECKPOINT_BACKUP_TYPE streamGetCheckpointBackupType() { } } -int32_t streamTaskUploadCheckpoint(const char* id, const char* path) { +int32_t streamTaskUploadCheckpoint(const char* id, const char* path, int64_t checkpointId) { int32_t code = 0; if (id == NULL || path == NULL || strlen(id) == 0 || strlen(path) == 0 || strlen(path) >= PATH_MAX) { stError("invalid parameters in upload checkpoint, %s", id); @@ -1206,7 +1250,7 @@ int32_t streamTaskUploadCheckpoint(const char* id, const char* path) { } if (strlen(tsSnodeAddress) != 0) { - code = uploadByRsync(id, path); + code = uploadByRsync(id, path, checkpointId); if (code != 0) { return TAOS_SYSTEM_ERROR(errno); } @@ -1233,14 +1277,14 @@ int32_t downloadCheckpointDataByName(const char* id, const char* fname, const ch return 0; } -int32_t streamTaskDownloadCheckpointData(const char* id, char* path) { +int32_t streamTaskDownloadCheckpointData(const char* id, char* path, int64_t checkpointId) { if (id == NULL || path == NULL || strlen(id) == 0 || strlen(path) == 0 || strlen(path) >= PATH_MAX) { stError("down checkpoint data parameters invalid"); return -1; } if (strlen(tsSnodeAddress) != 0) { - return downloadRsync(id, path); + return downloadByRsync(id, path, checkpointId); } else if (tsS3StreamEnabled) { return s3GetObjectsByPrefix(id, path); } @@ -1281,6 +1325,8 @@ int32_t streamTaskSendRestoreChkptMsg(SStreamTask* pTask) { const char* id = pTask->id.idStr; streamMutexLock(&pTask->lock); + ETaskStatus p = streamTaskGetStatus(pTask).state; + if (pTask->status.sendConsensusChkptId == true) { stDebug("s-task:%s already start to consensus-checkpointId, not start again before it completed", id); streamMutexUnlock(&pTask->lock); @@ -1291,9 +1337,15 @@ int32_t streamTaskSendRestoreChkptMsg(SStreamTask* pTask) { streamMutexUnlock(&pTask->lock); + if (pTask->pBackend != NULL) { + streamFreeTaskState(pTask, p); + pTask->pBackend = NULL; + } + ASSERT(pTask->pBackend == NULL); pTask->status.requireConsensusChkptId = true; + stDebug("s-task:%s set the require consensus-checkpointId flag", id); return 0; } diff --git a/source/libs/stream/src/streamDispatch.c b/source/libs/stream/src/streamDispatch.c index dfd9d3e8df..4da108507a 100644 --- a/source/libs/stream/src/streamDispatch.c +++ b/source/libs/stream/src/streamDispatch.c @@ -820,31 +820,32 @@ int32_t initCheckpointReadyMsg(SStreamTask* pTask, int32_t upstreamNodeId, int32 } static void checkpointReadyMsgSendMonitorFn(void* param, void* tmrId) { - SStreamTask* pTask = param; - int32_t vgId = pTask->pMeta->vgId; - const char* id = pTask->id.idStr; + SStreamTask* pTask = param; + int32_t vgId = pTask->pMeta->vgId; + const char* id = pTask->id.idStr; + SActiveCheckpointInfo* pActiveInfo = pTask->chkInfo.pActiveInfo; + SStreamTmrInfo* pTmrInfo = &pActiveInfo->chkptReadyMsgTmr; // check the status every 100ms if (streamTaskShouldStop(pTask)) { - int32_t ref = atomic_sub_fetch_32(&pTask->status.timerActive, 1); + int32_t ref = streamCleanBeforeQuitTmr(pTmrInfo, pTask); stDebug("s-task:%s vgId:%d status:stop, quit from monitor checkpoint-trigger, ref:%d", id, vgId, ref); streamMetaReleaseTask(pTask->pMeta, pTask); return; } - SActiveCheckpointInfo* pActiveInfo = pTask->chkInfo.pActiveInfo; - if (++pActiveInfo->sendReadyCheckCounter < 100) { - streamTmrReset(checkpointReadyMsgSendMonitorFn, 100, pTask, streamTimer, &pActiveInfo->pSendReadyMsgTmr, vgId, "chkpt-ready-monitor"); + if (++pTmrInfo->activeCounter < 50) { + streamTmrReset(checkpointReadyMsgSendMonitorFn, 200, pTask, streamTimer, &pTmrInfo->tmrHandle, vgId, "chkpt-ready-monitor"); return; } - pActiveInfo->sendReadyCheckCounter = 0; - stDebug("s-task:%s in sending checkpoint-ready msg monitor timer", id); + pTmrInfo->activeCounter = 0; + stDebug("s-task:%s in sending checkpoint-ready msg monitor tmr", id); streamMutexLock(&pTask->lock); SStreamTaskState pState = streamTaskGetStatus(pTask); if (pState.state != TASK_STATUS__CK) { - int32_t ref = atomic_sub_fetch_32(&pTask->status.timerActive, 1); + int32_t ref = streamCleanBeforeQuitTmr(pTmrInfo, pTask); stDebug("s-task:%s vgId:%d status:%s not in checkpoint, quit from monitor checkpoint-ready send, ref:%d", id, vgId, pState.name, ref); streamMutexUnlock(&pTask->lock); @@ -858,10 +859,21 @@ static void checkpointReadyMsgSendMonitorFn(void* param, void* tmrId) { SArray* pList = pActiveInfo->pReadyMsgList; int32_t num = taosArrayGetSize(pList); - // active checkpoint info is cleared for now - if ((pActiveInfo->activeId == 0) && (pActiveInfo->transId == 0) && (num == 0) && (pTask->chkInfo.startTs == 0)) { + if (pTmrInfo->launchChkptId != pActiveInfo->activeId) { streamMutexUnlock(&pActiveInfo->lock); - int32_t ref = atomic_sub_fetch_32(&pTask->status.timerActive, 1); + int32_t ref = streamCleanBeforeQuitTmr(pTmrInfo, pTask); + stWarn("s-task:%s vgId:%d ready-msg send tmr launched by previous checkpoint procedure, checkpointId:%" PRId64 + ", quit, ref:%d", + id, vgId, pTmrInfo->launchChkptId, ref); + + streamMetaReleaseTask(pTask->pMeta, pTask); + return; + } + + // active checkpoint info is cleared for now + if ((pActiveInfo->activeId == 0) || (pActiveInfo->transId == 0) || (num == 0) || (pTask->chkInfo.startTs == 0)) { + streamMutexUnlock(&pActiveInfo->lock); + int32_t ref = streamCleanBeforeQuitTmr(pTmrInfo, pTask); stWarn("s-task:%s vgId:%d active checkpoint may be cleared, quit from readyMsg send tmr, ref:%d", id, vgId, ref); streamMetaReleaseTask(pTask->pMeta, pTask); @@ -923,10 +935,10 @@ static void checkpointReadyMsgSendMonitorFn(void* param, void* tmrId) { } } - streamTmrReset(checkpointReadyMsgSendMonitorFn, 100, pTask, streamTimer, &pActiveInfo->pSendReadyMsgTmr, vgId, "chkpt-ready-monitor"); + streamTmrReset(checkpointReadyMsgSendMonitorFn, 200, pTask, streamTimer, &pTmrInfo->tmrHandle, vgId, "chkpt-ready-monitor"); streamMutexUnlock(&pActiveInfo->lock); } else { - int32_t ref = atomic_sub_fetch_32(&pTask->status.timerActive, 1); + int32_t ref = streamCleanBeforeQuitTmr(pTmrInfo, pTask); stDebug( "s-task:%s vgId:%d recv of checkpoint-ready msg confirmed by all upstream task(s), clear checkpoint-ready msg " "and quit from timer, ref:%d", @@ -975,22 +987,32 @@ int32_t streamTaskSendCheckpointReadyMsg(SStreamTask* pTask) { } } - streamMutexUnlock(&pActiveInfo->lock); stDebug("s-task:%s level:%d checkpoint-ready msg sent to all %d upstreams", id, pTask->info.taskLevel, num); // start to check if checkpoint ready msg has successfully received by upstream tasks. if (pTask->info.taskLevel == TASK_LEVEL__SINK || pTask->info.taskLevel == TASK_LEVEL__AGG) { - int32_t ref = atomic_add_fetch_32(&pTask->status.timerActive, 1); - stDebug("s-task:%s start checkpoint-ready monitor in 10s, ref:%d ", pTask->id.idStr, ref); - streamMetaAcquireOneTask(pTask); + SStreamTmrInfo* pTmrInfo = &pActiveInfo->chkptReadyMsgTmr; - if (pActiveInfo->pSendReadyMsgTmr == NULL) { - pActiveInfo->pSendReadyMsgTmr = taosTmrStart(checkpointReadyMsgSendMonitorFn, 100, pTask, streamTimer); + int8_t old = atomic_val_compare_exchange_8(&pTmrInfo->isActive, 0, 1); + if (old == 0) { + int32_t ref = atomic_add_fetch_32(&pTask->status.timerActive, 1); + stDebug("s-task:%s start checkpoint-ready monitor in 10s, ref:%d ", pTask->id.idStr, ref); + streamMetaAcquireOneTask(pTask); + + if (pTmrInfo->tmrHandle == NULL) { + pTmrInfo->tmrHandle = taosTmrStart(checkpointReadyMsgSendMonitorFn, 200, pTask, streamTimer); + } else { + streamTmrReset(checkpointReadyMsgSendMonitorFn, 200, pTask, streamTimer, &pTmrInfo->tmrHandle, vgId, "chkpt-ready-monitor"); + } + + // mark the timer monitor checkpointId + pTmrInfo->launchChkptId = pActiveInfo->activeId; } else { - streamTmrReset(checkpointReadyMsgSendMonitorFn, 100, pTask, streamTimer, &pActiveInfo->pSendReadyMsgTmr, vgId, "chkpt-ready-monitor"); + stError("s-task:%s previous checkpoint-ready monitor tmr is set, not start new one", pTask->id.idStr); } } + streamMutexUnlock(&pActiveInfo->lock); return TSDB_CODE_SUCCESS; } @@ -1061,7 +1083,7 @@ int32_t streamAddBlockIntoDispatchMsg(const SSDataBlock* pBlock, SStreamDispatch int32_t doSendDispatchMsg(SStreamTask* pTask, const SStreamDispatchReq* pReq, int32_t vgId, SEpSet* pEpSet) { void* buf = NULL; - int32_t code = -1; + int32_t code = 0; SRpcMsg msg = {0}; // serialize @@ -1071,9 +1093,9 @@ int32_t doSendDispatchMsg(SStreamTask* pTask, const SStreamDispatchReq* pReq, in goto FAIL; } - code = -1; buf = rpcMallocCont(sizeof(SMsgHead) + tlen); if (buf == NULL) { + code = terrno; goto FAIL; } @@ -1097,6 +1119,10 @@ FAIL: rpcFreeCont(buf); } + if (code == -1) { + code = TSDB_CODE_INVALID_MSG; + } + return code; } @@ -1267,17 +1293,18 @@ static int32_t handleDispatchSuccessRsp(SStreamTask* pTask, int32_t downstreamId } } -static bool setDispatchRspInfo(SDispatchMsgInfo* pMsgInfo, int32_t vgId, int32_t code, int64_t now, int32_t* pNotRsp, const char* id) { +static bool setDispatchRspInfo(SDispatchMsgInfo* pMsgInfo, int32_t vgId, int32_t code, int64_t now, int32_t* pNotRsp, + int32_t* pFailed, const char* id) { int32_t numOfRsp = 0; - bool alreadySet = false; - bool updated = false; - bool allRsp = false; - *pNotRsp = 0; + int32_t numOfFailed = 0; - streamMutexLock(&pMsgInfo->lock); + bool allRsp = false; int32_t numOfDispatchBranch = taosArrayGetSize(pMsgInfo->pSendInfo); - for(int32_t i = 0; i < numOfDispatchBranch; ++i) { + *pNotRsp = 0; + *pFailed = 0; + + for (int32_t i = 0; i < numOfDispatchBranch; ++i) { SDispatchEntry* pEntry = taosArrayGet(pMsgInfo->pSendInfo, i); if (pEntry == NULL) { continue; @@ -1295,24 +1322,34 @@ static bool setDispatchRspInfo(SDispatchMsgInfo* pMsgInfo, int32_t vgId, int32_t } if (pEntry->nodeId == vgId) { - ASSERT(!alreadySet); - pEntry->rspTs = now; - pEntry->status = code; - alreadySet = true; - updated = true; - numOfRsp += 1; + if (pEntry->rspTs != -1) { + stDebug("s-task:%s dispatch rsp has already recved at:%" PRId64 ", ignore this rsp, msgId:%d", id, + pEntry->rspTs, pMsgInfo->msgId); + allRsp = false; + } else { + pEntry->rspTs = now; + pEntry->status = code; + numOfRsp += 1; + allRsp = (numOfRsp == numOfDispatchBranch); - stDebug("s-task:%s record the rsp recv, ts:%" PRId64 " code:%d, idx:%d, total recv:%d/%d", id, now, code, j, - numOfRsp, numOfDispatchBranch); + stDebug("s-task:%s record the rsp recv, ts:%" PRId64 " code:%d, idx:%d, total recv:%d/%d", id, now, code, j, + numOfRsp, numOfDispatchBranch); + } + break; } } + // this code may be error code. + for (int32_t i = 0; i < numOfDispatchBranch; ++i) { + SDispatchEntry* pEntry = taosArrayGet(pMsgInfo->pSendInfo, i); + if (pEntry->status != TSDB_CODE_SUCCESS || isDispatchRspTimeout(pEntry, now)) { + numOfFailed += 1; + } + } + + *pFailed = numOfFailed; *pNotRsp = numOfDispatchBranch - numOfRsp; - allRsp = (numOfRsp == numOfDispatchBranch); - streamMutexUnlock(&pMsgInfo->lock); - - ASSERT(updated); return allRsp; } @@ -1345,15 +1382,23 @@ int32_t streamProcessDispatchRsp(SStreamTask* pTask, SStreamDispatchRsp* pRsp, i int64_t now = taosGetTimestampMs(); bool allRsp = false; int32_t notRsp = 0; + int32_t numOfFailed = 0; + bool triggerDispatchRsp = false; + + // we only set the dispatch msg info for current checkpoint trans + streamMutexLock(&pTask->lock); + triggerDispatchRsp = (streamTaskGetStatus(pTask).state == TASK_STATUS__CK) && + (pTask->chkInfo.pActiveInfo->activeId == pMsgInfo->checkpointId); + streamMutexUnlock(&pTask->lock); streamMutexLock(&pMsgInfo->lock); - int32_t msgId = pMsgInfo->msgId; - streamMutexUnlock(&pMsgInfo->lock); + int32_t msgId = pMsgInfo->msgId; // follower not handle the dispatch rsp if ((pTask->pMeta->role == NODE_ROLE_FOLLOWER) || (pTask->status.downstreamReady != 1)) { stError("s-task:%s vgId:%d is follower or task just re-launched, not handle the dispatch rsp, discard it", id, vgId); + streamMutexUnlock(&pMsgInfo->lock); return TSDB_CODE_STREAM_TASK_NOT_EXIST; } @@ -1362,6 +1407,7 @@ int32_t streamProcessDispatchRsp(SStreamTask* pTask, SStreamDispatchRsp* pRsp, i stError("s-task:%s vgId:%d not expect rsp, expected: msgId:%d, stage:%" PRId64 " actual msgId:%d, stage:%" PRId64 " discard it", id, vgId, msgId, pTask->pMeta->stage, pRsp->msgId, pRsp->stage); + streamMutexUnlock(&pMsgInfo->lock); return TSDB_CODE_INVALID_MSG; } @@ -1373,18 +1419,18 @@ int32_t streamProcessDispatchRsp(SStreamTask* pTask, SStreamDispatchRsp* pRsp, i if (code == TSDB_CODE_STREAM_TASK_NOT_EXIST) { // destination task does not exist, not retry anymore stError("s-task:%s failed to dispatch msg to task:0x%x(vgId:%d), msgId:%d no retry, since task destroyed already", id, pRsp->downstreamTaskId, pRsp->downstreamNodeId, msgId); - allRsp = setDispatchRspInfo(pMsgInfo, pRsp->downstreamNodeId, TSDB_CODE_SUCCESS, now, ¬Rsp, id); + allRsp = setDispatchRspInfo(pMsgInfo, pRsp->downstreamNodeId, TSDB_CODE_SUCCESS, now, ¬Rsp, &numOfFailed, id); } else { stError("s-task:%s failed to dispatch msgId:%d to task:0x%x(vgId:%d), code:%s, add to retry list", id, msgId, pRsp->downstreamTaskId, pRsp->downstreamNodeId, tstrerror(code)); - allRsp = setDispatchRspInfo(pMsgInfo, pRsp->downstreamNodeId, code, now, ¬Rsp, id); + allRsp = setDispatchRspInfo(pMsgInfo, pRsp->downstreamNodeId, code, now, ¬Rsp, &numOfFailed, id); } } else { // code == 0 if (pRsp->inputStatus == TASK_INPUT_STATUS__BLOCKED) { pTask->inputq.status = TASK_INPUT_STATUS__BLOCKED; // block the input of current task, to push pressure to upstream - allRsp = setDispatchRspInfo(pMsgInfo, pRsp->downstreamNodeId, pRsp->inputStatus, now, ¬Rsp, id); + allRsp = setDispatchRspInfo(pMsgInfo, pRsp->downstreamNodeId, pRsp->inputStatus, now, ¬Rsp, &numOfFailed, id); stTrace("s-task:%s inputQ of downstream task:0x%x(vgId:%d) is full, wait for retry dispatch", id, pRsp->downstreamTaskId, pRsp->downstreamNodeId); } else { @@ -1396,15 +1442,13 @@ int32_t streamProcessDispatchRsp(SStreamTask* pTask, SStreamDispatchRsp* pRsp, i id, pRsp->downstreamTaskId, pRsp->downstreamNodeId); } - allRsp = setDispatchRspInfo(pMsgInfo, pRsp->downstreamNodeId, TSDB_CODE_SUCCESS, now, ¬Rsp, id); + allRsp = setDispatchRspInfo(pMsgInfo, pRsp->downstreamNodeId, TSDB_CODE_SUCCESS, now, ¬Rsp, &numOfFailed, id); { bool delayDispatch = (pMsgInfo->dispatchMsgType == STREAM_INPUT__CHECKPOINT_TRIGGER); if (delayDispatch) { - streamMutexLock(&pTask->lock); // we only set the dispatch msg info for current checkpoint trans - if (streamTaskGetStatus(pTask).state == TASK_STATUS__CK && - pTask->chkInfo.pActiveInfo->activeId == pMsgInfo->checkpointId) { + if (triggerDispatchRsp) { ASSERT(pTask->chkInfo.pActiveInfo->transId == pMsgInfo->transId); stDebug("s-task:%s checkpoint-trigger msg to 0x%x rsp for checkpointId:%" PRId64 " transId:%d confirmed", pTask->id.idStr, pRsp->downstreamTaskId, pMsgInfo->checkpointId, pMsgInfo->transId); @@ -1415,12 +1459,13 @@ int32_t streamProcessDispatchRsp(SStreamTask* pTask, SStreamDispatchRsp* pRsp, i " transId:%d discard, since expired", pTask->id.idStr, pMsgInfo->checkpointId, pMsgInfo->transId); } - streamMutexUnlock(&pTask->lock); } } } } + streamMutexUnlock(&pMsgInfo->lock); + if (pTask->outputInfo.type == TASK_OUTPUT__SHUFFLE_DISPATCH) { if (!allRsp) { stDebug( @@ -1439,29 +1484,25 @@ int32_t streamProcessDispatchRsp(SStreamTask* pTask, SStreamDispatchRsp* pRsp, i } // all msg rsp already, continue - if (allRsp) { - ASSERT(pTask->outputq.status == TASK_OUTPUT_STATUS__WAIT); + // we need to re-try send dispatch msg to downstream tasks + if (allRsp && (numOfFailed == 0)) { + // trans-state msg has been sent to downstream successfully. let's transfer the fill-history task state + if (pMsgInfo->dispatchMsgType == STREAM_INPUT__TRANS_STATE) { + stDebug("s-task:%s dispatch trans-state msgId:%d to downstream successfully, start to prepare transfer state", id, + msgId); + ASSERT(pTask->info.fillHistory == 1); - // we need to re-try send dispatch msg to downstream tasks - int32_t numOfFailed = getFailedDispatchInfo(pMsgInfo, now); - if (numOfFailed == 0) { // this message has been sent successfully, let's try next one. - // trans-state msg has been sent to downstream successfully. let's transfer the fill-history task state - if (pMsgInfo->dispatchMsgType == STREAM_INPUT__TRANS_STATE) { - stDebug("s-task:%s dispatch trans-state msgId:%d to downstream successfully, start to prepare transfer state", - id, msgId); - ASSERT(pTask->info.fillHistory == 1); - - code = streamTransferStatePrepare(pTask); - if (code != TSDB_CODE_SUCCESS) { // todo: do nothing if error happens - } - - clearBufferedDispatchMsg(pTask); - - // now ready for next data output - atomic_store_8(&pTask->outputq.status, TASK_OUTPUT_STATUS__NORMAL); - } else { - code = handleDispatchSuccessRsp(pTask, pRsp->downstreamTaskId, pRsp->downstreamNodeId); + code = streamTransferStatePrepare(pTask); + if (code != TSDB_CODE_SUCCESS) { // todo: do nothing if error happens } + + clearBufferedDispatchMsg(pTask); + + // now ready for next data output + atomic_store_8(&pTask->outputq.status, TASK_OUTPUT_STATUS__NORMAL); + } else { + // this message has been sent successfully, let's try next one. + code = handleDispatchSuccessRsp(pTask, pRsp->downstreamTaskId, pRsp->downstreamNodeId); } } diff --git a/source/libs/stream/src/streamHb.c b/source/libs/stream/src/streamHb.c index 9804943ec2..d2c5cb05b7 100644 --- a/source/libs/stream/src/streamHb.c +++ b/source/libs/stream/src/streamHb.c @@ -142,11 +142,12 @@ int32_t streamMetaSendHbHelper(SStreamMeta* pMeta) { } SStreamHbMsg* pMsg = &pInfo->hbMsg; - stDebug("vgId:%d build stream hbMsg, leader:%d msgId:%d", pMeta->vgId, (pMeta->role == NODE_ROLE_LEADER), - pMeta->pHbInfo->hbCount); - pMsg->vgId = pMeta->vgId; pMsg->msgId = pMeta->pHbInfo->hbCount; + pMsg->ts = taosGetTimestampMs(); + + stDebug("vgId:%d build stream hbMsg, leader:%d HbMsgId:%d, HbMsgTs:%" PRId64, pMeta->vgId, + (pMeta->role == NODE_ROLE_LEADER), pMsg->msgId, pMsg->ts); pMsg->pTaskStatus = taosArrayInit(numOfTasks, sizeof(STaskStatusEntry)); pMsg->pUpdateNodes = taosArrayInit(numOfTasks, sizeof(int32_t)); @@ -292,14 +293,14 @@ void streamMetaHbToMnode(void* param, void* tmrId) { streamMetaRLock(pMeta); code = streamMetaSendHbHelper(pMeta); if (code) { - stError("vgId:%d failed to send hmMsg to mnode, try again in 5s, code:%s", pMeta->vgId, strerror(code)); + stError("vgId:%d failed to send hmMsg to mnode, try again in 5s, code:%s", pMeta->vgId, tstrerror(code)); } - streamMetaRUnLock(pMeta); + streamTmrReset(streamMetaHbToMnode, META_HB_CHECK_INTERVAL, param, streamTimer, &pMeta->pHbInfo->hbTmr, pMeta->vgId, "meta-hb-tmr"); - code = taosReleaseRef(streamMetaId, rid); + if (code) { stError("vgId:%d in meta timer, failed to release the meta rid:%" PRId64, pMeta->vgId, rid); } diff --git a/source/libs/stream/src/streamMeta.c b/source/libs/stream/src/streamMeta.c index 321027c293..5ed9f274a2 100644 --- a/source/libs/stream/src/streamMeta.c +++ b/source/libs/stream/src/streamMeta.c @@ -318,7 +318,19 @@ int32_t streamTaskSetDb(SStreamMeta* pMeta, SStreamTask* pTask, const char* key) pBackend->pTask = pTask; pBackend->pMeta = pMeta; - if (processVer != -1) pTask->chkInfo.processedVer = processVer; + if (processVer != -1) { + if (pTask->chkInfo.processedVer != processVer) { + stWarn("s-task:%s vgId:%d update checkpointVer:%" PRId64 "->%" PRId64 " for checkpointId:%" PRId64, + pTask->id.idStr, pTask->pMeta->vgId, pTask->chkInfo.processedVer, processVer, pTask->chkInfo.checkpointId); + pTask->chkInfo.processedVer = processVer; + pTask->chkInfo.checkpointVer = processVer; + pTask->chkInfo.nextProcessVer = processVer + 1; + } else { + stInfo("s-task:%s vgId:%d processedVer:%" PRId64 + " in task meta equals to data in checkpoint data for checkpointId:%" PRId64, + pTask->id.idStr, pTask->pMeta->vgId, pTask->chkInfo.processedVer, pTask->chkInfo.checkpointId); + } + } code = taosHashPut(pMeta->pTaskDbUnique, key, strlen(key), &pBackend, sizeof(void*)); if (code) { @@ -1140,6 +1152,20 @@ void streamMetaNotifyClose(SStreamMeta* pMeta) { taosMsleep(100); } + streamMetaRLock(pMeta); + + SArray* pTaskList = NULL; + int32_t code = streamMetaSendMsgBeforeCloseTasks(pMeta, &pTaskList); + if (code != TSDB_CODE_SUCCESS) { +// return code; + } + + streamMetaRUnLock(pMeta); + + if (pTaskList != NULL) { + taosArrayDestroy(pTaskList); + } + int64_t el = taosGetTimestampMs() - st; stDebug("vgId:%d all stream tasks are not in timer, continue close, elapsed time:%" PRId64 " ms", pMeta->vgId, el); } @@ -1393,7 +1419,6 @@ int32_t streamMetaStartAllTasks(SStreamMeta* pMeta) { } // negotiate the consensus checkpoint id for current task - ASSERT(pTask->pBackend == NULL); code = streamTaskSendRestoreChkptMsg(pTask); // this task may has no checkpoint, but others tasks may generate checkpoint already? diff --git a/source/libs/stream/src/streamMsg.c b/source/libs/stream/src/streamMsg.c index 1a48b594ea..6fe2d818b3 100644 --- a/source/libs/stream/src/streamMsg.c +++ b/source/libs/stream/src/streamMsg.c @@ -398,6 +398,7 @@ int32_t tEncodeStreamHbMsg(SEncoder* pEncoder, const SStreamHbMsg* pReq) { } if (tEncodeI32(pEncoder, pReq->msgId) < 0) return -1; + if (tEncodeI64(pEncoder, pReq->ts) < 0) return -1; tEndEncode(pEncoder); return pEncoder->pos; } @@ -470,6 +471,7 @@ int32_t tDecodeStreamHbMsg(SDecoder* pDecoder, SStreamHbMsg* pReq) { } if (tDecodeI32(pDecoder, &pReq->msgId) < 0) return -1; + if (tDecodeI64(pDecoder, &pReq->ts) < 0) return -1; tEndDecode(pDecoder); return 0; diff --git a/source/libs/stream/src/streamSched.c b/source/libs/stream/src/streamSched.c index 6506d449a6..e8c7be5204 100644 --- a/source/libs/stream/src/streamSched.c +++ b/source/libs/stream/src/streamSched.c @@ -107,7 +107,7 @@ void streamTaskResumeHelper(void* param, void* tmrId) { int32_t code = streamTaskSchedTask(pTask->pMsgCb, pTask->info.nodeId, pId->streamId, pId->taskId, STREAM_EXEC_T_RESUME_TASK); int32_t ref = atomic_sub_fetch_32(&pTask->status.timerActive, 1); if (code) { - stError("s-task:%s sched task failed, code:%s, ref:%d", pId->idStr, strerror(code), ref); + stError("s-task:%s sched task failed, code:%s, ref:%d", pId->idStr, tstrerror(code), ref); } else { stDebug("trigger to resume s-task:%s after being idled for %dms, ref:%d", pId->idStr, pTask->status.schedIdleTime, ref); diff --git a/source/libs/stream/src/streamTask.c b/source/libs/stream/src/streamTask.c index 59ce9e8d42..c0b2b16d30 100644 --- a/source/libs/stream/src/streamTask.c +++ b/source/libs/stream/src/streamTask.c @@ -321,7 +321,7 @@ void streamFreeTaskState(SStreamTask* pTask, int8_t remove) { stDebug("s-task:0x%x start to free task state", pTask->id.taskId); streamStateClose(pTask->pState, remove); - taskDbSetClearFileFlag(pTask->pBackend); + if (remove)taskDbSetClearFileFlag(pTask->pBackend); taskDbRemoveRef(pTask->pBackend); pTask->pBackend = NULL; pTask->pState = NULL; @@ -1140,14 +1140,16 @@ void streamTaskDestroyActiveChkptInfo(SActiveCheckpointInfo* pInfo) { taosArrayDestroy(pInfo->pCheckpointReadyRecvList); pInfo->pCheckpointReadyRecvList = NULL; - if (pInfo->pChkptTriggerTmr != NULL) { - (void) taosTmrStop(pInfo->pChkptTriggerTmr); - pInfo->pChkptTriggerTmr = NULL; + SStreamTmrInfo* pTriggerTmr = &pInfo->chkptTriggerMsgTmr; + if (pTriggerTmr->tmrHandle != NULL) { + (void) taosTmrStop(pTriggerTmr->tmrHandle); + pTriggerTmr->tmrHandle = NULL; } - if (pInfo->pSendReadyMsgTmr != NULL) { - (void) taosTmrStop(pInfo->pSendReadyMsgTmr); - pInfo->pSendReadyMsgTmr = NULL; + SStreamTmrInfo* pReadyTmr = &pInfo->chkptReadyMsgTmr; + if (pReadyTmr->tmrHandle != NULL) { + (void) taosTmrStop(pReadyTmr->tmrHandle); + pReadyTmr->tmrHandle = NULL; } taosMemoryFree(pInfo); diff --git a/source/libs/stream/src/streamTaskSm.c b/source/libs/stream/src/streamTaskSm.c index a54c17df03..8fd26dda27 100644 --- a/source/libs/stream/src/streamTaskSm.c +++ b/source/libs/stream/src/streamTaskSm.c @@ -96,12 +96,6 @@ static int32_t attachWaitedEvent(SStreamTask* pTask, SFutureHandleEventInfo* pEv } } -static int32_t stopTaskSuccFn(SStreamTask* pTask) { - SStreamTaskSM* pSM = pTask->status.pSM; - streamFreeTaskState(pTask,pSM->current.state == TASK_STATUS__DROPPING ? 1 : 0); - return TSDB_CODE_SUCCESS; -} - int32_t streamTaskInitStatus(SStreamTask* pTask) { pTask->execInfo.checkTs = taosGetTimestampMs(); stDebug("s-task:%s start init, and check downstream tasks, set the init ts:%" PRId64, pTask->id.idStr, @@ -698,21 +692,21 @@ void doInitStateTransferTable(void) { // resume is completed by restore status of state-machine // stop related event - trans = createStateTransform(TASK_STATUS__READY, TASK_STATUS__STOP, TASK_EVENT_STOP, NULL, stopTaskSuccFn, NULL); + trans = createStateTransform(TASK_STATUS__READY, TASK_STATUS__STOP, TASK_EVENT_STOP, NULL, NULL, NULL); CHECK_RET_VAL(taosArrayPush(streamTaskSMTrans, &trans)); - trans = createStateTransform(TASK_STATUS__DROPPING, TASK_STATUS__STOP, TASK_EVENT_STOP, NULL, stopTaskSuccFn, NULL); + trans = createStateTransform(TASK_STATUS__DROPPING, TASK_STATUS__STOP, TASK_EVENT_STOP, NULL, NULL, NULL); CHECK_RET_VAL(taosArrayPush(streamTaskSMTrans, &trans)); - trans = createStateTransform(TASK_STATUS__UNINIT, TASK_STATUS__STOP, TASK_EVENT_STOP, NULL, stopTaskSuccFn, NULL); + trans = createStateTransform(TASK_STATUS__UNINIT, TASK_STATUS__STOP, TASK_EVENT_STOP, NULL, NULL, NULL); CHECK_RET_VAL(taosArrayPush(streamTaskSMTrans, &trans)); - trans = createStateTransform(TASK_STATUS__STOP, TASK_STATUS__STOP, TASK_EVENT_STOP, NULL, stopTaskSuccFn, NULL); + trans = createStateTransform(TASK_STATUS__STOP, TASK_STATUS__STOP, TASK_EVENT_STOP, NULL, NULL, NULL); CHECK_RET_VAL(taosArrayPush(streamTaskSMTrans, &trans)); - trans = createStateTransform(TASK_STATUS__SCAN_HISTORY, TASK_STATUS__STOP, TASK_EVENT_STOP, NULL, stopTaskSuccFn, NULL); + trans = createStateTransform(TASK_STATUS__SCAN_HISTORY, TASK_STATUS__STOP, TASK_EVENT_STOP, NULL, NULL, NULL); CHECK_RET_VAL(taosArrayPush(streamTaskSMTrans, &trans)); - trans = createStateTransform(TASK_STATUS__HALT, TASK_STATUS__STOP, TASK_EVENT_STOP, NULL, stopTaskSuccFn, NULL); + trans = createStateTransform(TASK_STATUS__HALT, TASK_STATUS__STOP, TASK_EVENT_STOP, NULL, NULL, NULL); CHECK_RET_VAL(taosArrayPush(streamTaskSMTrans, &trans)); - trans = createStateTransform(TASK_STATUS__PAUSE, TASK_STATUS__STOP, TASK_EVENT_STOP, NULL, stopTaskSuccFn, NULL); + trans = createStateTransform(TASK_STATUS__PAUSE, TASK_STATUS__STOP, TASK_EVENT_STOP, NULL, NULL, NULL); CHECK_RET_VAL(taosArrayPush(streamTaskSMTrans, &trans)); - trans = createStateTransform(TASK_STATUS__CK, TASK_STATUS__STOP, TASK_EVENT_STOP, NULL, stopTaskSuccFn, NULL); + trans = createStateTransform(TASK_STATUS__CK, TASK_STATUS__STOP, TASK_EVENT_STOP, NULL, NULL, NULL); CHECK_RET_VAL(taosArrayPush(streamTaskSMTrans, &trans)); // dropping related event diff --git a/source/libs/stream/src/streamTimer.c b/source/libs/stream/src/streamTimer.c index fb1740ae0a..8f2fc83b72 100644 --- a/source/libs/stream/src/streamTimer.c +++ b/source/libs/stream/src/streamTimer.c @@ -50,3 +50,13 @@ void streamTmrReset(TAOS_TMR_CALLBACK fp, int32_t mseconds, void* param, void* h // stError("vgId:%d failed to reset tmr: %s, try again", vgId, pMsg); // } } + +int32_t streamCleanBeforeQuitTmr(SStreamTmrInfo* pInfo, SStreamTask* pTask) { + pInfo->activeCounter = 0; + pInfo->launchChkptId = 0; + atomic_store_8(&pInfo->isActive, 0); + + int32_t ref = atomic_sub_fetch_32(&pTask->status.timerActive, 1); + ASSERT(ref >= 0); + return ref; +} \ No newline at end of file diff --git a/source/libs/sync/src/syncMain.c b/source/libs/sync/src/syncMain.c index 171b73eba7..fd1d3e371e 100644 --- a/source/libs/sync/src/syncMain.c +++ b/source/libs/sync/src/syncMain.c @@ -1287,7 +1287,7 @@ SSyncNode* syncNodeOpen(SSyncInfo* pSyncInfo, int32_t vnodeVersion) { } // tools - (void)syncRespMgrCreate(pSyncNode, SYNC_RESP_TTL_MS, &pSyncNode->pSyncRespMgr); // TODO: check return value + (void)syncRespMgrCreate(pSyncNode, SYNC_RESP_TTL_MS, &pSyncNode->pSyncRespMgr); // TODO: check return value if (pSyncNode->pSyncRespMgr == NULL) { sError("vgId:%d, failed to create SyncRespMgr", pSyncNode->vgId); goto _error; @@ -1407,7 +1407,8 @@ int32_t syncNodeRestore(SSyncNode* pSyncNode) { pSyncNode->commitIndex = TMAX(pSyncNode->commitIndex, commitIndex); sInfo("vgId:%d, restore sync until commitIndex:%" PRId64, pSyncNode->vgId, pSyncNode->commitIndex); - if (pSyncNode->fsmState != SYNC_FSM_STATE_INCOMPLETE && (code = syncLogBufferCommit(pSyncNode->pLogBuf, pSyncNode, pSyncNode->commitIndex)) < 0) { + if (pSyncNode->fsmState != SYNC_FSM_STATE_INCOMPLETE && + (code = syncLogBufferCommit(pSyncNode->pLogBuf, pSyncNode, pSyncNode->commitIndex)) < 0) { TAOS_RETURN(code); } @@ -2187,7 +2188,7 @@ void syncNodeCandidate2Leader(SSyncNode* pSyncNode) { } SyncIndex lastIndex = pSyncNode->pLogStore->syncLogLastIndex(pSyncNode->pLogStore); - ASSERT(lastIndex >= 0); + // ASSERT(lastIndex >= 0); sInfo("vgId:%d, become leader. term:%" PRId64 ", commit index:%" PRId64 ", last index:%" PRId64 "", pSyncNode->vgId, raftStoreGetTerm(pSyncNode), pSyncNode->commitIndex, lastIndex); } diff --git a/source/libs/sync/src/syncPipeline.c b/source/libs/sync/src/syncPipeline.c index ef2cbece79..782d97f789 100644 --- a/source/libs/sync/src/syncPipeline.c +++ b/source/libs/sync/src/syncPipeline.c @@ -892,7 +892,7 @@ int32_t syncLogReplRecover(SSyncLogReplMgr* pMgr, SSyncNode* pNode, SyncAppendEn if (pMsg->matchIndex < pNode->pLogBuf->matchIndex) { code = syncLogReplGetPrevLogTerm(pMgr, pNode, index + 1, &term); - if (term < 0 && (errno == ENFILE || errno == EMFILE)) { + if (term < 0 && (errno == ENFILE || errno == EMFILE || errno == ENOENT)) { sError("vgId:%d, failed to get prev log term since %s. index:%" PRId64, pNode->vgId, tstrerror(code), index + 1); TAOS_RETURN(code); } diff --git a/source/libs/sync/test/sync_test_lib/src/syncIO.c b/source/libs/sync/test/sync_test_lib/src/syncIO.c index 11894f7853..f5a32b98d9 100644 --- a/source/libs/sync/test/sync_test_lib/src/syncIO.c +++ b/source/libs/sync/test/sync_test_lib/src/syncIO.c @@ -193,7 +193,7 @@ static int32_t syncIOStartInternal(SSyncIO *io) { io->clientRpc = rpcOpen(&rpcInit); if (io->clientRpc == NULL) { sError("failed to initialize RPC"); - return -1; + return terrno; } } @@ -214,7 +214,7 @@ static int32_t syncIOStartInternal(SSyncIO *io) { void *pRpc = rpcOpen(&rpcInit); if (pRpc == NULL) { sError("failed to start RPC server"); - return -1; + return terrno; } } diff --git a/source/libs/tdb/src/db/tdbBtree.c b/source/libs/tdb/src/db/tdbBtree.c index 419494dc84..59b7f4af2e 100644 --- a/source/libs/tdb/src/db/tdbBtree.c +++ b/source/libs/tdb/src/db/tdbBtree.c @@ -194,7 +194,11 @@ int tdbBtreeInsert(SBTree *pBt, const void *pKey, int kLen, const void *pVal, in int idx; int c; - (void)tdbBtcOpen(&btc, pBt, pTxn); + ret = tdbBtcOpen(&btc, pBt, pTxn); + if (ret) { + tdbError("tdb/btree-insert: btc open failed with ret: %d.", ret); + return ret; + } tdbTrace("tdb insert, btc: %p, pTxn: %p", &btc, pTxn); @@ -235,7 +239,11 @@ int tdbBtreeDelete(SBTree *pBt, const void *pKey, int kLen, TXN *pTxn) { int c; int ret; - (void)tdbBtcOpen(&btc, pBt, pTxn); + ret = tdbBtcOpen(&btc, pBt, pTxn); + if (ret) { + tdbError("tdb/btree-delete: btc open failed with ret: %d.", ret); + return ret; + } /* btc.coder.ofps = taosArrayInit(8, sizeof(SPage *)); // btc.coder.ofps = taosArrayInit(8, sizeof(SPgno)); @@ -337,7 +345,11 @@ int tdbBtreePGet(SBTree *pBt, const void *pKey, int kLen, void **ppKey, int *pkL void *pTVal = NULL; SCellDecoder cd = {0}; - (void)tdbBtcOpen(&btc, pBt, NULL); + ret = tdbBtcOpen(&btc, pBt, NULL); + if (ret) { + tdbError("tdb/btree-pget: btc open failed with ret: %d.", ret); + return ret; + } tdbTrace("tdb pget, btc: %p", &btc); @@ -1660,12 +1672,14 @@ int tdbBtcOpen(SBTC *pBtc, SBTree *pBt, TXN *pTxn) { if (pTxn == NULL) { TXN *pTxn = tdbOsCalloc(1, sizeof(*pTxn)); if (!pTxn) { + pBtc->pTxn = NULL; return terrno; } int32_t ret = tdbTxnOpen(pTxn, 0, tdbDefaultMalloc, tdbDefaultFree, NULL, 0); if (ret < 0) { tdbOsFree(pTxn); + pBtc->pTxn = NULL; return ret; } diff --git a/source/libs/tdb/src/db/tdbTable.c b/source/libs/tdb/src/db/tdbTable.c index d885c38864..dacca1aea1 100644 --- a/source/libs/tdb/src/db/tdbTable.c +++ b/source/libs/tdb/src/db/tdbTable.c @@ -224,7 +224,10 @@ int tdbTbcOpen(TTB *pTb, TBC **ppTbc, TXN *pTxn) { return -1; } - (void)tdbBtcOpen(&pTbc->btc, pTb->pBt, pTxn); + if (tdbBtcOpen(&pTbc->btc, pTb->pBt, pTxn)) { + taosMemoryFree(pTbc); + return -1; + } *ppTbc = pTbc; return 0; diff --git a/source/libs/transport/inc/transComm.h b/source/libs/transport/inc/transComm.h index e66941244c..820075787f 100644 --- a/source/libs/transport/inc/transComm.h +++ b/source/libs/transport/inc/transComm.h @@ -148,7 +148,6 @@ typedef struct { STransSyncMsg* pSyncMsg; // for syncchronous with timeout API int64_t syncMsgRef; SCvtAddr cvtAddr; - bool setMaxRetry; int32_t retryMinInterval; int32_t retryMaxInterval; @@ -207,7 +206,7 @@ typedef struct { #pragma pack(pop) -typedef enum { Normal, Quit, Release, Register, Update } STransMsgType; +typedef enum { Normal, Quit, Release, Register, Update, FreeById } STransMsgType; typedef enum { ConnNormal, ConnAcquire, ConnRelease, ConnBroken, ConnInPool } ConnStatus; #define container_of(ptr, type, member) ((type*)((char*)(ptr)-offsetof(type, member))) @@ -304,10 +303,10 @@ int32_t transClearBuffer(SConnBuffer* buf); int32_t transDestroyBuffer(SConnBuffer* buf); int32_t transAllocBuffer(SConnBuffer* connBuf, uv_buf_t* uvBuf); bool transReadComplete(SConnBuffer* connBuf); -int transResetBuffer(SConnBuffer* connBuf, int8_t resetBuf); -int transDumpFromBuffer(SConnBuffer* connBuf, char** buf, int8_t resetBuf); +int32_t transResetBuffer(SConnBuffer* connBuf, int8_t resetBuf); +int32_t transDumpFromBuffer(SConnBuffer* connBuf, char** buf, int8_t resetBuf); -int transSetConnOption(uv_tcp_t* stream, int keepalive); +int32_t transSetConnOption(uv_tcp_t* stream, int keepalive); void transRefSrvHandle(void* handle); void transUnrefSrvHandle(void* handle); @@ -315,21 +314,24 @@ void transUnrefSrvHandle(void* handle); void transRefCliHandle(void* handle); void transUnrefCliHandle(void* handle); -int transReleaseCliHandle(void* handle); -int transReleaseSrvHandle(void* handle); +int32_t transReleaseCliHandle(void* handle); +int32_t transReleaseSrvHandle(void* handle); -int transSendRequest(void* shandle, const SEpSet* pEpSet, STransMsg* pMsg, STransCtx* pCtx); -int transSendRecv(void* shandle, const SEpSet* pEpSet, STransMsg* pMsg, STransMsg* pRsp); -int transSendRecvWithTimeout(void* shandle, SEpSet* pEpSet, STransMsg* pMsg, STransMsg* pRsp, int8_t* epUpdated, +int32_t transSendRequest(void* shandle, const SEpSet* pEpSet, STransMsg* pMsg, STransCtx* pCtx); +int32_t transSendRecv(void* shandle, const SEpSet* pEpSet, STransMsg* pMsg, STransMsg* pRsp); +int32_t transSendRecvWithTimeout(void* shandle, SEpSet* pEpSet, STransMsg* pMsg, STransMsg* pRsp, int8_t* epUpdated, int32_t timeoutMs); -int transSendResponse(const STransMsg* msg); -int transRegisterMsg(const STransMsg* msg); -int transSetDefaultAddr(void* shandle, const char* ip, const char* fqdn); +int32_t transSendRequestWithId(void* shandle, const SEpSet* pEpSet, STransMsg* pReq, int64_t* transpointId); +int32_t transFreeConnById(void* shandle, int64_t transpointId); + +int32_t transSendResponse(const STransMsg* msg); +int32_t transRegisterMsg(const STransMsg* msg); +int32_t transSetDefaultAddr(void* shandle, const char* ip, const char* fqdn); int32_t transSetIpWhiteList(void* shandle, void* arg, FilteFunc* func); -int transSockInfo2Str(struct sockaddr* sockname, char* dst); +int32_t transSockInfo2Str(struct sockaddr* sockname, char* dst); -int64_t transAllocHandle(); +int32_t transAllocHandle(int64_t* refId); void* transInitServer(uint32_t ip, uint32_t port, char* label, int numOfThreads, void* fp, void* shandle); void* transInitClient(uint32_t ip, uint32_t port, char* label, int numOfThreads, void* fp, void* shandle); diff --git a/source/libs/transport/inc/transportInt.h b/source/libs/transport/inc/transportInt.h index 7853e25cff..703a4dde3e 100644 --- a/source/libs/transport/inc/transportInt.h +++ b/source/libs/transport/inc/transportInt.h @@ -58,6 +58,8 @@ typedef struct { int32_t failFastThreshold; int32_t failFastInterval; + int8_t notWaitAvaliableConn; // 1: no delay, 0: delay + void (*cfp)(void* parent, SRpcMsg*, SEpSet*); bool (*retry)(int32_t code, tmsg_t msgType); bool (*startTimer)(int32_t code, tmsg_t msgType); diff --git a/source/libs/transport/src/trans.c b/source/libs/transport/src/trans.c index fbcc74e8e1..8b99443a84 100644 --- a/source/libs/transport/src/trans.c +++ b/source/libs/transport/src/trans.c @@ -102,6 +102,8 @@ void* rpcOpen(const SRpcInit* pInit) { if (pRpc->timeToGetConn == 0) { pRpc->timeToGetConn = 10 * 1000; } + pRpc->notWaitAvaliableConn = pInit->notWaitAvaliableConn; + pRpc->tcphandle = (*taosInitHandle[pRpc->connType])(ip, pInit->localPort, pRpc->label, pRpc->numOfThreads, NULL, pRpc); @@ -163,38 +165,48 @@ void* rpcReallocCont(void* ptr, int64_t contLen) { return st + TRANS_MSG_OVERHEAD; } -int rpcSendRequest(void* shandle, const SEpSet* pEpSet, SRpcMsg* pMsg, int64_t* pRid) { +int32_t rpcSendRequest(void* shandle, const SEpSet* pEpSet, SRpcMsg* pMsg, int64_t* pRid) { return transSendRequest(shandle, pEpSet, pMsg, NULL); } -int rpcSendRequestWithCtx(void* shandle, const SEpSet* pEpSet, SRpcMsg* pMsg, int64_t* pRid, SRpcCtx* pCtx) { - return transSendRequest(shandle, pEpSet, pMsg, pCtx); -} -int rpcSendRecv(void* shandle, SEpSet* pEpSet, SRpcMsg* pMsg, SRpcMsg* pRsp) { - return transSendRecv(shandle, pEpSet, pMsg, pRsp); -} -int rpcSendRecvWithTimeout(void* shandle, SEpSet* pEpSet, SRpcMsg* pMsg, SRpcMsg* pRsp, int8_t* epUpdated, - int32_t timeoutMs) { - return transSendRecvWithTimeout(shandle, pEpSet, pMsg, pRsp, epUpdated, timeoutMs); +int32_t rpcSendRequestWithCtx(void* shandle, const SEpSet* pEpSet, SRpcMsg* pMsg, int64_t* pRid, SRpcCtx* pCtx) { + if (pCtx != NULL || pMsg->info.handle != 0 || pMsg->info.noResp != 0|| pRid == NULL) { + return transSendRequest(shandle, pEpSet, pMsg, pCtx); + } else { + return transSendRequestWithId(shandle, pEpSet, pMsg, pRid); + } } -int rpcSendResponse(const SRpcMsg* pMsg) { return transSendResponse(pMsg); } +int32_t rpcSendRequestWithId(void* shandle, const SEpSet* pEpSet, STransMsg* pReq, int64_t* transpointId) { + return transSendRequestWithId(shandle, pEpSet, pReq, transpointId); +} + +int32_t rpcSendRecv(void* shandle, SEpSet* pEpSet, SRpcMsg* pMsg, SRpcMsg* pRsp) { + return transSendRecv(shandle, pEpSet, pMsg, pRsp); +} +int32_t rpcSendRecvWithTimeout(void* shandle, SEpSet* pEpSet, SRpcMsg* pMsg, SRpcMsg* pRsp, int8_t* epUpdated, + int32_t timeoutMs) { + return transSendRecvWithTimeout(shandle, pEpSet, pMsg, pRsp, epUpdated, timeoutMs); +} +int32_t rpcFreeConnById(void* shandle, int64_t connId) { return transFreeConnById(shandle, connId); } + +int32_t rpcSendResponse(const SRpcMsg* pMsg) { return transSendResponse(pMsg); } void rpcRefHandle(void* handle, int8_t type) { (*taosRefHandle[type])(handle); } void rpcUnrefHandle(void* handle, int8_t type) { (*taosUnRefHandle[type])(handle); } -int rpcRegisterBrokenLinkArg(SRpcMsg* msg) { return transRegisterMsg(msg); } -int rpcReleaseHandle(void* handle, int8_t type) { return (*transReleaseHandle[type])(handle); } +int32_t rpcRegisterBrokenLinkArg(SRpcMsg* msg) { return transRegisterMsg(msg); } +int32_t rpcReleaseHandle(void* handle, int8_t type) { return (*transReleaseHandle[type])(handle); } // client only -int rpcSetDefaultAddr(void* thandle, const char* ip, const char* fqdn) { +int32_t rpcSetDefaultAddr(void* thandle, const char* ip, const char* fqdn) { // later return transSetDefaultAddr(thandle, ip, fqdn); } // server only int32_t rpcSetIpWhite(void* thandle, void* arg) { return transSetIpWhiteList(thandle, arg, NULL); } -void* rpcAllocHandle() { return (void*)transAllocHandle(); } +int32_t rpcAllocHandle(int64_t* refId) { return transAllocHandle(refId); } int32_t rpcUtilSIpRangeToStr(SIpV4Range* pRange, char* buf) { return transUtilSIpRangeToStr(pRange, buf); } int32_t rpcUtilSWhiteListToStr(SIpWhiteList* pWhiteList, char** ppBuf) { diff --git a/source/libs/transport/src/transCli.c b/source/libs/transport/src/transCli.c index 2e61f19af8..874fbd7733 100644 --- a/source/libs/transport/src/transCli.c +++ b/source/libs/transport/src/transCli.c @@ -213,8 +213,10 @@ static void cliHandleReq(SCliMsg* pMsg, SCliThrd* pThrd); static void cliHandleQuit(SCliMsg* pMsg, SCliThrd* pThrd); static void cliHandleRelease(SCliMsg* pMsg, SCliThrd* pThrd); static void cliHandleUpdate(SCliMsg* pMsg, SCliThrd* pThrd); -static void (*cliAsyncHandle[])(SCliMsg* pMsg, SCliThrd* pThrd) = {cliHandleReq, cliHandleQuit, cliHandleRelease, NULL, - cliHandleUpdate}; +static void cliHandleFreeById(SCliMsg* pMsg, SCliThrd* pThrd); + +static void (*cliAsyncHandle[])(SCliMsg* pMsg, SCliThrd* pThrd) = {cliHandleReq, cliHandleQuit, cliHandleRelease, + NULL, cliHandleUpdate, cliHandleFreeById}; /// static void (*cliAsyncHandle[])(SCliMsg* pMsg, SCliThrd* pThrd) = {cliHandleReq, cliHandleQuit, cliHandleRelease, /// NULL,cliHandleUpdate}; @@ -660,7 +662,9 @@ static SCliConn* getConnFromPool(SCliThrd* pThrd, char* key, bool* exceed) { if (QUEUE_IS_EMPTY(&plist->conns)) { if (plist->list->numOfConn >= pTranInst->connLimitNum) { *exceed = true; + return NULL;; } + plist->list->numOfConn++; return NULL; } @@ -704,7 +708,7 @@ static SCliConn* getConnFromPool2(SCliThrd* pThrd, char* key, SCliMsg** pMsg) { SMsgList* list = plist->list; if ((list)->numOfConn >= pTransInst->connLimitNum) { STraceId* trace = &(*pMsg)->msg.info.traceId; - if (pTransInst->noDelayFp != NULL && pTransInst->noDelayFp((*pMsg)->msg.msgType)) { + if (pTransInst->notWaitAvaliableConn || (pTransInst->noDelayFp != NULL && pTransInst->noDelayFp((*pMsg)->msg.msgType))) { tDebug("%s msg %s not to send, reason: %s", pTransInst->label, TMSG_INFO((*pMsg)->msg.msgType), tstrerror(TSDB_CODE_RPC_NETWORK_BUSY)); doNotifyApp(*pMsg, pThrd, TSDB_CODE_RPC_NETWORK_BUSY); @@ -899,10 +903,12 @@ static int32_t specifyConnRef(SCliConn* conn, bool update, int64_t handle) { exh->handle = conn; exh->pThrd = conn->hostThrd; taosWUnLockLatch(&exh->latch); - + conn->refId = exh->refId; taosWUnLockLatch(&exh->latch); + tDebug("conn %p specified by %"PRId64"", conn, handle); + (void)transReleaseExHandle(transGetRefMgt(), handle); return 0; } @@ -1035,7 +1041,6 @@ static void cliDestroyConn(SCliConn* conn, bool clear) { list->size--; } } - conn->list = NULL; (void)transReleaseExHandle(transGetRefMgt(), conn->refId); @@ -1075,8 +1080,11 @@ static void cliDestroy(uv_handle_t* handle) { (void)atomic_sub_fetch_32(&pThrd->connCount, 1); + if (conn->refId > 0) { (void)transReleaseExHandle(transGetRefMgt(), conn->refId); (void)transRemoveExHandle(transGetRefMgt(), conn->refId); + + } taosMemoryFree(conn->dstAddr); taosMemoryFree(conn->stream); @@ -1589,6 +1597,43 @@ static void cliHandleUpdate(SCliMsg* pMsg, SCliThrd* pThrd) { pThrd->cvtAddr = pCtx->cvtAddr; destroyCmsg(pMsg); } +static void cliHandleFreeById(SCliMsg* pMsg, SCliThrd* pThrd) { + int32_t code = 0; + int64_t refId = (int64_t)(pMsg->msg.info.handle); + SExHandle* exh = transAcquireExHandle(transGetRefMgt(), refId); + if (exh == NULL) { + tDebug("id %" PRId64 " already released", refId); + destroyCmsg(pMsg); + return; + } + + taosRLockLatch(&exh->latch); + SCliConn* conn = exh->handle; + taosRUnLockLatch(&exh->latch); + + if (conn == NULL || conn->refId != refId) { + TAOS_CHECK_GOTO(TSDB_CODE_REF_INVALID_ID, NULL, _exception); + } + tDebug("do free conn %p by id %" PRId64 "", conn, refId); + + int32_t size = transQueueSize(&conn->cliMsgs); + if (size == 0) { + // already recv, and notify upper layer + TAOS_CHECK_GOTO(TSDB_CODE_REF_INVALID_ID, NULL, _exception); + } else { + while (T_REF_VAL_GET(conn) >= 1) { + transUnrefCliHandle(conn); + } + return; + } +_exception: + tDebug("already free conn %p by id %" PRId64"", conn, refId); + + (void)transReleaseExHandle(transGetRefMgt(), refId); + (void)transReleaseExHandle(transGetRefMgt(), refId); + (void)transRemoveExHandle(transGetRefMgt(), refId); + destroyCmsg(pMsg); +} SCliConn* cliGetConn(SCliMsg** pMsg, SCliThrd* pThrd, bool* ignore, char* addr) { STransConnCtx* pCtx = (*pMsg)->ctx; @@ -2183,6 +2228,11 @@ static FORCE_INLINE void destroyCmsgAndAhandle(void* param) { pThrd->destroyAhandleFp(pMsg->ctx->ahandle); } + if (pMsg->msg.info.handle !=0) { + (void)transReleaseExHandle(transGetRefMgt(), (int64_t)pMsg->msg.info.handle); + (void)transRemoveExHandle(transGetRefMgt(), (int64_t)pMsg->msg.info.handle); + } + transDestroyConnCtx(pMsg->ctx); transFreeMsg(pMsg->msg.pCont); taosMemoryFree(pMsg); @@ -2759,7 +2809,7 @@ SCliThrd* transGetWorkThrd(STrans* trans, int64_t handle) { SCliThrd* pThrd = transGetWorkThrdFromHandle(trans, handle); return pThrd; } -int transReleaseCliHandle(void* handle) { +int32_t transReleaseCliHandle(void* handle) { int32_t code = 0; SCliThrd* pThrd = transGetWorkThrdFromHandle(NULL, (int64_t)handle); if (pThrd == NULL) { @@ -2823,25 +2873,25 @@ static int32_t transInitMsg(void* shandle, const SEpSet* pEpSet, STransMsg* pReq cliMsg->type = Normal; cliMsg->refId = (int64_t)shandle; QUEUE_INIT(&cliMsg->seqq); + *pCliMsg = cliMsg; return 0; } -int transSendRequest(void* shandle, const SEpSet* pEpSet, STransMsg* pReq, STransCtx* ctx) { +int32_t transSendRequest(void* shandle, const SEpSet* pEpSet, STransMsg* pReq, STransCtx* ctx) { STrans* pTransInst = (STrans*)transAcquireExHandle(transGetInstMgt(), (int64_t)shandle); if (pTransInst == NULL) { transFreeMsg(pReq->pCont); - return TSDB_CODE_RPC_BROKEN_LINK; + return TSDB_CODE_RPC_MODULE_QUIT; } int32_t code = 0; int64_t handle = (int64_t)pReq->info.handle; SCliThrd* pThrd = transGetWorkThrd(pTransInst, handle); if (pThrd == NULL) { - transFreeMsg(pReq->pCont); - (void)transReleaseExHandle(transGetInstMgt(), (int64_t)shandle); - return TSDB_CODE_RPC_BROKEN_LINK; + TAOS_CHECK_GOTO(TSDB_CODE_RPC_BROKEN_LINK, NULL, _exception;); } + if (handle != 0) { SExHandle* exh = transAcquireExHandle(transGetRefMgt(), handle); if (exh != NULL) { @@ -2849,26 +2899,27 @@ int transSendRequest(void* shandle, const SEpSet* pEpSet, STransMsg* pReq, STran if (exh->handle == NULL && exh->inited != 0) { SCliMsg* pCliMsg = NULL; code = transInitMsg(shandle, pEpSet, pReq, ctx, &pCliMsg); - ASSERT(code == 0); + if (code != 0) { + taosWUnLockLatch(&exh->latch); + (void)transReleaseExHandle(transGetRefMgt(), handle); + TAOS_CHECK_GOTO(code, NULL, _exception); + } QUEUE_PUSH(&exh->q, &pCliMsg->seqq); taosWUnLockLatch(&exh->latch); tDebug("msg refId: %" PRId64 "", handle); (void)transReleaseExHandle(transGetInstMgt(), (int64_t)shandle); return 0; + } else { + exh->inited = 1; + taosWUnLockLatch(&exh->latch); + (void)transReleaseExHandle(transGetRefMgt(), handle); } - exh->inited = 1; - taosWUnLockLatch(&exh->latch); - (void)transReleaseExHandle(transGetRefMgt(), handle); } } SCliMsg* pCliMsg = NULL; - code = transInitMsg(shandle, pEpSet, pReq, ctx, &pCliMsg); - if (code != 0) { - (void)transReleaseExHandle(transGetInstMgt(), (int64_t)shandle); - return code; - } + TAOS_CHECK_GOTO(transInitMsg(shandle, pEpSet, pReq, ctx, &pCliMsg), NULL, _exception); STraceId* trace = &pReq->info.traceId; tGDebug("%s send request at thread:%08" PRId64 ", dst:%s:%d, app:%p", transLabel(pTransInst), pThrd->pid, @@ -2880,13 +2931,63 @@ int transSendRequest(void* shandle, const SEpSet* pEpSet, STransMsg* pReq, STran } (void)transReleaseExHandle(transGetInstMgt(), (int64_t)shandle); return 0; + +_exception: + transFreeMsg(pReq->pCont); + (void)transReleaseExHandle(transGetInstMgt(), (int64_t)shandle); + return code; +} +int32_t transSendRequestWithId(void* shandle, const SEpSet* pEpSet, STransMsg* pReq, int64_t* transpointId) { + if (transpointId == NULL) { + ASSERT(0); + return TSDB_CODE_INVALID_PARA; + } + int32_t code = 0; + + STrans* pTransInst = (STrans*)transAcquireExHandle(transGetInstMgt(), (int64_t)shandle); + if (pTransInst == NULL) { + TAOS_CHECK_GOTO(TSDB_CODE_RPC_MODULE_QUIT, NULL, _exception); + } + + TAOS_CHECK_GOTO(transAllocHandle(transpointId), NULL, _exception); + + SCliThrd* pThrd = transGetWorkThrd(pTransInst, *transpointId); + if (pThrd == NULL) { + TAOS_CHECK_GOTO(TSDB_CODE_RPC_BROKEN_LINK, NULL, _exception); + } + + SExHandle* exh = transAcquireExHandle(transGetRefMgt(), *transpointId); + if (exh == NULL) { + TAOS_CHECK_GOTO(TSDB_CODE_RPC_MODULE_QUIT, NULL, _exception); + } + + pReq->info.handle = (void*)(*transpointId); + + SCliMsg* pCliMsg = NULL; + TAOS_CHECK_GOTO(transInitMsg(shandle, pEpSet, pReq, NULL, &pCliMsg), NULL, _exception); + + STraceId* trace = &pReq->info.traceId; + tGDebug("%s send request at thread:%08" PRId64 ", dst:%s:%d, app:%p", transLabel(pTransInst), pThrd->pid, + EPSET_GET_INUSE_IP(pEpSet), EPSET_GET_INUSE_PORT(pEpSet), pReq->info.ahandle); + if ((code = transAsyncSend(pThrd->asyncPool, &(pCliMsg->q))) != 0) { + destroyCmsg(pCliMsg); + (void)transReleaseExHandle(transGetInstMgt(), (int64_t)shandle); + return (code == TSDB_CODE_RPC_ASYNC_MODULE_QUIT ? TSDB_CODE_RPC_MODULE_QUIT : code); + } + (void)transReleaseExHandle(transGetInstMgt(), (int64_t)shandle); + return 0; + +_exception: + transFreeMsg(pReq->pCont); + (void)transReleaseExHandle(transGetInstMgt(), (int64_t)shandle); + return code; } -int transSendRecv(void* shandle, const SEpSet* pEpSet, STransMsg* pReq, STransMsg* pRsp) { +int32_t transSendRecv(void* shandle, const SEpSet* pEpSet, STransMsg* pReq, STransMsg* pRsp) { STrans* pTransInst = (STrans*)transAcquireExHandle(transGetInstMgt(), (int64_t)shandle); if (pTransInst == NULL) { transFreeMsg(pReq->pCont); - return TSDB_CODE_RPC_BROKEN_LINK; + return TSDB_CODE_RPC_MODULE_QUIT; } int32_t code = 0; @@ -2908,8 +3009,7 @@ int transSendRecv(void* shandle, const SEpSet* pEpSet, STransMsg* pReq, STransMs code = tsem_init(sem, 0, 0); if (code != 0) { taosMemoryFree(sem); - code = TAOS_SYSTEM_ERROR(errno); - TAOS_CHECK_GOTO(code, NULL, _RETURN1); + TAOS_CHECK_GOTO(TAOS_SYSTEM_ERROR(errno), NULL, _RETURN1); } TRACE_SET_MSGID(&pReq->info.traceId, tGenIdPI64()); @@ -3003,13 +3103,13 @@ _EXIT: taosMemoryFree(pSyncMsg); return code; } -int transSendRecvWithTimeout(void* shandle, SEpSet* pEpSet, STransMsg* pReq, STransMsg* pRsp, int8_t* epUpdated, - int32_t timeoutMs) { +int32_t transSendRecvWithTimeout(void* shandle, SEpSet* pEpSet, STransMsg* pReq, STransMsg* pRsp, int8_t* epUpdated, + int32_t timeoutMs) { int32_t code = 0; STrans* pTransInst = (STrans*)transAcquireExHandle(transGetInstMgt(), (int64_t)shandle); if (pTransInst == NULL) { transFreeMsg(pReq->pCont); - return TSDB_CODE_RPC_BROKEN_LINK; + return TSDB_CODE_RPC_MODULE_QUIT; } STransMsg* pTransMsg = taosMemoryCalloc(1, sizeof(STransMsg)); @@ -3096,22 +3196,21 @@ _RETURN2: /* * **/ -int transSetDefaultAddr(void* shandle, const char* ip, const char* fqdn) { +int32_t transSetDefaultAddr(void* shandle, const char* ip, const char* fqdn) { + if (ip == NULL || fqdn == NULL) return TSDB_CODE_INVALID_PARA; + STrans* pTransInst = (STrans*)transAcquireExHandle(transGetInstMgt(), (int64_t)shandle); if (pTransInst == NULL) { - return TSDB_CODE_RPC_BROKEN_LINK; + return TSDB_CODE_RPC_MODULE_QUIT; } SCvtAddr cvtAddr = {0}; - if (ip != NULL && fqdn != NULL) { - tstrncpy(cvtAddr.ip, ip, sizeof(cvtAddr.ip)); - tstrncpy(cvtAddr.fqdn, fqdn, sizeof(cvtAddr.fqdn)); - cvtAddr.cvt = true; - } + tstrncpy(cvtAddr.ip, ip, sizeof(cvtAddr.ip)); + tstrncpy(cvtAddr.fqdn, fqdn, sizeof(cvtAddr.fqdn)); + cvtAddr.cvt = true; int32_t code = 0; - int8_t i = 0; - for (i = 0; i < pTransInst->numOfThreads; i++) { + for (int8_t i = 0; i < pTransInst->numOfThreads; i++) { STransConnCtx* pCtx = taosMemoryCalloc(1, sizeof(STransConnCtx)); if (pCtx == NULL) { code = TSDB_CODE_OUT_OF_MEMORY; @@ -3136,7 +3235,9 @@ int transSetDefaultAddr(void* shandle, const char* ip, const char* fqdn) { if ((code = transAsyncSend(thrd->asyncPool, &(cliMsg->q))) != 0) { destroyCmsg(cliMsg); - code = (code == TSDB_CODE_RPC_ASYNC_MODULE_QUIT ? TSDB_CODE_RPC_MODULE_QUIT : code); + if (code == TSDB_CODE_RPC_ASYNC_MODULE_QUIT) { + code = TSDB_CODE_RPC_MODULE_QUIT; + } break; } } @@ -3145,7 +3246,7 @@ int transSetDefaultAddr(void* shandle, const char* ip, const char* fqdn) { return code; } -int64_t transAllocHandle() { +int32_t transAllocHandle(int64_t* refId) { SExHandle* exh = taosMemoryCalloc(1, sizeof(SExHandle)); if (exh == NULL) { return TSDB_CODE_OUT_OF_MEMORY; @@ -3166,5 +3267,43 @@ int64_t transAllocHandle() { QUEUE_INIT(&exh->q); taosInitRWLatch(&exh->latch); tDebug("pre alloc refId %" PRId64 "", exh->refId); - return exh->refId; + *refId = exh->refId; + return 0; +} +int32_t transFreeConnById(void* shandle, int64_t transpointId) { + int32_t code = 0; + STrans* pTransInst = (STrans*)transAcquireExHandle(transGetInstMgt(), (int64_t)shandle); + if (pTransInst == NULL) { + return TSDB_CODE_RPC_MODULE_QUIT; + } + if (transpointId == 0) { + tDebug("not free by refId:%"PRId64"", transpointId); + TAOS_CHECK_GOTO(0, NULL, _exception); + } + + SCliThrd* pThrd = transGetWorkThrdFromHandle(pTransInst, transpointId); + if (pThrd == NULL) { + TAOS_CHECK_GOTO(TSDB_CODE_REF_INVALID_ID, NULL, _exception); + } + + SCliMsg* pCli = taosMemoryCalloc(1, sizeof(SCliMsg)); + if (pCli == NULL) { + TAOS_CHECK_GOTO(TSDB_CODE_OUT_OF_MEMORY, NULL, _exception); + } + pCli->type = FreeById; + + tDebug("release conn id %" PRId64 "", transpointId); + + STransMsg msg = {.info.handle = (void*)transpointId}; + pCli->msg = msg; + + code = transAsyncSend(pThrd->asyncPool, &pCli->q); + if (code != 0) { + taosMemoryFree(pCli); + TAOS_CHECK_GOTO(code, NULL, _exception); + } + +_exception: + transReleaseExHandle(transGetInstMgt(), (int64_t)shandle); + return code; } diff --git a/source/libs/transport/src/transComm.c b/source/libs/transport/src/transComm.c index 9df0ddb6f3..148f4d4e9a 100644 --- a/source/libs/transport/src/transComm.c +++ b/source/libs/transport/src/transComm.c @@ -234,7 +234,7 @@ bool transReadComplete(SConnBuffer* connBuf) { return (p->left == 0 || p->invalid) ? true : false; } -int transSetConnOption(uv_tcp_t* stream, int keepalive) { +int32_t transSetConnOption(uv_tcp_t* stream, int keepalive) { #if defined(WINDOWS) || defined(DARWIN) #else return uv_tcp_keepalive(stream, 1, keepalive); @@ -745,8 +745,7 @@ int32_t transRemoveExHandle(int32_t refMgt, int64_t refId) { return taosRemoveRef(refMgt, refId); } -void* transAcquireExHandle(int32_t refMgt, int64_t refId) { - // acquire extern handle +void* transAcquireExHandle(int32_t refMgt, int64_t refId) { // acquire extern handle return (void*)taosAcquireRef(refMgt, refId); } diff --git a/source/libs/transport/src/transSvr.c b/source/libs/transport/src/transSvr.c index 1e0d54eb5b..0202fbc599 100644 --- a/source/libs/transport/src/transSvr.c +++ b/source/libs/transport/src/transSvr.c @@ -1707,7 +1707,7 @@ void transUnrefSrvHandle(void* handle) { } } -int transReleaseSrvHandle(void* handle) { +int32_t transReleaseSrvHandle(void* handle) { int32_t code = 0; SRpcHandleInfo* info = handle; SExHandle* exh = info->handle; @@ -1747,7 +1747,7 @@ _return2: return code; } -int transSendResponse(const STransMsg* msg) { +int32_t transSendResponse(const STransMsg* msg) { int32_t code = 0; if (msg->info.noResp) { @@ -1800,7 +1800,7 @@ _return2: rpcFreeCont(msg->pCont); return code; } -int transRegisterMsg(const STransMsg* msg) { +int32_t transRegisterMsg(const STransMsg* msg) { int32_t code = 0; SExHandle* exh = msg->info.handle; @@ -1891,4 +1891,4 @@ int32_t transSetIpWhiteList(void* thandle, void* arg, FilteFunc* func) { return code; } -int transGetConnInfo(void* thandle, STransHandleInfo* pConnInfo) { return -1; } +int32_t transGetConnInfo(void* thandle, STransHandleInfo* pConnInfo) { return -1; } diff --git a/source/libs/transport/test/cliBench.c b/source/libs/transport/test/cliBench.c index 8a5276b814..ec08f1baf0 100644 --- a/source/libs/transport/test/cliBench.c +++ b/source/libs/transport/test/cliBench.c @@ -160,7 +160,7 @@ int main(int argc, char *argv[]) { void *pRpc = rpcOpen(&rpcInit); if (pRpc == NULL) { tError("failed to initialize RPC"); - return -1; + return terrno; } tInfo("client is initialized"); diff --git a/source/os/src/osSocket.c b/source/os/src/osSocket.c index f9ae6c2157..081ed46c9a 100644 --- a/source/os/src/osSocket.c +++ b/source/os/src/osSocket.c @@ -1075,14 +1075,14 @@ int32_t taosGetFqdn(char *fqdn) { freeaddrinfo(result); -#else +#elif WINDOWS struct addrinfo hints = {0}; struct addrinfo *result = NULL; hints.ai_flags = AI_CANONNAME; int32_t ret = getaddrinfo(hostname, NULL, &hints, &result); if (!result) { - fprintf(stderr, "failed to get fqdn, code:%d, reason:%s\n", ret, gai_strerror(ret)); + fprintf(stderr, "failed to get fqdn, code:%d, hostname:%s, reason:%s\n", ret, hostname, gai_strerror(ret)); return -1; } strcpy(fqdn, result->ai_canonname); diff --git a/source/os/src/osThread.c b/source/os/src/osThread.c index 2f418d5a01..3e37d12759 100644 --- a/source/os/src/osThread.c +++ b/source/os/src/osThread.c @@ -242,7 +242,7 @@ int32_t taosThreadCondTimedWait(TdThreadCond *cond, TdThreadMutex *mutex, const return EINVAL; #else int32_t code = pthread_cond_timedwait(cond, mutex, abstime); - if (code) { + if (code && code != ETIMEDOUT) { terrno = TAOS_SYSTEM_ERROR(code); return terrno; } diff --git a/source/os/src/osTimezone.c b/source/os/src/osTimezone.c index c801347fc2..b3f7c9a368 100644 --- a/source/os/src/osTimezone.c +++ b/source/os/src/osTimezone.c @@ -871,14 +871,14 @@ void taosGetSystemTimezone(char *outTimezoneStr, enum TdTimezone *tsTimezone) { { int n = readlink("/etc/localtime", buf, sizeof(buf)); if (n < 0) { - printf("read /etc/localtime error, reason:%s", strerror(errno)); + printf("read /etc/localtime error, reason:%s\n", strerror(errno)); return; } buf[n] = '\0'; char *zi = strstr(buf, "zoneinfo"); if (!zi) { - printf("parsing /etc/localtime failed"); + printf("parsing /etc/localtime failed\n"); return; } tz = zi + strlen("zoneinfo") + 1; @@ -893,7 +893,7 @@ void taosGetSystemTimezone(char *outTimezoneStr, enum TdTimezone *tsTimezone) { // } // } // if (!tz || 0 == strchr(tz, '/')) { - // printf("parsing /etc/localtime failed"); + // printf("parsing /etc/localtime failed\n"); // return; // } @@ -927,7 +927,7 @@ void taosGetSystemTimezone(char *outTimezoneStr, enum TdTimezone *tsTimezone) { { int n = readlink("/etc/localtime", buf, sizeof(buf)-1); if (n < 0) { - (void)printf("read /etc/localtime error, reason:%s", strerror(errno)); + (void)printf("read /etc/localtime error, reason:%s\n", strerror(errno)); if (taosCheckExistFile("/etc/timezone")) { /* @@ -947,7 +947,7 @@ void taosGetSystemTimezone(char *outTimezoneStr, enum TdTimezone *tsTimezone) { int len = taosReadFile(pFile, buf, 64); if (len < 0) { (void)taosCloseFile(&pFile); - (void)printf("read /etc/timezone error, reason:%s", strerror(errno)); + (void)printf("read /etc/timezone error, reason:%s\n", strerror(errno)); return; } @@ -994,7 +994,7 @@ void taosGetSystemTimezone(char *outTimezoneStr, enum TdTimezone *tsTimezone) { char *zi = strstr(buf, "zoneinfo"); if (!zi) { - (void)printf("parsing /etc/localtime failed"); + (void)printf("parsing /etc/localtime failed\n"); return; } tz = zi + strlen("zoneinfo") + 1; diff --git a/source/util/src/tcompare.c b/source/util/src/tcompare.c index 4cb48bffe5..670a70a309 100644 --- a/source/util/src/tcompare.c +++ b/source/util/src/tcompare.c @@ -1208,20 +1208,28 @@ typedef struct UsingRegex { regex_t pRegex; int32_t lastUsedTime; } UsingRegex; +typedef UsingRegex* HashRegexPtr; typedef struct RegexCache { SHashObj *regexHash; void *regexCacheTmr; void *timer; + SRWLatch mutex; + bool exit; } RegexCache; static RegexCache sRegexCache; #define MAX_REGEX_CACHE_SIZE 20 #define REGEX_CACHE_CLEAR_TIME 30 static void checkRegexCache(void* param, void* tmrId) { + int32_t code = 0; + taosRLockLatch(&sRegexCache.mutex); + if(sRegexCache.exit) { + goto _exit; + } (void)taosTmrReset(checkRegexCache, REGEX_CACHE_CLEAR_TIME * 1000, param, sRegexCache.regexCacheTmr, &tmrId); if (taosHashGetSize(sRegexCache.regexHash) < MAX_REGEX_CACHE_SIZE) { - return; + goto _exit; } if (taosHashGetSize(sRegexCache.regexHash) >= MAX_REGEX_CACHE_SIZE) { @@ -1235,6 +1243,8 @@ static void checkRegexCache(void* param, void* tmrId) { ppUsingRegex = taosHashIterate(sRegexCache.regexHash, ppUsingRegex); } } +_exit: + taosRUnLockLatch(&sRegexCache.mutex); } void regexCacheFree(void *ppUsingRegex) { @@ -1246,30 +1256,35 @@ int32_t InitRegexCache() { sRegexCache.regexHash = taosHashInit(64, taosGetDefaultHashFunction(TSDB_DATA_TYPE_BINARY), false, HASH_ENTRY_LOCK); if (sRegexCache.regexHash == NULL) { uError("failed to create RegexCache"); - return -1; + return terrno; } taosHashSetFreeFp(sRegexCache.regexHash, regexCacheFree); sRegexCache.regexCacheTmr = taosTmrInit(0, 0, 0, "REGEXCACHE"); if (sRegexCache.regexCacheTmr == NULL) { uError("failed to create regex cache check timer"); - terrno = TSDB_CODE_OUT_OF_MEMORY; - return -1; + return terrno; } + sRegexCache.exit = false; + taosInitRWLatch(&sRegexCache.mutex); sRegexCache.timer = taosTmrStart(checkRegexCache, REGEX_CACHE_CLEAR_TIME * 1000, NULL, sRegexCache.regexCacheTmr); if (sRegexCache.timer == NULL) { uError("failed to start regex cache timer"); - return -1; + return terrno; } - return 0; + return TSDB_CODE_SUCCESS; } void DestroyRegexCache(){ + int32_t code = 0; uInfo("[regex cache] destory regex cache"); (void)taosTmrStopA(&sRegexCache.timer); + taosWLockLatch(&sRegexCache.mutex); + sRegexCache.exit = true; taosHashCleanup(sRegexCache.regexHash); taosTmrCleanUp(sRegexCache.regexCacheTmr); + taosWUnLockLatch(&sRegexCache.mutex); } int32_t checkRegexPattern(const char *pPattern) { @@ -1290,18 +1305,17 @@ int32_t checkRegexPattern(const char *pPattern) { return TSDB_CODE_SUCCESS; } -static UsingRegex **getRegComp(const char *pPattern) { - UsingRegex **ppUsingRegex = (UsingRegex **)taosHashAcquire(sRegexCache.regexHash, pPattern, strlen(pPattern)); +int32_t getRegComp(const char *pPattern, HashRegexPtr **regexRet) { + HashRegexPtr* ppUsingRegex = (HashRegexPtr*)taosHashAcquire(sRegexCache.regexHash, pPattern, strlen(pPattern)); if (ppUsingRegex != NULL) { (*ppUsingRegex)->lastUsedTime = taosGetTimestampSec(); - return ppUsingRegex; + *regexRet = ppUsingRegex; + return TSDB_CODE_SUCCESS; } - UsingRegex *pUsingRegex = taosMemoryMalloc(sizeof(UsingRegex)); if (pUsingRegex == NULL) { - terrno = TSDB_CODE_OUT_OF_MEMORY; uError("Failed to Malloc when compile regex pattern %s.", pPattern); - return NULL; + return terrno; } int32_t cflags = REG_EXTENDED; int32_t ret = regcomp(&pUsingRegex->pRegex, pPattern, cflags); @@ -1310,8 +1324,7 @@ static UsingRegex **getRegComp(const char *pPattern) { (void)regerror(ret, &pUsingRegex->pRegex, msgbuf, tListLen(msgbuf)); uError("Failed to compile regex pattern %s. reason %s", pPattern, msgbuf); taosMemoryFree(pUsingRegex); - terrno = TSDB_CODE_PAR_REGULAR_EXPRESSION_ERROR; - return NULL; + return TSDB_CODE_PAR_REGULAR_EXPRESSION_ERROR; } while (true) { @@ -1319,8 +1332,9 @@ static UsingRegex **getRegComp(const char *pPattern) { if (code != 0 && code != TSDB_CODE_DUP_KEY) { regexCacheFree(&pUsingRegex); uError("Failed to put regex pattern %s into cache, exception internal error.", pPattern); - terrno = code; - return NULL; + return code; + } else if (code == TSDB_CODE_DUP_KEY) { + terrno = 0; } ppUsingRegex = (UsingRegex **)taosHashAcquire(sRegexCache.regexHash, pPattern, strlen(pPattern)); if (ppUsingRegex) { @@ -1334,27 +1348,68 @@ static UsingRegex **getRegComp(const char *pPattern) { } } pUsingRegex->lastUsedTime = taosGetTimestampSec(); - return ppUsingRegex; + *regexRet = ppUsingRegex; + return TSDB_CODE_SUCCESS; } void releaseRegComp(UsingRegex **regex){ taosHashRelease(sRegexCache.regexHash, regex); } +static threadlocal UsingRegex ** ppUsingRegex; +static threadlocal regex_t * pRegex; +static threadlocal char *pOldPattern = NULL; +void DestoryThreadLocalRegComp() { + if (NULL != pOldPattern) { + releaseRegComp(ppUsingRegex); + taosMemoryFree(pOldPattern); + ppUsingRegex = NULL; + pRegex = NULL; + pOldPattern = NULL; + } +} + +int32_t threadGetRegComp(regex_t **regex, const char *pPattern) { + if (NULL != pOldPattern) { + if (strcmp(pOldPattern, pPattern) == 0) { + *regex = pRegex; + return 0; + } else { + DestoryThreadLocalRegComp(); + } + } + + HashRegexPtr *ppRegex = NULL; + int32_t code = getRegComp(pPattern, &ppRegex); + if (code != TSDB_CODE_SUCCESS) { + return code; + } + pOldPattern = taosStrdup(pPattern); + if (NULL == pOldPattern) { + uError("Failed to Malloc when compile regex pattern %s.", pPattern); + return terrno; + } + ppUsingRegex = ppRegex; + pRegex = &((*ppUsingRegex)->pRegex); + *regex = &(*ppRegex)->pRegex; + return 0; +} + static int32_t doExecRegexMatch(const char *pString, const char *pPattern) { int32_t ret = 0; char msgbuf[256] = {0}; - UsingRegex **pUsingRegex = getRegComp(pPattern); - if (pUsingRegex == NULL) { - return 1; + + regex_t *regex = NULL; + ret = threadGetRegComp(®ex, pPattern); + if (ret != 0) { + return ret; } regmatch_t pmatch[1]; - ret = regexec(&(*pUsingRegex)->pRegex, pString, 1, pmatch, 0); - releaseRegComp(pUsingRegex); + ret = regexec(regex, pString, 1, pmatch, 0); if (ret != 0 && ret != REG_NOMATCH) { terrno = TSDB_CODE_PAR_REGULAR_EXPRESSION_ERROR; - (void)regerror(ret, &(*pUsingRegex)->pRegex, msgbuf, sizeof(msgbuf)); + (void)regerror(ret, regex, msgbuf, sizeof(msgbuf)); uDebug("Failed to match %s with pattern %s, reason %s", pString, pPattern, msgbuf) } @@ -1365,8 +1420,7 @@ int32_t comparestrRegexMatch(const void *pLeft, const void *pRight) { size_t sz = varDataLen(pRight); char *pattern = taosMemoryMalloc(sz + 1); if (NULL == pattern) { - terrno = TSDB_CODE_OUT_OF_MEMORY; - return 1; + return 1; // terrno has been set } (void)memcpy(pattern, varDataVal(pRight), varDataLen(pRight)); @@ -1376,8 +1430,7 @@ int32_t comparestrRegexMatch(const void *pLeft, const void *pRight) { char *str = taosMemoryMalloc(sz + 1); if (NULL == str) { taosMemoryFree(pattern); - terrno = TSDB_CODE_OUT_OF_MEMORY; - return 1; + return 1; // terrno has been set } (void)memcpy(str, varDataVal(pLeft), sz); @@ -1395,14 +1448,13 @@ int32_t comparewcsRegexMatch(const void *pString, const void *pPattern) { size_t len = varDataLen(pPattern); char *pattern = taosMemoryMalloc(len + 1); if (NULL == pattern) { - terrno = TSDB_CODE_OUT_OF_MEMORY; - return 1; + return 1; // terrno has been set } int convertLen = taosUcs4ToMbs((TdUcs4 *)varDataVal(pPattern), len, pattern); if (convertLen < 0) { taosMemoryFree(pattern); - return (terrno = TSDB_CODE_APP_ERROR); + return 1; // terrno has been set } pattern[convertLen] = 0; @@ -1411,15 +1463,14 @@ int32_t comparewcsRegexMatch(const void *pString, const void *pPattern) { char *str = taosMemoryMalloc(len + 1); if (NULL == str) { taosMemoryFree(pattern); - terrno = TSDB_CODE_OUT_OF_MEMORY; - return 1; + return 1; // terrno has been set } convertLen = taosUcs4ToMbs((TdUcs4 *)varDataVal(pString), len, str); if (convertLen < 0) { taosMemoryFree(str); taosMemoryFree(pattern); - return (terrno = TSDB_CODE_APP_ERROR); + return 1; // terrno has been set } str[convertLen] = 0; diff --git a/source/util/src/terror.c b/source/util/src/terror.c index e9bdafcd5a..7b9d48eb58 100644 --- a/source/util/src/terror.c +++ b/source/util/src/terror.c @@ -409,6 +409,7 @@ TAOS_DEFINE_ERROR(TSDB_CODE_VND_ALREADY_IS_VOTER, "Vnode already is a vo TAOS_DEFINE_ERROR(TSDB_CODE_VND_DIR_ALREADY_EXIST, "Vnode directory already exist") TAOS_DEFINE_ERROR(TSDB_CODE_VND_META_DATA_UNSAFE_DELETE, "Single replica vnode data will lost permanently after this operation, if you make sure this, please use drop dnode unsafe to execute") TAOS_DEFINE_ERROR(TSDB_CODE_VND_ARB_NOT_SYNCED, "Vgroup peer is not synced") +TAOS_DEFINE_ERROR(TSDB_CODE_VND_WRITE_DISABLED, "Vnode write is disabled for snapshot") TAOS_DEFINE_ERROR(TSDB_CODE_VND_COLUMN_COMPRESS_ALREADY_EXIST,"Same with old param") @@ -753,7 +754,7 @@ TAOS_DEFINE_ERROR(TSDB_CODE_TSMA_INVALID_STAT, "Invalid tsma state" TAOS_DEFINE_ERROR(TSDB_CODE_TSMA_INVALID_PTR, "Invalid tsma pointer") TAOS_DEFINE_ERROR(TSDB_CODE_TSMA_INVALID_PARA, "Invalid tsma parameters") TAOS_DEFINE_ERROR(TSDB_CODE_TSMA_INVALID_TB, "Invalid table to create tsma, only stable or normal table allowed") -TAOS_DEFINE_ERROR(TSDB_CODE_TSMA_INVALID_INTERVAL, "Invalid tsma interval, 1m ~ 1h is allowed") +TAOS_DEFINE_ERROR(TSDB_CODE_TSMA_INVALID_INTERVAL, "Invalid tsma interval, 1m ~ 1y is allowed") TAOS_DEFINE_ERROR(TSDB_CODE_TSMA_INVALID_FUNC_PARAM, "Invalid tsma func param, only one non-tag column allowed") TAOS_DEFINE_ERROR(TSDB_CODE_TSMA_UNSUPPORTED_FUNC, "Tsma func not supported") TAOS_DEFINE_ERROR(TSDB_CODE_TSMA_MUST_BE_DROPPED, "Tsma must be dropped first") diff --git a/source/util/src/tlog.c b/source/util/src/tlog.c index 1946a0a274..86dc767adc 100644 --- a/source/util/src/tlog.c +++ b/source/util/src/tlog.c @@ -567,7 +567,7 @@ static inline void taosPrintLogImp(ELogLevel level, int32_t dflag, const char *b } } -void taosPrintLog(const char *flags, ELogLevel level, int32_t dflag, const char *format, ...) { +void taosPrintLog(const char *flags, int32_t level, int32_t dflag, const char *format, ...) { if (!(dflag & DEBUG_FILE) && !(dflag & DEBUG_SCREEN)) return; char buffer[LOG_MAX_LINE_BUFFER_SIZE]; @@ -590,7 +590,7 @@ void taosPrintLog(const char *flags, ELogLevel level, int32_t dflag, const char } } -void taosPrintLongString(const char *flags, ELogLevel level, int32_t dflag, const char *format, ...) { +void taosPrintLongString(const char *flags, int32_t level, int32_t dflag, const char *format, ...) { if (!osLogSpaceAvailable()) return; if (!(dflag & DEBUG_FILE) && !(dflag & DEBUG_SCREEN)) return; diff --git a/source/util/src/tpcre2.c b/source/util/src/tpcre2.c index 52991c58b8..ba9bd51510 100644 --- a/source/util/src/tpcre2.c +++ b/source/util/src/tpcre2.c @@ -5,14 +5,24 @@ int32_t doRegComp(pcre2_code** ppRegex, pcre2_match_data** ppMatchData, const ch int errorcode; PCRE2_SIZE erroroffset; - *ppRegex = pcre2_compile((PCRE2_SPTR8)pattern, PCRE2_ZERO_TERMINATED, options, &errorcode, &erroroffset, NULL); - if (*ppRegex == NULL) { + pcre2_code* pRegex = NULL; + pcre2_match_data* pMatchData = NULL; + + pRegex = pcre2_compile((PCRE2_SPTR8)pattern, PCRE2_ZERO_TERMINATED, options, &errorcode, &erroroffset, NULL); + if (pRegex == NULL) { PCRE2_UCHAR buffer[256]; (void)pcre2_get_error_message(errorcode, buffer, sizeof(buffer)); - return 1; + return -1; } - *ppMatchData = pcre2_match_data_create_from_pattern(*ppRegex, NULL); + pMatchData = pcre2_match_data_create_from_pattern(pRegex, NULL); + if (pMatchData == NULL) { + pcre2_code_free(pRegex); + return -1; + } + + *ppRegex = pRegex; + *ppMatchData = pMatchData; return 0; } diff --git a/source/util/src/tscalablebf.c b/source/util/src/tscalablebf.c index 72a97fee45..80b633f5e8 100644 --- a/source/util/src/tscalablebf.c +++ b/source/util/src/tscalablebf.c @@ -33,7 +33,7 @@ int32_t tScalableBfInit(uint64_t expectedEntries, double errorRate, SScalableBf* int32_t lino = 0; const uint32_t defaultSize = 8; if (expectedEntries < 1 || errorRate <= 0 || errorRate >= 1.0) { - code = TSDB_CODE_FAILED; + code = TSDB_CODE_INVALID_PARA; QUERY_CHECK_CODE(code, lino, _error); } SScalableBf* pSBf = taosMemoryCalloc(1, sizeof(SScalableBf)); @@ -71,7 +71,7 @@ int32_t tScalableBfPutNoCheck(SScalableBf* pSBf, const void* keyBuf, uint32_t le int32_t code = TSDB_CODE_SUCCESS; int32_t lino = 0; if (pSBf->status == SBF_INVALID) { - code = TSDB_CODE_FAILED; + code = TSDB_CODE_OUT_OF_BUFFER; QUERY_CHECK_CODE(code, lino, _error); } int32_t size = taosArrayGetSize(pSBf->bfArray); @@ -92,7 +92,7 @@ int32_t tScalableBfPutNoCheck(SScalableBf* pSBf, const void* keyBuf, uint32_t le _error: if (code != TSDB_CODE_SUCCESS) { - uError("%s failed at line %d since %s", __func__, lino, tstrerror(code)); + uDebug("%s failed at line %d since %s", __func__, lino, tstrerror(code)); } return code; } @@ -101,7 +101,7 @@ int32_t tScalableBfPut(SScalableBf* pSBf, const void* keyBuf, uint32_t len, int3 int32_t code = TSDB_CODE_SUCCESS; int32_t lino = 0; if (pSBf->status == SBF_INVALID) { - code = TSDB_CODE_FAILED; + code = TSDB_CODE_OUT_OF_BUFFER; QUERY_CHECK_CODE(code, lino, _end); } uint64_t h1 = (uint64_t)pSBf->hashFn1(keyBuf, len); @@ -153,7 +153,7 @@ static int32_t tScalableBfAddFilter(SScalableBf* pSBf, uint64_t expectedEntries, int32_t code = TSDB_CODE_SUCCESS; int32_t lino = 0; if (taosArrayGetSize(pSBf->bfArray) >= pSBf->maxBloomFilters) { - code = TSDB_CODE_FAILED; + code = TSDB_CODE_OUT_OF_BUFFER; QUERY_CHECK_CODE(code, lino, _error); } @@ -163,7 +163,7 @@ static int32_t tScalableBfAddFilter(SScalableBf* pSBf, uint64_t expectedEntries, if (taosArrayPush(pSBf->bfArray, &pNormalBf) == NULL) { tBloomFilterDestroy(pNormalBf); - code = TSDB_CODE_OUT_OF_MEMORY; + code = terrno; QUERY_CHECK_CODE(code, lino, _error); } pSBf->numBits += pNormalBf->numBits; @@ -217,7 +217,7 @@ int32_t tScalableBfDecode(SDecoder* pDecoder, SScalableBf** ppSBf) { pSBf->bfArray = NULL; int32_t size = 0; if (tDecodeI32(pDecoder, &size) < 0) { - code = TSDB_CODE_FAILED; + code = terrno; QUERY_CHECK_CODE(code, lino, _error); } if (size == 0) { @@ -242,19 +242,19 @@ int32_t tScalableBfDecode(SDecoder* pDecoder, SScalableBf** ppSBf) { } } if (tDecodeU32(pDecoder, &pSBf->growth) < 0) { - code = TSDB_CODE_FAILED; + code = terrno; QUERY_CHECK_CODE(code, lino, _error); } if (tDecodeU64(pDecoder, &pSBf->numBits) < 0) { - code = TSDB_CODE_FAILED; + code = terrno; QUERY_CHECK_CODE(code, lino, _error); } if (tDecodeU32(pDecoder, &pSBf->maxBloomFilters) < 0) { - code = TSDB_CODE_FAILED; + code = terrno; QUERY_CHECK_CODE(code, lino, _error); } if (tDecodeI8(pDecoder, &pSBf->status) < 0) { - code = TSDB_CODE_FAILED; + code = terrno; QUERY_CHECK_CODE(code, lino, _error); } (*ppSBf) = pSBf; diff --git a/source/util/src/tworker.c b/source/util/src/tworker.c index 258d53c335..b2064d6787 100644 --- a/source/util/src/tworker.c +++ b/source/util/src/tworker.c @@ -106,6 +106,7 @@ static void *tQWorkerThreadFp(SQueueWorker *worker) { } destroyThreadLocalGeosCtx(); + DestoryThreadLocalRegComp(); return NULL; } @@ -237,6 +238,7 @@ static void *tAutoQWorkerThreadFp(SQueueWorker *worker) { taosUpdateItemSize(qinfo.queue, 1); } + DestoryThreadLocalRegComp(); return NULL; } @@ -664,6 +666,7 @@ static void *tQueryAutoQWorkerThreadFp(SQueryAutoQWorker *worker) { } destroyThreadLocalGeosCtx(); + DestoryThreadLocalRegComp(); return NULL; } @@ -789,6 +792,16 @@ bool tQueryAutoQWorkerTryRecycleWorker(SQueryAutoQWorkerPool *pPool, SQueryAutoQ int32_t tQueryAutoQWorkerInit(SQueryAutoQWorkerPool *pool) { int32_t code; + + (void)taosThreadMutexInit(&pool->poolLock, NULL); + (void)taosThreadMutexInit(&pool->backupLock, NULL); + (void)taosThreadMutexInit(&pool->waitingAfterBlockLock, NULL); + (void)taosThreadMutexInit(&pool->waitingBeforeProcessMsgLock, NULL); + + (void)taosThreadCondInit(&pool->waitingBeforeProcessMsgCond, NULL); + (void)taosThreadCondInit(&pool->waitingAfterBlockCond, NULL); + (void)taosThreadCondInit(&pool->backupCond, NULL); + code = taosOpenQset(&pool->qset); if (code) return terrno = code; pool->workers = tdListNew(sizeof(SQueryAutoQWorker)); @@ -799,14 +812,6 @@ int32_t tQueryAutoQWorkerInit(SQueryAutoQWorkerPool *pool) { if (!pool->exitedWorkers) return TSDB_CODE_OUT_OF_MEMORY; pool->maxInUse = pool->max * 2 + 2; - (void)taosThreadMutexInit(&pool->poolLock, NULL); - (void)taosThreadMutexInit(&pool->backupLock, NULL); - (void)taosThreadMutexInit(&pool->waitingAfterBlockLock, NULL); - (void)taosThreadMutexInit(&pool->waitingBeforeProcessMsgLock, NULL); - - (void)taosThreadCondInit(&pool->waitingBeforeProcessMsgCond, NULL); - (void)taosThreadCondInit(&pool->waitingAfterBlockCond, NULL); - (void)taosThreadCondInit(&pool->backupCond, NULL); if (!pool->pCb) { pool->pCb = taosMemoryCalloc(1, sizeof(SQueryAutoQWorkerPoolCB)); @@ -821,13 +826,17 @@ int32_t tQueryAutoQWorkerInit(SQueryAutoQWorkerPool *pool) { void tQueryAutoQWorkerCleanup(SQueryAutoQWorkerPool *pPool) { (void)taosThreadMutexLock(&pPool->poolLock); pPool->exit = true; - int32_t size = listNEles(pPool->workers); - for (int32_t i = 0; i < size; ++i) { - taosQsetThreadResume(pPool->qset); + int32_t size = 0; + if (pPool->workers) { + size = listNEles(pPool->workers); } - size = listNEles(pPool->backupWorkers); - for (int32_t i = 0; i < size; ++i) { - taosQsetThreadResume(pPool->qset); + if (pPool->backupWorkers) { + size += listNEles(pPool->backupWorkers); + } + if (pPool->qset) { + for (int32_t i = 0; i < size; ++i) { + taosQsetThreadResume(pPool->qset); + } } (void)taosThreadMutexUnlock(&pPool->poolLock); @@ -845,7 +854,7 @@ void tQueryAutoQWorkerCleanup(SQueryAutoQWorkerPool *pPool) { int32_t idx = 0; SQueryAutoQWorker *worker = NULL; - while (true) { + while (pPool->workers) { (void)taosThreadMutexLock(&pPool->poolLock); if (listNEles(pPool->workers) == 0) { (void)taosThreadMutexUnlock(&pPool->poolLock); @@ -861,7 +870,7 @@ void tQueryAutoQWorkerCleanup(SQueryAutoQWorkerPool *pPool) { taosMemoryFree(pNode); } - while (listNEles(pPool->backupWorkers) > 0) { + while (pPool->backupWorkers && listNEles(pPool->backupWorkers) > 0) { SListNode *pNode = tdListPopHead(pPool->backupWorkers); worker = (SQueryAutoQWorker *)pNode->data; if (worker && taosCheckPthreadValid(worker->thread)) { @@ -871,7 +880,7 @@ void tQueryAutoQWorkerCleanup(SQueryAutoQWorkerPool *pPool) { taosMemoryFree(pNode); } - while (listNEles(pPool->exitedWorkers) > 0) { + while (pPool->exitedWorkers && listNEles(pPool->exitedWorkers) > 0) { SListNode *pNode = tdListPopHead(pPool->exitedWorkers); worker = (SQueryAutoQWorker *)pNode->data; if (worker && taosCheckPthreadValid(worker->thread)) { @@ -932,7 +941,6 @@ STaosQueue *tQueryAutoQWorkerAllocQueue(SQueryAutoQWorkerPool *pool, void *ahand if (taosThreadCreate(&pWorker->thread, &thAttr, (ThreadFp)tQueryAutoQWorkerThreadFp, pWorker) != 0) { taosCloseQueue(queue); - terrno = TSDB_CODE_OUT_OF_MEMORY; queue = NULL; break; } diff --git a/source/util/test/CMakeLists.txt b/source/util/test/CMakeLists.txt index e8aabfe338..0d8774ba41 100644 --- a/source/util/test/CMakeLists.txt +++ b/source/util/test/CMakeLists.txt @@ -119,6 +119,13 @@ add_test( COMMAND bufferTest ) +add_executable(regexTest "regexTest.cpp") +target_link_libraries(regexTest os util gtest_main ) +add_test( + NAME regexTest + COMMAND regexTest +) + #add_executable(decompressTest "decompressTest.cpp") #target_link_libraries(decompressTest os util common gtest_main) #add_test( diff --git a/source/util/test/regexTest.cpp b/source/util/test/regexTest.cpp new file mode 100644 index 0000000000..5fe3701700 --- /dev/null +++ b/source/util/test/regexTest.cpp @@ -0,0 +1,344 @@ + +#include +#include +#include +#include +#include +#include "os.h" +#include "tutil.h" +#include "regex.h" +#include "osDef.h" +#include "tcompare.h" + +extern "C" { + typedef struct UsingRegex UsingRegex; + typedef struct HashRegexPtr HashRegexPtr; + int32_t getRegComp(const char *pPattern, HashRegexPtr **regexRet); + int32_t threadGetRegComp(regex_t **regex, const char *pPattern); +} + +class regexTest { + public: + regexTest() { (void)InitRegexCache(); } + ~regexTest() { (void)DestroyRegexCache(); } +}; +static regexTest test; + +static threadlocal regex_t pRegex; +static threadlocal char *pOldPattern = NULL; + +void DestoryThreadLocalRegComp1() { + if (NULL != pOldPattern) { + regfree(&pRegex); + taosMemoryFree(pOldPattern); + pOldPattern = NULL; + } +} + +static regex_t *threadGetRegComp1(const char *pPattern) { + if (NULL != pOldPattern) { + if( strcmp(pOldPattern, pPattern) == 0) { + return &pRegex; + } else { + DestoryThreadLocalRegComp1(); + } + } + pOldPattern = (char*)taosMemoryMalloc(strlen(pPattern) + 1); + if (NULL == pOldPattern) { + uError("Failed to Malloc when compile regex pattern %s.", pPattern); + return NULL; + } + strcpy(pOldPattern, pPattern); + int32_t cflags = REG_EXTENDED; + int32_t ret = regcomp(&pRegex, pPattern, cflags); + if (ret != 0) { + char msgbuf[256] = {0}; + regerror(ret, &pRegex, msgbuf, tListLen(msgbuf)); + uError("Failed to compile regex pattern %s. reason %s", pPattern, msgbuf); + DestoryThreadLocalRegComp1(); + return NULL; + } + return &pRegex; +} + +TEST(testCase, regexCacheTest1) { + int times = 100000; + char s1[] = "abc"; + auto start = std::chrono::high_resolution_clock::now(); + + uint64_t t0 = taosGetTimestampUs(); + for (int i = 0; i < times; i++) { + HashRegexPtr* ret = NULL; + int32_t code = getRegComp(s1, &ret); + if (code != 0) { + FAIL() << "Failed to compile regex pattern " << s1; + } + } + uint64_t t1 = taosGetTimestampUs(); + + printf("%s regex(current) %d times:%" PRIu64 " us.\n", s1, times, t1 - t0); + + uint64_t t2 = taosGetTimestampUs(); + for(int i = 0; i < times; i++) { + regex_t* rex = threadGetRegComp1(s1); + } + uint64_t t3 = taosGetTimestampUs(); + + printf("%s regex(before) %d times:%" PRIu64 " us.\n", s1, times, t3 - t2); + + t2 = taosGetTimestampUs(); + for(int i = 0; i < times; i++) { + regex_t* rex = NULL; + (void)threadGetRegComp(&rex, s1); + } + t3 = taosGetTimestampUs(); + + printf("%s regex(new) %d times:%" PRIu64 " us.\n", s1, times, t3 - t2); +} + +TEST(testCase, regexCacheTest2) { + int times = 100000; + char s1[] = "abc%*"; + auto start = std::chrono::high_resolution_clock::now(); + + uint64_t t0 = taosGetTimestampUs(); + for (int i = 0; i < times; i++) { + HashRegexPtr* ret = NULL; + int32_t code = getRegComp(s1, &ret); + if (code != 0) { + FAIL() << "Failed to compile regex pattern " << s1; + } + } + uint64_t t1 = taosGetTimestampUs(); + + printf("%s regex(current) %d times:%" PRIu64 " us.\n", s1, times, t1 - t0); + + uint64_t t2 = taosGetTimestampUs(); + for(int i = 0; i < times; i++) { + regex_t* rex = threadGetRegComp1(s1); + } + uint64_t t3 = taosGetTimestampUs(); + + printf("%s regex(before) %d times:%" PRIu64 " us.\n", s1, times, t3 - t2); + + t2 = taosGetTimestampUs(); + for(int i = 0; i < times; i++) { + regex_t* rex = NULL; + (void)threadGetRegComp(&rex, s1); + } + t3 = taosGetTimestampUs(); + + printf("%s regex(new) %d times:%" PRIu64 " us.\n", s1, times, t3 - t2); +} + +TEST(testCase, regexCacheTest3) { + int times = 100000; + char s1[] = "abc%*"; + char s2[] = "abc"; + auto start = std::chrono::high_resolution_clock::now(); + + uint64_t t0 = taosGetTimestampUs(); + for (int i = 0; i < times; i++) { + HashRegexPtr* ret = NULL; + int32_t code = getRegComp(s1, &ret); + if (code != 0) { + FAIL() << "Failed to compile regex pattern " << s1; + } + } + uint64_t t1 = taosGetTimestampUs(); + + printf("'%s' and '%s' take place by turn regex(current) %d times:%" PRIu64 " us.\n", s1, s2, times, t1 - t0); + + uint64_t t2 = taosGetTimestampUs(); + for(int i = 0; i < times; i++) { + regex_t* rex = threadGetRegComp1(s1); + rex = threadGetRegComp1(s2); + } + uint64_t t3 = taosGetTimestampUs(); + + printf("'%s' and '%s' take place by turn regex(before) %d times:%" PRIu64 " us.\n", s1, s2, times, t3 - t2); + + t2 = taosGetTimestampUs(); + for(int i = 0; i < times; i++) { + regex_t* rex = NULL; + (void)threadGetRegComp(&rex, s1); + (void)threadGetRegComp(&rex, s2); + } + t3 = taosGetTimestampUs(); + + printf("'%s' and '%s' take place by turn regex(new) %d times:%" PRIu64 " us.\n", s1, s2, times, t3 - t2); +} + +TEST(testCase, regexCacheTest4) { + int times = 100; + int count = 1000; + char s1[] = "abc%*"; + char s2[] = "abc"; + auto start = std::chrono::high_resolution_clock::now(); + + uint64_t t0 = taosGetTimestampUs(); + for (int i = 0; i < times; i++) { + for (int j = 0; j < count; ++j) { + HashRegexPtr* ret = NULL; + int32_t code = getRegComp(s1, &ret); + if (code != 0) { + FAIL() << "Failed to compile regex pattern " << s1; + } + } + for (int j = 0; j < count; ++j) { + HashRegexPtr* ret = NULL; + int32_t code = getRegComp(s2, &ret); + if (code != 0) { + FAIL() << "Failed to compile regex pattern " << s2; + } + } + } + uint64_t t1 = taosGetTimestampUs(); + + printf("'%s' and '%s' take place by turn(per %d count) regex(current) %d times:%" PRIu64 " us.\n", s1, s2, count, times, t1 - t0); + + uint64_t t2 = taosGetTimestampUs(); + for (int i = 0; i < times; i++) { + for (int j = 0; j < count; ++j) { + regex_t* rex = threadGetRegComp1(s1); + } + for (int j = 0; j < count; ++j) { + regex_t* rex = threadGetRegComp1(s2); + } + } + uint64_t t3 = taosGetTimestampUs(); + + printf("'%s' and '%s' take place by turn(per %d count) regex(before) %d times:%" PRIu64 " us.\n", s1, s2, count, times, t3 - t2); + + t2 = taosGetTimestampUs(); + for (int i = 0; i < times; i++) { + for (int j = 0; j < count; ++j) { + regex_t* rex = NULL; + (void)threadGetRegComp(&rex, s1); + } + for (int j = 0; j < count; ++j) { + regex_t* rex = NULL; + (void)threadGetRegComp(&rex, s2); + } + } + t3 = taosGetTimestampUs(); + + printf("'%s' and '%s' take place by turn(per %d count) regex(new) %d times:%" PRIu64 " us.\n", s1, s2, count, times, t3 - t2); +} + +// It is not a good idea to test this case, because it will take a long time. +/* +TEST(testCase, regexCacheTest5) { + int times = 10000; + int count = 10000; + char s1[] = "abc%*"; + char s2[] = "abc"; + auto start = std::chrono::high_resolution_clock::now(); + + uint64_t t0 = taosGetTimestampUs(); + for (int i = 0; i < times; i++) { + for (int j = 0; j < count; ++j) { + HashRegexPtr* ret = NULL; + int32_t code = getRegComp(s1, &ret); + if (code != 0) { + FAIL() << "Failed to compile regex pattern " << s1; + } + } + for (int j = 0; j < count; ++j) { + HashRegexPtr* ret = NULL; + int32_t code = getRegComp(s2, &ret); + if (code != 0) { + FAIL() << "Failed to compile regex pattern " << s2; + } + } + } + uint64_t t1 = taosGetTimestampUs(); + + printf("'%s' and '%s' take place by turn(per %d count) regex(current) %d times:%" PRIu64 " us.\n", s1, s2, count, times, t1 - t0); + + uint64_t t2 = taosGetTimestampUs(); + for (int i = 0; i < times; i++) { + for (int j = 0; j < count; ++j) { + regex_t* rex = threadGetRegComp1(s1); + } + for (int j = 0; j < count; ++j) { + regex_t* rex = threadGetRegComp1(s2); + } + } + uint64_t t3 = taosGetTimestampUs(); + + printf("'%s' and '%s' take place by turn(per %d count) regex(before) %d times:%" PRIu64 " us.\n", s1, s2, count, times, t3 - t2); + + t2 = taosGetTimestampUs(); + for (int i = 0; i < times; i++) { + for (int j = 0; j < count; ++j) { + regex_t* rex = NULL; + (void)threadGetRegComp(&rex, s1); + } + for (int j = 0; j < count; ++j) { + regex_t* rex = NULL; + (void)threadGetRegComp(&rex, s2); + } + } + t3 = taosGetTimestampUs(); + + printf("'%s' and '%s' take place by turn(per %d count) regex(new) %d times:%" PRIu64 " us.\n", s1, s2, count, times, t3 - t2); +} + +TEST(testCase, regexCacheTest6) { + int times = 10000; + int count = 1000; + char s1[] = "abc%*"; + char s2[] = "abc"; + auto start = std::chrono::high_resolution_clock::now(); + + uint64_t t0 = taosGetTimestampUs(); + for (int i = 0; i < times; i++) { + for (int j = 0; j < count; ++j) { + HashRegexPtr* ret = NULL; + int32_t code = getRegComp(s1, &ret); + if (code != 0) { + FAIL() << "Failed to compile regex pattern " << s1; + } + } + for (int j = 0; j < count; ++j) { + HashRegexPtr* ret = NULL; + int32_t code = getRegComp(s2, &ret); + if (code != 0) { + FAIL() << "Failed to compile regex pattern " << s2; + } + } + } + uint64_t t1 = taosGetTimestampUs(); + + printf("'%s' and '%s' take place by turn(per %d count) regex(current) %d times:%" PRIu64 " us.\n", s1, s2, count, times, t1 - t0); + + uint64_t t2 = taosGetTimestampUs(); + for (int i = 0; i < times; i++) { + for (int j = 0; j < count; ++j) { + regex_t* rex = threadGetRegComp1(s1); + } + for (int j = 0; j < count; ++j) { + regex_t* rex = threadGetRegComp1(s2); + } + } + uint64_t t3 = taosGetTimestampUs(); + + printf("'%s' and '%s' take place by turn(per %d count) regex(before) %d times:%" PRIu64 " us.\n", s1, s2, count, times, t3 - t2); + + t2 = taosGetTimestampUs(); + for (int i = 0; i < times; i++) { + for (int j = 0; j < count; ++j) { + regex_t* rex = NULL; + (void)threadGetRegComp(&rex, s1); + } + for (int j = 0; j < count; ++j) { + regex_t* rex = NULL; + (void)threadGetRegComp(&rex, s2); + } + } + t3 = taosGetTimestampUs(); + + printf("'%s' and '%s' take place by turn(per %d count) regex(new) %d times:%" PRIu64 " us.\n", s1, s2, count, times, t3 - t2); +} +*/ diff --git a/tests/army/cmdline/fullopt.py b/tests/army/cmdline/fullopt.py index 6fc8e858a3..b80d7eac4a 100644 --- a/tests/army/cmdline/fullopt.py +++ b/tests/army/cmdline/fullopt.py @@ -60,7 +60,7 @@ class TDTestCase(TBase): "enableCoreFile 1", "fqdn 127.0.0.1", "firstEp 127.0.0.1", - "locale ENG", + "locale en_US.UTF-8", "metaCacheMaxSize 10000", "minimalTmpDirGB 5", "minimalLogDirGB 1", diff --git a/tests/army/query/queryBugs.py b/tests/army/query/queryBugs.py index 20ecb23881..a7a6f35372 100644 --- a/tests/army/query/queryBugs.py +++ b/tests/army/query/queryBugs.py @@ -113,6 +113,34 @@ class TDTestCase(TBase): tdSql.checkData(0, 0, f"nihao{num + 2}") tdSql.checkData(0, 1, f"{11*i}") + def FIX_TS_5239(self): + tdLog.info("check bug TS_5239 ...\n") + sqls = [ + "drop database if exists ts_5239", + "create database ts_5239 cachemodel 'both' stt_trigger 1;", + "use ts_5239;", + "CREATE STABLE st (ts timestamp, c1 int) TAGS (groupId int);", + "CREATE TABLE ct1 USING st TAGS (1);" + ] + tdSql.executes(sqls) + # 2024-07-03 06:00:00.000 + start_ts = 1719957600000 + # insert 100 rows + sql = "insert into ct1 values " + for i in range(100): + sql += f"('{start_ts+i * 100}', {i+1})" + sql += ";" + tdSql.execute(sql) + tdSql.execute("flush database ts_5239;") + tdSql.execute("alter database ts_5239 stt_trigger 3;") + tdSql.execute(f"insert into ct1(ts) values({start_ts - 100 * 100})") + tdSql.execute("flush database ts_5239;") + tdSql.execute(f"insert into ct1(ts) values({start_ts + 100 * 200})") + tdSql.execute("flush database ts_5239;") + tdSql.query("select count(*) from ct1;") + tdSql.checkRows(1) + tdSql.checkData(0, 0, 102) + # run def run(self): tdLog.debug(f"start to excute {__file__}") @@ -123,6 +151,7 @@ class TDTestCase(TBase): # TS BUGS self.FIX_TS_5105() self.FIX_TS_5143() + self.FIX_TS_5239() tdLog.success(f"{__file__} successfully executed") diff --git a/tests/army/storage/compressRatio.py b/tests/army/storage/compressRatio.py new file mode 100644 index 0000000000..a5875a02b0 --- /dev/null +++ b/tests/army/storage/compressRatio.py @@ -0,0 +1,98 @@ +from util.log import * +from util.sql import * +from util.cases import * +from util.dnodes import * +from util.common import * +import json +import random + + +class TDTestCase: + def init(self, conn, logSql, replicaVar=1): + self.replicaVar = int(replicaVar) + tdLog.debug(f"start to excute {__file__}") + tdSql.init(conn.cursor(), True) + + + def checksql(self, sql): + result = os.popen(f"taos -s \"{sql}\" ") + res = result.read() + print(res) + if ("Query OK" in res): + tdLog.info(f"checkEqual success") + else : + tdLog.exit(f"checkEqual error") + def generate_random_str(self,randomlength=32): + """ + 生成一个指定长度的随机字符串 + """ + random_str = '' + base_str = 'ABCDEFGHIGKLMNOPQRSTUVWXYZabcdefghigklmnopqrstuvwxyz1234567890' + #base_str = 'ABCDEFGHIGKLMNOPQRSTUVWXYZabcdefghigklmnopqrstuvwxyz' + length = len(base_str) - 1 + count = 0 + for i in range(randomlength): + count = count + 1 + random_str += base_str[random.randint(0, length)] + return random_str + def check(self): + # tdSql.execute("create database db" ) + # tdSql.execute("create table db.jtable (ts timestamp, c1 VARCHAR(64000))",queryTimes=2) + # with open('./1-insert/temp.json', 'r') as f: + # data = json.load(f) + # json_str=json.dumps(data) + # print(data,type(data),type(json_str)) + # json_str=json_str.replace('"','\\"') + # # sql = f"insert into db.jtable values(now,\"{json_str}\") " + # # os.system(f"taos -s {sql} ") + # rowNum = 100 + # step = 1000 + # self.ts = 1537146000000 + # for j in range(1000): + # sql = "insert into db.jtable values" + # for k in range(rowNum): + # self.ts += step + # sql += f"({self.ts},\"{json_str}\") " + # tdSql.execute(sql,queryTimes=2) + # tdSql.execute("flush database db",queryTimes=2) + + tdSql.execute("create database db1" ) + tdSql.execute("create table db1.jtable (ts timestamp, c1 VARCHAR(6400) compress 'zstd')",queryTimes=2) + # with open('./1-insert/seedStr.json', 'r') as f: + # data = f.read() + # json_str=str(data) + # print(data,type(data),type(json_str)) + # json_str=json_str.replace('"','\\"') + + + rowNum = 100 + step = 1000 + self.ts = 1657146000000 + f=self.generate_random_str(5750) + json_str=f.replace('"','\\"') + for j in range(1000): + sql = "insert into db1.jtable values" + # f=self.generate_random_str(5750) + # json_str=f.replace('"','\\"') + for k in range(rowNum): + self.ts += step + f=self.generate_random_str(5750) + json_str=f.replace('"','\\"') + sql += f"({self.ts},\"{json_str}\") " + #print(sql) + tdSql.execute(sql,queryTimes=2) + tdSql.execute("flush database db1",queryTimes=2) + + + def run(self): + self.check() + + + def stop(self): + tdSql.close() + tdLog.success(f"{__file__} successfully executed") + + + +tdCases.addLinux(__file__, TDTestCase()) +tdCases.addWindows(__file__, TDTestCase()) diff --git a/tests/army/tmq/tmqBugs.py b/tests/army/tmq/tmqBugs.py new file mode 100644 index 0000000000..f2ef433665 --- /dev/null +++ b/tests/army/tmq/tmqBugs.py @@ -0,0 +1,98 @@ + +import taos +import sys +import time +import socket +import os +import threading + +from frame.log import * +from frame.cases import * +from frame.sql import * +from frame.caseBase import * +from frame import * +from taos.tmq import * +import frame.etool + +class TDTestCase: + updatecfgDict = {'debugFlag': 135, 'asynclog': 0} + def init(self, conn, logSql, replicaVar=1): + self.replicaVar = int(replicaVar) + tdLog.debug(f"start to excute {__file__}") + tdSql.init(conn.cursor()) + #tdSql.init(conn.cursor(), logSql) # output sql.txt file + + def td_31283_test(self): + tdSql.execute(f'create database if not exists d1 vgroups 1') + tdSql.execute(f'use d1') + tdSql.execute(f'create table st(ts timestamp, i int) tags(t int)') + tdSql.execute(f'insert into t1 using st tags(1) values(now, 1) (now+1s, 2)') + tdSql.execute(f'insert into t2 using st tags(2) values(now, 1) (now+1s, 2)') + tdSql.execute(f'insert into t3 using st tags(3) values(now, 1) (now+1s, 2)') + tdSql.execute(f'insert into t1 using st tags(1) values(now+5s, 11) (now+10s, 12)') + + tdSql.query("select * from st") + tdSql.checkRows(8) + + tdSql.error(f'create topic t1 with meta as database d2', expectErrInfo="Database not exist") + tdSql.error(f'create topic t1 as database d2', expectErrInfo="Database not exist") + tdSql.error(f'create topic t2 as select * from st2', expectErrInfo="Fail to get table info, error: Table does not exist") + tdSql.error(f'create topic t3 as stable st2', expectErrInfo="STable not exist") + tdSql.error(f'create topic t3 with meta as stable st2', expectErrInfo="STable not exist") + + tdSql.execute(f'create topic t1 with meta as database d1') + + consumer_dict = { + "group.id": "g1", + "td.connect.user": "root", + "td.connect.pass": "taosdata", + "auto.offset.reset": "earliest", + # "msg.enable.batchmeta": "true", + "experimental.snapshot.enable": "true", + } + consumer1 = Consumer(consumer_dict) + + try: + consumer1.subscribe(["t1"]) + except TmqError: + tdLog.exit(f"subscribe error") + + index = 0 + try: + while True: + res = consumer1.poll(1) + if not res: + if index != 1: + tdLog.exit("consume error") + break + val = res.value() + if val is None: + continue + cnt = 0; + for block in val: + cnt += len(block.fetchall()) + + if cnt != 8: + tdLog.exit("consume error") + + index += 1 + finally: + consumer1.close() + + + tdSql.query(f'show consumers') + tdSql.checkRows(0) + + tdSql.execute(f'drop topic t1') + tdSql.execute(f'drop database d1') + + def run(self): + self.td_31283_test() + + + def stop(self): + tdSql.close() + tdLog.success(f"{__file__} successfully executed") + +tdCases.addLinux(__file__, TDTestCase()) +tdCases.addWindows(__file__, TDTestCase()) diff --git a/tests/ci/scan_file_path.py b/tests/ci/scan_file_path.py index 5a4f5a535d..aff94158b8 100644 --- a/tests/ci/scan_file_path.py +++ b/tests/ci/scan_file_path.py @@ -122,7 +122,7 @@ def scan_files_path(source_file_path): for file in files: if any(item in root for item in scan_dir_list): file_path = os.path.join(root, file) - if (file_path.endswith(".c") or file_path.endswith(".cpp")) and all(item not in file_path for item in scan_skip_file_list): + if (file_path.endswith(".c") or file_name.endswith(".h") or file_path.endswith(".cpp")) and all(item not in file_path for item in scan_skip_file_list): all_file_path.append(file_path) logger.info("Found %s files" % len(all_file_path)) @@ -134,7 +134,7 @@ def input_files(change_files): for line in file: file_name = line.strip() if any(dir_name in file_name for dir_name in scan_dir_list): - if (file_name.endswith(".c") or file_name.endswith(".h") or line.endswith(".cpp")) and all(dir_name not in file_name for dir_name in scan_skip_file_list): + if (file_name.endswith(".c") or line.endswith(".cpp")) and all(dir_name not in file_name for dir_name in scan_skip_file_list): if "enterprise" in file_name: file_name = os.path.join(TD_project_path, file_name) else: diff --git a/tests/parallel_test/cases.task b/tests/parallel_test/cases.task index e70042001d..c7c542762c 100644 --- a/tests/parallel_test/cases.task +++ b/tests/parallel_test/cases.task @@ -27,14 +27,15 @@ ,,y,army,./pytest.sh python3 ./test.py -f insert/insert_basic.py -N 3 ,,y,army,./pytest.sh python3 ./test.py -f cluster/splitVgroupByLearner.py -N 3 ,,y,army,./pytest.sh python3 ./test.py -f authorith/authBasic.py -N 3 -# ,,n,army,python3 ./test.py -f cmdline/fullopt.py -,,n,army,python3 ./test.py -f query/show.py -N 3 -,,n,army,python3 ./test.py -f alter/alterConfig.py -N 3 +,,n,army,python3 ./test.py -f cmdline/fullopt.py +,,y,army,./pytest.sh python3 ./test.py -f query/show.py -N 3 +,,y,army,./pytest.sh python3 ./test.py -f alter/alterConfig.py -N 3 ,,y,army,./pytest.sh python3 ./test.py -f query/subquery/subqueryBugs.py -N 3 ,,y,army,./pytest.sh python3 ./test.py -f storage/oneStageComp.py -N 3 -L 3 -D 1 ,,y,army,./pytest.sh python3 ./test.py -f storage/compressBasic.py -N 3 ,,y,army,./pytest.sh python3 ./test.py -f grant/grantBugs.py -N 3 ,,y,army,./pytest.sh python3 ./test.py -f query/queryBugs.py -N 3 +,,y,army,./pytest.sh python3 ./test.py -f tmq/tmqBugs.py -N 3 # # system test @@ -162,6 +163,7 @@ ,,y,system-test,./pytest.sh python3 ./test.py -f 1-insert/stt_blocks_check.py ,,y,system-test,./pytest.sh python3 ./test.py -f 2-query/out_of_order.py -Q 3 ,,y,system-test,./pytest.sh python3 ./test.py -f 2-query/out_of_order.py +,,y,system-test,./pytest.sh python3 ./test.py -f 2-query/agg_null.py ,,y,system-test,./pytest.sh python3 ./test.py -f 2-query/insert_null_none.py ,,y,system-test,./pytest.sh python3 ./test.py -f 2-query/insert_null_none.py -R ,,y,system-test,./pytest.sh python3 ./test.py -f 2-query/insert_null_none.py -Q 2 @@ -279,8 +281,8 @@ ,,y,system-test,./pytest.sh python3 ./test.py -f 7-tmq/tmq3mnodeSwitch.py -N 6 -M 3 -n 3 -i True ,,y,system-test,./pytest.sh python3 test.py -f 7-tmq/tmqVnodeTransform-db-removewal.py -N 2 -n 1 ,,y,system-test,./pytest.sh python3 test.py -f 7-tmq/tmqVnodeTransform-stb-removewal.py -N 6 -n 3 -#,,y,system-test,./pytest.sh python3 test.py -f 7-tmq/tmqVnodeTransform-stb.py -N 2 -n 1 -#,,y,system-test,./pytest.sh python3 test.py -f 7-tmq/tmqVnodeTransform-stb.py -N 6 -n 3 +,,y,system-test,./pytest.sh python3 test.py -f 7-tmq/tmqVnodeTransform-stb.py -N 2 -n 1 +,,y,system-test,./pytest.sh python3 test.py -f 7-tmq/tmqVnodeTransform-stb.py -N 6 -n 3 #,,y,system-test,./pytest.sh python3 test.py -f 7-tmq/tmqVnodeTransform-db.py -N 6 -n 3 ,,y,system-test,./pytest.sh python3 test.py -f 7-tmq/tmqVnodeSplit-stb-select.py -N 2 -n 1 ,,y,system-test,./pytest.sh python3 test.py -f 7-tmq/tmqVnodeSplit-stb-select-duplicatedata.py -N 3 -n 3 diff --git a/tests/requirements.txt b/tests/requirements.txt index 5cdd9e02be..c6dd044c86 100644 --- a/tests/requirements.txt +++ b/tests/requirements.txt @@ -9,3 +9,4 @@ requests pexpect faker pyopenssl +hyperloglog \ No newline at end of file diff --git a/tests/script/sh/bit_and.c b/tests/script/sh/bit_and.c index f3bf71ce94..c35f1da171 100644 --- a/tests/script/sh/bit_and.c +++ b/tests/script/sh/bit_and.c @@ -8,13 +8,17 @@ DLL_EXPORT int32_t bit_and_init() { return 0; } DLL_EXPORT int32_t bit_and_destroy() { return 0; } DLL_EXPORT int32_t bit_and(SUdfDataBlock* block, SUdfColumn* resultCol) { + udfTrace("block:%p, processing begins, rows:%d cols:%d", block, block->numOfRows, block->numOfCols); + if (block->numOfCols < 2) { + udfError("block:%p, cols:%d needs to be greater than 2", block, block->numOfCols); return TSDB_CODE_UDF_INVALID_INPUT; } for (int32_t i = 0; i < block->numOfCols; ++i) { SUdfColumn* col = block->udfCols[i]; - if (!(col->colMeta.type == TSDB_DATA_TYPE_INT)) { + if (col->colMeta.type != TSDB_DATA_TYPE_INT) { + udfError("block:%p, col:%d type:%d should be int(%d)", block, i, col->colMeta.type, TSDB_DATA_TYPE_INT); return TSDB_CODE_UDF_INVALID_INPUT; } } @@ -24,24 +28,34 @@ DLL_EXPORT int32_t bit_and(SUdfDataBlock* block, SUdfColumn* resultCol) { for (int32_t i = 0; i < block->numOfRows; ++i) { if (udfColDataIsNull(block->udfCols[0], i)) { udfColDataSetNull(resultCol, i); + udfTrace("block:%p, row:%d result is null since col:0 is null", block, i); continue; } + int32_t result = *(int32_t*)udfColDataGetData(block->udfCols[0], i); - int j = 1; + udfTrace("block:%p, row:%d col:0 data:%d", block, i, result); + + int32_t j = 1; for (; j < block->numOfCols; ++j) { if (udfColDataIsNull(block->udfCols[j], i)) { udfColDataSetNull(resultCol, i); + udfTrace("block:%p, row:%d result is null since col:%d is null", block, i, j); break; } char* colData = udfColDataGetData(block->udfCols[j], i); result &= *(int32_t*)colData; + udfTrace("block:%p, row:%d col:%d data:%d", block, i, j, *(int32_t*)colData); } + if (j == block->numOfCols) { udfColDataSet(resultCol, i, (char*)&result, false); + udfTrace("block:%p, row:%d result is %d", block, i, result); } } + resultData->numOfRows = block->numOfRows; + udfTrace("block:%p, processing completed", block); return TSDB_CODE_SUCCESS; } diff --git a/tests/script/sh/l2norm.c b/tests/script/sh/l2norm.c index 0b7f5bf7f6..e2f379fd29 100644 --- a/tests/script/sh/l2norm.c +++ b/tests/script/sh/l2norm.c @@ -1,71 +1,87 @@ -#include -#include -#include #include - +#include +#include +#include #include "taosudf.h" -DLL_EXPORT int32_t l2norm_init() { - return 0; -} +DLL_EXPORT int32_t l2norm_init() { return 0; } -DLL_EXPORT int32_t l2norm_destroy() { - return 0; -} +DLL_EXPORT int32_t l2norm_destroy() { return 0; } -DLL_EXPORT int32_t l2norm_start(SUdfInterBuf *buf) { +DLL_EXPORT int32_t l2norm_start(SUdfInterBuf* buf) { + int32_t bufLen = sizeof(double); + if (buf->bufLen < bufLen) { + udfError("failed to execute udf since input buflen:%d < %d", buf->bufLen, bufLen); + return TSDB_CODE_UDF_INVALID_BUFSIZE; + } + + udfTrace("start aggregation, buflen:%d used:%d", buf->bufLen, bufLen); *(int64_t*)(buf->buf) = 0; - buf->bufLen = sizeof(double); - buf->numOfResult = 1; + buf->bufLen = bufLen; + buf->numOfResult = 0; return 0; } -DLL_EXPORT int32_t l2norm(SUdfDataBlock* block, SUdfInterBuf *interBuf, SUdfInterBuf *newInterBuf) { - double sumSquares = *(double*)interBuf->buf; - int8_t numNotNull = 0; +DLL_EXPORT int32_t l2norm(SUdfDataBlock* block, SUdfInterBuf* interBuf, SUdfInterBuf* newInterBuf) { + udfTrace("block:%p, processing begins, cols:%d rows:%d", block, block->numOfCols, block->numOfRows); + for (int32_t i = 0; i < block->numOfCols; ++i) { SUdfColumn* col = block->udfCols[i]; - if (!(col->colMeta.type == TSDB_DATA_TYPE_INT || - col->colMeta.type == TSDB_DATA_TYPE_DOUBLE)) { + if (col->colMeta.type != TSDB_DATA_TYPE_INT && col->colMeta.type != TSDB_DATA_TYPE_DOUBLE) { + udfError("block:%p, col:%d type:%d should be int(%d) or double(%d)", block, i, col->colMeta.type, + TSDB_DATA_TYPE_INT, TSDB_DATA_TYPE_DOUBLE); return TSDB_CODE_UDF_INVALID_INPUT; } } + + double sumSquares = *(double*)interBuf->buf; + int8_t numNotNull = 0; + for (int32_t i = 0; i < block->numOfCols; ++i) { for (int32_t j = 0; j < block->numOfRows; ++j) { SUdfColumn* col = block->udfCols[i]; if (udfColDataIsNull(col, j)) { + udfTrace("block:%p, col:%d row:%d is null", block, i, j); continue; } + switch (col->colMeta.type) { case TSDB_DATA_TYPE_INT: { - char* cell = udfColDataGetData(col, j); + char* cell = udfColDataGetData(col, j); int32_t num = *(int32_t*)cell; sumSquares += (double)num * num; + udfTrace("block:%p, col:%d row:%d data:%d", block, i, j, num); break; } case TSDB_DATA_TYPE_DOUBLE: { - char* cell = udfColDataGetData(col, j); + char* cell = udfColDataGetData(col, j); double num = *(double*)cell; sumSquares += num * num; + udfTrace("block:%p, col:%d row:%d data:%f", block, i, j, num); break; } - default: + default: break; } ++numNotNull; } + udfTrace("block:%p, col:%d result is %f", block, i, sumSquares); } *(double*)(newInterBuf->buf) = sumSquares; newInterBuf->bufLen = sizeof(double); newInterBuf->numOfResult = 1; + + udfTrace("block:%p, result is %f", block, sumSquares); return 0; } -DLL_EXPORT int32_t l2norm_finish(SUdfInterBuf* buf, SUdfInterBuf *resultData) { +DLL_EXPORT int32_t l2norm_finish(SUdfInterBuf* buf, SUdfInterBuf* resultData) { double sumSquares = *(double*)(buf->buf); *(double*)(resultData->buf) = sqrt(sumSquares); resultData->bufLen = sizeof(double); resultData->numOfResult = 1; + + udfTrace("end aggregation, result is %f", *(double*)(resultData->buf)); return 0; } diff --git a/tests/script/sh/max_vol.c b/tests/script/sh/max_vol.c index 4f9ecd33a7..1a7a3f8210 100644 --- a/tests/script/sh/max_vol.c +++ b/tests/script/sh/max_vol.c @@ -1,101 +1,113 @@ -#include -#include -#include #include - +#include +#include +#include #include "taosudf.h" -#define STR_MAX_LEN 256 // inter buffer length +#define STR_MAX_LEN 256 // inter buffer length -// init -DLL_EXPORT int32_t max_vol_init() -{ - return 0; -} +DLL_EXPORT int32_t max_vol_init() { return 0; } -// destory -DLL_EXPORT int32_t max_vol_destroy() -{ - return 0; -} +DLL_EXPORT int32_t max_vol_destroy() { return 0; } -// start -DLL_EXPORT int32_t max_vol_start(SUdfInterBuf *buf) -{ - memset(buf->buf, 0, sizeof(float) + STR_MAX_LEN); - // set init value - *((float*)buf->buf) = -10000000; - buf->bufLen = sizeof(float) + STR_MAX_LEN; - buf->numOfResult = 0; - return 0; +DLL_EXPORT int32_t max_vol_start(SUdfInterBuf *buf) { + int32_t bufLen = sizeof(float) + STR_MAX_LEN; + if (buf->bufLen < bufLen) { + udfError("failed to execute udf since input buflen:%d < %d", buf->bufLen, bufLen); + return TSDB_CODE_UDF_INVALID_BUFSIZE; + } + + udfTrace("start aggregation, buflen:%d used:%d", buf->bufLen, bufLen); + memset(buf->buf, 0, sizeof(float) + STR_MAX_LEN); + *((float *)buf->buf) = INT32_MIN; + buf->bufLen = bufLen; + buf->numOfResult = 0; + return 0; } DLL_EXPORT int32_t max_vol(SUdfDataBlock *block, SUdfInterBuf *interBuf, SUdfInterBuf *newInterBuf) { - float maxValue = *(float *)interBuf->buf; - char strBuff[STR_MAX_LEN] = "inter1buf"; - - if (block->numOfCols < 2) - { + udfTrace("block:%p, processing begins, cols:%d rows:%d", block, block->numOfCols, block->numOfRows); + + float maxValue = *(float *)interBuf->buf; + char strBuff[STR_MAX_LEN] = "inter1buf"; + + if (block->numOfCols < 2) { + udfError("block:%p, cols:%d needs to be greater than 2", block, block->numOfCols); + return TSDB_CODE_UDF_INVALID_INPUT; + } + + // check data type + for (int32_t i = 0; i < block->numOfCols; ++i) { + SUdfColumn *col = block->udfCols[i]; + if (i == block->numOfCols - 1) { + // last column is device id , must varchar + if (col->colMeta.type != TSDB_DATA_TYPE_VARCHAR) { + udfError("block:%p, col:%d type:%d should be varchar(%d)", block, i, col->colMeta.type, TSDB_DATA_TYPE_VARCHAR); return TSDB_CODE_UDF_INVALID_INPUT; + } + } else { + if (col->colMeta.type != TSDB_DATA_TYPE_FLOAT) { + udfError("block:%p, col:%d type:%d should be float(%d)", block, i, col->colMeta.type, TSDB_DATA_TYPE_FLOAT); + return TSDB_CODE_UDF_INVALID_INPUT; + } } + } - // check data type - for (int32_t i = 0; i < block->numOfCols; ++i) - { - SUdfColumn *col = block->udfCols[i]; - if( i == block->numOfCols - 1) { - // last column is device id , must varchar - if (col->colMeta.type != TSDB_DATA_TYPE_VARCHAR ) { - return TSDB_CODE_UDF_INVALID_INPUT; - } - } else { - if (col->colMeta.type != TSDB_DATA_TYPE_FLOAT) { - return TSDB_CODE_UDF_INVALID_INPUT; - } - } + // calc max voltage + SUdfColumn *lastCol = block->udfCols[block->numOfCols - 1]; + for (int32_t i = 0; i < block->numOfCols - 1; ++i) { + for (int32_t j = 0; j < block->numOfRows; ++j) { + SUdfColumn *col = block->udfCols[i]; + if (udfColDataIsNull(col, j)) { + udfTrace("block:%p, col:%d row:%d is null", block, i, j); + continue; + } + + char *data = udfColDataGetData(col, j); + float voltage = *(float *)data; + + if (voltage <= maxValue) { + udfTrace("block:%p, col:%d row:%d data:%f", block, i, j, voltage); + } else { + maxValue = voltage; + char *valData = udfColDataGetData(lastCol, j); + int32_t valDataLen = udfColDataGetDataLen(lastCol, j); + + // get device id + char *deviceId = valData + sizeof(uint16_t); + int32_t deviceIdLen = valDataLen < (STR_MAX_LEN - 1) ? valDataLen : (STR_MAX_LEN - 1); + + strncpy(strBuff, deviceId, deviceIdLen); + snprintf(strBuff + deviceIdLen, STR_MAX_LEN - deviceIdLen, "_(%d,%d)_%f", j, i, maxValue); + udfTrace("block:%p, col:%d row:%d data:%f, as max_val:%s", block, i, j, voltage, strBuff); + } } + } - // calc max voltage - SUdfColumn *lastCol = block->udfCols[block->numOfCols - 1]; - for (int32_t i = 0; i < (block->numOfCols - 1); ++i) { - for (int32_t j = 0; j < block->numOfRows; ++j) { - SUdfColumn *col = block->udfCols[i]; - if (udfColDataIsNull(col, j)) { - continue; - } - char *data = udfColDataGetData(col, j); - float voltage = *(float *)data; - if (voltage > maxValue) { - maxValue = voltage; - char *valData = udfColDataGetData(lastCol, j); - // get device id - char *deviceId = valData + sizeof(uint16_t); - sprintf(strBuff, "%s_(%d,%d)_%f", deviceId, j, i, maxValue); - } - } - } + *(float *)newInterBuf->buf = maxValue; + strncpy(newInterBuf->buf + sizeof(float), strBuff, STR_MAX_LEN); + newInterBuf->bufLen = sizeof(float) + strlen(strBuff) + 1; + newInterBuf->numOfResult = 1; - *(float*)newInterBuf->buf = maxValue; - strcpy(newInterBuf->buf + sizeof(float), strBuff); - newInterBuf->bufLen = sizeof(float) + strlen(strBuff)+1; - newInterBuf->numOfResult = 1; - return 0; + udfTrace("block:%p, result is %s", block, strBuff); + return 0; } -DLL_EXPORT int32_t max_vol_finish(SUdfInterBuf *buf, SUdfInterBuf *resultData) -{ - char * str = buf->buf + sizeof(float); - // copy to des - char * des = resultData->buf + sizeof(uint16_t); - strcpy(des, str); +DLL_EXPORT int32_t max_vol_finish(SUdfInterBuf *buf, SUdfInterBuf *resultData) { + char *str = buf->buf + sizeof(float); + // copy to des + char *des = resultData->buf + sizeof(uint16_t); + strcpy(des, str); - // set binary type len - uint16_t len = strlen(str); - *((uint16_t*)resultData->buf) = len; + // set binary type len + uint16_t len = strlen(str); + *((uint16_t *)resultData->buf) = len; - // set buf len - resultData->bufLen = len + sizeof(uint16_t); - // set row count - resultData->numOfResult = 1; - return 0; + // set buf len + resultData->bufLen = len + sizeof(uint16_t); + // set row count + resultData->numOfResult = 1; + + udfTrace("end aggregation, result is %s", str); + return 0; } diff --git a/tests/script/tsim/stream/checkpointInterval0.sim b/tests/script/tsim/stream/checkpointInterval0.sim index a548f05c82..a5e5c87704 100644 --- a/tests/script/tsim/stream/checkpointInterval0.sim +++ b/tests/script/tsim/stream/checkpointInterval0.sim @@ -76,6 +76,8 @@ system sh/stop_dnodes.sh system sh/exec.sh -n dnode1 -s start +run tsim/stream/checkTaskStatus.sim + sql insert into t1 values(1648791213002,3,2,3,1.1); $loop_count = 0 diff --git a/tests/system-test/2-query/agg_null.py b/tests/system-test/2-query/agg_null.py new file mode 100644 index 0000000000..bec879abbe --- /dev/null +++ b/tests/system-test/2-query/agg_null.py @@ -0,0 +1,136 @@ +################################################################### +# Copyright (c) 2016 by TAOS Technologies, Inc. +# All rights reserved. +# +# This file is proprietary and confidential to TAOS Technologies. +# No part of this file may be reproduced, stored, transmitted, +# disclosed or used in any form or by any means other than as +# expressly provided by the written permission from Jianhui Tao +# +################################################################### + +# -*- coding: utf-8 -*- + +import numpy as np +from util.log import * +from util.cases import * +from util.sql import * +from util.common import * +from util.sqlset import * +from hyperloglog import HyperLogLog +''' +Test case for TS-5150 +''' +class TDTestCase: + def init(self, conn, logSql, replicaVar=1): + self.replicaVar = int(replicaVar) + tdLog.debug("start to execute %s" % __file__) + tdSql.init(conn.cursor()) + self.ts = 1537146000000 + def initdabase(self): + tdSql.execute('create database if not exists db_test vgroups 2 buffer 10') + tdSql.execute('use db_test') + tdSql.execute('create stable stb(ts timestamp, delay int) tags(groupid int)') + tdSql.execute('create table t1 using stb tags(1)') + tdSql.execute('create table t2 using stb tags(2)') + tdSql.execute('create table t3 using stb tags(3)') + tdSql.execute('create table t4 using stb tags(4)') + tdSql.execute('create table t5 using stb tags(5)') + tdSql.execute('create table t6 using stb tags(6)') + def insert_data(self): + for i in range(5000): + tdSql.execute(f"insert into t1 values({self.ts + i * 1000}, {i%5})") + tdSql.execute(f"insert into t2 values({self.ts + i * 1000}, {i%5})") + tdSql.execute(f"insert into t3 values({self.ts + i * 1000}, {i%5})") + + def verify_agg_null(self): + for i in range(20): + col_val_list = [] + tdSql.query(f'select CASE WHEN delay != 0 THEN delay ELSE NULL END from stb where ts between {1537146000000 + i * 1000} and {1537146000000 + (i+10) * 1000}') + for col_va in tdSql.queryResult: + if col_va[0] is not None: + col_val_list.append(col_va[0]) + tdSql.query(f'SELECT APERCENTILE(CASE WHEN delay != 0 THEN delay ELSE NULL END,50) AS apercentile,\ + MAX(CASE WHEN delay != 0 THEN delay ELSE NULL END) AS maxDelay,\ + MIN(CASE WHEN delay != 0 THEN delay ELSE NULL END) AS minDelay,\ + AVG(CASE WHEN delay != 0 THEN delay ELSE NULL END) AS avgDelay,\ + STDDEV(CASE WHEN delay != 0 THEN delay ELSE NULL END) AS jitter,\ + COUNT(CASE WHEN delay = 0 THEN 1 ELSE NULL END) AS timeoutCount,\ + COUNT(*) AS totalCount ,\ + ELAPSED(ts) AS elapsed_time,\ + SPREAD(CASE WHEN delay != 0 THEN delay ELSE NULL END) AS spread,\ + SUM(CASE WHEN delay != 0 THEN delay ELSE NULL END) AS sum,\ + HYPERLOGLOG(CASE WHEN delay != 0 THEN delay ELSE NULL END) AS hyperloglog from stb where ts between {1537146000000 + i * 1000} and {1537146000000 + (i+10) * 1000}') + #verify apercentile + apercentile_res = tdSql.queryResult[0][0] + approximate_median = np.percentile(col_val_list, 50) + assert np.abs(apercentile_res - approximate_median) < 1 + #verify max + max_res = tdSql.queryResult[0][1] + tdSql.checkEqual(max_res,max(col_val_list)) + #verify min + min_res = tdSql.queryResult[0][2] + tdSql.checkEqual(min_res,min(col_val_list)) + #verify avg + avg_res = tdSql.queryResult[0][3] + tdSql.checkEqual(avg_res,np.average(col_val_list)) + #verify stddev + stddev_res = tdSql.queryResult[0][4] + assert np.abs(stddev_res - np.std(col_val_list)) < 0.0001 + #verify count of 0 + count of !0 == count(*) + count_res = tdSql.queryResult[0][6] + tdSql.checkEqual(count_res,len(col_val_list)+tdSql.queryResult[0][5]) + #verify elapsed + elapsed_res = tdSql.queryResult[0][7] + assert elapsed_res == 10000 + #verify spread + spread_res = tdSql.queryResult[0][8] + tdSql.checkEqual(spread_res,max(col_val_list) - min(col_val_list)) + #verify sum + sum_res = tdSql.queryResult[0][9] + tdSql.checkEqual(sum_res,sum(col_val_list)) + #verify hyperloglog + error_rate = 0.01 + hll = HyperLogLog(error_rate) + for col_val in col_val_list: + hll.add(col_val) + hll_res = tdSql.queryResult[0][10] + assert np.abs(hll_res - hll.card()) < 0.01 + #verify leastsquares + tdSql.query(f'SELECT leastsquares(CASE WHEN delay != 0 THEN delay ELSE NULL END,1,1) from stb where ts between {1537146000000 + i * 1000} and {1537146000000 + (i+10) * 1000}') + cleaned_data = tdSql.queryResult[0][0].strip('{}').replace(' ', '') + pairs = cleaned_data.split(',') + slope = None + intercept = None + for pair in pairs: + key, value = pair.split(':') + key = key.strip() + value = float(value.strip()) + if key == 'slop': + slope = value + elif key == 'intercept': + intercept = value + assert slope != 0 + assert intercept != 0 + #verify histogram + tdSql.query(f'SELECT histogram(CASE WHEN delay != 0 THEN delay ELSE NULL END, "user_input", "[1,3,5,7]", 1) from stb where ts between {1537146000000 + i * 1000} and {1537146000000 + (i+10) * 1000}') + cleaned_data = tdSql.queryResult[0][0].strip('{}').replace(' ', '') + pairs = cleaned_data.split(',') + count = None + for pair in pairs: + key, value = pair.split(':') + key = key.strip() + if key == 'count': + count = float(value.strip()) + assert count != 0 + def run(self): + self.initdabase() + self.insert_data() + self.verify_agg_null() + def stop(self): + tdSql.close() + tdLog.success(f"{__file__} successfully executed") + + +tdCases.addLinux(__file__, TDTestCase()) +tdCases.addWindows(__file__, TDTestCase()) diff --git a/tests/system-test/2-query/interval_limit_opt.py b/tests/system-test/2-query/interval_limit_opt.py index aa1702fe3c..2f222d5b43 100644 --- a/tests/system-test/2-query/interval_limit_opt.py +++ b/tests/system-test/2-query/interval_limit_opt.py @@ -195,6 +195,10 @@ class TDTestCase: tdSql.checkData(1, 4, 2) tdSql.checkData(2, 4, 9) tdSql.checkData(3, 4, 9) + + sql = "SELECT _wstart, last(c1) FROM t6 INTERVAL(1w);" + tdSql.query(sql) + tdSql.checkRows(11) def test_partition_by_limit_no_agg(self): sql_template = 'select t1 from meters partition by t1 limit %d' diff --git a/tests/system-test/2-query/tsma.py b/tests/system-test/2-query/tsma.py index fccf6291b5..2bf908e250 100644 --- a/tests/system-test/2-query/tsma.py +++ b/tests/system-test/2-query/tsma.py @@ -1476,18 +1476,18 @@ class TDTestCase: tdSql.error(sql, -2147473920) # syntax error sql = 'create recursive tsma tsma2 on test.tsma1 interval(1m)' - tdSql.error(sql, -2147471099) # invalid tsma parameter + tdSql.error(sql, -2147471097) # invalid tsma interval sql = 'create recursive tsma tsma2 on test.tsma1 interval(7m)' - tdSql.error(sql, -2147471099) # invalid tsma parameter + tdSql.error(sql, -2147471097) # invalid tsma interval sql = 'create recursive tsma tsma2 on test.tsma1 interval(11m)' - tdSql.error(sql, -2147471099) # invalid tsma parameter + tdSql.error(sql, -2147471097) # invalid tsma interval self.create_recursive_tsma('tsma1', 'tsma2', 'test', '20m', 'meters') sql = 'create recursive tsma tsma3 on test.tsma2 interval(30m)' - tdSql.error(sql, -2147471099) # invalid tsma parameter + tdSql.error(sql, -2147471097) # invalid tsma interval self.create_recursive_tsma('tsma2', 'tsma3', 'test', '40m', 'meters') diff --git a/tests/system-test/2-query/tsma2.py b/tests/system-test/2-query/tsma2.py index 5af75b6fb9..404fb3f00c 100644 --- a/tests/system-test/2-query/tsma2.py +++ b/tests/system-test/2-query/tsma2.py @@ -830,11 +830,21 @@ class TDTestCase: ).ignore_res_order(sql_generator.can_ignore_res_order()).get_qc()) return ctxs + def test_query_interval(self): + sql = 'select count(*), _wstart, _wend from db.meters interval(1n) sliding(1d) limit 1' + tdSql.query(sql) + tdSql.checkData(0, 1, '2017-06-15 00:00:00') + sql = 'select /*+skip_tsma()*/count(*), _wstart, _wend from db.meters interval(1n) sliding(1d) limit 1' + tdSql.query(sql) + tdSql.checkData(0, 1, '2017-06-15 00:00:00') + def test_bigger_tsma_interval(self): db = 'db' tb = 'meters' func = ['max(c1)', 'min(c1)', 'min(c2)', 'max(c2)', 'avg(c1)', 'count(ts)'] self.init_data(db,10, 10000, 1500000000000, 11000000) + self.test_query_interval() + examples = [ ('10m', '1h', True), ('10m','1d',True), ('1m', '120s', True), ('1h','1d',True), ('12h', '1y', False), ('1h', '1n', True), ('1h', '1y', True), @@ -842,7 +852,7 @@ class TDTestCase: ] tdSql.execute('use db') for (i, ri, ret) in examples: - self.test_create_recursive_tsma_interval(db, tb, func, i, ri, ret, -2147471099) + self.test_create_recursive_tsma_interval(db, tb, func, i, ri, ret, -2147471097) self.create_tsma('tsma1', db, tb, func, '1h') self.create_recursive_tsma('tsma1', 'tsma2', db, '1n', tb, func) @@ -898,7 +908,7 @@ class TDTestCase: .get_qc()) self.check(ctxs) - tdSql.execute('drop database db') + def stop(self): tdSql.close() diff --git a/tests/system-test/7-tmq/tmqParamsTest.py b/tests/system-test/7-tmq/tmqParamsTest.py index a323dff19e..c14c3fc7d1 100644 --- a/tests/system-test/7-tmq/tmqParamsTest.py +++ b/tests/system-test/7-tmq/tmqParamsTest.py @@ -104,7 +104,7 @@ class TDTestCase: stop_flag = 0 try: while True: - res = consumer.poll(1) + res = consumer.poll(3) tdSql.query('show consumers;') consumer_info = tdSql.queryResult[0][-1] if offset_value == "latest": diff --git a/tests/system-test/7-tmq/tmqVnodeSplit-column-false.py b/tests/system-test/7-tmq/tmqVnodeSplit-column-false.py index 6ef28a4e77..b5105def37 100644 --- a/tests/system-test/7-tmq/tmqVnodeSplit-column-false.py +++ b/tests/system-test/7-tmq/tmqVnodeSplit-column-false.py @@ -52,7 +52,7 @@ class TDTestCase: 'rowsPerTbl': 1000, 'batchNum': 10, 'startTs': 1640966400000, # 2022-01-01 00:00:00.000 - 'pollDelay': 60, + 'pollDelay': 120, 'showMsg': 1, 'showRow': 1, 'snapshot': 0} diff --git a/tests/system-test/7-tmq/tmqVnodeSplit-column.py b/tests/system-test/7-tmq/tmqVnodeSplit-column.py index 8987cf5251..1488e304cb 100644 --- a/tests/system-test/7-tmq/tmqVnodeSplit-column.py +++ b/tests/system-test/7-tmq/tmqVnodeSplit-column.py @@ -52,7 +52,7 @@ class TDTestCase: 'rowsPerTbl': 1000, 'batchNum': 10, 'startTs': 1640966400000, # 2022-01-01 00:00:00.000 - 'pollDelay': 60, + 'pollDelay': 120, 'showMsg': 1, 'showRow': 1, 'snapshot': 0} @@ -121,7 +121,7 @@ class TDTestCase: 'rowsPerTbl': 1000, 'batchNum': 10, 'startTs': 1640966400000, # 2022-01-01 00:00:00.000 - 'pollDelay': 60, + 'pollDelay': 120, 'showMsg': 1, 'showRow': 1, 'snapshot': 0} diff --git a/tests/system-test/7-tmq/tmqVnodeSplit-db-false.py b/tests/system-test/7-tmq/tmqVnodeSplit-db-false.py index bad9e09da5..f77fb53c85 100644 --- a/tests/system-test/7-tmq/tmqVnodeSplit-db-false.py +++ b/tests/system-test/7-tmq/tmqVnodeSplit-db-false.py @@ -121,7 +121,7 @@ class TDTestCase: 'rowsPerTbl': 1000, 'batchNum': 10, 'startTs': 1640966400000, # 2022-01-01 00:00:00.000 - 'pollDelay': 60, + 'pollDelay': 120, 'showMsg': 1, 'showRow': 1, 'snapshot': 0} diff --git a/tests/system-test/7-tmq/tmqVnodeSplit-db.py b/tests/system-test/7-tmq/tmqVnodeSplit-db.py index a9fb1c2d4b..979d75d558 100644 --- a/tests/system-test/7-tmq/tmqVnodeSplit-db.py +++ b/tests/system-test/7-tmq/tmqVnodeSplit-db.py @@ -52,7 +52,7 @@ class TDTestCase: 'rowsPerTbl': 1000, 'batchNum': 10, 'startTs': 1640966400000, # 2022-01-01 00:00:00.000 - 'pollDelay': 60, + 'pollDelay': 120, 'showMsg': 1, 'showRow': 1, 'snapshot': 0} @@ -121,7 +121,7 @@ class TDTestCase: 'rowsPerTbl': 1000, 'batchNum': 10, 'startTs': 1640966400000, # 2022-01-01 00:00:00.000 - 'pollDelay': 60, + 'pollDelay': 120, 'showMsg': 1, 'showRow': 1, 'snapshot': 0} diff --git a/tests/system-test/7-tmq/tmqVnodeSplit-stb-select-duplicatedata-false.py b/tests/system-test/7-tmq/tmqVnodeSplit-stb-select-duplicatedata-false.py index 3965168fa7..7c8f56f40d 100644 --- a/tests/system-test/7-tmq/tmqVnodeSplit-stb-select-duplicatedata-false.py +++ b/tests/system-test/7-tmq/tmqVnodeSplit-stb-select-duplicatedata-false.py @@ -54,7 +54,7 @@ class TDTestCase: 'rowsPerTbl': 1000, 'batchNum': 10, 'startTs': 1640966400000, # 2022-01-01 00:00:00.000 - 'pollDelay': 60, + 'pollDelay': 120, 'showMsg': 1, 'showRow': 1, 'snapshot': 0} diff --git a/tests/system-test/7-tmq/tmqVnodeSplit-stb-select-duplicatedata.py b/tests/system-test/7-tmq/tmqVnodeSplit-stb-select-duplicatedata.py index d4c76c4f61..5ff2ca6e27 100644 --- a/tests/system-test/7-tmq/tmqVnodeSplit-stb-select-duplicatedata.py +++ b/tests/system-test/7-tmq/tmqVnodeSplit-stb-select-duplicatedata.py @@ -54,7 +54,7 @@ class TDTestCase: 'rowsPerTbl': 1000, 'batchNum': 10, 'startTs': 1640966400000, # 2022-01-01 00:00:00.000 - 'pollDelay': 60, + 'pollDelay': 120, 'showMsg': 1, 'showRow': 1, 'snapshot': 0} diff --git a/tests/system-test/7-tmq/tmqVnodeSplit-stb-select-false.py b/tests/system-test/7-tmq/tmqVnodeSplit-stb-select-false.py index a5e61adc8d..9d89e3b1c0 100644 --- a/tests/system-test/7-tmq/tmqVnodeSplit-stb-select-false.py +++ b/tests/system-test/7-tmq/tmqVnodeSplit-stb-select-false.py @@ -56,7 +56,7 @@ class TDTestCase: 'rowsPerTbl': 1000, 'batchNum': 10, 'startTs': 1640966400000, # 2022-01-01 00:00:00.000 - 'pollDelay': 60, + 'pollDelay': 120, 'showMsg': 1, 'showRow': 1, 'snapshot': 0} diff --git a/tests/system-test/7-tmq/tmqVnodeSplit-stb-select.py b/tests/system-test/7-tmq/tmqVnodeSplit-stb-select.py index eb35ebc718..3c5f3ecb30 100644 --- a/tests/system-test/7-tmq/tmqVnodeSplit-stb-select.py +++ b/tests/system-test/7-tmq/tmqVnodeSplit-stb-select.py @@ -56,7 +56,7 @@ class TDTestCase: 'rowsPerTbl': 1000, 'batchNum': 10, 'startTs': 1640966400000, # 2022-01-01 00:00:00.000 - 'pollDelay': 60, + 'pollDelay': 120, 'showMsg': 1, 'showRow': 1, 'snapshot': 0} diff --git a/tests/system-test/7-tmq/tmqVnodeSplit-stb.py b/tests/system-test/7-tmq/tmqVnodeSplit-stb.py index 5aa2054e96..b7cebb51e0 100644 --- a/tests/system-test/7-tmq/tmqVnodeSplit-stb.py +++ b/tests/system-test/7-tmq/tmqVnodeSplit-stb.py @@ -54,7 +54,7 @@ class TDTestCase: 'rowsPerTbl': 1000, 'batchNum': 10, 'startTs': 1640966400000, # 2022-01-01 00:00:00.000 - 'pollDelay': 60, + 'pollDelay': 120, 'showMsg': 1, 'showRow': 1, 'snapshot': 0} @@ -123,7 +123,7 @@ class TDTestCase: 'rowsPerTbl': 1000, 'batchNum': 10, 'startTs': 1640966400000, # 2022-01-01 00:00:00.000 - 'pollDelay': 60, + 'pollDelay': 120, 'showMsg': 1, 'showRow': 1, 'snapshot': 0} diff --git a/tests/system-test/7-tmq/tmqVnodeTransform-db-removewal.py b/tests/system-test/7-tmq/tmqVnodeTransform-db-removewal.py index a853489c3f..034197e0f9 100644 --- a/tests/system-test/7-tmq/tmqVnodeTransform-db-removewal.py +++ b/tests/system-test/7-tmq/tmqVnodeTransform-db-removewal.py @@ -140,7 +140,7 @@ class TDTestCase: 'rowsPerTbl': 10000, 'batchNum': 10, 'startTs': 1640966400000, # 2022-01-01 00:00:00.000 - 'pollDelay': 2, + 'pollDelay': 5, 'showMsg': 1, 'showRow': 1, 'snapshot': 0} @@ -190,9 +190,6 @@ class TDTestCase: # redistribute vgroup self.redistributeVgroups() - tdLog.info("start consume processor") - tmqCom.startTmqSimProcess(pollDelay=paraDict['pollDelay'],dbName=paraDict["dbName"],showMsg=paraDict['showMsg'], showRow=paraDict['showRow'],snapshot=paraDict['snapshot']) - tdLog.info("wait the consume result") expectRows = 1 resultList = tmqCom.selectConsumeResult(expectRows) diff --git a/tests/system-test/7-tmq/tmq_taosx.py b/tests/system-test/7-tmq/tmq_taosx.py index d30d88bb1c..d0e682cffb 100644 --- a/tests/system-test/7-tmq/tmq_taosx.py +++ b/tests/system-test/7-tmq/tmq_taosx.py @@ -496,6 +496,43 @@ class TDTestCase: consumer.close() print("consume_ts_4551 ok") + def consume_td_31283(self): + tdSql.execute(f'create database if not exists d31283') + tdSql.execute(f'use d31283') + + tdSql.execute(f'create topic topic_31283 with meta as database d31283') + consumer_dict = { + "group.id": "g1", + "td.connect.user": "root", + "td.connect.pass": "taosdata", + "auto.offset.reset": "earliest", + "experimental.snapshot.enable": "true", + # "msg.enable.batchmeta": "true" + } + consumer = Consumer(consumer_dict) + + try: + consumer.subscribe(["topic_31283"]) + except TmqError: + tdLog.exit(f"subscribe error") + + tdSql.execute(f'create table stt(ts timestamp, i int) tags(t int)') + + hasData = False + try: + while True: + res = consumer.poll(1) + if not res: + break + hasData = True + finally: + consumer.close() + + if not hasData: + tdLog.exit(f"consume_td_31283 error") + + print("consume_td_31283 ok") + def consume_TS_5067_Test(self): tdSql.execute(f'create database if not exists d1 vgroups 1') tdSql.execute(f'use d1') @@ -632,6 +669,7 @@ class TDTestCase: self.consume_ts_4544() self.consume_ts_4551() self.consume_TS_4540_Test() + self.consume_td_31283() tdSql.prepare() self.checkWal1VgroupOnlyMeta() diff --git a/tests/system-test/8-stream/state_window_case.py b/tests/system-test/8-stream/state_window_case.py index 5ecf8d7832..3015b0db42 100644 --- a/tests/system-test/8-stream/state_window_case.py +++ b/tests/system-test/8-stream/state_window_case.py @@ -30,14 +30,14 @@ class TDTestCase: tdSql.execute("CREATE STREAM stream_device_alarm2 TRIGGER AT_ONCE DELETE_MARK 30d INTO st_device_alarm2 tags(factory_id varchar(20), device_code varchar(80), var_name varchar(200))\ as select _wstart start_time, last(load_time) end_time, first(var_value) var_value, 1 state_flag from st_variable_data\ PARTITION BY tbname tname, factory_id, device_code, var_name STATE_WINDOW(case when lower(var_value)=lower(trigger_value) then '1' else '0' end)") - time.sleep(2) + time.sleep(5) def insert_data(self): try: tdSql.execute("insert into aaa values('2024-07-15 14:00:00', '2024-07-15 14:00:00', 'a8')", queryTimes=5, show=True) time.sleep(0.01) tdSql.execute("insert into aaa values('2024-07-15 14:10:00', '2024-07-15 14:10:00', 'a9')", queryTimes=5, show=True) - time.sleep(1) + time.sleep(5) except Exception as error: tdLog.exit(f"insert data failed {error}") diff --git a/utils/test/c/tmq_taosx_ci.c b/utils/test/c/tmq_taosx_ci.c index 8b9cfce395..49cfa3dff8 100644 --- a/utils/test/c/tmq_taosx_ci.c +++ b/utils/test/c/tmq_taosx_ci.c @@ -633,7 +633,7 @@ void basic_consume_loop(tmq_t* tmq, tmq_list_t* topics) { } int32_t cnt = 0; while (running) { - TAOS_RES* tmqmessage = tmq_consumer_poll(tmq, 1000); + TAOS_RES* tmqmessage = tmq_consumer_poll(tmq, 5000); if (tmqmessage) { cnt++; msg_process(tmqmessage);