Merge pull request #6455 from taosdata/docs/Update-Latest-Feature
[TD-4669] <docs>: upload English documents & pictures for version 2.0.
|
@ -0,0 +1,300 @@
|
|||
# Efficient Data Writing
|
||||
|
||||
TDengine supports multiple interfaces to write data, including SQL, Prometheus, Telegraf, EMQ MQTT Broker, HiveMQ Broker, CSV file, etc. Kafka, OPC and other interfaces will be provided in the future. Data can be inserted in a single piece or in batches, data from one or multiple data collection points can be inserted at the same time. TDengine supports multi-thread insertion, nonsequential data insertion, and also historical data insertion.
|
||||
|
||||
## <a class="anchor" id="sql"></a> SQL Writing
|
||||
|
||||
Applications insert data by executing SQL insert statements through C/C + +, JDBC, GO, or Python Connector, and users can manually enter SQL insert statements to insert data through TAOS Shell. For example, the following insert writes a record to table d1001:
|
||||
|
||||
```mysql
|
||||
```
|
||||
|
||||
INSERT INTO d1001 VALUES (1538548685000, 10.3, 219, 0.31);
|
||||
|
||||
```
|
||||
```
|
||||
|
||||
TDengine supports writing multiple records at a time. For example, the following command writes two records to table d1001:
|
||||
|
||||
```mysql
|
||||
```
|
||||
|
||||
INSERT INTO d1001 VALUES (1538548684000, 10.2, 220, 0.23) (1538548696650, 10.3, 218, 0.25);
|
||||
|
||||
```
|
||||
```
|
||||
|
||||
TDengine also supports writing data to multiple tables at a time. For example, the following command writes two records to d1001 and one record to d1002:
|
||||
|
||||
```mysql
|
||||
```
|
||||
|
||||
INSERT INTO d1001 VALUES (1538548685000, 10.3, 219, 0.31) (1538548695000, 12.6, 218, 0.33) d1002 VALUES (1538548696800, 12.3, 221, 0.31);
|
||||
|
||||
```
|
||||
```
|
||||
|
||||
For the SQL INSERT Grammar, please refer to [Taos SQL insert](https://www.taosdata.com/en/documentation/taos-sql#insert)。
|
||||
|
||||
**Tips:**
|
||||
|
||||
- To improve writing efficiency, batch writing is required. The more records written in a batch, the higher the insertion efficiency. However, a record cannot exceed 16K, and the total length of an SQL statement cannot exceed 64K (it can be configured by parameter maxSQLLength, and the maximum can be configured to 1M).
|
||||
- TDengine supports multi-thread parallel writing. To further improve writing speed, a client needs to open more than 20 threads to write parallelly. However, after the number of threads reaches a certain threshold, it cannot be increased or even become decreased, because too much frequent thread switching brings extra overhead.
|
||||
- For a same table, if the timestamp of a newly inserted record already exists, (no database was created using UPDATE 1) the new record will be discarded as default, that is, the timestamp must be unique in a table. If an application automatically generates records, it is very likely that the generated timestamps will be the same, so the number of records successfully inserted will be smaller than the number of records the application try to insert. If you use UPDATE 1 option when creating a database, inserting a new record with the same timestamp will overwrite the original record.
|
||||
- The timestamp of written data must be greater than the current time minus the time of configuration parameter keep. If keep is configured for 3650 days, data older than 3650 days cannot be written. The timestamp for writing data cannot be greater than the current time plus configuration parameter days. If days is configured to 2, data 2 days later than the current time cannot be written.
|
||||
|
||||
## <a class="anchor" id="prometheus"></a> Direct Writing of Prometheus
|
||||
|
||||
As a graduate project of Cloud Native Computing Foundation, [Prometheus](https://www.prometheus.io/) is widely used in the field of performance monitoring and K8S performance monitoring. TDengine provides a simple tool [Bailongma](https://github.com/taosdata/Bailongma), which only needs to be simply configured in Prometheus without any code, and can directly write the data collected by Prometheus into TDengine, then automatically create databases and related table entries in TDengine according to rules. Blog post [Use Docker Container to Quickly Build a Devops Monitoring Demo](https://www.taosdata.com/blog/2020/02/03/1189.html), which is an example of using bailongma to write Prometheus and Telegraf data into TDengine.
|
||||
|
||||
### Compile blm_prometheus From Source
|
||||
|
||||
Users need to download the source code of [Bailongma](https://github.com/taosdata/Bailongma) from github, then compile and generate an executable file using Golang language compiler. Before you start compiling, you need to complete following prepares:
|
||||
|
||||
- A server running Linux OS
|
||||
- Golang version 1.10 and higher installed
|
||||
- An appropriated TDengine version. Because the client dynamic link library of TDengine is used, it is necessary to install the same version of TDengine as the server-side; for example, if the server version is TDengine 2.0. 0, ensure install the same version on the linux server where bailongma is located (can be on the same server as TDengine, or on a different server)
|
||||
|
||||
Bailongma project has a folder, blm_prometheus, which holds the prometheus writing API. The compiling process is as follows:
|
||||
|
||||
```bash
|
||||
```
|
||||
|
||||
cd blm_prometheus
|
||||
|
||||
go build
|
||||
|
||||
```
|
||||
```
|
||||
|
||||
If everything goes well, an executable of blm_prometheus will be generated in the corresponding directory.
|
||||
|
||||
### Install Prometheus
|
||||
|
||||
Download and install as the instruction of Prometheus official website. [Download Address](https://prometheus.io/download/)
|
||||
|
||||
### Configure Prometheus
|
||||
|
||||
Read the Prometheus [configuration document](https://prometheus.io/docs/prometheus/latest/configuration/configuration/) and add following configurations in the section of Prometheus configuration file
|
||||
|
||||
- url: The URL provided by bailongma API service, refer to the blm_prometheus startup example section below
|
||||
|
||||
After Prometheus launched, you can check whether data is written successfully through query taos client.
|
||||
|
||||
### Launch blm_prometheus
|
||||
|
||||
blm_prometheus has following options that you can configure when you launch blm_prometheus.
|
||||
|
||||
```sh
|
||||
--tdengine-name
|
||||
|
||||
If TDengine is installed on a server with a domain name, you can also access the TDengine by configuring the domain name of it. In K8S environment, it can be configured as the service name that TDengine runs
|
||||
|
||||
--batch-size
|
||||
|
||||
blm_prometheus assembles the received prometheus data into a TDengine writing request. This parameter controls the number of data pieces carried in a writing request sent to TDengine at a time.
|
||||
|
||||
--dbname
|
||||
|
||||
Set a name for the database created in TDengine, blm_prometheus will automatically create a database named dbname in TDengine, and the default value is prometheus.
|
||||
|
||||
--dbuser
|
||||
|
||||
Set the user name to access TDengine, the default value is'root '
|
||||
|
||||
--dbpassword
|
||||
|
||||
Set the password to access TDengine, the default value is'taosdata '
|
||||
|
||||
--port
|
||||
|
||||
The port number blm_prometheus used to serve prometheus.
|
||||
```
|
||||
|
||||
|
||||
|
||||
### Example
|
||||
|
||||
Launch an API service for blm_prometheus with the following command:
|
||||
|
||||
```bash
|
||||
./blm_prometheus -port 8088
|
||||
```
|
||||
|
||||
Assuming that the IP address of the server where blm_prometheus located is "10.1.2. 3", the URL shall be added to the configuration file of Prometheus as:
|
||||
|
||||
remote_write:
|
||||
|
||||
\- url: "http://10.1.2.3:8088/receive"
|
||||
|
||||
|
||||
|
||||
### Query written data of prometheus
|
||||
|
||||
The format of generated data by Prometheus is as follows:
|
||||
|
||||
```json
|
||||
|
||||
|
||||
{
|
||||
Timestamp: 1576466279341,
|
||||
Value: 37.000000,
|
||||
apiserver_request_latencies_bucket {
|
||||
component="apiserver",
|
||||
instance="192.168.99.116:8443",
|
||||
job="kubernetes-apiservers",
|
||||
le="125000",
|
||||
resource="persistentvolumes", s
|
||||
cope="cluster",
|
||||
verb="LIST",
|
||||
version=“v1"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Where apiserver_request_latencies_bucket is the name of the time-series data collected by prometheus, and the tag of the time-series data is in the following {}. blm_prometheus automatically creates a STable in TDengine with the name of the time series data, and converts the tag in {} into the tag value of TDengine, with Timestamp as the timestamp and value as the value of the time-series data. Therefore, in the client of TDEngine, you can check whether this data was successfully written through the following instruction.
|
||||
|
||||
```mysql
|
||||
use prometheus;
|
||||
|
||||
select * from apiserver_request_latencies_bucket;
|
||||
```
|
||||
|
||||
|
||||
|
||||
## <a class="anchor" id="telegraf"></a> Direct Writing of Telegraf
|
||||
|
||||
[Telegraf](https://www.influxdata.com/time-series-platform/telegraf/) is a popular open source tool for IT operation data collection. TDengine provides a simple tool [Bailongma](https://github.com/taosdata/Bailongma), which only needs to be simply configured in Telegraf without any code, and can directly write the data collected by Telegraf into TDengine, then automatically create databases and related table entries in TDengine according to rules. Blog post [Use Docker Container to Quickly Build a Devops Monitoring Demo](https://www.taosdata.com/blog/2020/02/03/1189.html), which is an example of using bailongma to write Prometheus and Telegraf data into TDengine.
|
||||
|
||||
### Compile blm_telegraf From Source Code
|
||||
|
||||
Users need to download the source code of [Bailongma](https://github.com/taosdata/Bailongma) from github, then compile and generate an executable file using Golang language compiler. Before you start compiling, you need to complete following prepares:
|
||||
|
||||
- A server running Linux OS
|
||||
- Golang version 1.10 and higher installed
|
||||
- An appropriated TDengine version. Because the client dynamic link library of TDengine is used, it is necessary to install the same version of TDengine as the server-side; for example, if the server version is TDengine 2.0. 0, ensure install the same version on the linux server where bailongma is located (can be on the same server as TDengine, or on a different server)
|
||||
|
||||
Bailongma project has a folder, blm_telegraf, which holds the Telegraf writing API. The compiling process is as follows:
|
||||
|
||||
```bash
|
||||
cd blm_telegraf
|
||||
|
||||
go build
|
||||
```
|
||||
|
||||
If everything goes well, an executable of blm_telegraf will be generated in the corresponding directory.
|
||||
|
||||
### Install Telegraf
|
||||
|
||||
At the moment, TDengine supports Telegraf version 1.7. 4 and above. Users can download the installation package on Telegraf's website according to your current operating system. The download address is as follows: https://portal.influxdata.com/downloads
|
||||
|
||||
### Configure Telegraf
|
||||
|
||||
Modify the TDengine-related configurations in the Telegraf configuration file /etc/telegraf/telegraf.conf.
|
||||
|
||||
In the output plugins section, add the [[outputs.http]] configuration:
|
||||
|
||||
- url: The URL provided by bailongma API service, please refer to the example section below
|
||||
- data_format: "json"
|
||||
- json_timestamp_units: "1ms"
|
||||
|
||||
In agent section:
|
||||
|
||||
- hostname: The machine name that distinguishes different collection devices, and it is necessary to ensure its uniqueness
|
||||
- metric_batch_size: 100, which is the max number of records per batch wriiten by Telegraf allowed. Increasing the number can reduce the request sending frequency of Telegraf.
|
||||
|
||||
For information on how to use Telegraf to collect data and more about using Telegraf, please refer to the official [document](https://docs.influxdata.com/telegraf/v1.11/) of Telegraf.
|
||||
|
||||
### Launch blm_telegraf
|
||||
|
||||
blm_telegraf has following options, which can be set to tune configurations of blm_telegraf when launching.
|
||||
|
||||
```sh
|
||||
--host
|
||||
|
||||
The ip address of TDengine server, default is null
|
||||
|
||||
--batch-size
|
||||
|
||||
blm_prometheus assembles the received telegraf data into a TDengine writing request. This parameter controls the number of data pieces carried in a writing request sent to TDengine at a time.
|
||||
|
||||
--dbname
|
||||
|
||||
Set a name for the database created in TDengine, blm_telegraf will automatically create a database named dbname in TDengine, and the default value is prometheus.
|
||||
|
||||
--dbuser
|
||||
|
||||
Set the user name to access TDengine, the default value is 'root '
|
||||
|
||||
--dbpassword
|
||||
|
||||
Set the password to access TDengine, the default value is'taosdata '
|
||||
|
||||
--port
|
||||
|
||||
The port number blm_telegraf used to serve Telegraf.
|
||||
```
|
||||
|
||||
|
||||
|
||||
### Example
|
||||
|
||||
Launch an API service for blm_telegraf with the following command
|
||||
|
||||
```bash
|
||||
./blm_telegraf -host 127.0.0.1 -port 8089
|
||||
```
|
||||
|
||||
Assuming that the IP address of the server where blm_telegraf located is "10.1.2. 3", the URL shall be added to the configuration file of telegraf as:
|
||||
|
||||
```yaml
|
||||
url = "http://10.1.2.3:8089/telegraf"
|
||||
```
|
||||
|
||||
### Query written data of telegraf
|
||||
|
||||
The format of generated data by telegraf is as follows:
|
||||
|
||||
```json
|
||||
{
|
||||
"fields": {
|
||||
"usage_guest": 0,
|
||||
"usage_guest_nice": 0,
|
||||
"usage_idle": 89.7897897897898,
|
||||
"usage_iowait": 0,
|
||||
"usage_irq": 0,
|
||||
"usage_nice": 0,
|
||||
"usage_softirq": 0,
|
||||
"usage_steal": 0,
|
||||
"usage_system": 5.405405405405405,
|
||||
"usage_user": 4.804804804804805
|
||||
},
|
||||
|
||||
"name": "cpu",
|
||||
"tags": {
|
||||
"cpu": "cpu2",
|
||||
"host": "bogon"
|
||||
},
|
||||
"timestamp": 1576464360
|
||||
}
|
||||
```
|
||||
|
||||
Where the name field is the name of the time-series data collected by telegraf, and the tag field is the tag of the time-series data. blm_telegraf automatically creates a STable in TDengine with the name of the time series data, and converts the tag field into the tag value of TDengine, with Timestamp as the timestamp and fields values as the value of the time-series data. Therefore, in the client of TDEngine, you can check whether this data was successfully written through the following instruction.
|
||||
|
||||
```mysql
|
||||
use telegraf;
|
||||
|
||||
select * from cpu;
|
||||
```
|
||||
|
||||
MQTT is a popular data transmission protocol in the IoT. TDengine can easily access the data received by MQTT Broker and write it to TDengine.
|
||||
|
||||
## <a class="anchor" id="emq"></a> Direct Writing of EMQ Broker
|
||||
|
||||
[EMQ](https://github.com/emqx/emqx) is an open source MQTT Broker software, with no need of coding, only to use "rules" in EMQ Dashboard for simple configuration, and MQTT data can be directly written into TDengine. EMQ X supports storing data to the TDengine by sending it to a Web service, and also provides a native TDengine driver on Enterprise Edition for direct data store. Please refer to [EMQ official documents](https://docs.emqx.io/broker/latest/cn/rule/rule-example.html#%E4%BF%9D%E5%AD%98%E6%95%B0%E6%8D%AE%E5%88%B0-tdengine) for more details.
|
||||
|
||||
|
||||
|
||||
## <a class="anchor" id="hivemq"></a> Direct Writing of HiveMQ Broker
|
||||
|
||||
[HiveMQ](https://www.hivemq.com/) is an MQTT agent that provides Free Personal and Enterprise Edition versions. It is mainly used for enterprises, emerging machine-to-machine(M2M) communication and internal transmission to meet scalability, easy management and security features. HiveMQ provides an open source plug-in development kit. You can store data to TDengine via HiveMQ extension-TDengine. Refer to the [HiveMQ extension-TDengine documentation](https://github.com/huskar-t/hivemq-tdengine-extension/blob/b62a26ecc164a310104df57691691b237e091c89/README.md) for more details.
|
|
@ -0,0 +1,99 @@
|
|||
# Efficient Data Querying
|
||||
|
||||
## <a class="anchor" id="queries"></a> Main Query Features
|
||||
|
||||
TDengine uses SQL as the query language. Applications can send SQL statements through C/C + +, Java, Go, Python connectors, and users can manually execute SQL Ad-Hoc Query through the Command Line Interface (CLI) tool TAOS Shell provided by TDengine. TDengine supports the following query functions:
|
||||
|
||||
- Single-column and multi-column data query
|
||||
- Multiple filters for tags and numeric values: >, <, =, < >, like, etc
|
||||
- Group by, Order by, Limit/Offset of aggregation results
|
||||
- Four operations for numeric columns and aggregation results
|
||||
- Time stamp aligned join query (implicit join) operations
|
||||
- Multiple aggregation/calculation functions: count, max, min, avg, sum, twa, stddev, leastsquares, top, bottom, first, last, percentile, apercentile, last_row, spread, diff, etc
|
||||
|
||||
For example, in TAOS shell, the records with vlotage > 215 are queried from table d1001, sorted in descending order by timestamps, and only two records are outputted.
|
||||
|
||||
```mysql
|
||||
taos> select * from d1001 where voltage > 215 order by ts desc limit 2;
|
||||
ts | current | voltage | phase |
|
||||
======================================================================================
|
||||
2018-10-03 14:38:16.800 | 12.30000 | 221 | 0.31000 |
|
||||
2018-10-03 14:38:15.000 | 12.60000 | 218 | 0.33000 |
|
||||
Query OK, 2 row(s) in set (0.001100s)
|
||||
```
|
||||
|
||||
In order to meet the needs of an IoT scenario, TDengine supports several special functions, such as twa (time weighted average), spread (difference between maximum and minimum), last_row (last record), etc. More functions related to IoT scenarios will be added. TDengine also supports continuous queries.
|
||||
|
||||
For specific query syntax, please see the [Data Query section of TAOS SQL](https://www.taosdata.com/cn/documentation/taos-sql#select).
|
||||
|
||||
## <a class="anchor" id="aggregation"></a> Multi-table Aggregation Query
|
||||
|
||||
In an IoT scenario, there are often multiple data collection points in a same type. TDengine uses the concept of STable to describe a certain type of data collection point, and an ordinary table to describe a specific data collection point. At the same time, TDengine uses tags to describe the statical attributes of data collection points. A given data collection point has a specific tag value. By specifying the filters of tags, TDengine provides an efficient method to aggregate and query the sub-tables of STables (data collection points of a certain type). Aggregation functions and most operations on ordinary tables are applicable to STables, and the syntax is exactly the same.
|
||||
|
||||
**Example 1**: In TAOS Shell, look up the average voltages collected by all smart meters in Beijing and group them by location
|
||||
|
||||
```mysql
|
||||
taos> SELECT AVG(voltage) FROM meters GROUP BY location;
|
||||
avg(voltage) | location |
|
||||
=============================================================
|
||||
222.000000000 | Beijing.Haidian |
|
||||
219.200000000 | Beijing.Chaoyang |
|
||||
Query OK, 2 row(s) in set (0.002136s)
|
||||
```
|
||||
|
||||
**Example 2**: In TAOS Shell, look up the number of records with groupId 2 in the past 24 hours, check the maximum current of all smart meters
|
||||
|
||||
```mysql
|
||||
taos> SELECT count(*), max(current) FROM meters where groupId = 2 and ts > now - 24h;
|
||||
cunt(*) | max(current) |
|
||||
==================================
|
||||
5 | 13.4 |
|
||||
Query OK, 1 row(s) in set (0.002136s)
|
||||
```
|
||||
|
||||
TDengine only allows aggregation queries between tables belonging to a same STable, means aggregation queries between different STables are not supported. In the Data Query section of TAOS SQL, query class operations will all be indicated that whether STables are supported.
|
||||
|
||||
## <a class="anchor" id="sampling"></a> Down Sampling Query, Interpolation
|
||||
|
||||
In a scenario of IoT, it is often necessary to aggregate the collected data by intervals through down sampling. TDengine provides a simple keyword interval, which makes query operations according to time windows extremely simple. For example, the current values collected by smart meter d1001 are summed every 10 seconds.
|
||||
|
||||
```mysql
|
||||
taos> SELECT sum(current) FROM d1001 INTERVAL(10s);
|
||||
ts | sum(current) |
|
||||
======================================================
|
||||
2018-10-03 14:38:00.000 | 10.300000191 |
|
||||
2018-10-03 14:38:10.000 | 24.900000572 |
|
||||
Query OK, 2 row(s) in set (0.000883s)
|
||||
```
|
||||
|
||||
The down sampling operation is also applicable to STables, such as summing the current values collected by all smart meters in Beijing every second.
|
||||
|
||||
```mysql
|
||||
taos> SELECT SUM(current) FROM meters where location like "Beijing%" INTERVAL(1s);
|
||||
ts | sum(current) |
|
||||
======================================================
|
||||
2018-10-03 14:38:04.000 | 10.199999809 |
|
||||
2018-10-03 14:38:05.000 | 32.900000572 |
|
||||
2018-10-03 14:38:06.000 | 11.500000000 |
|
||||
2018-10-03 14:38:15.000 | 12.600000381 |
|
||||
2018-10-03 14:38:16.000 | 36.000000000 |
|
||||
Query OK, 5 row(s) in set (0.001538s)
|
||||
```
|
||||
|
||||
The down sampling operation also supports time offset, such as summing the current values collected by all smart meters every second, but requires each time window to start from 500 milliseconds.
|
||||
|
||||
```mysql
|
||||
taos> SELECT SUM(current) FROM meters INTERVAL(1s, 500a);
|
||||
ts | sum(current) |
|
||||
======================================================
|
||||
2018-10-03 14:38:04.500 | 11.189999809 |
|
||||
2018-10-03 14:38:05.500 | 31.900000572 |
|
||||
2018-10-03 14:38:06.500 | 11.600000000 |
|
||||
2018-10-03 14:38:15.500 | 12.300000381 |
|
||||
2018-10-03 14:38:16.500 | 35.000000000 |
|
||||
Query OK, 5 row(s) in set (0.001521s)
|
||||
```
|
||||
|
||||
In a scenario of IoT, it is difficult to synchronize the time stamp of collected data at each point, but many analysis algorithms (such as FFT) need to align the collected data strictly at equal intervals of time. In many systems, it’s required to write their own programs to process, but the down sampling operation of TDengine can be easily solved. If there is no collected data in an interval, TDengine also provides interpolation calculation function.
|
||||
|
||||
For details of syntax rules, please refer to the [Time-dimension Aggregation section of TAOS SQL](https://www.taosdata.com/en/documentation/taos-sql#aggregation).
|
|
@ -0,0 +1,360 @@
|
|||
# Advanced Features
|
||||
|
||||
## <a class="anchor" id="continuous-query"></a> Continuous Query
|
||||
|
||||
Continuous Query is a query executed by TDengine periodically with a sliding window, it is a simplified stream computing driven by timers. Continuous query can be applied to a table or a STable automatically and periodically, and the result set can be passed to the application directly via call back function, or written into a new table in TDengine. The query is always executed on a specified time window (window size is specified by parameter interval), and this window slides forward while time flows (the sliding period is specified by parameter sliding).
|
||||
|
||||
Continuous query of TDengine adopts time-driven mode, which can be defined directly by TAOS SQL without additional operation. Using continuous query, results can be generated conveniently and quickly according to the time window, thus down sampling the original collected data. After the user defines a continuous query through TAOS SQL, TDengine automatically pulls up the query at the end of the last complete time period and pushes the calculated results to the user or writes them back to TDengine.
|
||||
|
||||
The continuous query provided by TDengine differs from the time window calculation in ordinary stream computing in the following ways:
|
||||
|
||||
- Unlike the real-time feedback calculated results of stream computing, continuous query only starts calculation after the time window is closed. For example, if the time period is 1 day, the results of that day will only be generated after 23:59:59.
|
||||
- If a history record is written to the time interval that has been calculated, the continuous query will not recalculate and will not push the results to the user again. For the mode of writing back to TDengine, the existing calculated results will not be updated.
|
||||
- Using the mode of continuous query pushing results, the server does not cache the client's calculation status, nor does it provide Exactly-Once semantic guarantee. If the user's application side crashed, the continuous query pulled up again would only recalculate the latest complete time window from the time pulled up again. If writeback mode is used, TDengine can ensure the validity and continuity of data writeback.
|
||||
|
||||
### How to use continuous query
|
||||
|
||||
The following is an example of the smart meter scenario to introduce the specific use of continuous query. Suppose we create a STables and sub-tables through the following SQL statement:
|
||||
|
||||
```sql
|
||||
create table meters (ts timestamp, current float, voltage int, phase float) tags (location binary(64), groupdId int);
|
||||
create table D1001 using meters tags ("Beijing.Chaoyang", 2);
|
||||
create table D1002 using meters tags ("Beijing.Haidian", 2);
|
||||
...
|
||||
```
|
||||
|
||||
We already know that the average voltage of these meters can be counted with one minute as the time window and 30 seconds as the forward increment through the following SQL statement.
|
||||
|
||||
```sql
|
||||
select avg(voltage) from meters interval(1m) sliding(30s);
|
||||
```
|
||||
|
||||
Every time this statement is executed, all data will be recalculated. If you need to execute every 30 seconds to incrementally calculate the data of the latest minute, you can improve the above statement as following, using a different `startTime` each time and executing it regularly:
|
||||
|
||||
```sql
|
||||
select avg(voltage) from meters where ts > {startTime} interval(1m) sliding(30s);
|
||||
```
|
||||
|
||||
There is no problem with this, but TDengine provides a simpler method, just add `create table {tableName} as` before the initial query statement, for example:
|
||||
|
||||
```sql
|
||||
create table avg_vol as select avg(voltage) from meters interval(1m) sliding(30s);
|
||||
```
|
||||
|
||||
A new table named `avg_vol` will be automatically created, and then every 30 seconds, TDengine will incrementally execute the SQL statement after `as` and write the query result into this table. The user program only needs to query the data from `avg_vol`. For example:
|
||||
|
||||
```mysql
|
||||
taos> select * from avg_vol;
|
||||
ts | avg_voltage_ |
|
||||
===================================================
|
||||
2020-07-29 13:37:30.000 | 222.0000000 |
|
||||
2020-07-29 13:38:00.000 | 221.3500000 |
|
||||
2020-07-29 13:38:30.000 | 220.1700000 |
|
||||
2020-07-29 13:39:00.000 | 223.0800000 |
|
||||
```
|
||||
|
||||
It should be noted that the minimum value of the query time window is 10 milliseconds, and there is no upper limit of the time window range.
|
||||
|
||||
In addition, TDengine also supports users to specify the starting and ending times of a continuous query. If the start time is not entered, the continuous query will start from the time window where the first original data is located; If no end time is entered, the continuous query will run permanently; If the user specifies an end time, the continuous query stops running after the system time reaches the specified time. For example, a continuous query created with the following SQL will run for one hour and then automatically stop.
|
||||
|
||||
```mysql
|
||||
create table avg_vol as select avg(voltage) from meters where ts > now and ts <= now + 1h interval(1m) sliding(30s);
|
||||
```
|
||||
|
||||
It should be noted that now in the above example refers to the time when continuous queries are created, not the time when queries are executed, otherwise, queries cannot be stopped automatically. In addition, in order to avoid the problems caused by delayed writing of original data as much as possible, there is a certain delay in the calculation of continuous queries in TDengine. In other words, after a time window has passed, TDengine will not immediately calculate the data of this window, so it will take a while (usually not more than 1 minute) to find the calculation result.
|
||||
|
||||
### Manage the Continuous Query
|
||||
|
||||
Users can view all continuous queries running in the system through the show streams command in the console, and can kill the corresponding continuous queries through the kill stream command. Subsequent versions will provide more finer-grained and convenient continuous query management commands.
|
||||
|
||||
## <a class="anchor" id="subscribe"></a> Publisher/Subscriber
|
||||
|
||||
Based on the natural time-series characteristics of data, the data insert of TDengine is logically consistent with the data publish (pub) of messaging system, which can be regarded as a new record inserted with timestamp in the system. At the same time, TDengine stores data in strict accordance with the monotonous increment of time-series. Essentially, every table in TDengine can be regarded as a standard messaging queue.
|
||||
|
||||
TDengine supports embedded lightweight message subscription and publishment services. Using the API provided by the system, users can subscribe to one or more tables in the database using common query statements. The maintenance of subscription logic and operation status is completed by the client. The client regularly polls the server for whether new records arrive, and the results will be fed back to the client when new records arrive.
|
||||
|
||||
The status of the subscription and publishment services of TDengine is maintained by the client, but not by the TDengine server. Therefore, if the application restarts, it is up to the application to decide from which point of time to obtain the latest data.
|
||||
|
||||
In TDengine, there are three main APIs relevant to subscription:
|
||||
|
||||
```c
|
||||
taos_subscribe
|
||||
taos_consume
|
||||
taos_unsubscribe
|
||||
```
|
||||
|
||||
Please refer to the [C/C++ Connector](https://www.taosdata.com/cn/documentation/connector/) for the documentation of these APIs. The following is still a smart meter scenario as an example to introduce their specific usage (please refer to the previous section "Continuous Query" for the structure of STables and sub-tables). The complete sample code can be found [here](https://github.com/taosdata/TDengine/blob/master/tests/examples/c/subscribe.c).
|
||||
|
||||
If we want to be notified and do some process when the current of a smart meter exceeds a certain limit (e.g. 10A), there are two methods: one is to query each sub-table separately, record the timestamp of the last piece of data after each query, and then only query all data after this timestamp:
|
||||
|
||||
```sql
|
||||
select * from D1001 where ts > {last_timestamp1} and current > 10;
|
||||
select * from D1002 where ts > {last_timestamp2} and current > 10;
|
||||
...
|
||||
```
|
||||
|
||||
This is indeed feasible, but as the number of meters increases, the number of queries will also increase, and the performance of both the client and the server will be affected, until the system cannot afford it.
|
||||
|
||||
Another method is to query the STable. In this way, no matter how many meters there are, only one query is required:
|
||||
|
||||
```sql
|
||||
select * from meters where ts > {last_timestamp} and current > 10;
|
||||
```
|
||||
|
||||
However, how to choose `last_timestamp` has become a new problem. Because, on the one hand, the time of data generation (the data timestamp) and the time of data storage are generally not the same, and sometimes the deviation is still very large; On the other hand, the time when the data of different meters arrive at TDengine will also vary. Therefore, if we use the timestamp of the data from the slowest meter as `last_timestamp` in the query, we may repeatedly read the data of other meters; If the timestamp of the fastest meter is used, the data of other meters may be missed.
|
||||
|
||||
The subscription function of TDengine provides a thorough solution to the above problem.
|
||||
|
||||
First, use `taos_subscribe` to create a subscription:
|
||||
|
||||
```c
|
||||
TAOS_SUB* tsub = NULL;
|
||||
if (async) {
|
||||
// create an asynchronized subscription, the callback function will be called every 1s
|
||||
tsub = taos_subscribe(taos, restart, topic, sql, subscribe_callback, &blockFetch, 1000);
|
||||
} else {
|
||||
// create an synchronized subscription, need to call 'taos_consume' manually
|
||||
tsub = taos_subscribe(taos, restart, topic, sql, NULL, NULL, 0);
|
||||
}
|
||||
```
|
||||
|
||||
Subscriptions in TDengine can be either synchronous or asynchronous, and the above code will decide which method to use based on the value of parameter `async` obtained from the command line. Here, synchronous means that the user program calls `taos_consume` directly to pull data, while asynchronous means that the API calls `taos_consume` in another internal thread, and then gives the pulled data to the callback function `subscribe_callback` for processing.
|
||||
|
||||
Parameter `taos` is an established database connection and has no special requirements in synchronous mode. However, in asynchronous mode, it should be noted that it will not be used by other threads, otherwise it may lead to unpredictable errors, because the callback function is called in the internal thread of the API, while some APIs of TDengine are not thread-safe.
|
||||
|
||||
Parameter `sql` is a query statement in which you can specify filters using where clause. In our example, if you only want to subscribe to data when the current exceeds 10A, you can write as follows:
|
||||
|
||||
```sql
|
||||
select * from meters where current > 10;
|
||||
```
|
||||
|
||||
Note that the starting time is not specified here, so the data of all timers will be read. If you only want to start subscribing from the data one day ago and do not need earlier historical data, you can add a time condition:
|
||||
|
||||
```sql
|
||||
select * from meters where ts > now - 1d and current > 10;
|
||||
```
|
||||
|
||||
The `topic` of the subscription is actually its name, because the subscription function is implemented in the client API, so it is not necessary to ensure that it is globally unique, but it needs to be unique on a client machine.
|
||||
|
||||
If the subscription of name `topic` does not exist, the parameter restart is meaningless; However, if the user program exits after creating this subscription, when it starts again and reuses this `topic`, `restart` will be used to decide whether to read data from scratch or from the previous location. In this example, if `restart` is **true** (non-zero value), the user program will definitely read all the data. However, if this subscription exists before, and some data has been read, and `restart` is **false** (zero), the user program will not read the previously read data.
|
||||
|
||||
The last parameter of `taos_subscribe` is the polling period in milliseconds. In synchronous mode, if the interval between the two calls to `taos_consume` is less than this time, `taos_consume` will block until the interval exceeds this time. In asynchronous mode, this time is the minimum time interval between two calls to the callback function.
|
||||
|
||||
The penultimate parameter of `taos_subscribe` is used by the user program to pass additional parameters to the callback function, which is passed to the callback function as it is without any processing by the subscription API. This parameter is meaningless in sync mode.
|
||||
|
||||
After created, the subscription can consume data. In synchronous mode, the sample code is the following as the `else` section:
|
||||
|
||||
```c
|
||||
if (async) {
|
||||
getchar();
|
||||
} else while(1) {
|
||||
TAOS_RES* res = taos_consume(tsub);
|
||||
if (res == NULL) {
|
||||
printf("failed to consume data.");
|
||||
break;
|
||||
} else {
|
||||
print_result(res, blockFetch);
|
||||
getchar();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Here is a **while** loop. Every time the user presses the Enter key, `taos_consume` is called, and the return value of `taos_consume` is the query result set, which is exactly the same as `taos_use_result`. In the example, the code using this result set is the function `print_result`:
|
||||
|
||||
```c
|
||||
void print_result(TAOS_RES* res, int blockFetch) {
|
||||
TAOS_ROW row = NULL;
|
||||
int num_fields = taos_num_fields(res);
|
||||
TAOS_FIELD* fields = taos_fetch_fields(res);
|
||||
int nRows = 0;
|
||||
if (blockFetch) {
|
||||
nRows = taos_fetch_block(res, &row);
|
||||
for (int i = 0; i < nRows; i++) {
|
||||
char temp[256];
|
||||
taos_print_row(temp, row + i, fields, num_fields);
|
||||
puts(temp);
|
||||
}
|
||||
} else {
|
||||
while ((row = taos_fetch_row(res))) {
|
||||
char temp[256];
|
||||
taos_print_row(temp, row, fields, num_fields);puts(temp);
|
||||
nRows++;
|
||||
}
|
||||
}
|
||||
printf("%d rows consumed.\n", nRows);
|
||||
}
|
||||
```
|
||||
|
||||
Among them, `taos_print_row` is used to process subscription to data. In our example, it will print out all eligible records. In asynchronous mode, it is simpler to consume subscribed data:
|
||||
|
||||
```c
|
||||
void subscribe_callback(TAOS_SUB* tsub, TAOS_RES *res, void* param, int code) {
|
||||
print_result(res, *(int*)param);
|
||||
}
|
||||
```
|
||||
|
||||
To end a data subscription, you need to call `taos_unsubscribe`:
|
||||
|
||||
```c
|
||||
taos_unsubscribe(tsub, keep);
|
||||
```
|
||||
|
||||
Its second parameter is used to decide whether to keep the progress information of subscription on the client. If this parameter is **false** (zero), the subscription can only be restarted no matter what the `restart` parameter is when `taos_subscribe` is called next time. In addition, progress information is saved in the directory {DataDir}/subscribe/. Each subscription has a file with the same name as its `topic`. Deleting a file will also lead to a new start when the corresponding subscription is created next time.
|
||||
|
||||
After introducing the code, let's take a look at the actual running effect. For exmaple:
|
||||
|
||||
- Sample code has been downloaded locally
|
||||
- TDengine has been installed on the same machine
|
||||
- All the databases, STables and sub-tables required by the example have been created
|
||||
|
||||
You can compile and start the sample program by executing the following command in the directory where the sample code is located:
|
||||
|
||||
```shell
|
||||
$ make
|
||||
$ ./subscribe -sql='select * from meters where current > 10;'
|
||||
```
|
||||
|
||||
After the sample program starts, open another terminal window, and the shell that starts TDengine inserts a data with a current of 12A into **D1001**:
|
||||
|
||||
```shell
|
||||
$ taos
|
||||
> use test;
|
||||
> insert into D1001 values(now, 12, 220, 1);
|
||||
```
|
||||
|
||||
At this time, because the current exceeds 10A, you should see that the sample program outputs it to the screen. You can continue to insert some data to observe the output of the sample program.
|
||||
|
||||
### Use data subscription in Java
|
||||
|
||||
The subscription function also provides a Java development interface, as described in [Java Connector](https://www.taosdata.com/cn/documentation/connector/). It should be noted that the Java interface does not provide asynchronous subscription mode at present, but user programs can achieve the same feature by creating TimerTask.
|
||||
|
||||
The following is an example to introduce its specific use. The function it completes is basically the same as the C language example described earlier, and it is also to subscribe to all records with current exceeding 10A in the database.
|
||||
|
||||
#### Prepare data
|
||||
|
||||
```sql
|
||||
# Create power Database
|
||||
taos> create database power;
|
||||
# Switch to the database
|
||||
taos> use power;
|
||||
# Create a STable
|
||||
taos> create table meters(ts timestamp, current float, voltage int, phase int) tags(location binary(64), groupId int);
|
||||
# Create tables
|
||||
taos> create table d1001 using meters tags ("Beijing.Chaoyang", 2);
|
||||
taos> create table d1002 using meters tags ("Beijing.Haidian", 2);
|
||||
# Insert test data
|
||||
taos> insert into d1001 values("2020-08-15 12:00:00.000", 12, 220, 1),("2020-08-15 12:10:00.000", 12.3, 220, 2),("2020-08-15 12:20:00.000", 12.2, 220, 1);
|
||||
taos> insert into d1002 values("2020-08-15 12:00:00.000", 9.9, 220, 1),("2020-08-15 12:10:00.000", 10.3, 220, 1),("2020-08-15 12:20:00.000", 11.2, 220, 1);
|
||||
# Query all records with current over 10A from STable meters
|
||||
taos> select * from meters where current > 10;
|
||||
ts | current | voltage | phase | location | groupid |
|
||||
===========================================================================================================
|
||||
2020-08-15 12:10:00.000 | 10.30000 | 220 | 1 | Beijing.Haidian | 2 |
|
||||
2020-08-15 12:20:00.000 | 11.20000 | 220 | 1 | Beijing.Haidian | 2 |
|
||||
2020-08-15 12:00:00.000 | 12.00000 | 220 | 1 | Beijing.Chaoyang | 2 |
|
||||
2020-08-15 12:10:00.000 | 12.30000 | 220 | 2 | Beijing.Chaoyang | 2 |
|
||||
2020-08-15 12:20:00.000 | 12.20000 | 220 | 1 | Beijing.Chaoyang | 2 |
|
||||
Query OK, 5 row(s) in set (0.004896s)
|
||||
```
|
||||
|
||||
#### Example
|
||||
|
||||
```java
|
||||
public class SubscribeDemo {
|
||||
private static final String topic = "topic-meter-current-bg-10";
|
||||
private static final String sql = "select * from meters where current > 10";
|
||||
|
||||
public static void main(String[] args) {
|
||||
Connection connection = null;
|
||||
TSDBSubscribe subscribe = null;
|
||||
|
||||
try {
|
||||
Class.forName("com.taosdata.jdbc.TSDBDriver");
|
||||
Properties properties = new Properties();
|
||||
properties.setProperty(TSDBDriver.PROPERTY_KEY_CHARSET, "UTF-8");
|
||||
properties.setProperty(TSDBDriver.PROPERTY_KEY_TIME_ZONE, "UTC-8");
|
||||
String jdbcUrl = "jdbc:TAOS://127.0.0.1:6030/power?user=root&password=taosdata";
|
||||
connection = DriverManager.getConnection(jdbcUrl, properties);
|
||||
subscribe = ((TSDBConnection) connection).subscribe(topic, sql, true); // Create a subscription
|
||||
int count = 0;
|
||||
while (count < 10) {
|
||||
TimeUnit.SECONDS.sleep(1); / Wait 1 second to avoid calling consume too frequently and causing pressure on server
|
||||
TSDBResultSet resultSet = subscribe.consume(); // 消费数据
|
||||
if (resultSet == null) {
|
||||
continue;
|
||||
}
|
||||
ResultSetMetaData metaData = resultSet.getMetaData();
|
||||
while (resultSet.next()) {
|
||||
int columnCount = metaData.getColumnCount();
|
||||
for (int i = 1; i <= columnCount; i++) {
|
||||
System.out.print(metaData.getColumnLabel(i) + ": " + resultSet.getString(i) + "\t");
|
||||
}
|
||||
System.out.println();
|
||||
count++;
|
||||
}
|
||||
}
|
||||
} catch (Exception e) {
|
||||
e.printStackTrace();
|
||||
} finally {
|
||||
try {
|
||||
if (null != subscribe)
|
||||
subscribe.close(true); // Close the subscription
|
||||
if (connection != null)
|
||||
connection.close();
|
||||
} catch (SQLException throwables) {
|
||||
throwables.printStackTrace();
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Run the sample program. First, it consumes all the historical data that meets the query conditions:
|
||||
|
||||
```shell
|
||||
# java -jar subscribe.jar
|
||||
|
||||
ts: 1597464000000 current: 12.0 voltage: 220 phase: 1 location: Beijing.Chaoyang groupid : 2
|
||||
ts: 1597464600000 current: 12.3 voltage: 220 phase: 2 location: Beijing.Chaoyang groupid : 2
|
||||
ts: 1597465200000 current: 12.2 voltage: 220 phase: 1 location: Beijing.Chaoyang groupid : 2
|
||||
ts: 1597464600000 current: 10.3 voltage: 220 phase: 1 location: Beijing.Haidian groupid : 2
|
||||
ts: 1597465200000 current: 11.2 voltage: 220 phase: 1 location: Beijing.Haidian groupid : 2
|
||||
```
|
||||
|
||||
Then, add a piece of data to the table via taos client:
|
||||
|
||||
```sql
|
||||
# taos
|
||||
taos> use power;
|
||||
taos> insert into d1001 values("2020-08-15 12:40:00.000", 12.4, 220, 1);
|
||||
```
|
||||
|
||||
Because the current of this data is greater than 10A, the sample program will consume it:
|
||||
|
||||
```shell
|
||||
ts: 1597466400000 current: 12.4 voltage: 220 phase: 1 location: Beijing.Chaoyang groupid: 2
|
||||
```
|
||||
|
||||
## <a class="anchor" id="cache"></a> Cache
|
||||
|
||||
TDengine adopts a time-driven cache management strategy (First-In-First-Out, FIFO), also known as a write-driven cache management mechanism. This strategy is different from the read-driven data cache mode (Least-Recent-Use, LRU), which directly saves the most recently written data in the system buffer. When the buffer reaches a threshold, the oldest data is written to disk in batches. Generally speaking, for the use of IoT data, users are most concerned about the recently generated data, that is, the current status. TDengine takes full advantage of this feature by storing the most recently arrived (current status) data in the buffer.
|
||||
|
||||
TDengine provides data collection in milliseconds to users through query functions. Saving the recently arrived data directly in buffer can respond to the user's query analysis for the latest piece or batch of data more quickly, and provide faster database query response as a whole. In this way, TDengine can be used as a data buffer by setting appropriate configuration parameters without deploying additional caching systems, which can effectively simplify the system architecture and reduce the operation costs. It should be noted that after the TDengine is restarted, the buffer of the system will be emptied, the previously cached data will be written to disk in batches, and the cached data will not reload the previously cached data into the buffer like some proprietary Key-value cache system.
|
||||
|
||||
TDengine allocates a fixed size of memory space as a buffer, which can be configured according to application requirements and hardware resources. By properly setting the buffer space, TDengine can provide extremely high-performance write and query support. Each virtual node in TDengine is allocated a separate cache pool when it is created. Each virtual node manages its own cache pool, and different virtual nodes do not share the pool. All tables belonging to each virtual node share the cache pool owned by itself.
|
||||
|
||||
TDengine manages the memory pool by blocks, and the data is stored in the form of rows within. The memory pool of a vnode is allocated by blocks when the vnode is created, and each memory block is managed according to the First-In-First-Out strategy. When creating a memory pool, the size of the blocks is determined by the system configuration parameter cache; The number of memory blocks in each vnode is determined by the configuration parameter blocks. So for a vnode, the total memory size is: cache * blocks. A cache block needs to ensure that each table can store at least dozens of records in order to be efficient.
|
||||
|
||||
You can quickly obtain the last record of a table or a STable through the function last_row, which is very convenient to show the real-time status or collected values of each device on a large screen. For example:
|
||||
|
||||
```mysql
|
||||
select last_row(voltage) from meters where location='Beijing.Chaoyang';
|
||||
```
|
||||
|
||||
This SQL statement will obtain the last recorded voltage value of all smart meters located in Chaoyang District, Beijing.
|
||||
|
||||
## <a class="anchor" id="alert"></a> Alert
|
||||
|
||||
In scenarios of TDengine, alarm monitoring is a common requirement. Conceptually, it requires the program to filter out data that meet certain conditions from the data of the latest period of time, and calculate a result according to a defined formula based on these data. When the result meets certain conditions and lasts for a certain period of time, it will notify the user in some form.
|
||||
|
||||
In order to meet the needs of users for alarm monitoring, TDengine provides this function in the form of an independent module. For its installation and use, please refer to the blog [How to Use TDengine for Alarm Monitoring](https://www.taosdata.com/blog/2020/04/14/1438.html).
|
|
@ -0,0 +1,157 @@
|
|||
# Connections with Other Tools
|
||||
|
||||
## <a class="anchor" id="grafana"></a> Grafana
|
||||
|
||||
TDengine can quickly integrate with [Grafana](https://www.grafana.com/), an open source data visualization system, to build a data monitoring and alarming system. The whole process does not require any code to write. The contents of the data table in TDengine can be visually showed on DashBoard.
|
||||
|
||||
### Install Grafana
|
||||
|
||||
TDengine currently supports Grafana 5.2.4 and above. You can download and install the package from Grafana website according to the current operating system. The download address is as follows:
|
||||
|
||||
https://grafana.com/grafana/download.
|
||||
|
||||
### Configure Grafana
|
||||
|
||||
TDengine Grafana plugin is in the /usr/local/taos/connector/grafanaplugin directory.
|
||||
|
||||
Taking Centos 7.2 as an example, just copy grafanaplugin directory to /var/lib/grafana/plugins directory and restart Grafana.
|
||||
|
||||
```bash
|
||||
sudo cp -rf /usr/local/taos/connector/grafanaplugin /var/lib/grafana/plugins/tdengine
|
||||
```
|
||||
|
||||
### Use Grafana
|
||||
|
||||
#### Configure data source
|
||||
|
||||
You can log in the Grafana server (username/password:admin/admin) through localhost:3000, and add data sources through `Configuration -> Data Sources` on the left panel, as shown in the following figure:
|
||||
|
||||

|
||||
|
||||
Click `Add data source` to enter the Add Data Source page, and enter TDengine in the query box to select Add, as shown in the following figure:
|
||||
|
||||

|
||||
|
||||
Enter the data source configuration page and modify the corresponding configuration according to the default prompt:
|
||||
|
||||

|
||||
|
||||
- Host: IP address of any server in TDengine cluster and port number of TDengine RESTful interface (6041), default [http://localhost:6041](http://localhost:6041/)
|
||||
- User: TDengine username.
|
||||
- Password: TDengine user password.
|
||||
|
||||
Click `Save & Test` to test. Success will be prompted as follows:
|
||||
|
||||

|
||||
|
||||
#### Create Dashboard
|
||||
|
||||
Go back to the home to create Dashboard, and click `Add Query` to enter the panel query page:
|
||||
|
||||

|
||||
|
||||
As shown in the figure above, select the TDengine data source in Query, and enter the corresponding sql in the query box below to query. Details are as follows:
|
||||
|
||||
- INPUT SQL: Enter the statement to query (the result set of the SQL statement should be two columns and multiple rows), for example: `select avg(mem_system) from log.dn where ts >= $from and ts < $to interval($interval)` , where `from`, `to` and `interval` are built-in variables of the TDengine plug-in, representing the query range and time interval obtained from the Grafana plug-in panel. In addition to built-in variables, it is also supported to use custom template variables.
|
||||
- ALIAS BY: You can set alias for the current queries.
|
||||
- GENERATE SQL: Clicking this button will automatically replace the corresponding variable and generate the final statement to execute.
|
||||
|
||||
According to the default prompt, query the average system memory usage at the specified interval of the server where the current TDengine deployed in as follows:
|
||||
|
||||

|
||||
|
||||
> Please refer to Grafana [documents](https://grafana.com/docs/) for how to use Grafana to create the corresponding monitoring interface and for more about Grafana usage.
|
||||
|
||||
#### Import Dashboard
|
||||
|
||||
A `tdengine-grafana.json` importable dashboard is provided under the Grafana plug-in directory/usr/local/taos/connector/grafana/tdengine/dashboard/.
|
||||
|
||||
Click the `Import` button on the left panel and upload the `tdengine-grafana.json` file:
|
||||
|
||||

|
||||
|
||||
You can see as follows after Dashboard imported.
|
||||
|
||||

|
||||
|
||||
## <a class="anchor" id="matlab"></a> Matlab
|
||||
|
||||
MatLab can access data to the local workspace by connecting directly to the TDengine via the JDBC Driver provided in the installation package.
|
||||
|
||||
### JDBC Interface Adaptation of MatLab
|
||||
|
||||
Several steps are required to adapt Matlab to TDengine. Taking adapting Matlab2017a on Windows10 as an example:
|
||||
|
||||
- Copy the file JDBCDriver-1.0.0-dist.ja*r* in TDengine package to the directory ${matlab_root}\MATLAB\R2017a\java\jar\toolbox
|
||||
- Copy the file taos.lib in TDengine package to ${matlab root dir}\MATLAB\R2017a\lib\win64
|
||||
- Add the .jar package just copied to the Matlab classpath. Append the line below as the end of the file of ${matlab root dir}\MATLAB\R2017a\toolbox\local\classpath.txt
|
||||
- ```
|
||||
$matlabroot/java/jar/toolbox/JDBCDriver-1.0.0-dist.jar
|
||||
```
|
||||
|
||||
- Create a file called javalibrarypath.txt in directory ${user_home}\AppData\Roaming\MathWorks\MATLAB\R2017a_, and add the _taos.dll path in the file. For example, if the file taos.dll is in the directory of C:\Windows\System32,then add the following line in file javalibrarypath.txt:
|
||||
- ```
|
||||
C:\Windows\System32
|
||||
```
|
||||
|
||||
- ### Connect to TDengine in MatLab to get data
|
||||
|
||||
After the above configured successfully, open MatLab.
|
||||
|
||||
- Create a connection:
|
||||
|
||||
```matlab
|
||||
conn = database(‘db’, ‘root’, ‘taosdata’, ‘com.taosdata.jdbc.TSDBDriver’, ‘jdbc:TSDB://127.0.0.1:0/’)
|
||||
```
|
||||
|
||||
* Make a query:
|
||||
|
||||
```matlab
|
||||
sql0 = [‘select * from tb’]
|
||||
data = select(conn, sql0);
|
||||
```
|
||||
|
||||
* Insert a record:
|
||||
|
||||
```matlab
|
||||
sql1 = [‘insert into tb values (now, 1)’]
|
||||
exec(conn, sql1)
|
||||
```
|
||||
|
||||
For more detailed examples, please refer to the examples\Matlab\TDEngineDemo.m file in the package.
|
||||
|
||||
## <a class="anchor" id="r"></a> R
|
||||
|
||||
R language supports connection to the TDengine database through the JDBC interface. First, you need to install the JDBC package of R language. Launch the R language environment, and then execute the following command to install the JDBC support library for R language:
|
||||
|
||||
```R
|
||||
install.packages('RJDBC', repos='http://cran.us.r-project.org')
|
||||
```
|
||||
|
||||
After installed, load the RJDBC package by executing `library('RJDBC')` command.
|
||||
|
||||
Then load the TDengine JDBC driver:
|
||||
|
||||
```R
|
||||
drv<-JDBC("com.taosdata.jdbc.TSDBDriver","JDBCDriver-2.0.0-dist.jar", identifier.quote="\"")
|
||||
```
|
||||
|
||||
If succeed, no error message will display. Then use the following command to try a database connection:
|
||||
|
||||
```R
|
||||
conn<-dbConnect(drv,"jdbc:TSDB://192.168.0.1:0/?user=root&password=taosdata","root","taosdata")
|
||||
```
|
||||
|
||||
Please replace the IP address in the command above to the correct one. If no error message is shown, then the connection is established successfully, otherwise the connection command needs to be adjusted according to the error prompt. TDengine supports below functions in *RJDBC* package:
|
||||
|
||||
- `dbWriteTable(conn, "test", iris, overwrite=FALSE, append=TRUE)`: Write the data in a data frame iris to the table test in the TDengine server. Parameter overwrite must be false. append must be TRUE and the schema of the data frame iris should be the same as the table test.
|
||||
- `dbGetQuery(conn, "select count(*) from test")`: run a query command
|
||||
- `dbSendUpdate (conn, "use db")`: Execute any non-query sql statement. For example, `dbSendUpdate (conn, "use db")`, write data `dbSendUpdate (conn, "insert into t1 values (now, 99)")`, and the like.
|
||||
- `dbReadTable(conn, "test")`: read all the data in table test
|
||||
- `dbDisconnect(conn)`: close a connection
|
||||
- `dbRemoveTable(conn, "test")`: remove table test
|
||||
|
||||
The functions below are not supported currently:
|
||||
|
||||
- `dbExistsTable(conn, "test")`: if table test exists
|
||||
- `dbListTables(conn)`: list all tables in the connection
|
|
@ -0,0 +1,235 @@
|
|||
# Installation and Management of TDengine Cluster
|
||||
|
||||
Multiple TDengine servers, that is, multiple running instances of taosd, can form a cluster to ensure the highly reliable operation of TDengine and provide scale-out features. To understand cluster management in TDengine 2.0, it is necessary to understand the basic concepts of clustering. Please refer to the chapter "Overall Architecture of TDengine 2.0". And before installing the cluster, please follow the chapter ["Getting started"](https://www.taosdata.com/en/documentation/getting-started/) to install and experience the single node function.
|
||||
|
||||
Each data node of the cluster is uniquely identified by End Point, which is composed of FQDN (Fully Qualified Domain Name) plus Port, such as [h1.taosdata.com](http://h1.taosdata.com/):6030. The general FQDN is the hostname of the server, which can be obtained through the Linux command `hostname -f` (how to configure FQDN, please refer to: [All about FQDN of TDengine](https://www.taosdata.com/blog/2020/09/11/1824.html)). Port is the external service port number of this data node. The default is 6030, but it can be modified by configuring the parameter serverPort in taos.cfg. A physical node may be configured with multiple hostnames, and TDengine will automatically get the first one, but it can also be specified through the configuration parameter fqdn in taos.cfg. If you are accustomed to direct IP address access, you can set the parameter fqdn to the IP address of this node.
|
||||
|
||||
The cluster management of TDengine is extremely simple. Except for manual intervention in adding and deleting nodes, all other tasks are completed automatically, thus minimizing the workload of operation. This chapter describes the operations of cluster management in detail.
|
||||
|
||||
Please refer to the [video tutorial](https://www.taosdata.com/blog/2020/11/11/1961.html) for cluster building.
|
||||
|
||||
## <a class="anchor" id="prepare"></a> Preparation
|
||||
|
||||
**Step 0:** Plan FQDN of all physical nodes in the cluster, and add the planned FQDN to /etc/hostname of each physical node respectively; modify the /etc/hosts of each physical node, and add the corresponding IP and FQDN of all cluster physical nodes. [If DNS is deployed, contact your network administrator to configure it on DNS]
|
||||
|
||||
**Step 1:** If the physical nodes have previous test data, installed with version 1. x, or installed with other versions of TDengine, please delete it first and drop all data. For specific steps, please refer to the blog "[Installation and Uninstallation of Various Packages of TDengine](https://www.taosdata.com/blog/2019/08/09/566.html)"
|
||||
|
||||
**Note 1:** Because the information of FQDN will be written into a file, if FQDN has not been configured or changed before, and TDengine has been started, be sure to clean up the previous data (`rm -rf /var/lib/taos/*`)on the premise of ensuring that the data is useless or backed up;
|
||||
|
||||
**Note 2:** The client also needs to be configured to ensure that it can correctly parse the FQDN configuration of each node, whether through DNS service or Host file.
|
||||
|
||||
**Step 2:** It is recommended to close the firewall of all physical nodes, and at least ensure that the TCP and UDP ports of ports 6030-6042 are open. It is **strongly recommended** to close the firewall first and configure the ports after the cluster is built;
|
||||
|
||||
**Step 3:** Install TDengine on all physical nodes, and the version must be consistent, **but do not start taosd**. During installation, when prompted to enter whether to join an existing TDengine cluster, press enter for the first physical node directly to create a new cluster, and enter the FQDN: port number (default 6030) of any online physical node in the cluster for the subsequent physical nodes;
|
||||
|
||||
**Step 4:** Check the network settings of all data nodes and the physical nodes where the application is located:
|
||||
|
||||
1. Execute command `hostname -f` on each physical node, and check and confirm that the hostnames of all nodes are different (the node where the application driver is located does not need to do this check).
|
||||
2. Execute `ping host` on each physical node, wherein host is that hostname of other physical node, and see if other physical nodes can be communicated to; if not, you need to check the network settings, or the /etc/hosts file (the default path for Windows systems is C:\ Windows\ system32\ drivers\ etc\ hosts), or the configuration of DNS. If it fails to ping, then we cann't build the cluster.
|
||||
3. From the physical node where the application runs, ping the data node where taosd runs. If the ping fails, the application cannot connect to taosd. Please check the DNS settings or hosts file of the physical node where the application is located;
|
||||
4. The End Point of each data node is the output hostname plus the port number, for example, [h1.taosdata.com](http://h1.taosdata.com/): 6030
|
||||
|
||||
**Step 5:** Modify the TDengine configuration file (the file/etc/taos/taos.cfg for all nodes needs to be modified). Assume that the first data node End Point to be started is [h1.taosdata.com](http://h1.taosdata.com/): 6030, and its parameters related to cluster configuration are as follows:
|
||||
|
||||
```
|
||||
// firstEp is the first data node connected after each data node’s first launch
|
||||
firstEp h1.taosdata.com:6030
|
||||
// Must configure it as the FQDN of this data node. If this machine has only one hostname, you can comment out this configuration
|
||||
fqdn h1.taosdata.com
|
||||
// Configure the port number of this data node, the default is 6030
|
||||
serverPort 6030
|
||||
// For application scenarios, please refer to the section “Use of Arbitrator”
|
||||
arbitrator ha.taosdata.com:6042
|
||||
```
|
||||
|
||||
The parameters that must be modified are firstEp and fqdn. At each data node, every firstEp needs to be configured to be the same, **but fqdn must be configured to the value of the data node where it is located**. Other parameters may not be modified unless you have clear reasons.
|
||||
|
||||
**The data node dnode added to the cluster must be exactly the same as the 11 parameters in the following table related to the cluster, otherwise it cannot be successfully added to the cluster.**
|
||||
|
||||
|
||||
|
||||
| **#** | **Configuration Parameter Name** | **Description** |
|
||||
| ----- | -------------------------------- | ------------------------------------------------------------ |
|
||||
| 1 | numOfMnodes | Number of management nodes in system |
|
||||
| 2 | mnodeEqualVnodeNum | A mnode equals to the number of vnodes consumed |
|
||||
| 3 | offlineThreshold | Offline threshold of dnode to judge if the dnode is offline |
|
||||
| 4 | statusInterval | The interval for dnode to report its status to mnode |
|
||||
| 5 | arbitrator | The end point of the arbitrator in system |
|
||||
| 6 | timezone | Time zone |
|
||||
| 7 | locale | Location information and coding format of system |
|
||||
| 8 | charset | Character set encoding |
|
||||
| 9 | balance | Whether to start load balancing |
|
||||
| 10 | maxTablesPerVnode | The maximum number of tables that can be created in each vnode |
|
||||
| 11 | maxVgroupsPerDb | The maximum number of vgroups that can be used per DB |
|
||||
|
||||
## <a class="anchor" id="node-one"></a> Launch the First Data Node
|
||||
|
||||
Follow the instructions in "[Getting started](https://www.taosdata.com/en/documentation/getting-started/)", launch the first data node, such as [h1.taosdata.com](http://h1.taosdata.com/), then execute taos, start the taos shell, and execute command "show dnodes" from the shell; ", as follows:
|
||||
|
||||
```
|
||||
Welcome to the TDengine shell from Linux, Client Version:2.0.0.0
|
||||
Copyright (c) 2017 by TAOS Data, Inc. All rights reserved.
|
||||
|
||||
taos> show dnodes;
|
||||
id | end_point | vnodes | cores | status | role | create_time |
|
||||
=====================================================================================
|
||||
1 | h1.taos.com:6030 | 0 | 2 | ready | any | 2020-07-31 03:49:29.202 |
|
||||
Query OK, 1 row(s) in set (0.006385s)
|
||||
|
||||
taos>
|
||||
```
|
||||
|
||||
In the above command, you can see that the End Point of the newly launched data node is: [h1.taos.com](http://h1.taos.com/): 6030, which is the firstEP of the new cluster.
|
||||
|
||||
## <a class="anchor" id="node-other"></a>Launch Subsequent Data Nodes
|
||||
|
||||
To add subsequent data nodes to the existing cluster, there are the following steps:
|
||||
|
||||
1. Start taosd at each physical node according to the chapter "[Getting started](https://www.taosdata.com/en/documentation/getting-started/)";
|
||||
|
||||
2. On the first data node, use CLI program taos to log in to TDengine system and execute the command:
|
||||
|
||||
```
|
||||
CREATE DNODE "h2.taos.com:6030";
|
||||
```
|
||||
|
||||
Add the End Point of the new data node (learned in Step 4 of the preparation) to the cluster's EP list. **"fqdn: port" needs to be enclosed in double quotation marks**, otherwise an error will occur. Notice that the example "[h2.taos.com](http://h2.taos.com/): 6030" is replaced with the End Point for this new data node.
|
||||
|
||||
3. And then execute the command
|
||||
|
||||
1. ```
|
||||
SHOW DNODES;
|
||||
```
|
||||
|
||||
2. Check to see if the new node was successfully joined. If the added data node is offline, then check:
|
||||
|
||||
1. - Check whether the taosd of this data node is working properly. If it is not working properly, you need to check the reason first
|
||||
- Check the first few lines of the data node taosd log file taosdlog.0 (usually in the /var/log/taos directory) to see if the data node fqdn and port number output in the log are the just added End Point. If not, you need to add the correct End Point.
|
||||
|
||||
According to the above steps, new data nodes can be continuously added to the cluster.
|
||||
|
||||
**Tips**:
|
||||
|
||||
- Any data node that has joined the cluster online can be used as the firstEP of the subsequent node to be joined.
|
||||
- firstEp is only effective when the data node joins the cluster for the first time. After joining the cluster, the data node will save the latest End Point list of mnode and no longer rely on this parameter.
|
||||
- The two dnode data nodes dnode that are not configured with the firstEp parameter will run independently after startup. At this time, one data node cannot be added to another data node to form a cluster. **You cannot merge two independent clusters into a new cluster**.
|
||||
|
||||
## <a class="anchor" id="management"></a> Data Node Management
|
||||
|
||||
The above has already introduced how to build clusters from scratch. After the cluster is formed, new data nodes can be added at any time for expansion, or data nodes can be deleted, and the current status of the cluster can be checked.
|
||||
|
||||
### Add data nodes
|
||||
|
||||
Execute CLI program taos, log in to the system using root account, and execute:
|
||||
|
||||
```
|
||||
CREATE DNODE "fqdn:port";
|
||||
```
|
||||
|
||||
Add the End Point for the new data node to the cluster's EP list. **"fqdn: port" needs to be enclosed in double quotation marks**, otherwise an error will occur. The fqdn and port of a data node's external service can be configured through the configuration file taos.cfg, which is automatically obtained by default. [It is strongly not recommended to configure FQDN with automatic acquisition, which may cause the End Point of the generated data node to be not expected]
|
||||
|
||||
### Delete data nodes
|
||||
|
||||
Execute the CLI program taos, log in to the TDengine system using the root account, and execute:
|
||||
|
||||
```
|
||||
DROP DNODE "fqdn:port";
|
||||
```
|
||||
|
||||
Where fqdn is the FQDN of the deleted node, and port is the port number of its external server.
|
||||
|
||||
<font color=green>**【Note】**</font>
|
||||
|
||||
- Once a data node is dropped, it cannot rejoin the cluster. This node needs to be redeployed (emptying the data folder). The cluster migrates the data from the dnode before it completes the drop dnode operation.
|
||||
- Note that dropping a dnode and stopping the taosd process are two different concepts. Don't be confused: the data migration operation must be performed before deleting a dnode, thus the deleted dnode must remain online. The taosd process cannot be stopped until the delete operation is completed.
|
||||
- After a data node is dropped, other nodes will perceive the deletion of this dnodeID, and no node in any cluster will receive the request of the dnodeID.
|
||||
- dnodeID is automatically assigned by the cluster and cannot be specified manually. It is incremented at the time of generation and does not repeat.
|
||||
|
||||
### View data nodes
|
||||
|
||||
Execute the CLI program taos, log in to the TDengine system using the root account, and execute:
|
||||
|
||||
```
|
||||
SHOW DNODES;
|
||||
```
|
||||
|
||||
All dnodes, fqdn: port for each dnode, status (ready, offline, etc.), number of vnodes, number of unused vnodes in the cluster will be listed. You can use this command to view after adding or deleting a data node.
|
||||
|
||||
### View virtual node group
|
||||
|
||||
In order to make full use of multi-core technology and provide scalability, data needs to be processed in partitions. Therefore, TDengine will split the data of a DB into multiple parts and store them in multiple vnodes. These vnodes may be distributed in multiple data node dnodes, thus realizing scale-out. A vnode belongs to only one DB, but a DB can have multiple vnodes. vnode is allocated automatically by mnode according to the current system resources without any manual intervention.
|
||||
|
||||
Execute the CLI program taos, log in to the TDengine system using the root account, and execute:
|
||||
|
||||
```
|
||||
SHOW VGROUPS;
|
||||
```
|
||||
|
||||
## <a class="anchor" id="high-availability"></a> High-availability of vnode
|
||||
|
||||
TDengine provides high-availability of system through a multi-replica mechanism, including high-availability of vnode and mnode.
|
||||
|
||||
The number of replicas of vnode is associated with DB. There can be multiple DBs in a cluster. Each DB can be configured with different replicas according to operational requirements. When creating a database, specify the number of replicas with parameter replica (the default is 1). If the number of replicas is 1, the reliability of the system cannot be guaranteed. As long as the node where the data is located goes down, the service cannot be provided. The number of nodes in the cluster must be greater than or equal to the number of replicas, otherwise the error "more dnodes are needed" will be returned when creating a table. For example, the following command will create a database demo with 3 replicas:
|
||||
|
||||
```
|
||||
CREATE DATABASE demo replica 3;
|
||||
```
|
||||
|
||||
The data in a DB will be partitioned and splitted into multiple vnode groups. The number of vnodes in a vnode group is the number of replicas of the DB, and the data of each vnode in the same vnode group is completely consistent. In order to ensure high-availability, the vnodes in a vnode group must be distributed in different dnode data nodes (in actual deployment, they need to be on different physical machines). As long as more than half of the vnodes in a vgroup are working, the vgroup can be normally serving.
|
||||
|
||||
There may be data from multiple DBs of data in a data node dnode, so when a dnode is offline, it may affect multiple DBs. If half or more of the vnodes in a vnode group do not work, then the vnode group cannot serve externally and cannot insert or read data, which will affect the reading and writing operations of some tables in the DB to which it belongs.
|
||||
|
||||
Because of the introduction of vnode, it is impossible to simply draw a conclusion: "If more than half of the data nodes in the cluster work in dnode, the cluster should work." But for simple cases, it is easier to judge. For example, if the number of replicas is 3 and there are only 3 dnodes, the whole cluster can still work normally if only one node does not work, but if two data nodes do not work, the whole cluster cannot work normally.
|
||||
|
||||
## <a class="anchor" id="mnode"></a> High-availability of mnode
|
||||
|
||||
TDengine cluster is managed by mnode (a module of taosd, management node). In order to ensure the high-availability of mnode, multiple mnode replicas can be configured. The number of replicas is determined by system configuration parameter numOfMnodes, and the effective range is 1-3. In order to ensure the strong consistency of metadata, mnode replicas are duplicated synchronously.
|
||||
|
||||
A cluster has multiple data node dnodes, but a dnode runs at most one mnode instance. In the case of multiple dnodes, which dnode can be used as an mnode? This is automatically specified by the system according to the resource situation on the whole. User can execute the following command in the console of TDengine through the CLI program taos:
|
||||
|
||||
```
|
||||
SHOW MNODES;
|
||||
```
|
||||
|
||||
To view the mnode list, which lists the End Point and roles (master, slave, unsynced, or offline) of the dnode where the mnode is located. When the first data node in the cluster starts, the data node must run an mnode instance, otherwise the dnode of the data node cannot work properly because a system must have at least one mnode. If numOfMnodes is configured to 2, when the second dnode is started, the latter will also run an mnode instance.
|
||||
|
||||
To ensure the high-availability of mnode service, numOfMnodes must be set to 2 or greater. Because the metadata saved by mnode must be strongly consistent, if numOfMnodes is greater than 2, the duplication parameter quorum is automatically set to 2, that is to say, at least two replicas must be guaranteed to write the data successfully before notifying the client application of successful writing.
|
||||
|
||||
**Note:** A TDengine highly-available system, whether vnode or mnode, must be configured with multiple replicas.
|
||||
|
||||
## <a class="anchor" id="load-balancing"></a> Load Balancing
|
||||
|
||||
There are three situations in which load balancing will be triggered, and no manual intervention is required.
|
||||
|
||||
- When a new data node is added to the cluster, the system will automatically trigger load balancing, and the data on some nodes will be automatically migrated to the new data node without any manual intervention.
|
||||
- When a data node is removed from the cluster, the system will automatically migrate the data on the data node to other data nodes without any manual intervention.
|
||||
- If a data node is overheated (too large amount of data), the system will automatically load balance and migrate some vnodes of the data node to other nodes.
|
||||
|
||||
When the above three situations occur, the system will start a load computing of each data node to decide how to migrate.
|
||||
|
||||
**[Tip] Load balancing is controlled by parameter balance, which determines whether to start automatic load balancing.**
|
||||
|
||||
## <a class="anchor" id="offline"></a> Offline Processing of Data Nodes
|
||||
|
||||
If a data node is offline, the TDengine cluster will automatically detect it. There are two detailed situations:
|
||||
|
||||
- If the data node is offline for more than a certain period of time (configuration parameter offlineThreshold in taos.cfg controls the duration), the system will automatically delete the data node, generate system alarm information and trigger the load balancing process. If the deleted data node is online again, it will not be able to join the cluster, and the system administrator will need to add it to the cluster again.
|
||||
- After offline, the system will automatically start the data recovery process if it goes online again within the duration of offlineThreshold. After the data is fully recovered, the node will start to work normally.
|
||||
|
||||
**Note:** If each data node belonging to a virtual node group (including mnode group) is in offline or unsynced state, Master can only be elected after all data nodes in the virtual node group are online and can exchange status information, and the virtual node group can serve externally. For example, the whole cluster has 3 data nodes with 3 replicas. If all 3 data nodes go down and then 2 data nodes restart, it will not work. Only when all 3 data nodes restart successfully can serve externally again.
|
||||
|
||||
## <a class="anchor" id="arbitrator"></a> How to Use Arbitrator
|
||||
|
||||
If the number of replicas is even, it is impossible to elect a master from a vnode group when half of the vnodes are not working. Similarly, when half of the mnodes are not working, the master of the mnode cannot be elected because of the "split brain" problem. To solve this problem, TDengine introduced the concept of Arbitrator. Arbitrator simulates a vnode or mnode working, but is simply responsible for networking, and does not handle any data insertion or access. As long as more than half of the vnodes or mnodes, including the Arbitrator, work, the vnode group or mnode group can normally provide data insertion or query services. For example, in the case of 2 replicas, if one node A is offline, but the other node B is normal on and can connect to the Arbitrator, then node B can work normally.
|
||||
|
||||
In a word, under the current version, TDengine recommends configuring Arbitrator in double-replica environment to improve the availability.
|
||||
|
||||
The name of the executable for Arbitrator is tarbitrator. The executable has almost no requirements for system resources, just need to ensure a network connection, with any Linux server to run it. The following briefly describes the steps to install the configuration:
|
||||
|
||||
|
||||
|
||||
1. Click [Package Download](https://www.taosdata.com/cn/all-downloads/), and in the TDengine Arbitrator Linux section, select the appropriate version to download and install.
|
||||
2. The command line parameter -p of this application can specify the port number of its external service, and the default is 6042.
|
||||
3. Modify the configuration file of each taosd instance, and set parameter arbitrator to the End Point corresponding to the tarbitrator in taos.cfg. (If this parameter is configured, when the number of replicas is even, the system will automatically connect the configured Arbitrator. If the number of replicas is odd, even if the Arbitrator is configured, the system will not establish a connection.)
|
||||
4. The Arbitrator configured in the configuration file will appear in the return result of instruction `SHOW DNODES`; the value of the corresponding role column will be "arb".
|
||||
|
|
@ -0,0 +1,511 @@
|
|||
# TDengine Operation and Maintenance
|
||||
|
||||
## <a class="anchor" id="planning"></a> Capacity Planing
|
||||
|
||||
Using TDengine to build an IoT big data platform, computing resource and storage resource need to be planned according to business scenarios. The following is a discussion of the memory, CPU and hard disk space required for the system to run.
|
||||
|
||||
### Memory requirements
|
||||
|
||||
Each DB can create a fixed number of vgroups, which is the same as the CPU cores by default and can be configured by maxVgroupsPerDb; each replica in the vgroup would be a vnode; each vnode takes up a fixed amount of memory (the size is related to the database's configuration parameters blocks and cache); each table takes up memory related to the total length of the tag; in addition, the system will have some fixed memory overhead. Therefore, the system memory required for each DB can be calculated by the following formula:
|
||||
|
||||
```
|
||||
Database Memory Size = maxVgroupsPerDb * (blocks * cache + 10MB) + numOfTables * (tagSizePerTable + 0.5KB)
|
||||
```
|
||||
|
||||
Example: Assuming a 4-core machine, cache is the default size of 16M, blocks is the default value of 6, assuming there are 100,000 tables, and the total tag length is 256 bytes, the total memory requirement is: 4 * (16 * 6 + 10) + 100,000 * (0.25 + 0.5)/1000 = 499M.
|
||||
|
||||
The actual running system often stores the data in different DBs according to different characteristics of the data. All these shall be considered when planning.
|
||||
|
||||
If there is plenty of memory, the configuration of Blocks can be increased so that more data will be stored in memory and the query speed will be improved.
|
||||
|
||||
### CPU requirements
|
||||
|
||||
CPU requirements depend on the following two aspects:
|
||||
|
||||
- **Data insertion** TDengine single core can handle at least 10,000 insertion requests per second. Each insertion request can take multiple records, and inserting one record at a time is almost the same as inserting 10 records in computing resources consuming. Therefore, the larger the number of inserts, the higher the insertion efficiency. If an insert request has more than 200 records, a single core can insert 1 million records per second. However, the faster the insertion speed, the higher the requirement for front-end data collection, because records need to be cached and then inserted in batches.
|
||||
- **Query requirements** TDengine to provide efficient queries, but the queries in each scenario vary greatly and the query frequency too, making it difficult to give objective figures. Users need to write some query statements for their own scenes to determine.
|
||||
|
||||
Therefore, only for data insertion, CPU can be estimated, but the computing resources consumed by query cannot be that clear. In the actual operation, it is not recommended to make CPU utilization rate over 50%. After that, new nodes need to be added to bring more computing resources.
|
||||
|
||||
### Storage requirements
|
||||
|
||||
Compared with general databases, TDengine has an ultra-high compression ratio. In most scenarios, the compression ratio of TDengine will not be less than 5:1, and in some scenarios, maybe over 10:1, depending on the actual data characteristics. The raw data size before compressed can be calculated as follows:
|
||||
|
||||
```
|
||||
Raw DataSize = numOfTables * rowSizePerTable * rowsPerTable
|
||||
```
|
||||
|
||||
Example: 10 million smart meters, each meter collects data every 15 minutes, and the data collected each time is 128 bytes, so the original data amount in one year is: 10000000 * 128 * 24 * 60/15 * 365 = 44.8512 T. The TDengine consumes approximately 44.851/5 = 8.97024 T.
|
||||
|
||||
User can set the maximum retention time of data on disk through parameter `keep`. In order to further reduce the storage cost, TDengine also provides tiered storage. The coldest data can be stored on the cheapest storage media. Application access does not need to be adjusted, but lower reading speed.
|
||||
|
||||
To improve speed, multiple hard disks can be configured so that data can be written or read concurrently. It should be reminded that TDengine provides high reliability of data in the form of multiple replicas, so it is no longer necessary to use expensive disk arrays.
|
||||
|
||||
### Number of physical or virtual machines
|
||||
|
||||
According to the above estimation of memory, CPU and storage, we can know how many cores, how much memory and storage space the whole system needs. If the number of data replicas is not 1, the total demand needs to be multiplied by the number of replicas.
|
||||
|
||||
Because TDengine provides great scale-out feature, it is easy to decide how many physical or virtual machines need to be purchased according to the total amount and the resources of a single physical/ virtual machine.
|
||||
|
||||
**Calculate CPU, memory and storage immediately, see:** [**Resource Estimation**](https://www.taosdata.com/config/config.html)
|
||||
|
||||
### Fault Tolerance and Disaster Recovery
|
||||
|
||||
### Fault tolerance
|
||||
|
||||
TDengine supports WAL (Write Ahead Log) mechanism to realize fault tolerance of data and ensure high-availability of data.
|
||||
|
||||
When TDengine receives the application's request packet, it first writes the requested original packet into the database log file, and then deletes the corresponding WAL after the data is successfully written. This ensures that TDengine can recover data from the database log file when the service is restarted due to power failure or other reasons, thus avoiding data loss.
|
||||
|
||||
There are two system configuration parameters involved:
|
||||
|
||||
- walLevel: WAL level, 0: do not write wal; 1: write wal, but do not execute fsync; 2: write wal and execute fsync.
|
||||
- fsync: the cycle in which fsync is executed when walLevel is set to 2. Setting to 0 means that fsync is executed immediately whenever there is a write.
|
||||
|
||||
To guarantee 100% data safe, you need to set walLevel to 2 and fsync to 0. In that way, the write speed will decrease. However, if the number of threads starting to write data on the application side reaches a certain number (more than 50), the performance of writing data will also be good, only about 30% lower than that of fsync set to 3000 milliseconds.
|
||||
|
||||
### Disaster recovery
|
||||
|
||||
The cluster of TDengine provides high-availability of the system and implements disaster recovery through the multipl-replica mechanism.
|
||||
|
||||
TDengine cluster is managed by mnode. In order to ensure the high reliability of the mnode, multiple mnode replicas can be configured. The number of replicas is determined by system configuration parameter numOfMnodes. In order to support high reliability, it needs to be set to be greater than 1. In order to ensure the strong consistency of metadata, mnode replicas duplicate data synchronously to ensure the strong consistency of metadata.
|
||||
|
||||
The number of replicas of time-series data in TDengine cluster is associated with databases. There can be multiple databases in a cluster, and each database can be configured with different replicas. When creating a database, specify the number of replicas through parameter replica. In order to support high reliability, it is necessary to set the number of replicas greater than 1.
|
||||
|
||||
The number of nodes in TDengine cluster must be greater than or equal to the number of replicas, otherwise an error will be reported in table creation.
|
||||
|
||||
When the nodes in TDengine cluster are deployed on different physical machines and multiple replicas are set, the high reliability of the system is implemented without using other software or tools. TDengine Enterprise Edition can also deploy replicas in different server rooms, thus realizing remote disaster recovery.
|
||||
|
||||
## <a class="anchor" id="config"></a> Server-side Configuration
|
||||
|
||||
The background service of TDengine system is provided by taosd, and the configuration parameters can be modified in the configuration file taos.cfg to meet the requirements of different scenarios. The default location of the configuration file is the /etc/taos directory, which can be specified by executing the parameter -c from the taosd command line. Such as taosd-c/home/user, to specify that the configuration file is located in the /home/user directory.
|
||||
|
||||
You can also use “-C” to show the current server configuration parameters:
|
||||
|
||||
```
|
||||
taosd -C
|
||||
```
|
||||
|
||||
Only some important configuration parameters are listed below. For more parameters, please refer to the instructions in the configuration file. Please refer to the previous chapters for detailed introduction and function of each parameter, and the default of these parameters is working and generally does not need to be set. **Note: After the configuration is modified, \*taosd service\* needs to be restarted to take effect.**
|
||||
|
||||
- firstEp: end point of the first dnode in the actively connected cluster when taosd starts, the default value is localhost: 6030.
|
||||
- fqdn: FQDN of the data node, which defaults to the first hostname configured by the operating system. If you are accustomed to IP address access, you can set it to the IP address of the node.
|
||||
- serverPort: the port number of the external service after taosd started, the default value is 6030.
|
||||
- httpPort: the port number used by the RESTful service to which all HTTP requests (TCP) require a query/write request. The default value is 6041.
|
||||
- dataDir: the data file directory to which all data files will be written. [Default:/var/lib/taos](http://default/var/lib/taos).
|
||||
- logDir: the log file directory to which the running log files of the client and server will be written. [Default:/var/log/taos](http://default/var/log/taos).
|
||||
- arbitrator: the end point of the arbiter in the system; the default value is null.
|
||||
- role: optional role for dnode. 0-any; it can be used as an mnode and to allocate vnodes; 1-mgmt; It can only be an mnode, but not to allocate vnodes; 2-dnode; caannot be an mnode, only vnode can be allocated
|
||||
- debugFlage: run the log switch. 131 (output error and warning logs), 135 (output error, warning, and debug logs), 143 (output error, warning, debug, and trace logs). Default value: 131 or 135 (different modules have different default values).
|
||||
- numOfLogLines: the maximum number of lines allowed for a single log file. Default: 10,000,000 lines.
|
||||
- logKeepDays: the maximum retention time of the log file. When it is greater than 0, the log file will be renamed to taosdlog.xxx, where xxx is the timestamp of the last modification of the log file in seconds. Default: 0 days.
|
||||
- maxSQLLength: the maximum length allowed for a single SQL statement. Default: 65380 bytes.
|
||||
- telemetryReporting: whether TDengine is allowed to collect and report basic usage information. 0 means not allowed, and 1 means allowed. Default: 1.
|
||||
- stream: whether continuous query (a stream computing function) is enabled, 0 means not allowed, 1 means allowed. Default: 1.
|
||||
- queryBufferSize: the amount of memory reserved for all concurrent queries. The calculation rule can be multiplied by the number of the table according to the maximum possible concurrent number in practical application, and then multiplied by 170. The unit is MB (in versions before 2.0. 15, the unit of this parameter is byte).
|
||||
- ratioOfQueryCores: set the maximum number of query threads. The minimum value of 0 means that there is only one query thread; the maximum value of 2 indicates that the maximum number of query threads established is 2 times the number of CPU cores. The default is 1, which indicates the maximum number of query threads equals to the number of CPU cores. This value can be a decimal, that is, 0.5 indicates that the query thread with half of the maximum CPU cores is established.
|
||||
|
||||
**Note:** for ports, TDengine will use 13 continuous TCP and UDP port numbers from serverPort, so be sure to open them in the firewall. Therefore, if it is the default configuration, a total of 13 ports from 6030 to 6042 need to be opened, and the same for both TCP and UDP.
|
||||
|
||||
Data in different application scenarios often have different data characteristics, such as retention days, number of replicas, collection frequency, record size, number of collection points, compression, etc. In order to obtain the best efficiency in storage, TDengine provides the following storage-related system configuration parameters:
|
||||
|
||||
- days: the time span for a data file to store data, in days, the default value is 10.
|
||||
- keep: the number of days to keep data in the database, in days, default value: 3650.
|
||||
- minRows: the minimum number of records in a file block, in pieces, default: 100.
|
||||
- maxRows: the maximum number of records in a file block, in pieces, default: 4096.
|
||||
- comp: file compression flag bit, 0: off; 1: one-stage compression; 2: two-stage compression. Default: 2.
|
||||
- walLevel: WAL level. 1: write wal, but do not execute fsync; 2: write wal and execute fsync. Default: 1.
|
||||
- fsync: the period during which fsync is executed when wal is set to 2. Setting to 0 means that fsync is executed immediately whenever a write happens, in milliseconds, and the default value is 3000.
|
||||
- cache: the size of the memory block in megabytes (MB), default: 16.
|
||||
- blocks: how many cache-sized memory blocks are in each VNODE (TSDB). Therefore, the memory size used by a VNODE is roughly (cache * blocks), in blocks, and the default value is 4.
|
||||
- replica: number of replicas; value range: 1-3, in items, default value: 1
|
||||
- precision: timestamp precision identification, ms for milliseconds and us for microseconds. Default: ms
|
||||
- cacheLast: whether the sub-table last_row is cached in memory, 0: off; 1: on. Default: 0. (This parameter is supported as of version 2.0. 11)
|
||||
|
||||
For an application scenario, there may be data with multiple characteristics coexisting. The best design is to put tables with the same data characteristics in one database. Such an application has multiple databases, and each one can be configured with different storage parameters, thus ensuring the optimal performance of the system. TDengine allows the application to specify the above storage parameter in database creation. If specified, the parameters will override the corresponding system configuration parameters. For example, there is the following SQL:
|
||||
|
||||
```
|
||||
create database demo days 10 cache 32 blocks 8 replica 3 update 1;
|
||||
```
|
||||
|
||||
The SQL creates a database demo, each data file stores 10 days of data, the memory block is 32 megabytes, each VNODE occupies 8 memory blocks, the number of replicas is 3, updates are allowed, and other parameters are completely consistent with the system configuration.
|
||||
|
||||
When adding a new dnode to the TDengine cluster, some parameters related to the cluster must be the same as the configuration of the existing cluster, otherwise it cannot be successfully added to the cluster. The parameters that will be verified are as follows:
|
||||
|
||||
- numOfMnodes: the number of management nodes in the system. Default: 3.
|
||||
- balance: whether to enable load balancing. 0: No, 1: Yes. Default: 1.
|
||||
- mnodeEqualVnodeNum: an mnode is equal to the number of vnodes consumed. Default: 4.
|
||||
- offlineThreshold: the threshold for a dnode to be offline, exceed which the dnode will be removed from the cluster. The unit is seconds, and the default value is 86400*10 (that is, 10 days).
|
||||
- statusInterval: the length of time dnode reports status to mnode. The unit is seconds, and the default value is 1.
|
||||
- maxTablesPerVnode: the maximum number of tables that can be created in each vnode. Default: 1000000.
|
||||
- maxVgroupsPerDb: the maximum number of vgroups that can be used in each database.
|
||||
- arbitrator: the end point of the arbiter in system, which is empty by default.
|
||||
- See Client Configuration for the configuration of timezone, locale and charset.
|
||||
|
||||
For the convenience of debugging, the log configuration of each dnode can be temporarily adjusted through SQL statements, and all will be invalid after system restarting:
|
||||
|
||||
```mysql
|
||||
ALTER DNODE <dnode_id> <config>
|
||||
```
|
||||
|
||||
- dnode_id: available from the SQL statement "SHOW DNODES" command
|
||||
- config: the log parameter to be adjusted, and the value is taken in the following list
|
||||
|
||||
resetlog truncates the old log file and creates a new log file debugFlag < 131 135 143 > Set debugFlag to 131, 135 or 143.
|
||||
|
||||
For example:
|
||||
|
||||
```
|
||||
alter dnode 1 debugFlag 135;
|
||||
```
|
||||
|
||||
## <a class="anchor" id="client"></a> Client Configuration
|
||||
|
||||
The foreground interactive client application of TDengine system is taos and application driver, which shares the same configuration file taos.cfg with taosd. When running taos, use the parameter -c to specify the configuration file directory, such as taos-c/home/cfg, which means using the parameters in the taos.cfg configuration file under the /home/cfg/ directory. The default directory is /etc/taos. For more information on how to use taos, see the help information taos --help. This section mainly describes the parameters used by the taos client application in the configuration file taos.cfg.
|
||||
|
||||
**Versions after 2.0. 10.0 support the following parameters on command line to display the current client configuration parameters**
|
||||
|
||||
```bash
|
||||
taos -C 或 taos --dump-config
|
||||
```
|
||||
|
||||
Client configuration parameters:
|
||||
|
||||
- firstEp: end point of the first taosd instance in the actively connected cluster when taos is started, the default value is localhost: 6030.
|
||||
- secondEp: when taos starts, if not impossible to connect to firstEp, it will try to connect to secondEp.
|
||||
- locale
|
||||
|
||||
Default value: obtained dynamically from the system. If the automatic acquisition fails, user needs to set it in the configuration file or through API
|
||||
|
||||
TDengine provides a special field type nchar for storing non-ASCII encoded wide characters such as Chinese, Japanese and Korean. The data written to the nchar field will be uniformly encoded in UCS4-LE format and sent to the server. It should be noted that the correctness of coding is guaranteed by the client. Therefore, if users want to normally use nchar fields to store non-ASCII characters such as Chinese, Japanese, Korean, etc., it’s needed to set the encoding format of the client correctly.
|
||||
|
||||
The characters inputted by the client are all in the current default coding format of the operating system, mostly UTF-8 on Linux systems, and some Chinese system codes may be GB18030 or GBK, etc. The default encoding in the docker environment is POSIX. In the Chinese versions of Windows system, the code is CP936. The client needs to ensure that the character set it uses is correctly set, that is, the current encoded character set of the operating system running by the client, in order to ensure that the data in nchar is correctly converted into UCS4-LE encoding format.
|
||||
|
||||
The naming rules of locale in Linux are: < language > _ < region >. < character set coding >, such as: zh_CN.UTF-8, zh stands for Chinese, CN stands for mainland region, and UTF-8 stands for character set. Character set encoding provides a description of encoding transformations for clients to correctly parse local strings. Linux system and Mac OSX system can determine the character encoding of the system by setting locale. Because the locale used by Windows is not the POSIX standard locale format, another configuration parameter charset is needed to specify the character encoding under Windows. You can also use charset to specify character encoding in Linux systems.
|
||||
|
||||
- charset
|
||||
|
||||
Default value: obtained dynamically from the system. If the automatic acquisition fails, user needs to set it in the configuration file or through API
|
||||
|
||||
If charset is not set in the configuration file, in Linux system, when taos starts up, it automatically reads the current locale information of the system, and parses and extracts the charset encoding format from the locale information. If the automatic reading of locale information fails, an attempt is made to read the charset configuration, and if the reading of the charset configuration also fails, the startup process is interrupted.
|
||||
|
||||
In Linux system, locale information contains character encoding information, so it is unnecessary to set charset separately after setting locale of Linux system correctly. For example:
|
||||
|
||||
```
|
||||
locale zh_CN.UTF-8
|
||||
```
|
||||
|
||||
- On Windows systems, the current system encoding cannot be obtained from locale. If string encoding information cannot be read from the configuration file, taos defaults to CP936. It is equivalent to adding the following to the configuration file:
|
||||
|
||||
```
|
||||
charset CP936
|
||||
```
|
||||
|
||||
- If you need to adjust the character encoding, check the encoding used by the current operating system and set it correctly in the configuration file.
|
||||
|
||||
In Linux systems, if user sets both locale and charset encoding charset, and the locale and charset are inconsistent, the value set later will override the value set earlier.
|
||||
|
||||
```
|
||||
locale zh_CN.UTF-8
|
||||
charset GBK
|
||||
```
|
||||
|
||||
- The valid value for charset is GBK.
|
||||
|
||||
And the valid value for charset is UTF-8.
|
||||
|
||||
The configuration parameters of log are exactly the same as those of server.
|
||||
|
||||
- timezone
|
||||
|
||||
Default value: get the current time zone option dynamically from the system
|
||||
|
||||
The time zone in which the client runs the system. In order to deal with the problem of data writing and query in multiple time zones, TDengine uses Unix Timestamp to record and store timestamps. The characteristics of UNIX timestamps determine that the generated timestamps are consistent at any time regardless of any time zone. It should be noted that UNIX timestamps are converted and recorded on the client side. In order to ensure that other forms of time on the client are converted into the correct Unix timestamp, the correct time zone needs to be set.
|
||||
|
||||
In Linux system, the client will automatically read the time zone information set by the system. Users can also set time zones in profiles in a number of ways. For example:
|
||||
|
||||
```
|
||||
timezone UTC-8
|
||||
timezone GMT-8
|
||||
timezone Asia/Shanghai
|
||||
```
|
||||
|
||||
- All above are legal to set the format of the East Eight Zone.
|
||||
|
||||
The setting of time zone affects the content of non-Unix timestamp (timestamp string, parsing of keyword now) in query and writing SQL statements. For example:
|
||||
|
||||
```sql
|
||||
SELECT count(*) FROM table_name WHERE TS<'2019-04-11 12:01:08';
|
||||
```
|
||||
|
||||
- In East Eight Zone, the SQL statement is equivalent to
|
||||
|
||||
```sql
|
||||
SELECT count(*) FROM table_name WHERE TS<1554955268000;
|
||||
```
|
||||
|
||||
-
|
||||
|
||||
In the UTC time zone, the SQL statement is equivalent to
|
||||
|
||||
```sql
|
||||
SELECT count(*) FROM table_name WHERE TS<1554984068000;
|
||||
```
|
||||
|
||||
-
|
||||
|
||||
In order to avoid the uncertainty caused by using string time format, Unix timestamp can also be used directly. In addition, timestamp strings with time zones can also be used in SQL statements, such as: timestamp strings in RFC3339 format, 2013-04-12T15: 52: 01.123 +08:00, or ISO-8601 format timestamp strings 2013-04-12T15: 52: 01.123 +0800. The conversion of the above two strings into Unix timestamps is not affected by the time zone in which the system is located.
|
||||
|
||||
When starting taos, you can also specify an end point for an instance of taosd from the command line, otherwise read from taos.cfg.
|
||||
|
||||
- maxBinaryDisplayWidth
|
||||
|
||||
The upper limit of the display width of binary and nchar fields in a shell, beyond which parts will be hidden. Default: 30. You can modify this option dynamically in the shell with the command set max_binary_display_width nn.
|
||||
|
||||
## <a class="anchor" id="user"></a>User Management
|
||||
|
||||
System administrators can add and delete users in CLI, and also modify passwords. The SQL syntax in the CLI is as follows:
|
||||
|
||||
```sql
|
||||
CREATE USER <user_name> PASS <'password'>;
|
||||
```
|
||||
|
||||
Create a user, and specify the user name and password. The password needs to be enclosed in single quotation marks. The single quotation marks are in English half-width.
|
||||
|
||||
```sql
|
||||
DROP USER <user_name>;
|
||||
```
|
||||
|
||||
Delete a user, root only.
|
||||
|
||||
```sql
|
||||
ALTER USER <user_name> PASS <'password'>;
|
||||
```
|
||||
|
||||
Modify the user password. In order to avoid being converted to lowercase, the password needs to be quoted in single quotation marks. The single quotation marks are in English half-width
|
||||
|
||||
```sql
|
||||
ALTER USER <user_name> PRIVILEGE <write|read>;
|
||||
```
|
||||
|
||||
Modify the user privilege to: write or read, without adding single quotation marks.
|
||||
|
||||
Note: There are three privilege levels: super/write/read in the system, but it is not allowed to give super privilege to users through alter instruction at present.
|
||||
|
||||
```mysql
|
||||
SHOW USERS;
|
||||
```
|
||||
|
||||
Show all users
|
||||
|
||||
**Note:** In SQL syntax, < > indicates the part that requires user to input, but do not enter < > itself
|
||||
|
||||
## <a class="anchor" id="import"></a> Import Data
|
||||
|
||||
TDengine provides a variety of convenient data import functions, including imported by script file, by data file, and by taosdump tool.
|
||||
|
||||
**Import by script file**
|
||||
|
||||
TDengine shell supports source filename command, which is used to run SQL statements from a file in batch. Users can write SQL commands such as database building, table building and data writing in the same file. Each command has a separate line. By running source command in the shell, SQL statements in the file can be run in batches in sequence. SQL statements beginning with '#' are considered comments and are automatically ignored by the shell.
|
||||
|
||||
**Import by data file**
|
||||
|
||||
TDengine also supports data import from CSV files on existing tables in the shell. The CSV file belongs to only one table, and the data format in the CSV file should be the same as the structure of the table to be imported. When importing, its syntax is as follows:
|
||||
|
||||
```mysql
|
||||
insert into tb1 file 'path/data.csv';
|
||||
```
|
||||
|
||||
Note: if there is descriptive information in the first line of the CSV file, please delete it manually before importing
|
||||
|
||||
For example, there is now a sub-table d1001 whose table structure is as follows:
|
||||
|
||||
```mysql
|
||||
taos> DESCRIBE d1001
|
||||
Field | Type | Length | Note |
|
||||
=================================================================================
|
||||
ts | TIMESTAMP | 8 | |
|
||||
current | FLOAT | 4 | |
|
||||
voltage | INT | 4 | |
|
||||
phase | FLOAT | 4 | |
|
||||
location | BINARY | 64 | TAG |
|
||||
groupid | INT | 4 | TAG |
|
||||
```
|
||||
|
||||
And the format of the data.csv to import is as follows:
|
||||
|
||||
```csv
|
||||
'2018-10-04 06:38:05.000',10.30000,219,0.31000
|
||||
'2018-10-05 06:38:15.000',12.60000,218,0.33000
|
||||
'2018-10-06 06:38:16.800',13.30000,221,0.32000
|
||||
'2018-10-07 06:38:05.000',13.30000,219,0.33000
|
||||
'2018-10-08 06:38:05.000',14.30000,219,0.34000
|
||||
'2018-10-09 06:38:05.000',15.30000,219,0.35000
|
||||
'2018-10-10 06:38:05.000',16.30000,219,0.31000
|
||||
'2018-10-11 06:38:05.000',17.30000,219,0.32000
|
||||
'2018-10-12 06:38:05.000',18.30000,219,0.31000
|
||||
```
|
||||
|
||||
Then we can use the following command to import:
|
||||
|
||||
```mysql
|
||||
taos> insert into d1001 file '~/data.csv';
|
||||
Query OK, 9 row(s) affected (0.004763s)
|
||||
```
|
||||
|
||||
**Import via taosdump tool**
|
||||
|
||||
TDengine provides a convenient database import and export tool, taosdump. Users can import data exported by taosdump from one system into other systems. Please refer to the blog: [User Guide of TDengine DUMP Tool](https://www.taosdata.com/blog/2020/03/09/1334.html).
|
||||
|
||||
## <a class="anchor" id="export"></a> Export Data
|
||||
|
||||
To facilitate data export, TDengine provides two export methods, namely, export by table and export by taosdump.
|
||||
|
||||
**Export CSV file by table**
|
||||
|
||||
If user needs to export data from a table or a STable, it can run in a shell
|
||||
|
||||
```mysql
|
||||
select * from <tb_name> >> data.csv;
|
||||
```
|
||||
|
||||
In this way, the data in table tb_name will be exported to the file data.csv in CSV format.
|
||||
|
||||
**Export data by taosdump**
|
||||
|
||||
TDengine provides a convenient database export tool, taosdump. Users can choose to export all databases, a database or a table in a database, all data or data for a time period, or even just the definition of a table as needed. Please refer to the blog: [User Guide of TDengine DUMP Tool](https://www.taosdata.com/blog/2020/03/09/1334.html)
|
||||
|
||||
## <a class="anchor" id="status"></a> System Connection and Task Query Management
|
||||
|
||||
The system administrator can query the connection, ongoing query and stream computing of the system from CLI, and can close the connection and stop the ongoing query and stream computing. The SQL syntax in the CLI is as follows:
|
||||
|
||||
```mysql
|
||||
SHOW CONNECTIONS;
|
||||
```
|
||||
|
||||
Show the connection of the database, and one column shows ip: port, which is the IP address and port number of the connection.
|
||||
|
||||
```mysql
|
||||
KILL CONNECTION <connection-id>;
|
||||
```
|
||||
|
||||
Force the database connection to close, where connection-id is the number in the first column displayed in SHOW CONNECTIONS.
|
||||
|
||||
```mysql
|
||||
SHOW QUERIES;
|
||||
```
|
||||
|
||||
Show the data query, where the two numbers separated by colons displayed in the first column are query-id and the connection-id that initiated the query application connection and the number of queries.
|
||||
|
||||
```mysql
|
||||
KILL QUERY <query-id>;
|
||||
```
|
||||
|
||||
Force to close the data query, where query-id is the connection-id: query-no string displayed in SHOW QUERIES, such as "105: 2", copy and paste it.
|
||||
|
||||
```mysql
|
||||
SHOW STREAMS;
|
||||
```
|
||||
|
||||
Show the stream computing, where the first column shows the two numbers separated by colons as stream-id and the connection-id to start the stream application connection and the number of times the stream was initiated.
|
||||
|
||||
```mysql
|
||||
KILL STREAM <stream-id>;
|
||||
```
|
||||
|
||||
Force to turn off the stream computing, in which stream-id is the connection-id: stream-no string displayed in SHOW STREAMS, such as 103: 2, copy and paste it.
|
||||
|
||||
## System Monitoring
|
||||
|
||||
After TDengine is started, it will automatically create a monitoring database log and write the server's CPU, memory, hard disk space, bandwidth, number of requests, disk read-write speed, slow query and other information into the database regularly. TDengine also records important system operations (such as logging in, creating, deleting databases, etc.) logs and various error alarm information and stores them in the log database. The system administrator can view the database directly from CLI or view the monitoring information through GUI on WEB.
|
||||
|
||||
The collection of these monitoring metrics is turned on by default, but you can modify option enableMonitor in the configuration file to turn it off or on.
|
||||
|
||||
## <a class="anchor" id="directories"></a> File Directory Structure
|
||||
|
||||
After installing TDengine, the following directories or files are generated in the operating system by default:
|
||||
|
||||
|
||||
|
||||
| **Directory/File** | **Description** |
|
||||
| ------------------------- | ------------------------------------------------------------ |
|
||||
| /usr/local/taos/bin | TEngine’s executable directory. The executables are connected to the/usr/bin directory via softly links. |
|
||||
| /usr/local/taos/connector | TDengine’s various connector directories. |
|
||||
| /usr/local/taos/driver | TDengine’s dynamic link library directory. Connect to /usr/lib directory via soft links. |
|
||||
| /usr/local/taos/examples | TDengine’s application example directory for various languages. |
|
||||
| /usr/local/taos/include | TDengine’s header files of C interface for externally serving. |
|
||||
| /etc/taos/taos.cfg | TDengine’s default [configuration files]. |
|
||||
| /var/lib/taos | TDengine’s default data file directory, where the local can be modified via [configuration files]. |
|
||||
| /var/log/taos | TDengine’s default log file directory, where the local can be modified via [configuration files]. |
|
||||
|
||||
**Executables**
|
||||
|
||||
All executables of TDengine are stored in the directory /usr/local/taos/bin by default. Including:
|
||||
|
||||
- *taosd*: TDengine server-side executable
|
||||
- *taos*: TDengine Shell executable
|
||||
- *taosdump*: A data import/export tool
|
||||
- remove.sh: uninstall the TDengine script, please execute carefully, and link to rmtaos command in the/usr/bin directory. The TDengine installation directory /usr/local/taos will be removed, but/etc/taos,/var/lib/taos,/var/log/taos will remain.
|
||||
|
||||
You can configure different data directories and log directories by modifying system configuration file taos.cfg.
|
||||
|
||||
## <a class="anchor" id="keywords"></a> TDengine Parameter Limits and Reserved Keywords
|
||||
|
||||
- Database name: cannot contain "." and other special characters, and cannot exceed 32 characters
|
||||
- Table name: cannot contain "." and other special characters, and cannot exceed 192 characters together with the database name to which it belongs
|
||||
- Table column name: cannot contain special characters, and cannot exceed 64 characters
|
||||
- Database name, table name, column name cannot begin with a number
|
||||
- Number of columns in table: cannot exceed 1024 columns
|
||||
- Maximum length of record: including 8 bytes as timestamp, no more than 16KB (each column of BINARY/NCHAR type will occupy an additional 2 bytes of storage location)
|
||||
- Default maximum string length for a single SQL statement: 65480 bytes
|
||||
- Number of database replicas: no more than 3
|
||||
- User name: no more than 23 bytes
|
||||
- User password: no more than 15 bytes
|
||||
- Number of Tags: no more than 128
|
||||
- Total length of label: cannot exceed 16K bytes
|
||||
- Number of records: limited by storage space only
|
||||
- Number of tables: limited only by the number of nodes
|
||||
- Number of databases: limited only by the number of nodes
|
||||
- Number of virtual nodes on a single database: cannot exceed 64
|
||||
|
||||
At the moment, TDengine has nearly 200 internal reserved keywords, which cannot be used as database name, table name, STable name, data column name or tag column name regardless of case. The list of these keywords is as follows:
|
||||
|
||||
| **List of Keywords** | | | | |
|
||||
| -------------------- | ----------- | ------------ | ---------- | --------- |
|
||||
| ABLOCKS | CONNECTIONS | GT | MNODES | SLIDING |
|
||||
| ABORT | COPY | ID | MODULES | SLIMIT |
|
||||
| ACCOUNT | COUNT | IF | NCHAR | SMALLINT |
|
||||
| ACCOUNTS | CREATE | IGNORE | NE | SPREAD |
|
||||
| ADD | CTIME | IMMEDIATE | NONE | STABLE |
|
||||
| AFTER | DATABASE | IMPORT | NOT | STABLES |
|
||||
| ALL | DATABASES | IN | NOTNULL | STAR |
|
||||
| ALTER | DAYS | INITIALLY | NOW | STATEMENT |
|
||||
| AND | DEFERRED | INSERT | OF | STDDEV |
|
||||
| AS | DELIMITERS | INSTEAD | OFFSET | STREAM |
|
||||
| ASC | DESC | INTEGER | OR | STREAMS |
|
||||
| ATTACH | DESCRIBE | INTERVAL | ORDER | STRING |
|
||||
| AVG | DETACH | INTO | PASS | SUM |
|
||||
| BEFORE | DIFF | IP | PERCENTILE | TABLE |
|
||||
| BEGIN | DISTINCT | IS | PLUS | TABLES |
|
||||
| BETWEEN | DIVIDE | ISNULL | PRAGMA | TAG |
|
||||
| BIGINT | DNODE | JOIN | PREV | TAGS |
|
||||
| BINARY | DNODES | KEEP | PRIVILEGE | TBLOCKS |
|
||||
| BITAND | DOT | KEY | QUERIES | TBNAME |
|
||||
| BITNOT | DOUBLE | KILL | QUERY | TIMES |
|
||||
| BITOR | DROP | LAST | RAISE | TIMESTAMP |
|
||||
| BOOL | EACH | LE | REM | TINYINT |
|
||||
| BOTTOM | END | LEASTSQUARES | REPLACE | TOP |
|
||||
| BY | EQ | LIKE | REPLICA | TRIGGER |
|
||||
| CACHE | EXISTS | LIMIT | RESET | UMINUS |
|
||||
| CASCADE | EXPLAIN | LINEAR | RESTRICT | UPLUS |
|
||||
| CHANGE | FAIL | LOCAL | ROW | USE |
|
||||
| CLOG | FILL | LP | ROWS | USER |
|
||||
| CLUSTER | FIRST | LSHIFT | RP | USERS |
|
||||
| COLON | FLOAT | LT | RSHIFT | USING |
|
||||
| COLUMN | FOR | MATCH | SCORES | VALUES |
|
||||
| COMMA | FROM | MAX | SELECT | VARIABLE |
|
||||
| COMP | GE | METRIC | SEMI | VGROUPS |
|
||||
| CONCAT | GLOB | METRICS | SET | VIEW |
|
||||
| CONFIGS | GRANTS | MIN | SHOW | WAVG |
|
||||
| CONFLICT | GROUP | MINUS | SLASH | WHERE |
|
||||
| CONNECTION | | | | |
|
|
@ -0,0 +1,161 @@
|
|||
# FAQ
|
||||
|
||||
Tutorials & FAQ
|
||||
|
||||
## 0.How to report an issue?
|
||||
|
||||
If the contents in FAQ cannot help you and you need the technical support and assistance of TDengine team, please package the contents in the following two directories:
|
||||
|
||||
1./var/log/taos (if default path has not been modified)
|
||||
|
||||
2./etc/taos
|
||||
|
||||
Provide the necessary description of the problem, including the version information of TDengine used, the platform environment information, the execution operation of the problem, the characterization of the problem and the approximate time, and submit the Issue on [GitHub](https://github.com/taosdata/TDengine).
|
||||
|
||||
To ensure that there is enough debug information, if the problem can be repeated, please modify the/etc/taos/taos.cfg file, add a line of "debugFlag 135" at the end (without quotation marks themselves), then restart taosd, repeat the problem, and then submit. You can also temporarily set the log level of taosd through the following SQL statement.
|
||||
|
||||
```
|
||||
alter dnode <dnode_id> debugFlag 135;
|
||||
```
|
||||
|
||||
However, when the system is running normally, please set debugFlag to 131, otherwise a large amount of log information will be generated and the system efficiency will be reduced.
|
||||
|
||||
## 1.What should I pay attention to when upgrading TDengine from older versions to 2.0 and above? ☆☆☆
|
||||
|
||||
Version 2.0 is a complete refactoring of the previous version, and the configuration and data files are incompatible. Be sure to do the following before upgrading:
|
||||
|
||||
1. Delete the configuration file, execute sudo rm `-rf /etc/taos/taos.cfg`
|
||||
2. Delete the log file, execute `sudo rm -rf /var/log/taos/`
|
||||
3. By ensuring that the data is no longer needed, delete the data file and execute `sudo rm -rf /var/lib/taos/`
|
||||
4. Install the latest stable version of TDengine
|
||||
5. If you need to migrate data or the data file is corrupted, please contact the official technical support team of TAOS Data to assist
|
||||
|
||||
## 2. When encoutered with the error " Unable to establish connection " in Windows, what can I do?
|
||||
|
||||
See the [technical blog](https://www.taosdata.com/blog/2019/12/03/jdbcdriver%E6%89%BE%E4%B8%8D%E5%88%B0%E5%8A%A8%E6%80%81%E9%93%BE%E6%8E%A5%E5%BA%93/) for this issue.
|
||||
|
||||
## 3. Why I get “more dnodes are needed” when create a table?
|
||||
|
||||
See the [technical blog](https://www.taosdata.com/blog/2019/12/03/%E5%88%9B%E5%BB%BA%E6%95%B0%E6%8D%AE%E8%A1%A8%E6%97%B6%E6%8F%90%E7%A4%BAmore-dnodes-are-needed/) for this issue.
|
||||
|
||||
## 4. How do I generate a core file when TDengine crashes?
|
||||
|
||||
See the [technical blog](https://www.taosdata.com/blog/2019/12/06/tdengine-crash%E6%97%B6%E7%94%9F%E6%88%90core%E6%96%87%E4%BB%B6%E7%9A%84%E6%96%B9%E6%B3%95/) for this issue.
|
||||
|
||||
## 5. What should I do if I encounter an error "Unable to establish connection"?
|
||||
|
||||
When the client encountered a connection failure, please follow the following steps to check:
|
||||
|
||||
1. Check your network environment
|
||||
|
||||
2. - Cloud server: Check whether the security group of the cloud server opens access to TCP/UDP ports 6030-6042
|
||||
- Local virtual machine: Check whether the network can be pinged, and try to avoid using localhost as hostname
|
||||
- Corporate server: If you are in a NAT network environment, be sure to check whether the server can return messages to the client
|
||||
|
||||
2. Make sure that the client and server version numbers are exactly the same, and the open source Community Edition and Enterprise Edition cannot be mixed.
|
||||
3. On the server, execute systemctl status taosd to check the running status of *taosd*. If not running, start *taosd*.
|
||||
4. Verify that the correct server FQDN (Fully Qualified Domain Name, which is available by executing the Linux command hostname-f on the server) is specified when the client connects. FQDN configuration reference: "[All about FQDN of TDengine](https://www.taosdata.com/blog/2020/09/11/1824.html)".
|
||||
5. Ping the server FQDN. If there is no response, please check your network, DNS settings, or the system hosts file of the computer where the client is located.
|
||||
6. Check the firewall settings (Ubuntu uses ufw status, CentOS uses firewall-cmd-list-port) to confirm that TCP/UDP ports 6030-6042 are open.
|
||||
7. For JDBC (ODBC, Python, Go and other interfaces are similar) connections on Linux, make sure that libtaos.so is in the directory /usr/local/taos/driver, and /usr/local/taos/driver is in the system library function search path LD_LIBRARY_PATH.
|
||||
8. For JDBC, ODBC, Python, Go, etc. connections on Windows, make sure that C:\ TDengine\ driver\ taos.dll is in your system library function search directory (it is recommended that taos.dll be placed in the directory C:\ Windows\ System32)
|
||||
9. If the connection issue still exist
|
||||
|
||||
1. - On Linux system, please use the command line tool nc to determine whether the TCP and UDP connections on the specified ports are unobstructed. Check whether the UDP port connection works: nc -vuz {hostIP} {port} Check whether the server-side TCP port connection works: nc -l {port}Check whether the client-side TCP port connection works: nc {hostIP} {port}
|
||||
- Windows systems use the PowerShell command Net-TestConnection-ComputerName {fqdn} Port {port} to detect whether the service-segment port is accessed
|
||||
|
||||
10. You can also use the built-in network connectivity detection function of taos program to verify whether the specified port connection between the server and the client is unobstructed (including TCP and UDP): [TDengine's Built-in Network Detection Tool Use Guide](https://www.taosdata.com/blog/2020/09/08/1816.html).
|
||||
|
||||
|
||||
|
||||
## 6.What to do if I encounter an error "Unexpected generic error in RPC" or "TDengine error: Unable to resolve FQDN"?
|
||||
|
||||
This error occurs because the client or data node cannot parse the FQDN (Fully Qualified Domain Name). For TAOS shell or client applications, check the following:
|
||||
|
||||
1. Please verify whether the FQDN of the connected server is correct. FQDN configuration reference: "[All about FQDN of TDengine](https://www.taosdata.com/blog/2020/09/11/1824.html)".
|
||||
2. If the network is configured with a DNS server, check that it is working properly.
|
||||
3. If the network does not have a DNS server configured, check the hosts file of the machine where the client is located to see if the FQDN is configured and has the correct IP address.
|
||||
4. If the network configuration is OK, from the machine where the client is located, you need to be able to ping the connected FQDN, otherwise the client cannot connect to the server
|
||||
|
||||
## 7.Although the syntax is corrected, why do I still get the “Invalid SQL" error?
|
||||
|
||||
If you confirm that the syntax is correct, for versions older than 2.0, please check whether the SQL statement length exceeds 64K. If it does, this error will also be returned.
|
||||
|
||||
## 8. Are “validation queries” supported?
|
||||
|
||||
The TDengine does not yet have a dedicated set of validation queries. However, it is recommended to use the database "log" monitored by the system.
|
||||
|
||||
## 9. Can I delete or update a record?
|
||||
|
||||
TDengine does not support the deletion function at present, and may support it in the future according to user requirements.
|
||||
|
||||
Starting from 2.0. 8.0, TDengine supports the function of updating written data. Using the update function requires using UPDATE 1 parameter when creating the database, and then you can use INSERT INTO command to update the same timestamp data that has been written. UPDATE parameter does not support ALTER DATABASE command modification. Without a database created using UPDATE 1 parameter, writing data with the same timestamp will not modify the previous data with no error reported.
|
||||
|
||||
It should also be noted that when UPDATE is set to 0, the data with the same timestamp sent later will be discarded directly, but no error will be reported, and will still be included in affected rows (so the return information of INSERT instruction cannot be used for timestamp duplicate checking). The main reason for this design is that TDengine regards the written data as a stream. Regardless of whether the timestamp conflicts or not, TDengine believes that the original device that generates the data actually generates such data. The UPDATE parameter only controls how such stream data should be processed when persistence-when UPDATE is 0, it means that the data written first overwrites the data written later; When UPDATE is 1, it means that the data written later overwrites the data written first. How to choose this coverage relationship depends on whether the data generated first or later is expected in the subsequent use and statistics compile.
|
||||
|
||||
## 10. How to create a table with more than 1024 columns?
|
||||
|
||||
Using version 2.0 and above, 1024 columns are supported by default; for older versions, TDengine allowed the creation of a table with a maximum of 250 columns. However, if the limit is exceeded, it is recommended to logically split this wide table into several small ones according to the data characteristics.
|
||||
|
||||
## 11. What is the most effective way to write data?
|
||||
|
||||
Insert in batches. Each write statement can insert multiple records into one or multiple tables at the same time.
|
||||
|
||||
## 12. What is the most effective way to write data? How to solve the problem that Chinese characters in nchar inserted under Windows systems are parsed into messy code?
|
||||
|
||||
If there are Chinese characters in nchar data under Windows, please first confirm that the region of the system is set to China (which can be set in the Control Panel), then the taos client in cmd should already support it normally; If you are developing Java applications in an IDE, such as Eclipse and Intellij, please confirm that the file code in the IDE is GBK (this is the default coding type of Java), and then initialize the configuration of the client when generating the Connection. The specific statement is as follows:
|
||||
|
||||
```JAVA
|
||||
Class.forName("com.taosdata.jdbc.TSDBDriver");
|
||||
Properties properties = new Properties();
|
||||
properties.setProperty(TSDBDriver.LOCALE_KEY, "UTF-8");
|
||||
Connection = DriverManager.getConnection(url, properties);
|
||||
```
|
||||
|
||||
## 13. JDBC error: the excluded SQL is not a DML or a DDL?
|
||||
|
||||
Please update to the latest JDBC driver.
|
||||
|
||||
```xml
|
||||
<dependency>
|
||||
<groupId>com.taosdata.jdbc</groupId>
|
||||
<artifactId>taos-jdbcdriver</artifactId>
|
||||
<version>2.0.27</version>
|
||||
</dependency>
|
||||
```
|
||||
|
||||
## 14. taos connect failed, reason: invalid timestamp.
|
||||
|
||||
The common reason is that the server time and client time are not calibrated, which can be calibrated by synchronizing with the time server (use ntpdate command under Linux, and select automatic synchronization in the Windows time setting).
|
||||
|
||||
## 15. Incomplete display of table name
|
||||
|
||||
Due to the limited display width of taos shell in the terminal, it is possible that a relatively long table name is not displayed completely. If relevant operations are carried out according to the displayed incomplete table name, a Table does not exist error will occur. The workaround can be by modifying the setting option maxBinaryDisplayWidth in the taos.cfg file, or directly entering the command `set max_binary_display_width 100`. Or, use the \\G parameter at the end of the command to adjust how the results are displayed.
|
||||
|
||||
## 16. How to migrate data?
|
||||
|
||||
TDengine uniquely identifies a machine according to hostname. When moving data files from machine A to machine B, pay attention to the following three points:
|
||||
|
||||
- For versions 2.0. 0.0 to 2.0. 6. x, reconfigure machine B's hostname to machine A's.
|
||||
- For 2.0. 7.0 and later versions, go to/var/lib/taos/dnode, repair the FQDN corresponding to dnodeId of dnodeEps.json, and restart. Make sure this file is identical for all machines.
|
||||
- The storage structures of versions 1. x and 2. x are incompatible, and it is necessary to use migration tools or your own application to export and import data.
|
||||
|
||||
## 17. How to temporarily adjust the log level in command line program taos?
|
||||
|
||||
For the convenience of debugging, since version 2.0. 16, command line program taos gets two new instructions related to logging:
|
||||
|
||||
```mysql
|
||||
ALTER LOCAL flag_name flag_value;
|
||||
```
|
||||
|
||||
This means that under the current command line program, modify the loglevel of a specific module (only valid for the current command line program, if taos is restarted, it needs to be reset):
|
||||
|
||||
- The values of flag_name can be: debugFlag, cDebugFlag, tmrDebugFlag, uDebugFlag, rpcDebugFlag
|
||||
- Flag_value values can be: 131 (output error and alarm logs), 135 (output error, alarm, and debug logs), 143 (output error, alarm, debug, and trace logs)
|
||||
|
||||
```mysql
|
||||
ALTER LOCAL RESETLOG;
|
||||
```
|
||||
|
||||
This means wiping up all client-generated log files on the machine.
|
||||
|
After Width: | Height: | Size: 97 KiB |
After Width: | Height: | Size: 43 KiB |
After Width: | Height: | Size: 87 KiB |
After Width: | Height: | Size: 66 KiB |
After Width: | Height: | Size: 37 KiB |
After Width: | Height: | Size: 114 KiB |
After Width: | Height: | Size: 91 KiB |
After Width: | Height: | Size: 92 KiB |
After Width: | Height: | Size: 54 KiB |
After Width: | Height: | Size: 71 KiB |
After Width: | Height: | Size: 56 KiB |
After Width: | Height: | Size: 53 KiB |
After Width: | Height: | Size: 44 KiB |
After Width: | Height: | Size: 55 KiB |
After Width: | Height: | Size: 60 KiB |
After Width: | Height: | Size: 91 KiB |
After Width: | Height: | Size: 97 KiB |
After Width: | Height: | Size: 45 KiB |
After Width: | Height: | Size: 162 KiB |
After Width: | Height: | Size: 85 KiB |
After Width: | Height: | Size: 45 KiB |
After Width: | Height: | Size: 42 KiB |