124 lines
6.3 KiB
Markdown
124 lines
6.3 KiB
Markdown
---
|
|
title: Caching
|
|
slug: /advanced-features/caching
|
|
---
|
|
|
|
TDengine includes caching as a built-in component. This includes write caching, read caching, metadata caching, and file system caching.
|
|
|
|
## Write Cache
|
|
|
|
TDengine uses a time-driven cache management strategy that prioritizes caching the most recently ingested data. When the size of the data stored in cache reaches a preset threshold, the earliest data in cache is written to disk.
|
|
|
|
You can optimize database performance for your use case by specifying the number of vgroups in the database and the size of the write cache allocated to each vnode.
|
|
|
|
For example, the following SQL statement creates a database with 10 vgroups, with each vnode having a 256 MB write cache.
|
|
|
|
```sql
|
|
CREATE DATABASE power VGROUPS 10 BUFFER 256 CACHEMODEL 'none' PAGES 128 PAGESIZE 16;
|
|
```
|
|
|
|
Generally, a larger cache size results in improved performance. However, there exists a certain point at which further increasing the cache size has no significant effect on performance.
|
|
|
|
## Read Cache
|
|
|
|
You can configure TDengine databases to cache the most recent data of each subtable, allowing for faster queries. To do so, you specify a cache model for your database by setting the the `CACHEMODEL` parameter to one of the following values:
|
|
|
|
- `none`: The read cache is disabled.
|
|
- `last_row`: The most recent row of data from each subtable is cached. The `LAST_ROW()` function will then retrieve this data from cache.
|
|
- `last_value`: The most recent non-null value for each column of each subtable is cached. The `LAST()` function will then retrieve this data from cache.
|
|
- `both`: The most recent row of each subtable and the most recent non-null value of each column of each subtable are cached. This simultaneously activates the behavior of both the `last_row` and `last_value` cache models.
|
|
|
|
You can also configure the memory size for each vnode by specifying a value for the `CACHESIZE` parameter. This parameter can be set from 1 MB to 65536 MB. The default value is 1 MB.
|
|
|
|
## Metadata Cache
|
|
|
|
Each vnode caches metadata that it has previously accessed. The size of this metadata cache is determined by the `PAGES` and `PAGESIZE` parameters of the database. For example, the following SQL statement creates a database whose vnodes have a metadata cache of 128 pages with each page being 16 KB:
|
|
|
|
```sql
|
|
CREATE DATABASE power PAGES 128 PAGESIZE 16;
|
|
```
|
|
|
|
## File System Cache
|
|
|
|
For reliability purposes, TDengine records changes in a write-ahead log (WAL) file before any data is written to the data storage layer. The `fsync` function is then called to write the data from the WAL to disk. You can control when the `fsync` function is called for a database by specifying the `WAL_LEVEL` and `WAL_FSYNC_PERIOD` parameters.
|
|
|
|
- `WAL_LEVEL`:
|
|
- Specify `1` to wait for the operating system to call `fsync`. In this configuration, TDengine does not call `fsync` itself.
|
|
- Specify `2` for TDengine to call `fsync` at a certain interval, specified by the WAL_FSYNC_PERIOD parameter.
|
|
|
|
The default value is `1`.
|
|
|
|
- `WAL_FSYNC_PERIOD`:
|
|
- Specify `0` to call `fsync` every time data is written to the WAL.
|
|
- Specify a value between `1` and `180000` milliseconds to call `fsync` each time this interval has elapsed.
|
|
|
|
Note that this parameter takes effect only when `WAL_LEVEL` is set to `2`.
|
|
|
|
The following SQL statement creates a database in which data in the WAL is written to disk every 3000 milliseconds:
|
|
|
|
|
|
```sql
|
|
CREATE DATABASE power WAL_LEVEL 2 WAL_FSYNC_PERIOD 3000;
|
|
```
|
|
|
|
The default configuration of `WAL_VALUE 1` delivers the highest performance. In use cases where data reliability is a higher priority than performance, you can set `WAL_LEVEL` to `2`.
|
|
|
|
## Example: Enhancing Query Performance with Read Caching
|
|
|
|
This example demonstrates the performance improvements delivered by read caching. The sample data from [Data Querying](../../basic-features/data-querying/) is used in this section. This data is generated by the following command:
|
|
|
|
```shell
|
|
taosBenchmark -d power -Q --start-timestamp=1600000000000 --tables=10000 --records=10000 --time-step=10000 -y
|
|
```
|
|
|
|
Note that read caching is disabled by default on the sample database generated by taosBenchmark.
|
|
|
|
1. To establish a performance baseline, run the following SQL statements:
|
|
|
|
```text
|
|
taos> SELECT LAST(ts, current) FROM meters;
|
|
last(ts) | last(current) |
|
|
=================================================
|
|
2020-09-15 00:13:10.000 | 1.1294620 |
|
|
Query OK, 1 row(s) in set (0.353815s)
|
|
|
|
taos> SELECT LAST_ROW(ts, current) FROM meters;
|
|
last_row(ts) | last_row(current) |
|
|
=================================================
|
|
2020-09-15 00:13:10.000 | 1.1294620 |
|
|
Query OK, 1 row(s) in set (0.344070s)
|
|
```
|
|
|
|
These return the most recent non-null value and the most recent row from any subtable in the `meters` supertable. It can be seen that these queries return in 353 and 344 milliseconds, respectively.
|
|
|
|
2. Enable read caching on the database:
|
|
|
|
```text
|
|
taos> ALTER DATABASE power CACHEMODEL 'both';
|
|
Query OK, 0 row(s) affected (0.046092s)
|
|
|
|
taos> SHOW CREATE DATABASE power\G;
|
|
*************************** 1.row ***************************
|
|
Database: power
|
|
Create Database: CREATE DATABASE `power` BUFFER 256 CACHESIZE 1 CACHEMODEL 'both' COMP 2 DURATION 14400m WAL_FSYNC_PERIOD 3000 MAXROWS 4096 MINROWS 100 STT_TRIGGER 2 KEEP 5256000m,5256000m,5256000m PAGES 256 PAGESIZE 4 PRECISION 'ms' REPLICA 1 WAL_LEVEL 1 VGROUPS 10 SINGLE_STABLE 0 TABLE_PREFIX 0 TABLE_SUFFIX 0 TSDB_PAGESIZE 4 WAL_RETENTION_PERIOD 3600 WAL_RETENTION_SIZE 0 KEEP_TIME_OFFSET 0
|
|
Query OK, 1 row(s) in set (0.000282s)
|
|
```
|
|
|
|
3. Run the two queries from Step 1 again:
|
|
|
|
```text
|
|
taos> SELECT LAST(ts, current) FROM meters;
|
|
last(ts) | last(current) |
|
|
=================================================
|
|
2020-09-15 00:13:10.000 | 1.1294620 |
|
|
Query OK, 1 row(s) in set (0.044021s)
|
|
|
|
taos> SELECT LAST_ROW(ts, current) FROM meters;
|
|
last_row(ts) | last_row(current) |
|
|
=================================================
|
|
2020-09-15 00:13:10.000 | 1.1294620 |
|
|
Query OK, 1 row(s) in set (0.046682s)
|
|
```
|
|
|
|
It can be seen that these queries now return in 44 and 47 milliseconds, respectively. This indicates that read caching on this system produces an approximately 8-fold improvement in query performance.
|