Securing and compacting a table > Isolation during flushing and merging

FLUSH TABLE rt_table

FLUSH TABLE forcefully flushes RT table RAM chunk contents to disk.

The real-time table RAM chunk is automatically flushed to disk during a clean shutdown, or periodically every rt_flush_period seconds.

Issuing a FLUSH TABLE command not only forces the RAM chunk contents to be written to disk but also triggers the cleanup of binary log files.

‹›

SQL

SQL

📋

FLUSH TABLE rt;

‹›

Response

Query OK, 0 rows affected (0.05 sec)

Compacting a table

Over time, RT tables may become fragmented into numerous disk chunks and/or contaminated with deleted, yet unpurged data, affecting search performance. In these cases, optimization is necessary. Essentially, the optimization process combines pairs of disk chunks, removing documents that were previously deleted using DELETE statements.

Beginning with Manticore 4, this process occurs automatically by default. However, you can also use the following commands to manually initiate table compaction.

OPTIMIZE TABLE table_name [OPTION opt_name = opt_value [,...]]

OPTIMIZE statement adds an RT table to the optimization queue, which will be processed in a background thread.

‹›

SQL

SQL

📋

OPTIMIZE TABLE rt;

By default, OPTIMIZE merges the RT table's disk chunks down to a number less than or equal to the number of logical CPU cores multiplied by 2.

However, if the table has attributes with KNN indexes, this threshold is different. In this case, it is set to the number of physical CPU cores divided by 2 to improve KNN search performance.

You can also control the number of optimized disk chunks manually using the cutoff option.

Additional options include:

Server setting optimize_cutoff for overriding the default threshold
Per-table setting optimize_cutoff

‹›

SQL

SQL

📋

OPTIMIZE TABLE rt OPTION cutoff=4;

When using OPTION sync=1 (0 by default), the command will wait for the optimization process to complete before returning. If the connection is interrupted, the optimization will continue running on the server.

‹›

SQL

SQL

📋

OPTIMIZE TABLE rt OPTION sync=1;

Optimization can be a lengthy and I/O-intensive process. To minimize the impact, all actual merge work is executed serially in a special background thread, and the OPTIMIZE statement simply adds a job to its queue. The optimization thread can be I/O-throttled, and you can control the maximum number of I/Os per second and the maximum I/O size with the rt_merge_iops and rt_merge_maxiosize directives, respectively.

During optimization, the RT table being optimized remains online and available for both searching and updates nearly all the time. It is locked for a very brief period when a pair of disk chunks is successfully merged, allowing for the renaming of old and new files and updating the table header.

As long as auto_optimize is not disabled, tables are optimized automatically.

If you are experiencing unexpected SSTs or want tables across all nodes of the cluster to be binary identical, you need to:

Disable auto_optimize.
Manually optimize tables:
On one of the nodes, drop the table from the cluster:
‹›

SQL

SQL

📋
⚙

ALTER CLUSTER mycluster DROP myindex;
Optimize the table:
‹›

SQL

SQL

📋
⚙

OPTIMIZE TABLE myindex;
Add back the table to the cluster:
‹›

SQL

SQL

📋
⚙

ALTER CLUSTER mycluster ADD myindex;
When the table is added back, the new files created by the optimization process will be replicated to the other nodes in the cluster. Any local changes made to the table on other nodes will be lost.

Table data modifications (inserts, replaces, deletes, updates) should either:

Be postponed, or
Be directed to the node where the optimization process is running.

Note that while the table is out of the cluster, insert/replace/delete/update commands should refer to it without the cluster name prefix (for SQL statements or the cluster property in case of an HTTP JSON request), otherwise they will fail. Once the table is added back to the cluster, you must resume write operations on the table and include the cluster name prefix again, or they will fail.

Search operations are available as usual during the process on any of the nodes.

Flushing RT table to disk Isolation during flushing and merging

Manticore provides isolation during the flushing and merging process of a real-time table to prevent any changes from affecting running queries.

For example, during table compaction, a pair of disk chunks are merged and a new chunk is produced. At one point, a new version of the table is created with the new chunk replacing the original pair. This is done seamlessly so that a long-running query using the original chunks will continue to see the old version of the table, while a new query will see the new version with the resulting merged chunk.

The same applies to flushing a RAM chunk, where suitable RAM segments are merged into a new disk chunk and the participated RAM chunk segments are abandoned. During this operation, Manticore provides isolation for queries that started before the operation began.

Furthermore, these operations are transparent for replaces and updates. If you update an attribute in a document that belongs to a disk chunk being merged with another one, the update will be applied to both that chunk and the resulting merged chunk. If you delete a document during a merge, it will be deleted in the original chunk and also in the resulting merged chunk, which will either have the document marked as deleted or have no such document at all if the deletion happened early in the merging process.

Compacting a table Freezing a table

FREEZE tbl1[, tbl2, ...]

FREEZE readies a real-time/plain table for a secure backup. Specifically, it:

Deactivates table compaction. If the table is currently being compacted, FREEZE will gracefully interrupt it.
Transfers the current RAM chunk to a disk chunk.
Flushes attributes.
Disables implicit operations that could modify the disk files.
Increments the table's locked counter.
Shows the actual file list associated with the table.

If the table is already frozen (locked), FREEZE will:

Increment the table's locked counter.
Shows the actual file list associated with the table.

The built-in tool manticore-backup uses FREEZE to ensure data consistency. You can do the same if you want to create your own backup solution or need to freeze tables for other reasons. Just follow these steps:

FREEZE a table (or a few).
Capture the output of the FREEZE command and back up the specified files.
UNFREEZE the table(s) once finished.

‹›

Example

Example

📋

FREEZE t;

‹›

Response

+-------------------+---------------------------------+
| file              | normalized                      |
+-------------------+---------------------------------+
| data/t/t.0.spa    | /work/anytest/data/t/t.0.spa    |
| data/t/t.0.spd    | /work/anytest/data/t/t.0.spd    |
| data/t/t.0.spds   | /work/anytest/data/t/t.0.spds   |
| data/t/t.0.spe    | /work/anytest/data/t/t.0.spe    |
| data/t/t.0.sph    | /work/anytest/data/t/t.0.sph    |
| data/t/t.0.sphi   | /work/anytest/data/t/t.0.sphi   |
| data/t/t.0.spi    | /work/anytest/data/t/t.0.spi    |
| data/t/t.0.spm    | /work/anytest/data/t/t.0.spm    |
| data/t/t.0.spp    | /work/anytest/data/t/t.0.spp    |
| data/t/t.0.spt    | /work/anytest/data/t/t.0.spt    |
| data/t/t.meta     | /work/anytest/data/t/t.meta     |
| data/t/t.ram      | /work/anytest/data/t/t.ram      |
| data/t/t.settings | /work/anytest/data/t/t.settings |
+-------------------+---------------------------------+
13 rows in set (0.01 sec)

The file column indicates the table's file paths within the data_dir of the running instance. The normalized column displays the absolute paths for the same files. To back up a table, simply copy the provided files without additional preparation.

When a table is frozen, you cannot execute UPDATE queries; they will wait until the table is unfrozen.

Also, DELETE and REPLACE queries have some restrictions while the table is frozen:

If DELETE affects a document in the current RAM chunk - it is permitted.
If DELETE impacts a document in a disk chunk but was previously deleted - it is allowed.
If DELETE would alter an actual disk chunk - it will wait until the table is unfrozen.

Manually FLUSHing a RAM chunk of a frozen table will report 'success', but no real saving will occur.

DROP/TRUNCATE of a frozen table is allowed since these operations are not implicit. We assume that if you truncate or drop a table, you don't need it backed up; therefore, it should not have been frozen initially.

INSERTing into a frozen table is supported but limited: new data will be stored in RAM (as usual) until rt_mem_limit is reached; then, new insertions will wait until the table is unfrozen.

If you shut down the daemon with a frozen table, it will act as if it experienced a dirty shutdown (e.g., kill -9): newly inserted data will not be saved in the RAM-chunk on disk, and upon restart, it will be restored from a binary log (if any) or lost (if binary logging is disabled).

UNFREEZE tbl1[, tbl2, ...]

UNFREEZE command decreases the table's locked counter, and if it reaches zero, reactivates previously blocked operations and resumes the internal compaction service. Any operations that were waiting for the table to unfreeze will also resume and complete normally.

‹›

Example

Example

📋

UNFREEZE tbl;

You can use SHOW table_name STATUS to check if a table is frozen or not.

The locked counter is displayed in the table's status under the locked column. A value of zero indicates that the table is not frozen, while a non-zero value reflects the number of active locks. Each explicit FREEZE command and implicit locking (such as when the table is part of a cluster and a replication routine selects it as a donor for a replica) increases the counter. Each UNFREEZE command decreases the counter, eventually down to zero.

‹›

Example

Example

📋

SHOW TABLE `foo` STATUS LIKE 'locked';

‹›

Response

+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| locked        | 2     |
+---------------+-------+
1 row in set (0,00 sec)

Locked tables are also displayed using the SHOW LOCKS command. The lock counters are shown in the Additional Info column.

‹›

Example

Example

📋

SHOW LOCKS;

‹›

Response

+-----------+------+-----------+-----------------+
| Type      | Name | Lock Type | Additional Info |
+-----------+------+-----------+-----------------+
| rt        | a    | freeze    | Count: 1        |
| percolate | bar  | freeze    | Count: 3        |
| rt        | foo  | freeze    | Count: 2        |
+-----------+------+-----------+-----------------+
3 rows in set (0,01 sec)

Isolation during flushing and merging Flushing attributes

Flushing RAM chunk to disk

FLUSH TABLE

Compacting a Table

OPTIMIZE TABLE

Number of optimized disk chunks

Running in foreground

Throttling the IO impact

Optimizing clustered tables

Isolation during flushing and merging

Freezing a table

Unfreezing a table

Inspecting the lock state of a table

SHOW LOCKS