Compacting a table

Over time, RT tables can become fragmented into many disk chunks and/or tainted with deleted, but unpurged data, impacting search performance. When that happens, they can be optimized. Basically, the optimization pass merges together disk chunks pairs, purging off documents suppressed previously by DELETEs.

Starting Manticore 4 it happens automaticaly by default, but you can also use the below commands to force table compaction.

OPTIMIZE TABLE

OPTIMIZE TABLE index_name [OPTION opt_name = opt_value [,...]]

OPTIMIZE statement enqueues an RT table for optimization in a background thread.

‹›
  • SQL
SQL
📋
OPTIMIZE TABLE rt;

Number of optimized disk chunks

OPTIMIZE merges the RT table's disk chunks down to the number which equals to # of CPU cores * 2 by default. The number of optimized disk chunks can be controlled with option cutoff.

There's also:

‹›
  • SQL
SQL
📋
OPTIMIZE TABLE rt OPTION cutoff=4;

Running in foreground

If OPTION sync=1 is used (0 by default), the command will wait until the optimization process is done (in case the connection interrupts the optimization will continue to run on the server).

‹›
  • SQL
SQL
📋
OPTIMIZE TABLE rt OPTION sync=1;

Throttling the IO impact

Optimize can be a lengthy and IO intensive process, so to limit the impact, all the actual merge work is executed serially in a special background thread, and the OPTIMIZE statement simply adds a job to its queue. Currently, there is no way to check the table or queue status (that might be added in the future to the SHOW TABLE STATUS and SHOW STATUS statements respectively). The optimization thread can be IO-throttled, you can control the maximum number of IOs per second and the maximum IO size with rt_merge_iops and rt_merge_maxiosize directives respectively.

The RT table being optimized stays online and available for both searching and updates at (almost) all times during the optimization. It gets locked for a very short time when a pair of disk chunks is merged successfully, to rename the old and the new files, and update the table header.

Optimizing clustered tables

As long as you don't have auto_optimize disabled tables are optimized automatically

In case you are experiencing unexpected SSTs or want tables across all nodes of the cluster be binary identical you need to:

  1. Disable auto_optimize.
  2. Optimize tables manually:

    On one of the nodes drop the table from the cluster:

    ‹›
    • SQL
    SQL
    📋
    ALTER CLUSTER mycluster DROP myindex;

    Optimize the table:

    ‹›
    • SQL
    SQL
    📋
    OPTIMIZE TABLE myindex;

    Add back the table to the cluster:

    ‹›
    • SQL
    SQL
    📋
    ALTER CLUSTER mycluster ADD myindex;

    When the table is added back, the new files created by the optimize process will be replicated to the other nodes in the cluster. Any changes made locally to the table on other nodes will be lost.

Table data modifications (inserts, replaces, deletes, updates) should:

  1. either be postponed
  2. or directed to the node where the optimize process is running.

Note, while the table is out of the cluster, insert/replace/delete/update commands should refer to it without cluster name prefix (for SQL statements or cluster property fin case of a HTTP JSON request), otherwise they will fail. As soon as the table is added back to the cluster, writes can be resumed. At this point write operations on the table must include the cluster name prefix again, or they will fail. Search operations are available as usual during the process on any of the nodes.

Isolation during flushing and merging

When flushing and compacting a real-time table Manticore provides isolation, so that a changed state doesn't affect the queries that were running when this or that operation started.

For instance, while compacting a table we have a pair of disk chunks that are being merged and also a new chunk produced by merging those two. Then, at one moment we create a new version of the table, where instead of the original pair of chunks the new one is placed. That is done seamlessly, so that if there's a long-running query using the original chunks, it will continue seeing the old version of the table while a new query will see the new version with the resulting merged chunk.

Same is true for flushing a RAM chunk: we merge all suitable RAM segments into a new disk chunk, and finally put a new disk chunk into the set of disk chunks and abandon the participated RAM chunk segments. During this operation, Manticore also provides isolation for those queries that started before the operation began.

Moreover, these operations are also transparent for replaces and updates. If you update an attribute in a document which belongs to a disk chunk which is being merged with another one, the update will be applied both to that chunk and to the resulting chunk after the merge. If you delete a document during a merge - it will be deleted in the original chunk and also the resulting merged chunk will either have the document marked deleted, or it will have no such document at all (if the deletion happened on early stage of the merging).

Freezing a table

FREEZE prepares a real-time/plain table for a safe backup. In particular it:

  1. Disables table compaction. If the table is being compacted right now FREEZE will wait for it to finish.
  2. Flushes current RAM chunk into a disk chunk.
  3. Flushes attributes.
  4. Disables implicit operations that may change the files on disk.
  5. Displays actual list of the files belonging to the table.

Built-in tool manticore-backup uses FREEZE to guarantee data consistency. So can you if you want to make your own backup solution or need to freeze tables for whatever else reason. All you need to do is:

  1. FREEZE a table.
  2. Grab output of the FREEZE command and backup the provided files.
  3. UNFREEZE the table once you are done.
‹›
  • Example
Example
📋
FREEZE t;
‹›
Response
+-------------------+---------------------------------+
| file              | normalized                      |
+-------------------+---------------------------------+
| data/t/t.0.spa    | /work/anytest/data/t/t.0.spa    |
| data/t/t.0.spd    | /work/anytest/data/t/t.0.spd    |
| data/t/t.0.spds   | /work/anytest/data/t/t.0.spds   |
| data/t/t.0.spe    | /work/anytest/data/t/t.0.spe    |
| data/t/t.0.sph    | /work/anytest/data/t/t.0.sph    |
| data/t/t.0.sphi   | /work/anytest/data/t/t.0.sphi   |
| data/t/t.0.spi    | /work/anytest/data/t/t.0.spi    |
| data/t/t.0.spm    | /work/anytest/data/t/t.0.spm    |
| data/t/t.0.spp    | /work/anytest/data/t/t.0.spp    |
| data/t/t.0.spt    | /work/anytest/data/t/t.0.spt    |
| data/t/t.meta     | /work/anytest/data/t/t.meta     |
| data/t/t.ram      | /work/anytest/data/t/t.ram      |
| data/t/t.settings | /work/anytest/data/t/t.settings |
+-------------------+---------------------------------+
13 rows in set (0.01 sec)

The column file provides paths to the table's files inside data_dir of the running instance. The column normalized shows absolute paths of the same files. If you want to back up a table it's safe to just copy the provided files with no other preparations.

When a table is frozen, you can't perform UPDATE queries on it; they will fail with the error message index is locked now, try again later.

Also, DELETE and REPLACE queries have some limitations while the table is frozen:

  • If DELETE affects a document stored in a current RAM chunk - it is allowed.
  • If DELETE affects a document in a disk chunk, but it was already deleted before - it is allowed.
  • If DELETE is going to change an actual disk chunk - it will wait until the table is unfrozen.

Manual FLUSH of a RAM chunk of a frozen table will report 'success', however no actual save will happen.

DROP/TRUNCATE of a frozen table is allowed, since such operation is not implicit. We assume that if you truncate or drop a table - you don't need it backed up anyway, therefore it should not have been frozen in the first place.

INSERT into a frozen table is supported, but also limited: new data will be stored in RAM (as usual), until rt_mem_limit is reached; then new insertions will wait until the table is unfrozen.

If you shut down the daemon with a frozen table, it will behave as in case of a dirty shutdown (e.g. kill -9): new inserted data will not be saved in the RAM-chunk on disk, and on restart it will be restored from a binary log (if any), or lost (if binary logging is disabled).

Unfreezing a table

UNFREEZE re-enables previously blocked operations and restarts the internal compaction service. All the operations that are waiting for a table unfreeze get unfrozen too and finish their operations normally.

‹›
  • Example
Example
📋
UNFREEZE tbl;