Logging > Docker logging | Manticore Search Manual

Binary logging serves as a recovery mechanism for real-time table data. When binary logs are enabled, searchd records each transaction to the binlog file and utilizes it for recovery following an unclean shutdown. During a clean shutdown, RAM chunks are saved to disk, and all binlog files are subsequently deleted.

By default, binary logging is enabled to safeguard data integrity. On Linux systems, the default location for binlog.* files in Plain mode is /var/lib/manticore/data/. In RT mode, binary logs are stored in the <data_dir>/binlog/ folder, unless specified otherwise.

To disable binary logging globally, set binlog_path to an empty value in the searchd configuration. Disabling binary logging requires a restart of the daemon and puts data at risk if the system shuts down unexpectedly.

‹›

Example

Example

📋

searchd {
...
    binlog_path = # disable logging
...

You can use the following directive to set a custom path:

‹›

Example

Example

📋

searchd {
...
    binlog_path = /var/data
...

For more granular control, binary logging can be disabled at the table level for real-time tables by setting the binlog table parameter to 0. This option is not available for percolate tables.

‹›

Example

Example

📋

create table a (id bigint, s string attribute) binlog='0';

For existing RT tables, binary logging can also be disabled by modifying the binlog parameter.

‹›

Example

Example

📋

alter table FOO binlog='0';

If binary logging was previously disabled, it can be re-enabled by setting the binlog parameter back to 1:

‹›

Example

Example

📋

alter table FOO binlog='1';

Dependency on global settings: per-table binary logging settings only take effect if binary logging is globally enabled in the searchd configuration (binlog_path must not be empty).
Binary logging status and transaction ID insights: Modifying the binary logging status of a table forces an immediate flush of the table. If you turn off binary logging for a table, its transaction ID (TID) changes to -1. This indicates that binary logging is not active, and no changes are being tracked. Conversely, if you start binary logging for a table, its transaction ID becomes a non-negative number (zero or higher). This indicates that the table's changes are now being recorded. You can check the transaction ID by using the command: SHOW TABLE <name> STATUS. The transaction ID reflects whether changes to the table are being recorded (non-negative number) or not (-1).

When binary logging is turned on, every change made to an RT table is saved to a log file. If the system shuts down unexpectedly, these logs are used automatically when the system starts again to bring back all the changes that were logged.

During normal operations, when the amount of data logged reaches a certain limit (set by binlog_max_log_size), a new log file starts. Old log files are kept until all changes in them are completely processed and saved to disk as a disk chunk. If this limit is set to 0, the log files are kept until the system is properly shut down. By default, there's no limit to how large these files can grow.

‹›

Example

Example

📋

searchd {
...
    binlog_max_log_size = 16M
....

Each binlog file is named with a zero-padded number, like binlog.0000, binlog.0001, etc., typically showing four digits. You can change how many digits the number has with the setting binlog_filename_digits. If you have more binlog files than the number of digits can accommodate, the number of digits will be automatically increased to fit all files.

Important: To change the number of digits, you must first save all table data and properly shut down the system. Then, delete the old log files and restart the system.

‹›

Example

Example

📋

searchd {
...
    binlog_filename_digits = 6
...

You can choose between two ways to manage binary log files, which can be set with the binlog_common directive:

Separate file for each table (default, 0): Each table saves its changes in its own log file. This setup is good if you have many tables that get updated at different times. It allows tables to be updated without waiting for others. Also, if there is a problem with one table's log file, it does not affect the others.
Single file for all tables (1): All tables use the same binary log file. This method makes it easier to handle files because there are fewer of them. However, this could keep files longer than needed if one table still needs to save its updates. This setting might also slow things down if many tables need to update at the same time because all changes have to wait to be written to one file.

‹›

binlog_common

binlog_common

📋

searchd {
...
    binlog_common = 1
...

There are four different binlog flushing strategies, controlled by the binlog_flush directive:

0 - Data is written to disk (flushed) every second, and Manticore initiates making it secure on the disk (syncing) right after flushing. This method is the fastest, but if the server or computer crashes suddenly, some recently written data that hasn't been secured may be lost.
1 - Data is written to the binlog and synced immediately after each transaction. This method is the safest as it ensures that each change is immediately preserved, but it slows down writing.
2 - Data is written after each transaction, and a sync is initiated every second. This approach offers a balance, writing data regularly and quickly. However, if the computer fails, some of the data that was being secured might not finish saving. Also, syncing may take longer than one second depending on the disk.
3 - Similar to 2, but it also ensures the binlog file is synced before it is closed due to exceeding binlog_max_log_size.

The default mode is 2, which writes data after each transaction and starts syncing it every second, balancing speed and safety.

‹›

Example

Example

📋

searchd {
...
    binlog_flush = 1 # ultimate safety, low write speed
...
}

In a cluster setup using Galera, node recovery behavior is crucial. Normally, Galera handles node desynchronization via IST (incremental state transfer) if the node was shut down cleanly and its last sequence number (seqno) was properly saved. However, in case of a crash where seqno isn't preserved, Galera will trigger an SST (state snapshot transfer), which is resource-intensive and can significantly slow down the cluster due to high I/O activity.

To address this, cluster binlog support has been introduced. This feature extends the existing binary logging functionality to help reduce the need for SST by allowing a recovering node to replay missing transactions from local binlogs and rejoin the cluster with a valid seqno.

Cluster binlog is enabled by default for any cluster operations. However, it can be disabled by setting the environment variable:

‹›

binlog_cluster

binlog_cluster

📋

MANTICORE_REPLICATION_BINLOG=0

This feature reduces downtime and avoids full data transfers by combining the local durability of binary log with Galera distributed sync capabilities.

During recovery after an unclean shutdown, binlogs are replayed, and all logged transactions since the last good on-disk state are restored. Transactions are checksummed, so in case of binlog file corruption, garbage data will not be replayed; such a broken transaction will be detected and will stop the replay.

Intensive updates to a small RT table that fully fits into a RAM chunk can result in an ever-growing binlog that can never be unlinked until a clean shutdown. Binlogs essentially serve as append-only deltas against the last known good saved state on disk, and they cannot be unlinked unless the RAM chunk is saved. An ever-growing binlog is not ideal for disk usage and crash recovery time. To address this issue, you can configure searchd to perform periodic RAM chunk flushes using the rt_flush_period directive. With periodic flushes enabled, searchd will maintain a separate thread that checks whether RT table RAM chunks need to be written back to disk. Once this occurs, the respective binlogs can be (and are) safely unlinked.

The default RT flush period is set to 10 hours.

‹›

Example

Example

📋

searchd {
...
    rt_flush_period = 3600 # 1 hour
...
}

It's important to note that rt_flush_period only controls the frequency at which checks occur. There are no guarantees that a specific RAM chunk will be saved. For example, it doesn't make sense to regularly re-save a large RAM chunk that only receives a few rows worth of updates. Manticore automatically determines whether to perform the flush using a few heuristics.

Docker logging

When you use the official Manticore docker image, the server log is sent to /dev/stdout which can be viewed from host with:

docker logs manticore

The query log can be diverted to the Docker log by passing the variable QUERY_LOG_TO_STDOUT=true.

The log folder is the same as in the case of the Linux package, set to /var/log/manticore. If desired, it can be mounted to a local path to view or process the logs.

Binary logging Rotating query and server logs

Binary logging

Enabling and disabling binary logging

Global binary logging configuration

Per-table binary logging configuration

Important considerations:

Operations

Log size

Log files

Binary logging strategies

Binary flushing strategies

Cluster binlog support

Recovery

Flushing RT RAM chunks

Docker logging