Cluster recovery

There may be cases when Mantcore search daemon gets stopped with no node in the cluster left to serve requests. In these cases someone needs to recover the cluster or a part of it. Due to multi-master nature of Galera library used for replication, Manticore replication cluster constitutes one logical entity and takes care about each of its nodes and node's data consistency and keeps the cluster's status as a whole. This enables safe writes on multiple nodes at the same time and maintains cluster integrity unlike a traditional asynchronous replication.

However it comes with its challenges. Below let's look at what will happen in this or that case. For example let's take a cluster of nodes A, B and C and consider scenarios where some or all nodes get out of service and what one has to do to bring them back.

Case 1

Node A is stopped as usual. The other nodes receive "normal shutdown" message from node A. The cluster size is reduced and a quorum re-calculation is issued.

After node A gets started as usual, it joins the cluster nodes. Node A will not serve any write transaction until the join is complete and it's fully synchronized with the cluster. If a writeset cache on donor node B or C (which can be controlled with a Galera cluster's option gcache.size) still has all transactions missed at node A, node A will receive a fast incremental state transfer (IST), that is, a transfer of only missed transactions. Otherwise, a snapshot state transfer (SST) will start, that is, a transfer of table files.

Case 2

Nodes A and B are stopped as usual. That is the same situation as in the previous case but the cluster's size is reduced to 1 and node C itself forms a primary component that allows it to handle write transactions.

Nodes A and B may be started as usual and will join the cluster after the start. Node C becomes a "donor" and provides the transfer of the state to nodes A and B.

Case 3

All nodes are stopped as usual and the cluster is off.

The problem now is how to initialize the cluster. It's important that on a clean shutdown of searchd the nodes write the number of last executed transaction into the cluster directory grastate.dat file along with flag safe_to_bootstrap. The node which was stopped last will have option safe_to_bootstrap: 1 and the most advanced seqno number.

It is important that this node starts first to form the cluster. To bootstrap a cluster the server should be started on this node with flag --new-cluster. On Linux you can also run manticore_new_cluster which will start Manticore in --new-cluster mode via systemd.

If another node starts first and bootstraps the cluster, then the most advanced node joins that cluster, performs full SST and receives a table file where some transactions are missed in comparison with the table files it got before. That is why it is important to start first the node which was shut down last, it should have flag safe_to_bootstrap: 1 in grastate.dat.

Case 4

Node A disappears from the cluster due to a crash or a network failure.

Nodes B and C try to reconnect to the missed node A and after a failure remove node A from the cluster. The cluster quorum is valid as 2 out of 3 nodes are running and the cluster keep working as usual.

After node A is restarted it will join the cluster automatically the same way as in Case 1.

Case 5

Nodes A and B disappear. Node C is not able to form the quorum alone as 1 node is less than 1.5 (half of 3). So the cluster on node C is switched to non-primary state and node C rejects any write transactions with an error message.

Meanwhile, the single node C is waiting for other nodes to connect and try to connect to them itself. If it happens, after the network is restored and nodes A and B are running again, the cluster will be formed again automatically. If nodes A and B are just cut from node C, but they can still reach each other, they keep working as usual because they still form the quorum.

However, if both nodes A and B crashed or restarted due to power outage, someone should turn on primary component on the C node with the following statement:

‹›
  • SQL
  • JSON
📋
SET CLUSTER posts GLOBAL 'pc.bootstrap' = 1

But before doing that you need to make sure that the other nodes are really unreachable, otherwise split-brain happens and separate clusters get formed.

Case 6

All nodes crashed. In this case the grastate.dat file in the cluster directory is not updated and does not contain a valid sequence number seqno.

If this happens, someone should find the most advanced node and start the server on it with the --new-cluster-force command line key. All other nodes will start as usual as in Case 3). On Linux you can also run manticore_new_cluster --force. It will start Manticore in --new-cluster-force mode via systemd.

Case 7

Split-brain causes the cluster to get into non-primary state. For example, the cluster consists of even number of nodes (four), e.g. two couple of nodes located in different datacenters, and network failure interrupts the connection between the datacenters. Split-brain happens as each group of nodes has exactly half of the quorum. The both groups stop handling write transactions as Galera replication model cares about data consistency and the cluster cannot accept write transactions without quorum. But nodes in the both groups try to re-connect to the nodes from the other group to restore the cluster.

If someone wants to restore the cluster without network got restored the same steps as in Case 5 should be done, but only at one group of the nodes.

After that, the group with the node we run this statement at can successfully handle write transactions again.

‹›
  • SQL
  • JSON
📋
SET CLUSTER posts GLOBAL 'pc.bootstrap' = 1

However, we want to notice that if the statement gets issued at both groups it will result in two separate clusters, so the following network recovery will not make the groups to rejoin.

Connecting to the server

With default configuration Manticore is waiting for your connections on:

  • port 9306 for MySQL clients
  • port 9308 for HTTP/HTTPS connections
  • port 9312 for HTTP/HTTPS, and connections from other Manticore nodes and clients based on Manticore binary API
‹›
  • SQL
  • HTTP
  • PHP
  • Python
  • Javascript
  • Java
  • docker
📋
mysql -h0 -P9306

MySQL protocol

Manticore Search implements an SQL interface using MySQL protocol, which allows any MySQL client, library or connector to be used for connecting to Manticore Search and work with it as if it would be MySQL server, not Manticore.

However the SQL dialect is different. It implements only a subset of SQL commands or functions available in MySQL. In addition, there are clauses and functions that are specific to Manticore Search. The most eloquent example is the MATCH() clause which allows setting the full-text search.

Manticore Search doesn't support server-side prepared statements. Client-side prepared statements can be used with Manticore. It must be noted that Manticore implements the multi value (MVA) data type for which there is no equivalent in MySQL or libraries implementing prepared statements. In these cases, the MVA values will need to be crafted in the raw query.

Some MySQL clients/connectors demand values for user/password and/or database name. Since Manticore Search does not have the concept of database and there is no user access control yet implemented, these can be set arbitrarily as Manticore will simply ignore the values.

Configuration

The default port for the SQL interface is 9306 and it's enabled by default.

In the searchd section of the configuration file the MySQL port can be defined by listen directive like this:

searchd {
...
   listen = 127.0.0.1:9306:mysql
...
}

Because Manticore doesn't have yet user authentication implemented make sure the MySQL port can't be accessed by anyone outside your network.

VIP connection

A separate MySQL port can be used to perform 'VIP' connections. A connection to this port bypasses the thread pool and always forcibly creates a new dedicated thread. That's useful for managing in case of a severe overload when the server would either stall or not let you connect via a regular port.

searchd {
...
   listen = 127.0.0.1:9306:mysql
   listen = 127.0.0.1:9307:mysql_vip
...
}

Connecting via standard MySQL client

The easiest way to connect to Manticore is by using a standard MySQL client:

mysql -P9306 -h0

Secured MySQL connection

The MySQL protocol supports SSL encryption. The secured connections can be made on the same mysql listening port.

Compressed MySQL connection

Compression can be used with MySQL Connections and available to clients by default. The client just need to specify the connection to use compression.

An example with the MySQL client:

mysql -P9306 -h0 -C

Compression can be used in both secured and non-secured connections.

Notes on MySQL connectors

The official MySQL connectors can be used to connect to Manticore Search, however they might require certain settings passed in the DSN string as the connector can try running certain SQL commands not implemented yet in Manticore.

JDBC Connector 6.x and above require Manticore Search 2.8.2 or greater and the DSN string should contain the following options:

jdbc:mysql://IP:PORT/DB/?characterEncoding=utf8&maxAllowedPacket=512000&serverTimezone=XXX

By default Manticore Search will report it's own version to the connector, however this may cause some troubles. To overcome that mysql_version_string directive in searchd section of the configuration should be set to a version lower than 5.1.1:

searchd {
...
   mysql_version_string = 5.0.37
...
}

.NET MySQL connector uses connection pools by default. To correctly get the statistics of SHOW META, queries along with SHOW META command should be sent as a single multistatement (SELECT ...;SHOW META). If pooling is enabled option Allow Batch=True is required to be added to the connection string to allow multistatements:

Server=127.0.0.1;Port=9306;Database=somevalue;Uid=somevalue;Pwd=;Allow Batch=True;

Notes on ODBC connectivity

Manticore can be accessed using ODBC. It's recommended to set charset=UTF8 in the ODBC string. Some ODBC drivers will not like the reported version by the Manticore server as they will see it as a very old MySQL server. This can be overridden with mysql_version_string option.

Comment syntax

Manticore SQL over MySQL supports C-style comment syntax. Everything from an opening /* sequence to a closing */ sequence is ignored. Comments can span multiple lines, can not nest, and should not get logged. MySQL specific /*! ... */ comments are also currently ignored. (As the comments support was rather added for better compatibility with mysqldump produced dumps, rather than improving general query interoperability between Manticore and MySQL.)

SELECT /*! SQL_CALC_FOUND_ROWS */ col1 FROM table1 WHERE ...