▪️ Security
In many cases you might want to encrypt traffic between your client and the server. To do that you can specify that the server should use HTTPS protocol rather than HTTP.
To enable HTTPS at least the following two directives should be set in searchd section of the config and there should be at least one listener set to https
In addition to that you can specify certificate authority's certificate (aka root certificate) in
- ssl_ca certificate authority's certificate file
- with CA
- without CA
Example with CA:
ssl_ca = ca-cert.pem
ssl_cert = server-cert.pem
ssl_key = server-key.pem
Example without CA:
ssl_cert = server-cert.pem
ssl_key = server-key.pem
These steps will help you generate the SSL certificates with 'openssl' tool.
Server can use Certificate Authority to verify the signature of certificates, but can also work with just private key and certificate (w/o the CA certificate).
openssl genrsa 2048 > ca-key.pem
Generate self-signed CA (root) certificate from the private key (fill in at least "Common Name"):
openssl req -new -x509 -nodes -days 365 -key ca-key.pem -out ca-cert.pem
Server uses the server certificate to secure communication with client. Generate certificate request and server private key (fill in at least "Common Name" different from the root certificate's common name):
openssl req -newkey rsa:2048 -days 365 -nodes -keyout server-key.pem -out server-req.pem
openssl rsa -in server-key.pem -out server-key.pem
openssl x509 -req -in server-req.pem -days 365 -CA ca-cert.pem -CAkey ca-key.pem -set_serial 01 -out server-cert.pem
When done you can verify the key and certificate files were generated correctly:
openssl verify -CAfile ca-cert.pem server-cert.pem
When your SSL config is valid the following things are available:
- you can connect to multiprotocol port (when no listener type is specified) over HTTPS and run queries. Both request and response will be ssl encrypted.
- you can connect to dedicated
https
port with http and run queries. Connection will be secured. (attempt to connect to this port via plain http will be rejected with 400 error code). - you can connect to mysql port with a mysql client using secured connection. The session will be secured. Note, that Linux
mysql
client tries to use ssl by default, so usual connect to Manticore in case it has a valid SSL config most probably will be secured. You can check it by running SQL 'status' command after you connect.
When your SSL config is not valid by any reason, which daemon detects by the fact that a secured connection can't be established (apart non-valid config there may be other reasons, like just inability to load appropriate SSL lib at all), the following things will not work or work non-secured way:
- you can't connect to multiprotocol port with https. The connection will be dropped.
- you can't connect to dedicated
https
port. The HTTPS connections will be dropped. - connection to
mysql
port via mysql client will not propagate possibility of SSL securing. So, if the client demands it, it will fail. If not - it will use plain mysql or compressed connection.
- binary API connections (such as connections from old clients, or inter-daemons master-agent communication) are not secured
- SSL for replication needs to be set up separately. However since SST stage of the replication is done by the binary API connection it is not secured too.
- you still can use any external proxies (e.g. SSH tunnelling) which will secure your connections.
▪️ Logging
Query logging can be enabled by setting query_log
directive in searchd section of the configuration file
searchd {
...
query_log = /var/log/query.log
...
}
Queries can also be sent to syslog by setting syslog
instead of a file path.
In this case all search queries will be sent to syslog daemon with LOG_INFO
priority, prefixed with [query]
instead of timestamp. Only plain
log format is supported for syslog. To use the syslog option Manticore must be configured –-with-syslog
on building (official packages come with syslog support).
Two query log formats are supported. Plain text format is still the default one. However, while it might be more convenient for manual monitoring and review, but hard to replay for benchmarks, it only logs search queries but not the other types of requests, does not always contain the complete search query data, etc.
The default text format is also harder (and sometimes impossible) to replay for benchmarking purposes. The sphinxql
format alleviates that. It aims to be complete and re-playable, even though at the cost of brevity and readability.
By default, searchd
logs all successfully executed search queries into a query log file. Here's an example:
[Fri Jun 29 21:17:58 2007] 0.004 sec 0.004 sec [all/0/rel 35254 (0,20)] [lj] test
[Fri Jun 29 21:20:34 2007] 0.024 sec 0.024 sec [all/0/rel 19886 (0,20) @channel_id] [lj] test
This log format is as follows:
[query-date] real-time wall-time [match-mode/filters-count/sort-mode total-matches (offset,limit) @groupby-attr] [index-name] query
- real-time is a time measured just from start to finish of the query
- wall-time like real-time but not including waiting for agents and merging result sets time
Match mode can take one of the following values:
- "all" for
SPH_MATCH_ALL
mode; - "any" for
SPH_MATCH_ANY
mode; - "phr" for
SPH_MATCH_PHRASE
mode; - "bool" for
SPH_MATCH_BOOLEAN
mode; - "ext" for
SPH_MATCH_EXTENDED
mode; - "ext2" for
SPH_MATCH_EXTENDED2
mode; - "scan" if the full scan mode was used, either by being specified with
SPH_MATCH_FULLSCAN
Sort mode can take one of the following values:
- "rel" for
SPH_SORT_RELEVANCE
mode; - "attr-" for
SPH_SORT_ATTR_DESC
mode; - "attr+" for
SPH_SORT_ATTR_ASC
mode; - "tsegs" for
SPH_SORT_TIME_SEGMENTS
mode; - "ext" for
SPH_SORT_EXTENDED
mode.
Note: the SPH* modes are specific to SphinxAPI legacy interface. SQL and HTTP interface will log in most cases ext2 for matching mode and ext and rel for sorting modes.
Additionally, if searchd
was started with --iostats
, there will be a block of data after where the index(es) searched are listed.
A query log entry might take the form of:
[Fri Jun 29 21:17:58 2007] 0.004 sec [all/0/rel 35254 (0,20)] [lj] [ios=6 kb=111.1 ms=0.5] test
This additional block is information regarding I/O operations in performing the search: the number of file I/O operations carried out, the amount of data in kilobytes read from the index files and time spent on I/O operations (although there is a background processing component, the bulk of this time is the I/O operation time).
SQL format can be enabled by searchd directive query_log_format
:
searchd {
...
query_log = /var/log/query.log
query_log_format = sphinxql
...
}
In this format, the example from the previous section would look as follows. (Wrapped below for readability, but with just one query per line in the actual log.)
/* Fri Jun 29 21:17:58.609 2007 2011 conn 2 real 0.004 wall 0.004 found 35254 */
SELECT * FROM test WHERE MATCH('test') OPTION ranker=proximity;
/* Fri Jun 29 21:20:34 2007.555 conn 3 real 0.024 wall 0.024 found 19886 */
SELECT * FROM test WHERE MATCH('test') GROUP BY channel_id
OPTION ranker=proximity;
Note that all requests would be logged in this format, including those sent via SphinxAPI and SphinxSE, not just those sent via SQL. Also note, that this kind of logging works only with plain log files and will not work if you use 'syslog' service for logging.
The features of Manticore SQL log format compared to the default text one are as follows.
- All request types should be logged. (This is still work in progress.)
- Full statement data will be logged where possible.
- Errors and warnings are logged.
- The log should be automatically re-playable via SphinxQL.
- Additional performance counters (currently, per-agent distributed query times) are logged.
Use sphinxql:compact_in
to shorten your IN()
clauses in log if you have too many values in it.
Every request (including both SphinxAPI and SQL) request must result in exactly one log line. All request types, including INSERT
, CALL SNIPPETS
, etc will eventually get logged, though as of time of this writing, that is a work in progress). Every log line must be a valid Manticore SQL statement that reconstructs the full request, except if the logged request is too big and needs shortening for performance reasons. Additional messages, counters, etc can be logged in the comments section after the request.
By default all queries are logged. If it's desired to log only queries with execution times that exceed the specified
limit, the query_log_min_msec
directive can be used:
searchd {
...
query_log = /var/log/query.log
query_log_min_msec = 1000
...
}
The expected unit of measure is milliseconds, but time suffix expressions can be used as well, like
searchd {
...
query_log = /var/log/query.log
query_log_min_msec = 1s
...
}
By default the searchd and query log files are created with 600 permission, so only the user under which server runs and root users can read the log files. query_log_mode
allows settings a different permission. This can be handy to allow other users to be able to read the log files (for example monitoring solutions running on non-root users).
searchd {
...
query_log = /var/log/query.log
query_log_mode = 666
...
}