⪢ Fetching from databases
Manticore Search allows fetching data from databases using specialized drivers or ODBC. Current drivers include:
mysql
- for MySQL/MariaDB/Percona MySQL databasespgsql
- for PostgreSQL databasemssql
- for Microsoft SQL databaseodbc
- for any database that accepts connections using ODBC
To fetch data from the database, a source must be configured with type as one of the above. The source requires information about how to connect to the database and the query that will be used to fetch the data. Additional pre- and post-queries can also be set - either to configure session settings or to perform pre/post fetch tasks. The source also must contain definitions of data types for the columns that are fetched.
The source definition must contain the settings of the connection, this includes the host, port, user credentials, or specific settings of a driver.
The database server host to connect to. Note that the MySQL client library chooses whether to connect over TCP/IP or over UNIX socket based on the host name. Specifically, "localhost" will force it to use UNIX socket (this is the default and generally recommended mode) and "127.0.0.1" will force TCP/IP usage.
The server IP port to connect to.
For mysql
the default is 3306 and for pgsql
, it is 5432.
The SQL database to use after the connection is established and perform further queries within.
The username used for connecting.
The user password to use when connecting. If the password includes #
(which can be used to add comments in the configuration file), you can escape it with \
.
UNIX socket name to connect to for local database servers. Note that it depends on the sql_host
setting whether this value will actually be used.
sql_sock = /var/lib/mysql/mysql.sock
MySQL client connection flags. Optional, the default value is 0 (do not set any flags).
This option must contain an integer value with the sum of the flags. The value will be passed to mysql_real_connect() verbatim. The flags are enumerated in mysql_com.h include file. Flags that are especially interesting in regard to indexing, with their respective values, are as follows:
- CLIENT_COMPRESS = 32; can use compression protocol
- CLIENT_SSL = 2048; switch to SSL after handshake
- CLIENT_SECURE_CONNECTION = 32768; new 4.1 authentication For instance, you can specify 2080 (2048+32) to use both compression and SSL, or 32768 to use new authentication only. Initially, this option was introduced to be able to use compression when the indexer and mysqld are on different hosts. Compression on 1 Gbps links is most likely to hurt indexing time though it reduces network traffic, both in theory and in practice. However, enabling compression on 100 Mbps links may improve indexing time significantly (up to 20-30% of the total indexing time improvement was reported). Your mileage may vary.
mysql_connect_flags = 32 # enable compression
mysql_ssl_cert
- path to SSL certificatemysql_ssl_key
- path to SSL key filemysql_ssl_ca
- path to CA certificate
unpack_mysqlcompress_maxsize = 1M
Columns to unpack using MySQL UNCOMPRESS()
algorithm. Multi-value, optional, default value is an empty list of columns.
Columns specified using this directive will be unpacked by the indexer using the modified zlib algorithm used by MySQL COMPRESS()
and UNCOMPRESS()
functions. When indexing on a different box than the database, this lets you offload the database and save on network traffic. This feature is only available if zlib and zlib-devel were both available during build time.
unpack_mysqlcompress = body_compressed
unpack_mysqlcompress = description_compressed
By default, a buffer of 16M is used for uncompressing the data. This can be changed by setting unpack_mysqlcompress_maxsize
.
When using unpack_mysqlcompress, due to implementation intricacies, it is not possible to deduce the required buffer size from the compressed data. So, the buffer must be preallocated in advance, and the unpacked data can not go over the buffer size.
unpack_zlib = col1
unpack_zlib = col2
Columns to unpack using zlib (aka deflate, aka gunzip). Multi-value, optional, default value is an empty list of columns. Applies to source types mysql
and pgsql
only.
Columns specified using this directive will be unpacked by the indexer
using the standard zlib algorithm (called deflate and also implemented by gunzip
). When indexing on a different box than the database, this lets you offload the database and save on network traffic. This feature is only available if zlib and zlib-devel were both available during build time.
MS SQL Windows authentication flag. Whether to use currently logged-in Windows account credentials for authentication when connecting to MS SQL Server.
mssql_winauth = 1
Sources using ODBC require the presence of a DSN (Data Source Name) string which can be set with odbc_dsn
.
odbc_dsn = Driver={Oracle ODBC Driver};Dbq=myDBName;Uid=myUsername;Pwd=myPassword
Please note that the format depends on the specific ODBC driver used.