Please read distributed index. Here we will just provide an example.
Here we have split the data over 4 servers, each serving one of the shards. If one of the servers fails, our distributed index will still work, but we would miss the results from the failed shard.
- ini
index mydist {
type = distributed
agent = box1:9312:shard1
agent = box2:9312:shard2
agent = box3:9312:shard3
agent = box4:9312:shard4
}
Now we added mirrors, each shard is found on 2 servers. By default, the master (the searchd instance with the distributed index) will pick randomly one of the mirrors.
The mode used for picking mirrors can be set with ha_strategy. In addition to random there's also ha_strategy = roundrobin
.
More interesting strategies are the latency-weighted probabilities based ones. noerrors and nodeads not only that take out mirrors with issues, but also monitor the response times and do a balancing. If a mirror responds slower (for example due to some operations running on it), it will receive less requests. When the mirror recovers and provides better times, it will get more requests.
- ini
index mydist {
type = distributed
agent = box1:9312|box5:9312:shard1
agent = box2:9312:|box6:9312:shard2
agent = box3:9312:|box7:9312:shard3
agent = box4:9312:|box8:9312:shard4
}