Booth is an implementation of Cluster Ticket Registry
or so-called Cluster Ticket Manager
.
Booth is the instance managing the ticket distribution and thus, the failover process between the sites of a multi-site cluster. Each of the participating clusters and arbitrators runs a service, the boothd. It connects to the booth daemons running at the other sites and exchanges connectivity details. Once a ticket is granted to a site, the booth mechanism will manage the ticket automatically: If the site which holds the ticket is out of service, the booth daemons will vote which of the other sites will get the ticket. To protect against brief connection failures, sites that lose the vote (either explicitly or implicitly by being disconnected from the voting body) need to relinquish the ticket after a time-out. Thus, it is made sure that a ticket will only be re-distributed after it has been relinquished by the previous site. The resources that depend on that ticket will fail over to the new site holding the ticket. The nodes that have run the resources before will be treated according to the loss-policy
you set within the rsc_ticket
constraint.
Before the booth can manage a certain ticket within the multi-site cluster, you initially need to grant it to a site manually via booth client
command. After you have initially granted a ticket to a site, the booth mechanism will take over and manage the ticket automatically.
The booth client
command line tool can be used to grant, list, or revoke tickets. The booth client
commands work on any machine where the booth daemon is running.
If you are managing tickets via Booth
, only use booth client
for manual intervention instead of crm_ticket
. That can make sure the same ticket will only be owned by one cluster site at a time.
Booth includes an implementation of
Paxos and
Paxos Lease algorithm, which guarantees the distributed consensus among different cluster sites.
Arbitrator
Each site runs one booth instance that is responsible for communicating with the other sites. If you have a setup with an even number of sites, you need an additional instance to reach consensus about decisions such as failover of resources across sites. In this case, add one or more arbitrators running at additional sites. Arbitrators are single machines that run a booth instance in a special mode. As all booth instances communicate with each other, arbitrators help to make more reliable decisions about granting or revoking tickets.
An arbitrator is especially important for a two-site scenario: For example, if site A
can no longer communicate with site B
, there are two possible causes for that:
However, if site C
(the arbitrator) can still communicate with site B
, site B
must still be up and running.
The most common scenario is probably a multi-site cluster with two sites and a single arbitrator on a third site. However, technically, there are no limitations with regards to the number of sites and the number of arbitrators involved.
Nodes belonging to the same cluster site should be synchronized via NTP. However, time synchronization is not required between the individual cluster sites.