15.5.2. Granting and Revoking Tickets via a Cluster Ticket Registry

15.5.2.1. Booth

Booth is an implementation of Cluster Ticket Registry or so-called Cluster Ticket Manager.

Booth is the instance managing the ticket distribution and thus, the failover process between the sites of a multi-site cluster. Each of the participating clusters and arbitrators runs a service, the boothd. It connects to the booth daemons running at the other sites and exchanges connectivity details. Once a ticket is granted to a site, the booth mechanism will manage the ticket automatically: If the site which holds the ticket is out of service, the booth daemons will vote which of the other sites will get the ticket. To protect against brief connection failures, sites that lose the vote (either explicitly or implicitly by being disconnected from the voting body) need to relinquish the ticket after a time-out. Thus, it is made sure that a ticket will only be re-distributed after it has been relinquished by the previous site. The resources that depend on that ticket will fail over to the new site holding the ticket. The nodes that have run the resources before will be treated according to the loss-policy you set within the rsc_ticket constraint.

Before the booth can manage a certain ticket within the multi-site cluster, you initially need to grant it to a site manually via booth client command. After you have initially granted a ticket to a site, the booth mechanism will take over and manage the ticket automatically.

Importante

The booth client command line tool can be used to grant, list, or revoke tickets. The booth client commands work on any machine where the booth daemon is running.

If you are managing tickets via Booth, only use booth client for manual intervention instead of crm_ticket. That can make sure the same ticket will only be owned by one cluster site at a time.

Booth includes an implementation of Paxos and Paxos Lease algorithm, which guarantees the distributed consensus among different cluster sites.

Nota

Arbitrator

Each site runs one booth instance that is responsible for communicating with the other sites. If you have a setup with an even number of sites, you need an additional instance to reach consensus about decisions such as failover of resources across sites. In this case, add one or more arbitrators running at additional sites. Arbitrators are single machines that run a booth instance in a special mode. As all booth instances communicate with each other, arbitrators help to make more reliable decisions about granting or revoking tickets.

An arbitrator is especially important for a two-site scenario: For example, if site A can no longer communicate with site B, there are two possible causes for that:

A network failure between A and B.
Site B is down.

However, if site C (the arbitrator) can still communicate with site B, site B must still be up and running.

15.5.2.1.1. Requirements

All clusters that will be part of the multi-site cluster must be based on Pacemaker.
Booth must be installed on all cluster nodes and on all arbitrators that will be part of the multi-site cluster.

The most common scenario is probably a multi-site cluster with two sites and a single arbitrator on a third site. However, technically, there are no limitations with regards to the number of sites and the number of arbitrators involved.

Nodes belonging to the same cluster site should be synchronized via NTP. However, time synchronization is not required between the individual cluster sites.