This fixes some rare race conditions in ensemble_sync where we could
sometimes run wait_for_stable prior to an ensemble actually becoming
unstable, and then it would pass the wait but the ensemble could become
unavailable during the next step in the test.
By waiting for the ensemble leader tick counts to increment, we can
guarantee that any failures will have been "noticed" prior to our
calling wait_for_stable, because the leader_tick function ensures a
quorum is present when it executes, and steps down if it fails to get
one.
It was previously possible for the 'minority' network partition to
become the majority network partition by a naive network partitioning
strategy. Previously, when a preference list of 5 keyspace partitions
was created on only four distinct nodes, it became possible for a 2 node
'minority' network partition group to actually have a majority of
keyspace partitions because 2 keyspace partitions were assigned to 1
node in the 'minority' group. This was fixed so that the 'majority'
group now always has a majority of keyspace partitions by preventing
nodes with greater than 1 keyspace partition from becoming part of the
'minority' group.
Add ensemble_basic4, ensemble_sync, and ensemble_interleave tests.
ensemble_sync tests the new AAE-based peer syncing logic. The test
checks various scenarios with different levels of data corruption.
ensemble_interleave tests a specific scenario where two peers become
corrupted one after the other. This tests the scenario where the
second peer becomes untrusted while the first peer may be syncing
with it.