Because list-keys and list-buckets use coverage, we might hit latent
replicas depending on the coverage plan. This gives each call some extra
tries to complete successfully.
Now green when run in sequence:
Test Results:
pb_cipher_suites-bitcask: pass
pb_security-bitcask : pass
---------------------------------------------
0 Tests Failed
2 Tests Passed
That's 100.0% for those keeping score
This changes the test assertion so that it retries fetching the value
from the second cluster until it is the expected value, at which point
the test will either pass if the sibling count is reasonable or fail if
it is too damn high.
Fetch the sink object on each iteration of the wait_until, just in case
that the entire set of siblings didn't make it across the repl link.
This also gives read-repair a chance to happen, in case the version the
sink wrote didn't make it to all replicas.
Trying to use the repl features before newly started nodes have
riak_repl completely initialized leads to all sorts of nasty crashes and
noise. Frequently it makes fullsync stuck forever, which makes a lot of
the tests fail.
This also tweaks the AAE fullsync tests to remove assumptions about
failure stats when AAE transient errors occur. The behavior in the
handling of those errors has changed recently with the introduction of
soft exits.
Reduces the probability of a race condition between the calculation of spiral/histogram metrics and SNMP stat cache refresh by reducing the SNMP poll interval to 1 second during test execution
Don't wait for convergence of the ring, because bucket properties are no
longer stored in the ring; instead, wait until the property changes,
which means the gossip has stabilized.
Add an rt:admin/3 function that accepts a list of options as the third
parameter. Currently the only valid option is return_exit_code. The
rtdev, rtssh, and rt_cs_dev harnesses have been updated to support
this option. If return_to_exit is specified the return from a
?HARNESS:admin call is a pair with the exit code as the first member
and the command output as the second member. Finally the
basic_command_line test has been changed to use return_for_exit to
verify the changes.
Due to the refactor for the cluster manager/connection manager system to
use otp behaviors, the raw message method of getting stats has been ousted.
Instead, it uses a call. To allow the riak_test to be able to check older
clusters as well as the method, the function was extended to try new and
then the old.
Add testing of the handoff heartbeat change from the following pull
request: https://github.com/basho/riak_core/pull/560. Add an intercept
module for the riak_core_handoff_sender module to introduce artificial
delay on item visitation during a handoff fold. This delay along with
the changes to the verify_handoff test induces test failure when run
without the heartbeat change. The handoff_receive_timeout is exceeded,
handoff stalls, and the test eventually fails due to timeout. The test
succeeds when run with the heartbeat change.