Trying to use the repl features before newly started nodes have
riak_repl completely initialized leads to all sorts of nasty crashes and
noise. Frequently it makes fullsync stuck forever, which makes a lot of
the tests fail.
This also tweaks the AAE fullsync tests to remove assumptions about
failure stats when AAE transient errors occur. The behavior in the
handling of those errors has changed recently with the introduction of
soft exits.
* Do not attempt to cancel fullsync if the initial attempt to start
and wait for completion fails. It has not been observed that the
problem is fullsync starting and not completing in time, but rather
the issue is that the initial call to start fullsync does not take
effect. Therefore the cancellation is unnecessary.
* Replace the call to repl_util:wait_for_connection/2 in the node
upgrade process with a call to
replication:wait_until_connection/1. This function is geared towards
v2 replication and should speed up test execution.
As of commit 3044839456 tests that
return something other than the prescribed success atom 'pass' to
indicate success result in test failure. Change the
replication_upgrade and replication2_upgrade tests that return the
result of the a call to lists:foreach/2 to instead return 'pass' to
indicate success.
Prior to Riak 1.4.8 replication registers as a service prior to
completing all initialization tasks including establishing realtime
connections to sink clusters. This leads to a race condition in the
replication_upgrade and replication2_upgrade tests where the test may
begin writing data to the source cluster to verify the function of
realtime replication before the most recently upgraded node
establishes a connection to the sink cluster. The result of this is
that the data is silently discarded by the realtime replication system
and the test fails because all of the expected data is not replicated
and able to be read on the sink cluster. Change the
replication_upgrade and replication2_upgrade tests to explicitly wait
for the realtime connection to be established after each source
cluster node is upgraded before proceeding with the test.