Add testing of the handoff heartbeat change from the following pull
request: https://github.com/basho/riak_core/pull/560. Add an intercept
module for the riak_core_handoff_sender module to introduce artificial
delay on item visitation during a handoff fold. This delay along with
the changes to the verify_handoff test induces test failure when run
without the heartbeat change. The handoff_receive_timeout is exceeded,
handoff stalls, and the test eventually fails due to timeout. The test
succeeds when run with the heartbeat change.
You can now choose to save the intercept modules in basho-patches so
that they are loaded on a restart. This should be useful to modify
Riak's behavior at startup time.
Changed intercept to explicitly return `{error, econnrefused}`. Moved
helper functions to `repl_util` and added a new helper to distinguish
between disconnects on `cluster_by_name` and `cluster_by_address`
connections.
Added asserts to all wait_for functions.
Allow intercept functions passed to rt_intercept:add/2 to be anonymous. In
compiled code they can either be a plain anonymous function, assuming they
don't use any variables from the surrounding context, or they can be a
2-tuple like this:
{[FreeVar1, ...],
fun(Arg1, ...) -> ... end}
where FreeVar1 etc. is a list of free variables to be closed over so that
they can be used within the anonymous function. For making interactive
calls to rt_intercept:add/2 from the Erlang shell, only the anonymous
function form is required, even if it uses free variables, though the
2-tuple form is also acceptable.
For compiled code, support for anonymous intercept functions is implemented
via a parse transform, and so to use anonymous functions the intercept
structure(s) containing them must be defined directly inline as part of the
final argument to rt_intercept:add/2, i.e., they cannot be first assigned
to a variable that is then used within the argument. This is because the
value of such a variable might not be visible to the parse transform.
Add a description of anonymous function intercepts to the README.
Add test which ensures that the AAE source worker doesn't deadlock when
waiting for responses from the process which is computing the hashtree
differences.
Unfortunately, this test uses timeouts because as the code currently
stands, I can't figure out a way to make it any cleaner.
Ensure that AAE replication is tested using all possible failure cases
when dealing with the riak_kv_index_hashtrees and failed connections.
First, use intercepts on riak_kv_vnode and riak_kv_index_hashtree to
ensure that we simulate errors on a per node basis, starting with the
source cluster and moving to the sink. Simulate ownership transfers,
locked and incomplete hashtrees. Verify partitions generate the correct
error count, after using a bounded set of retries, and finally remove
all intercepts and verify that the fullsync completes and all keys have
been migrated between the two clusters.
Verify that if the leader election occurs before the ring has converged,
that we still connect to remote sinks.
This test uses an intercept to ensure we can delay the ring events to
occur until after the election.
- Add tests to validate that timeouts are working correctly for
all variations of list buckets and list keys (stream and non,
timeouts too-short and long-enough).
- add intercept in the right place to simulate delays for large
numbers of keys/buckets returned.