Commit Graph

39 Commits

Author SHA1 Message Date
Doug Rohrer
da28931e4e Remove handling of timeout, as old-school Riak connection nodes would also result in a timeout. Try new riak_core_cluster_conn:status() first, then fall back to older 1.4.X style bare send. Lather, rinse, repeat. 2015-04-14 15:13:15 -04:00
Doug Rohrer
c603e8be14 Fix test hang when riak_core_cluster_conn:status failed to respond after 2 milliseconds.
Should resolve test failures with a message similar to:

@riak_core_cluster_conn:handle_info:402 Unmatch message {<20563.30238.10>,status}

in the server logs.
2015-04-13 16:37:43 -04:00
Micah Warren
d1891f69fd Fixed cluster connection detmination function.
Due to the refactor for the cluster manager/connection manager system to
use otp behaviors, the raw message method of getting stats has been ousted.
Instead, it uses a call. To allow the riak_test to be able to check older
clusters as well as the method, the function was extended to try new and
then the old.
2014-09-24 15:42:26 -05:00
Kelly McLaughlin
4b9a77c828 Re-initiate fullsync after a number of failed checks for completion
Re-initiate fullsync after 100 failed checks for completion. The
number of retries of the 'start fullsync and then check for
completion' cycle is configurable using
repl_util:start_and_wait_until_fullsync_complete/4 and defaults to 20
retries. This change is to avoid spurious test failures due to a rare
condition where the rpc call to start fullsync fails to actually
initiate the fullsync. A very similar changed for the version of the
start_and_wait_until_fullsync_complete in the replication module
introduced in 0a36f9974c has had good
effect at avoiding this condition for v2 replication tests.
2014-06-19 14:34:56 -06:00
Kelly McLaughlin
2f9a3cae4a Update calls to rt:systest_read to handle identical siblings
Update the calls to rt:systest_read in repl_util and
repl_aae_fullsync_util to treat identical siblings resulting from the
use of DVV as a single value.  These changes are specifically to
address failures seen in the repl_aae_fullsync_custom_n and
replication_object_reformat tests, but should be generally useful for
replication tests using the utility modules that and that have
allow_mult set to true.
2014-06-18 14:33:44 -06:00
bsparrow435
c89de8dac9 Address PR comments
Changed intercept to explicitly return `{error, econnrefused}`. Moved
helper functions to `repl_util` and added a new helper to distinguish
between disconnects on `cluster_by_name` and `cluster_by_address`
connections.

Added asserts to all wait_for functions.
2014-06-04 19:51:21 -04:00
Christopher Meiklejohn
b1752e0a26 Merge pull request #617 from basho/feature/csm/location-down
Add ability to test the location_down behavior.
2014-05-30 10:49:59 -04:00
Christopher Meiklejohn
6cebbd371a Add ability to test the location_down behavior.
Assert that we properly handle retries of failed partitions, when the remote location happens to be down.
2014-05-28 13:59:23 +02:00
Andrew J. Stone
8c3beedcc8 Make repl_cancel_fullsync more robust
* add repl_util:wait_until_fullsync_started/1
 * add repl_util:wait_until_fullsync_stopped/1
 * remove timeouts and use above calls to confirm our test is in the
   right state
2014-05-27 17:41:15 -04:00
Engel A. Sanchez
bda1b5c3cf Merge pull request #543 from basho/refactor/one-wait-4-aae-trees
Merge repl and rt versions of wait until AAE trees build
2014-05-12 15:54:36 -04:00
Engel A. Sanchez
31588c5d22 Replacing repl_util AAE wait with rt version 2014-05-12 14:55:59 -04:00
Andrew Thompson
6c4afcbcde Switch all the selfsigned certificates to be generated on demand
Using the make_cert tool we can generate arbitrary certificate chains on
demand, so they never have to be regenerated.
2014-05-09 14:46:52 -04:00
Christopher Meiklejohn
bd0721ec32 Merge pull request #588 from basho/feature/csm/repl-stats-1
Add basic moving target stats test.
2014-04-24 12:00:02 +01:00
Christopher Meiklejohn
09f7f88776 Add basic moving target stats test.
Use this as a platform to start testing reports of missing stats from
replication.
2014-04-22 10:38:33 +01:00
Christopher Meiklejohn
07b91fab36 Refactor test to assert downgrade.
When performing the test of object reformatting through replication,
assert that if we happen to downgrade the format we can still read the
keys which have been replicated.
2014-04-20 16:09:43 +00:00
Joseph Blomstedt
1b7a65d1fc Merge pull request #544 from basho/feature/rtssh
Add rtcloud support; rtssh harness.
2014-03-31 14:54:05 -07:00
Kelly McLaughlin
b106abb87f Address some replication test failures due to cluster race conditions
Change some of the helper functions in the repl_util module to handle
errors more sensibly so that cluster setup race conditions do not
cause unnecessary test failures.
2014-03-27 14:07:05 -06:00
Christopher Meiklejohn
1347ac91c1 Remove harness specific code.
Remove some harness specific code which isn't even really that
valuable at this point.
2014-02-19 22:10:32 +00:00
Kelly McLaughlin
ecc5dfb25c Fix problem with repl_util:wait_until_leader_converge function
The wait_until_leader_converge function could incorrectly return
success if all of the results from the get_leader rpc calls were
either undefined or all returned a badrpc tuple. In either case the
particular result ends up as the sole unique value in a list and the
success condition is verifying that the list is of length 1 regardless
of the value of the member of the list. Change the function to filter
the list of results for values that indicate failure prior to the
success condition checking.
2014-02-19 14:10:32 -07:00
Jon Anderson
ad9af013f4 added backward compatbility test for fullsync 2014-02-13 13:51:33 -07:00
Christopher Meiklejohn
494cd2deb5 General code cleanup. 2014-01-16 16:29:24 -05:00
Andrew Thompson
ab59896b24 Fix several bugs in the test refactor, and adjust some assumptions 2014-01-02 16:28:05 -05:00
Christopher Meiklejohn
0964abb5e3 Add an exhaustive test for AAE based replication.
Ensure that AAE replication is tested using all possible failure cases
when dealing with the riak_kv_index_hashtrees and failed connections.

First, use intercepts on riak_kv_vnode and riak_kv_index_hashtree to
ensure that we simulate errors on a per node basis, starting with the
source cluster and moving to the sink.  Simulate ownership transfers,
locked and incomplete hashtrees.  Verify partitions generate the correct
error count, after using a bounded set of retries, and finally remove
all intercepts and verify that the fullsync completes and all keys have
been migrated between the two clusters.
2014-01-02 16:27:43 -05:00
Micah Warren
d36238dec4 Added reduced repl test.
Also moved some code that would make these tests easier to setup into
the repl_util module.
2013-08-29 14:49:41 -05:00
Engel Sanchez
8d957817c8 Merge pull request #299 from basho/mw-fix-repl-rt-cascading
Mw fix repl rt cascading
2013-06-19 12:03:14 -07:00
Chris Tilt
9ad27ac759 Merge pull request #306 from basho/cet-successful-exists-test
Test for fullsync's successful_exists status item
2013-06-14 11:16:08 -07:00
Andrew Thompson
2a95faf27a Can't wait for riak_repl service because we may be building a 1.2 cluster 2013-06-14 13:14:25 -04:00
Chris Tilt
571e92d6ed Test for fullsync's successful_exists status item
* After a fullsync, make sure that the numner of successful exists equals the number of partitions, since each partition should have one successful fullsync source process exit.
* We could extend this test to handle the parts of replication2 that have down nodes in the future.
* Added a utility function to get the number of partitions from a node, plus a function to get fullsync status items.
* Test passes with changes from branch cet-successful-exists-fix
2013-06-13 12:57:23 -07:00
Dave Parfitt
81ea083a06 Merge pull request #294 from basho/adt-pull-291
Cherry-pick relevant changes from #291 to master
2013-06-10 14:06:48 -07:00
Micah Warren
4fe8f5ab08 Made is_leader and is_not_leader log messages useful. 2013-06-05 11:44:56 -05:00
Andrew Thompson
a45555ae15 Make sure nodes are ready before trying to join them 2013-06-04 17:13:56 -04:00
Andrew Thompson
a24c091f27 Fix wait_for_connection to tolerate badrpc 2013-06-04 14:03:24 -04:00
Chris Tilt
2415895306 Merge pull request #282 from basho/pdx-aae-fullsync
Riak test for AAE fullsync
2013-05-22 15:28:26 -07:00
Chris Tilt
1a9bb3d883 AAE fullsync works, plus some new AAE r_t utils 2013-04-19 12:57:58 -07:00
Andrew Thompson
84a2856ce3 Add replication2 SSL test, fix wait_for_connection
Wait for connection was considering a connection *trying* to connect as
a valid connection, which is wrong.
2013-04-03 11:44:47 -04:00
Chris Tilt
d80c30025c Remove export all from repl tests; make repl_util API explicit 2013-01-25 10:43:29 -08:00
Chris Tilt
1a61e06634 Moved common repl2 functions to repl_util and added log_to_nodes tracing to replication2_dirty test 2013-01-25 10:12:34 -08:00
Chris Tilt
b0ca6fa2c5 Move log_to_nodes into rt module for general use. 2013-01-24 15:51:33 -08:00
Chris Tilt
a076531785 Started refactor and added cluster-wide logging from tests. 2013-01-24 14:24:16 -08:00