Commit Graph

1044 Commits

Author SHA1 Message Date
Jon Anderson
747212678b Merge pull request #676 from basho/bugfix/jra/bug-611
Add repl_bucket_types check that RTQ is drained after bucket type mis-match
2014-08-21 20:52:29 -04:00
Jon Anderson
613bdc29f1 Change rtq drainage check to use dumpq and put it in a function. 2014-08-20 08:07:26 -04:00
Jon Anderson
f36abd590d Add a check to make sure the RTQ queues drain after an unknown bucket
type repl.
2014-08-19 16:36:45 -04:00
Sean Cribbs
b8d8e3026b Verify that datatype stats are updated.
See basho/riak_kv#1017
2014-08-18 15:24:20 -07:00
Kelly McLaughlin
3577f476e6 Updates to bucket property validation test
* Rename the module from validate_nval_etc to bucket_props_validation
* Employ testing of protocol buffers connections in addition to HTTP
2014-08-13 16:32:34 -06:00
Russell Brown
4e7936da17 Riak 2.0 allow_mult defaults to true 2014-08-13 14:15:25 -06:00
Russell Brown
f100468e11 Test for bad bucket property validation 2014-08-13 14:15:25 -06:00
Christopher Meiklejohn
ccc4d403d4 Add yz test for search over maps. 2014-07-29 12:33:16 -04:00
Kelly McLaughlin
ceb24fc3e2 Merge pull request #661 from basho/bugfix/replication-ssl-site-ip-verification
Fix failure of replication_ssl test introduced by 297090d
2014-07-15 15:08:50 -06:00
Kelly McLaughlin
0ab2393559 Change replication SSL ACL tests to avoid certificate expiration
Change the ACL test case in the replication_ssl and replication2_ssl
tests to use certificates generated within the tests instead of
relying on certificates created outside the test that are prone to
expire and cause spurious test failure.

Also change the replication_ssl and replication2_ssl tests to avoid a
cycle of standing up the test clusters and then immediately restarting
them before any tests cases execute. This should make the test
execution slightly faster for both test modules.

This commit also changes the tests to be a bit more robust in checking
for cluster state when restarting nodes and removes an unnecessary
five second sleep call in the replication_ssl test.
2014-07-15 12:06:06 -06:00
Kelly McLaughlin
9c5daf0f31 Fix failure of replication_ssl test introduced by 297090d
Change replication_ssl to use the wait_for_site_ips function from the
replication module introduced in
297090ded6 instead of the defunct
verify_site_ips function.
2014-07-14 12:37:32 -06:00
Russell Brown
a212b99a75 Update expected return to match change in riak_pb API
See 2b68a97710
for details.
2014-07-14 17:14:51 +01:00
Joseph Blomstedt
695853cc94 Merge pull request #657 from basho/bugfix/ensemble-interleave-error-failed
Fix ensemble_interleave error condition
2014-07-11 21:42:59 -07:00
Andrew J. Stone
f0643db473 Fix ensemble_sync by allowing {error, <<"failed">>
Allow {error, <<"failed">>} as an error response in ensemble_sync. Fixes
the test with basho/riak_ensemble#37 and basho/riak_kv#1002
2014-07-11 18:11:30 -04:00
Andrew J. Stone
38bd8399d1 Fix ensemble_interleave error condition
Include {error, <<"failed">>} as allowed failure so that test passes
with changes for basho/riak_ensemble#37 and basho/riak_kv#1002
2014-07-11 17:41:48 -04:00
Kelly McLaughlin
297090ded6 Avoid a race condition in the replication test module
Avoid a race condition in the replication test module when checking
for site IP addresses in the replication status output.  The test
waits for a connection on the leader, but it only queries the
replication status to check for the expected site IP addresses a
single time. Change the test to wait and re-check the status output to
give greater assurance that if the expected site IP addresses are not
present it is due to legitimate failure and not a race condition in
checking the replication status. This change affects the replication
and replication_upgrade tests as well as any other tests that call the
replication:replication function.
2014-07-01 16:36:45 -06:00
Jon Anderson
054c015d10 Merge pull request #651 from basho/feature/jra/verify_listkeys_eqcfsm
Expand verify_listkeys_eqcfsm to track varying buckets and n_vals.
2014-07-01 13:19:41 -04:00
Christopher Meiklejohn
f085f70169 Merge pull request #653 from basho/features/csm/crdt-capability
Prevent autoreconnect problem.
2014-06-30 18:29:05 -04:00
Christopher Meiklejohn
da34719fe3 Prevent autoreconnect problem.
Prevent a situtation where the auto-reconnect hasn't triggered yet
causing the result to be an error, instead of ok, on the next operation
after reconnecting.  Force a disconnect and reconnect to make sure the
test is deterministic.
2014-06-30 17:22:20 -04:00
Jon Anderson
7c2d7cc827 Expand verify_listkeys_eqcfsm to track varying buckets and n_vals. 2014-06-26 16:13:58 -04:00
Eric Redmond
0eb2d1c443 Merge pull request #650 from basho/er/yz-ensemble-test
Test that ensemble delete functions in yokozuna
2014-06-23 17:38:09 -07:00
Engel A. Sanchez
3662965705 Merge pull request #649 from basho/feature/ensembles-wait-for-riak-kv
Feature/ensembles wait for riak kv
2014-06-23 14:24:12 -04:00
Eric Redmond
266f9858eb Test the ensemble delete function 2014-06-20 14:44:30 -07:00
Engel A. Sanchez
d32d007f4d Fix service/peer check race
Changing to fetching the list of peers first, then check if the riak_kv
service is up. If the service is up, then check the peers. Otherwise it
is possible to see the service down, then peers up because it went up in
the interim.

Also, making KV vnode delay configurable.
2014-06-20 14:26:24 -04:00
Engel A. Sanchez
3bf0954253 Test ensemble peers wait for riak_kv service
Now ensemble peers are prevented from starting up until the riak_kv
service is up to avoid nasty races that could even lead to node crashes
as the ensembles frantically query for data that isn't ready.
2014-06-19 23:26:30 -04:00
Kelly McLaughlin
4b9a77c828 Re-initiate fullsync after a number of failed checks for completion
Re-initiate fullsync after 100 failed checks for completion. The
number of retries of the 'start fullsync and then check for
completion' cycle is configurable using
repl_util:start_and_wait_until_fullsync_complete/4 and defaults to 20
retries. This change is to avoid spurious test failures due to a rare
condition where the rpc call to start fullsync fails to actually
initiate the fullsync. A very similar changed for the version of the
start_and_wait_until_fullsync_complete in the replication module
introduced in 0a36f9974c has had good
effect at avoiding this condition for v2 replication tests.
2014-06-19 14:34:56 -06:00
Kelly McLaughlin
3466aa7c24 Merge branch 'bugfix/fix-repl-object-reformat' 2014-06-18 22:14:35 -06:00
Jon Anderson
c424848bb0 Merge pull request #644 from basho/bugfix/jra/listkeys_eqc_setup
Fix occasional setup errors in verify_listkeys_eqcfsm
2014-06-18 20:29:00 -05:00
Jon Anderson
f3f5e40a36 Removed commented function. 2014-06-18 18:15:30 -05:00
Kelly McLaughlin
f8e10f2f75 Reinstate concurrency in replication_object_reformat test 2014-06-18 15:55:27 -06:00
Kelly McLaughlin
87ee6f5883 Fix replication_object_reformat test failure
Part of the condition checking done in the replication_object_reformat
test is to validate the results of a fullsync using
repl_util:validate_completed_fullsync/6. The way in which the the
function is called from the test expects fullsync to complete with 0
error_exit or retry_exit conditions occurring. This requires that sink
cluster be in a steady state with all partitions available. The test
failed to wait for such conditions to occur and instead relied on
performing a node downgrade asynchronously and waiting for up to 60
seconds for a completion message before continuing with the test. The
test was continually failing after a node was downgraded to `previous`
due to partitions being reported as `down` on that node. To resolve
the issue the node downgrade process is now done in the primary test
process instead of in a separate spawned process. After the version
downgrade is complete, the test now waits for the riak_repl and the
riak_kv services, calls rt:wait_until_nodes_ready/1, calls
rt:wait_until_no_pending_changes/1, and finally waits for the
riak_repl2_fs_node_reserver named process to be registered on the
downgraded node. This process is responsible for handling partition
reservation requests and is key to determining the the new node is
able to handle a fullsync without partition errors.
2014-06-18 15:55:27 -06:00
Kelly McLaughlin
c55e473b97 Merge branch 'feature/update-repl-systest-read-use' 2014-06-18 15:52:48 -06:00
Kelly McLaughlin
2f9a3cae4a Update calls to rt:systest_read to handle identical siblings
Update the calls to rt:systest_read in repl_util and
repl_aae_fullsync_util to treat identical siblings resulting from the
use of DVV as a single value.  These changes are specifically to
address failures seen in the repl_aae_fullsync_custom_n and
replication_object_reformat tests, but should be generally useful for
replication tests using the utility modules that and that have
allow_mult set to true.
2014-06-18 14:33:44 -06:00
Andrew J. Stone
7d0301db35 add intercept for riak_kv_ensemble_backend:handle_down/4 in ensemble_vnode_crash 2014-06-17 23:13:44 -04:00
Andrew J. Stone
6c14c7c371 Add test to kill a vnode and vnode proxy
Kill a vnode and it's proxy for a given key and ensure that operation
reads succeed afterwards.
2014-06-17 17:57:15 -04:00
Jon Anderson
baf32904af Remove un-used clean up function. 2014-06-17 17:26:23 -04:00
Jon Anderson
8912210036 Re-enable AAE. 2014-06-17 17:04:39 -04:00
Jon Anderson
472241f180 Take cluster set up out of a state and instead put it in the property. 2014-06-17 16:49:20 -04:00
John Burwell
6733c099c8 Merge pull request #636 from basho/bugfix/jsb/start-ensemble-without-aae
Verify Riak Startup when Strong Consistency is Misconfigured
2014-06-16 09:30:33 -04:00
Micah
c96f318f6a Merge pull request #643 from basho/bugfix/mw/better-isolate-pb_security-certs
isolate certs created for the pb_security tests.
2014-06-12 17:30:36 -05:00
Micah Warren
f7631b42c3 pb_cipher_suites test creates certs in its own dir.
Same reason as pb_security and http_security: to keep other tests
from stomping on it.
2014-06-12 17:22:42 -05:00
Micah Warren
f96847beb8 isolate certs created for the pb_security tests.
This should prevent other tests from interfering in its execution
2014-06-12 17:18:15 -05:00
Kelly McLaughlin
0589935931 Fix problems with cert specifications causing replication_ssl to fail
Fix problem with cacertdir specification in replication_ssl test. The
code used load cert files in v2 replication expects the path specific
by the cacertdir key to only be a directory. With v3 replication the
code used is flexible enough to allow a directory or a file. Also
correct a typo in the certfile path for the SSLConfig1 configuration.
2014-06-12 12:38:58 -06:00
Kelly McLaughlin
5f5c3ac035 Merge branch 'bugfix/replication-upgrade-fixes' 2014-06-12 10:39:53 -06:00
Kelly McLaughlin
21b64526f1 Fix two issues with replication_upgrade test
* Do not attempt to cancel fullsync if the initial attempt to start
  and wait for completion fails. It has not been observed that the
  problem is fullsync starting and not completing in time, but rather
  the issue is that the initial call to start fullsync does not take
  effect. Therefore the cancellation is unnecessary.
* Replace the call to repl_util:wait_for_connection/2 in the node
  upgrade process with a call to
  replication:wait_until_connection/1. This function is geared towards
  v2 replication and should speed up test execution.
2014-06-11 21:53:29 -06:00
Micah
2c5def132c Merge pull request #638 from basho/bugfix/mw/pb_security-using-removed-function
Fixed map crdt creation
2014-06-11 13:50:09 -05:00
Micah Warren
3067209a97 Fixed map crdt creation
riakc_map:add/2 no longer exists, so updated the creation of that key to
use the correct update semantics.
2014-06-11 13:25:13 -05:00
Kelly McLaughlin
0e2b52d8b1 Fix timing issue with jmx_verify test
Replace use of a 40 second sleep in the test_supervision test case
with a wait condition to better handle variances in the time it takes
to progress through 10 retry attempts.
2014-06-11 11:26:45 -06:00
Kelly McLaughlin
0601cd594f Merge branch 'bugfix/replication-upgrade-return-term' 2014-06-10 17:14:31 -06:00
John Burwell
6d8c504dba - Verifies Riak startup behavior when strong consistency is enabled and
AAE is disabled.  (defect https://github.com/basho/riak_kv/issues/959)
- Adds additional console output to reset-current-env to explain
  configuration and steps being executed
- Adds the -n option to the reset-current-env script to specify the
  number of nodes to build.  By default, 5 will be created.
2014-06-10 15:01:10 -04:00