riak_test

mirror of https://github.com/valitydev/riak_test.git synced 2024-11-06 16:45:29 +00:00

Author	SHA1	Message	Date
Doug Rohrer	04d54cb810	Change verify_dt_converge to use riakc_flag:enable(F) rather than disable. This will work even in the rare and pathological case where the cluster is partitioned before all 3 nodes have received the update. riakc_flag:disable(F) requires context, which isn’t there in the new map that would be created on the side of the partition with no data.	2015-02-11 10:47:35 -05:00
Alex Berghage	a9bef3c0a7	Rolled back change to test_vnode_protection threshold. This test's a little confused in the sources as-is since it prints like it's based on the number of requests, even though the actual comparison is done against a function of THRESHOLD. I've reverted to the comparison used currently, since it looks to me like this test should really expect to have ~NUM_REQUESTS processes, and a vnode queue pretty close to THRESHOLD. I'd appreciate review here though, particularly if anyone recalls the original intent of these comparison numbers.	2015-02-10 21:15:23 -05:00
Alex Berghage	ec562e9461	Fixed predicate print formatter (was ~w, now ~p)	2015-02-10 21:11:32 -05:00
Alex Berghage	a7258f4ed7	Loosened criteria for successful overload test. Previously we'd used a sort of fuzzy 'metric' where we expect the number of successful requests/fsms to be less than some fudge factor over the overload threshold. This tends to kick up spurious failures on the test board without offering much more in the way of assurances about overload's functionality. This change instead bases test success on the number of requests only, not the threshold — if some amount of work was shed at all we consider that a passing test. In the future we should revisit this and change the request accounting machinery to just explicitly track denials instead of fsm processes / vnode queue depth.	2015-02-10 20:49:16 -05:00
Alex Berghage	e0921b1bf1	Reduced repetition in tests/overload predicates	2015-02-10 20:44:09 -05:00
Christopher Meiklejohn	2179361bb3	Resolve incorrect guard.	2015-02-09 16:56:30 -05:00
Ulf Wiger	36f80415e1	add vnode_get_fsm_time_* stats	2015-02-09 22:49:04 +01:00
Russell Brown	9efa9602d0	Merge pull request #728 from basho/kv667 Test for basho/riak#667, read pre 2.0.4 sets/maps	2015-01-29 14:27:50 +00:00
Russell Brown	bc8a3b240b	Change env var name	2015-01-27 18:00:48 +00:00
Russell Brown	d22fbfbdc7	Persuade rtdev harness to set advanced.config variables In the case that no advanced.config file exists (everycase!) rt would not add any advanced config settings to the conf. This PR teaches rtdev to create an advanced.comfig file if none exists so that tests may set advanced config. In this case we set ring_size and also the `crdt_mixed_versions` app env	2015-01-27 09:53:15 +00:00
Sean Cribbs	fd086aa5bc	Finished up format test. This is currently failing, somehow fetching the map immediately after updates on the upgrade, I'm getting the dict format.	2015-01-26 15:34:11 -06:00
Sean Cribbs	dea8a89160	Fetch maps after upgrade, checking that they match.	2015-01-24 11:49:18 -06:00
Sean Cribbs	af9fe7a63e	WIP extending mixed-cluster test. Still have not completed upgrade and feature-flag switch. I changed the versions from atoms to "2.0.2" and "2.0.4", we can bikeshed that with build/test czars on Monday. Added some useful logging statements to the plain-upgrade test. Removed unnecessary clean_cluster and systest_read calls.	2015-01-23 18:02:13 -06:00
Sean Cribbs	86202d010e	Little tweak to use literal versions and binaries for keys.	2015-01-23 16:04:48 -06:00
Sean Cribbs	ca040bdaee	Begin split of test into two parts, safe upgrade and mixed cluster	2015-01-23 14:09:48 -06:00
Russell Brown	7b4ffcbc18	Extend kv667 test for upgrade between 2.0.2/4/5 This test is getting Biig, and there is still a lot to add (see comments at the end of the test.) Maybe we should break it out into a few tests, and there are some questions still.	2015-01-23 18:02:32 +00:00
Russell Brown	49183d92b9	Update Chris's test to test nested sets and maps too	2015-01-21 16:27:44 +00:00
Christopher Meiklejohn	569a76d490	Add riak_kv#667 reproducer.	2015-01-20 18:09:20 +01:00
Russell Brown	c1b12f5069	Merge pull request #719 from basho/rdb/gh-kv679 Riak Tests for scenarios of basho/riak_kv#679	2015-01-16 13:35:00 +00:00
Russell Brown	ccb5af778b	remove test for unfixed case	2015-01-15 17:19:41 +00:00
Russell Brown	3657c57bac	Address review comments Remove those timers and other remnants of flailing when first trying to write tests.	2015-01-15 16:09:02 +00:00
John Burwell	a37832d7f9	Verify the list of HTTP stats keys is complete * Checks the list of stats keys returned from the HTTP endpoint is complete -- delineating between riak and riak_ee. The test will fail if the list returned from the HTTP endpoint does not exactly match the expected list. This behavior acts as a forcing function to ensure that the expected list is properly maintained as stats are added and removed. * Modifies reset-current-env to properly clean dependencies when a full clean is requested and remove the current directory in the target test instance. * Adds logging to verify_riak_stats to explain the addition steps being performed * Adds rt:product/1 to determine whether a node is running riak, riak_ee, or riak_cs * Adds tools.mk support and eunit scaffolding to rebar.config * Modifies reset-current-env.sh to remove the current directory in the target test instance	2015-01-08 17:23:25 -05:00
John Burwell	f98b3aed87	Merge pull request #713 from uwiger/uw-cmp-http-and-console-stats Verify that the stats in riak-admin status match those from the HTTP endpoint	2015-01-06 15:02:01 -05:00
Andrew J. Stone	a28f70eb1a	Merge pull request #720 from basho/bugfix/ensemble_sync_partition_error Fix partition issue in ensemble_sync	2015-01-05 16:30:08 -05:00
Andrew J. Stone	70da1fd2f9	separate some folds out of partition/2 in ensemble_sync for clarity	2015-01-05 15:42:26 -05:00
Russell Brown	d86d122308	WIP	2015-01-02 09:02:13 +00:00
Russell Brown	a1af1140f0	WIP data loss test was failing for the wrong reason add some debugging log out	2015-01-02 09:02:13 +00:00
Russell Brown	8cbec21a5d	WIP Kv679 tests Dataloss at coordinator Dataloss at coordinator with an old clock repaired Vnode id's not unique	2015-01-02 09:02:13 +00:00
Russell Brown	87afd498e6	WIP	2015-01-02 09:02:13 +00:00
Russell Brown	6097bda909	WIP kv679 tombstone test A-B-C, crash before reap	2015-01-02 09:02:13 +00:00
Russell Brown	a29d3c3494	WIP kv679 doomstone tests	2015-01-02 09:02:13 +00:00
Russell Brown	4f7e9e17af	WIP make the test fail	2015-01-02 09:02:13 +00:00
Russell Brown	40882ea81d	WIP remove voodoo code, add inactivity timeout	2015-01-02 09:02:13 +00:00
Russell Brown	aece28adc9	WIP kv679 tombstone-doomstone-boomstone test	2015-01-02 09:02:12 +00:00
Sean Cribbs	d46928bea2	Merge pull request #723 from basho/sdc/bugfix/http-bucket-types-harden Wrap all list_keys and list_buckets calls with wait_until	2014-12-31 10:35:30 -06:00
John R. Daily	9fdf2ac1f4	Expected output changed as of PR#678	2014-12-31 10:56:47 -05:00
Sean Cribbs	a67140a1b9	Wrap all list_keys and list_buckets calls with wait_until Because list-keys and list-buckets use coverage, we might hit latent replicas depending on the coverage plan. This gives each call some extra tries to complete successfully.	2014-12-31 09:27:23 -06:00
Andrew J. Stone	043baabf3f	Fix partition issue in ensemble_sync It was previously possible for the 'minority' network partition to become the majority network partition by a naive network partitioning strategy. Previously, when a preference list of 5 keyspace partitions was created on only four distinct nodes, it became possible for a 2 node 'minority' network partition group to actually have a majority of keyspace partitions because 2 keyspace partitions were assigned to 1 node in the 'minority' group. This was fixed so that the 'majority' group now always has a majority of keyspace partitions by preventing nodes with greater than 1 keyspace partition from becoming part of the 'minority' group.	2014-12-30 15:21:18 -05:00
Sean Cribbs	41b60d0328	Make sure the httpd started exits with the test. Now green when run in sequence: Test Results: pb_cipher_suites-bitcask: pass pb_security-bitcask : pass --------------------------------------------- 0 Tests Failed 2 Tests Passed That's 100.0% for those keeping score	2014-12-29 14:08:36 -06:00
Andrew J. Stone	1addf5207f	Merge pull request #716 from basho/bugfix/ensemble_basic3 Use result of fold in ensemble_basic3	2014-12-24 12:57:52 -05:00
Engel A. Sanchez	0328b4e7d7	Fix failure on slow replication This changes the test assertion so that it retries fetching the value from the second cluster until it is the expected value, at which point the test will either pass if the sibling count is reasonable or fail if it is too damn high.	2014-12-23 16:41:59 -05:00
Andrew J. Stone	32d90ca13a	Don't try to resume pids that don't exist	2014-12-23 15:59:37 -05:00
Andrew J. Stone	7d097080d3	Use result of fold in ensemble_basic3 When resuming vnodes we need the proper pid from the previous suspend. Use the result of the fold to get the right pids.	2014-12-23 15:50:03 -05:00
Sean Cribbs	abfa0ebf8a	Adjust test to ensure that essential concurrency =< n_val	2014-12-23 14:15:55 -06:00
Sean Cribbs	ca44485d1f	Fix race condition in verify_dvv_repl. Fetch the sink object on each iteration of the wait_until, just in case that the entire set of siblings didn't make it across the repl link. This also gives read-repair a chance to happen, in case the version the sink wrote didn't make it to all replicas.	2014-12-23 10:51:50 -06:00
Engel A. Sanchez	556cb7210c	Ensure riak_repl service is up across the board Trying to use the repl features before newly started nodes have riak_repl completely initialized leads to all sorts of nasty crashes and noise. Frequently it makes fullsync stuck forever, which makes a lot of the tests fail. This also tweaks the AAE fullsync tests to remove assumptions about failure stats when AAE transient errors occur. The behavior in the handling of those errors has changed recently with the introduction of soft exits.	2014-12-18 16:07:00 -05:00
Ulf Wiger	07ec15dc44	check riak-admin status	2014-12-16 20:42:28 +01:00
John Burwell	5a6150ab14	Merge pull request #702 from basho/jsb/reduce-snmp-poll-interval Reduce the SNMP stats poll interval form 1 minute to 1 second	2014-11-12 13:28:13 -05:00
John Burwell	b103c52a87	Reduces the SNMP stats poll interval from 1 minutes to 1 second Reduces the probability of a race condition between the calculation of spiral/histogram metrics and SNMP stat cache refresh by reducing the SNMP poll interval to 1 second during test execution	2014-11-12 12:50:57 -05:00
John Burwell	a38166e6ea	Configure all JMX listeners to use an unprivledged port the test can be executed when not running as root. Logs the JMX query command line to assist test failure debugging	2014-11-11 22:06:22 -05:00
Christopher Meiklejohn	98b66a3190	Resolve race condition. Don't wait for convergence of the ring, because bucket properties are no longer stored in the ring; instead, wait until the property changes, which means the gossip has stabilized.	2014-11-06 12:30:03 -08:00
Micah	696e034ed8	Merge pull request #684 from basho/feature/mw/cluster-mgr-refactor-support Fixed cluster connection detmination function.	2014-10-30 11:40:43 -05:00
Russell Brown	1f4e504b70	Test for kv679 unique vnode id on a node	2014-10-15 11:29:16 +01:00
Kelly McLaughlin	cf55f195a5	Add option to return exit code from rt:admin calls Add an rt:admin/3 function that accepts a list of options as the third parameter. Currently the only valid option is return_exit_code. The rtdev, rtssh, and rt_cs_dev harnesses have been updated to support this option. If return_to_exit is specified the return from a ?HARNESS:admin call is a pair with the exit code as the first member and the command output as the second member. Finally the basic_command_line test has been changed to use return_for_exit to verify the changes.	2014-10-10 15:37:54 -06:00
Micah Warren	d1891f69fd	Fixed cluster connection detmination function. Due to the refactor for the cluster manager/connection manager system to use otp behaviors, the raw message method of getting stats has been ousted. Instead, it uses a call. To allow the riak_test to be able to check older clusters as well as the method, the function was extended to try new and then the old.	2014-09-24 15:42:26 -05:00
Kelly McLaughlin	7cd2645564	Add verification of handoff heartbeat to verify_handoff test Add testing of the handoff heartbeat change from the following pull request: https://github.com/basho/riak_core/pull/560. Add an intercept module for the riak_core_handoff_sender module to introduce artificial delay on item visitation during a handoff fold. This delay along with the changes to the verify_handoff test induces test failure when run without the heartbeat change. The handoff_receive_timeout is exceeded, handoff stalls, and the test eventually fails due to timeout. The test succeeds when run with the heartbeat change.	2014-09-11 15:05:26 -06:00
Russell Brown	45846699c7	Wait until all changes are replicated before passing Realtime repl takes real time, so wait for it to finish before calling the test done.	2014-08-27 18:21:00 +01:00
Russell Brown	57cbd61b35	Ensure realtime repl is doing it's thing before making updates	2014-08-27 15:33:10 +01:00
Sean Cribbs	67dfe49934	Merge pull request #654 from basho/sdc-nitpick-counter-cap Fix nitpick about new/old API from #653.	2014-08-26 09:16:33 -05:00
Andrew J. Stone	a6cb50cdfc	use 8 instead of 9 nodes to satisfy giddyup	2014-08-25 18:56:24 -04:00
Kelly McLaughlin	8aa5c3f33f	Merge branch 'feature/sc-overload-testing'	2014-08-25 13:58:27 -06:00
Kelly McLaughlin	1253c8ad3e	More robustification against races between overloading the vnodes and the list_keys attempt for the coverage testing. Conflicts: tests/overload.erl	2014-08-22 15:58:47 -06:00
Andrew J. Stone	c42d2ac055	Merge pull request #666 from basho/ajs/ensemble_byzantine Add tests for SC byzantine dataloss and tree loss	2014-08-21 23:17:01 -04:00
Jon Anderson	747212678b	Merge pull request #676 from basho/bugfix/jra/bug-611 Add repl_bucket_types check that RTQ is drained after bucket type mis-match	2014-08-21 20:52:29 -04:00
Jon Anderson	613bdc29f1	Change rtq drainage check to use dumpq and put it in a function.	2014-08-20 08:07:26 -04:00
Andrew J. Stone	44f562c2d5	add force_replace test to ensemble_ring_changes	2014-08-19 19:22:58 -04:00
Jon Anderson	f36abd590d	Add a check to make sure the RTQ queues drain after an unknown bucket type repl.	2014-08-19 16:36:45 -04:00
Sean Cribbs	b8d8e3026b	Verify that datatype stats are updated. See basho/riak_kv#1017	2014-08-18 15:24:20 -07:00
Kelly McLaughlin	3577f476e6	Updates to bucket property validation test * Rename the module from validate_nval_etc to bucket_props_validation * Employ testing of protocol buffers connections in addition to HTTP	2014-08-13 16:32:34 -06:00
Russell Brown	4e7936da17	Riak 2.0 allow_mult defaults to true	2014-08-13 14:15:25 -06:00
Russell Brown	f100468e11	Test for bad bucket property validation	2014-08-13 14:15:25 -06:00
Andrew J. Stone	07de5cb9c9	Update ensemble_ring_changes w/ node replace test Additionally fix cluster expansion and make test deterministically pass.	2014-08-13 00:22:02 -04:00
Andrew J. Stone	cee6cbf4ef	Fixup ensemble_ring_changes * Add some logging * Ensure updates work before and after ring expansion	2014-08-08 17:38:05 -04:00
Andrew J. Stone	d057999a7b	WIP - Add ensemble_ring_changes Ensemble_ring_changes tests writing a value, expanding the cluster, then updating and reading that value after ring expansion has completed. It also creates a bucket using a bucket type with a different n_val from the default bucket type. The latter tests basho/riak_kv#1008 and it's corresponding riak_core PR.	2014-08-05 17:23:42 -04:00
Andrew J. Stone	3bed92cfb4	Use rt functions to safely remove backend data Use riak_test_runner:metadata/0 to get the configured backend instead of defaulting to bitcask. Additionally we use rt:clean_data_dir/2 to safely remove backend directories.	2014-07-29 17:05:50 -04:00
Christopher Meiklejohn	ccc4d403d4	Add yz test for search over maps.	2014-07-29 12:33:16 -04:00
Andrew J. Stone	5319e75269	Add tests for SC byzantine dataloss and tree loss This is the first iteration of creating byzantine dataloss tests that show both recoverable and unrecoverable, but detectable errors. This tests the following scenarios. * Lose one partition worth of data, but no synctrees and recover. * Lose all but one partition of ensemble data, but no synctrees and recover. * Lose minority of synctrees. Only the peers with the missing synctrees are restarted. System remains available. * Loss of majority of synctrees. Majority peers are restarted. System recovers when they all come back online. * Loss of majority of synctrees with one node partitioned. All peers restarted except partitioned one. System does not recover with that node partitioned. When the partition is healed the system recovers. * Loss of all data and synctree except on one peer recovers. * Backing up and restoring old data but not synctrees results in detected errors. Restoring newer data fixes this. * Delete all data on all nodes, but not synctrees. This is detected and an error returned to the user.	2014-07-25 13:58:02 -04:00
Jon Anderson	09a60d1289	add timeouts to calls instead of from .riak_test.config	2014-07-24 14:47:46 -04:00
Jon Anderson	9746731f09	initial commit	2014-07-22 17:08:12 -04:00
Jon Anderson	a597b3ee63	removed sleep	2014-07-22 17:03:16 -04:00
Jon Anderson	0671b59b4c	add a intercept/server accounting of get_fsm processes.	2014-07-22 16:48:25 -04:00
Kelly McLaughlin	98681cd658	Increase the test code coverage and make the coverage checking more robust to failures.	2014-07-15 16:48:43 -06:00
Kelly McLaughlin	ceb24fc3e2	Merge pull request #661 from basho/bugfix/replication-ssl-site-ip-verification Fix failure of replication_ssl test introduced by `297090d`	2014-07-15 15:08:50 -06:00
Kelly McLaughlin	0ab2393559	Change replication SSL ACL tests to avoid certificate expiration Change the ACL test case in the replication_ssl and replication2_ssl tests to use certificates generated within the tests instead of relying on certificates created outside the test that are prone to expire and cause spurious test failure. Also change the replication_ssl and replication2_ssl tests to avoid a cycle of standing up the test clusters and then immediately restarting them before any tests cases execute. This should make the test execution slightly faster for both test modules. This commit also changes the tests to be a bit more robust in checking for cluster state when restarting nodes and removes an unnecessary five second sleep call in the replication_ssl test.	2014-07-15 12:06:06 -06:00
Kelly McLaughlin	9c5daf0f31	Fix failure of replication_ssl test introduced by `297090d` Change replication_ssl to use the wait_for_site_ips function from the replication module introduced in `297090ded6` instead of the defunct verify_site_ips function.	2014-07-14 12:37:32 -06:00
Russell Brown	a212b99a75	Update expected return to match change in riak_pb API See `2b68a97710` for details.	2014-07-14 17:14:51 +01:00
Joseph Blomstedt	695853cc94	Merge pull request #657 from basho/bugfix/ensemble-interleave-error-failed Fix ensemble_interleave error condition	2014-07-11 21:42:59 -07:00
Andrew J. Stone	f0643db473	Fix ensemble_sync by allowing {error, <<"failed">> Allow {error, <<"failed">>} as an error response in ensemble_sync. Fixes the test with basho/riak_ensemble#37 and basho/riak_kv#1002	2014-07-11 18:11:30 -04:00
Andrew J. Stone	38bd8399d1	Fix ensemble_interleave error condition Include {error, <<"failed">>} as allowed failure so that test passes with changes for basho/riak_ensemble#37 and basho/riak_kv#1002	2014-07-11 17:41:48 -04:00
Kelly McLaughlin	b59fb48611	Expand the overload test to include strong consistency Change the overload test to exercise the strongly consistent code paths in addition to the eventually consistent paths during overload conditions.	2014-07-11 13:23:48 -06:00
Kelly McLaughlin	297090ded6	Avoid a race condition in the replication test module Avoid a race condition in the replication test module when checking for site IP addresses in the replication status output. The test waits for a connection on the leader, but it only queries the replication status to check for the expected site IP addresses a single time. Change the test to wait and re-check the status output to give greater assurance that if the expected site IP addresses are not present it is due to legitimate failure and not a race condition in checking the replication status. This change affects the replication and replication_upgrade tests as well as any other tests that call the replication:replication function.	2014-07-01 16:36:45 -06:00
Jon Anderson	054c015d10	Merge pull request #651 from basho/feature/jra/verify_listkeys_eqcfsm Expand verify_listkeys_eqcfsm to track varying buckets and n_vals.	2014-07-01 13:19:41 -04:00
Sean Cribbs	17b94da468	Fix nitpick about new/old API from #653 .	2014-07-01 08:53:00 -05:00
Christopher Meiklejohn	f085f70169	Merge pull request #653 from basho/features/csm/crdt-capability Prevent autoreconnect problem.	2014-06-30 18:29:05 -04:00
Christopher Meiklejohn	da34719fe3	Prevent autoreconnect problem. Prevent a situtation where the auto-reconnect hasn't triggered yet causing the result to be an error, instead of ok, on the next operation after reconnecting. Force a disconnect and reconnect to make sure the test is deterministic.	2014-06-30 17:22:20 -04:00
Jon Anderson	7c2d7cc827	Expand verify_listkeys_eqcfsm to track varying buckets and n_vals.	2014-06-26 16:13:58 -04:00
Eric Redmond	0eb2d1c443	Merge pull request #650 from basho/er/yz-ensemble-test Test that ensemble delete functions in yokozuna	2014-06-23 17:38:09 -07:00
Engel A. Sanchez	3662965705	Merge pull request #649 from basho/feature/ensembles-wait-for-riak-kv Feature/ensembles wait for riak kv	2014-06-23 14:24:12 -04:00
Eric Redmond	266f9858eb	Test the ensemble delete function	2014-06-20 14:44:30 -07:00
Engel A. Sanchez	d32d007f4d	Fix service/peer check race Changing to fetching the list of peers first, then check if the riak_kv service is up. If the service is up, then check the peers. Otherwise it is possible to see the service down, then peers up because it went up in the interim. Also, making KV vnode delay configurable.	2014-06-20 14:26:24 -04:00
Engel A. Sanchez	3bf0954253	Test ensemble peers wait for riak_kv service Now ensemble peers are prevented from starting up until the riak_kv service is up to avoid nasty races that could even lead to node crashes as the ensembles frantically query for data that isn't ready.	2014-06-19 23:26:30 -04:00
Kelly McLaughlin	4b9a77c828	Re-initiate fullsync after a number of failed checks for completion Re-initiate fullsync after 100 failed checks for completion. The number of retries of the 'start fullsync and then check for completion' cycle is configurable using repl_util:start_and_wait_until_fullsync_complete/4 and defaults to 20 retries. This change is to avoid spurious test failures due to a rare condition where the rpc call to start fullsync fails to actually initiate the fullsync. A very similar changed for the version of the start_and_wait_until_fullsync_complete in the replication module introduced in `0a36f9974c` has had good effect at avoiding this condition for v2 replication tests.	2014-06-19 14:34:56 -06:00
Kelly McLaughlin	3466aa7c24	Merge branch 'bugfix/fix-repl-object-reformat'	2014-06-18 22:14:35 -06:00
Jon Anderson	c424848bb0	Merge pull request #644 from basho/bugfix/jra/listkeys_eqc_setup Fix occasional setup errors in verify_listkeys_eqcfsm	2014-06-18 20:29:00 -05:00
Jon Anderson	f3f5e40a36	Removed commented function.	2014-06-18 18:15:30 -05:00
Kelly McLaughlin	f8e10f2f75	Reinstate concurrency in replication_object_reformat test	2014-06-18 15:55:27 -06:00
Kelly McLaughlin	87ee6f5883	Fix replication_object_reformat test failure Part of the condition checking done in the replication_object_reformat test is to validate the results of a fullsync using repl_util:validate_completed_fullsync/6. The way in which the the function is called from the test expects fullsync to complete with 0 error_exit or retry_exit conditions occurring. This requires that sink cluster be in a steady state with all partitions available. The test failed to wait for such conditions to occur and instead relied on performing a node downgrade asynchronously and waiting for up to 60 seconds for a completion message before continuing with the test. The test was continually failing after a node was downgraded to `previous` due to partitions being reported as `down` on that node. To resolve the issue the node downgrade process is now done in the primary test process instead of in a separate spawned process. After the version downgrade is complete, the test now waits for the riak_repl and the riak_kv services, calls rt:wait_until_nodes_ready/1, calls rt:wait_until_no_pending_changes/1, and finally waits for the riak_repl2_fs_node_reserver named process to be registered on the downgraded node. This process is responsible for handling partition reservation requests and is key to determining the the new node is able to handle a fullsync without partition errors.	2014-06-18 15:55:27 -06:00
Kelly McLaughlin	c55e473b97	Merge branch 'feature/update-repl-systest-read-use'	2014-06-18 15:52:48 -06:00
Kelly McLaughlin	2f9a3cae4a	Update calls to rt:systest_read to handle identical siblings Update the calls to rt:systest_read in repl_util and repl_aae_fullsync_util to treat identical siblings resulting from the use of DVV as a single value. These changes are specifically to address failures seen in the repl_aae_fullsync_custom_n and replication_object_reformat tests, but should be generally useful for replication tests using the utility modules that and that have allow_mult set to true.	2014-06-18 14:33:44 -06:00
Andrew J. Stone	7d0301db35	add intercept for riak_kv_ensemble_backend:handle_down/4 in ensemble_vnode_crash	2014-06-17 23:13:44 -04:00
Andrew J. Stone	6c14c7c371	Add test to kill a vnode and vnode proxy Kill a vnode and it's proxy for a given key and ensure that operation reads succeed afterwards.	2014-06-17 17:57:15 -04:00
Jon Anderson	baf32904af	Remove un-used clean up function.	2014-06-17 17:26:23 -04:00
Jon Anderson	8912210036	Re-enable AAE.	2014-06-17 17:04:39 -04:00
Jon Anderson	472241f180	Take cluster set up out of a state and instead put it in the property.	2014-06-17 16:49:20 -04:00
John Burwell	6733c099c8	Merge pull request #636 from basho/bugfix/jsb/start-ensemble-without-aae Verify Riak Startup when Strong Consistency is Misconfigured	2014-06-16 09:30:33 -04:00
Micah	c96f318f6a	Merge pull request #643 from basho/bugfix/mw/better-isolate-pb_security-certs isolate certs created for the pb_security tests.	2014-06-12 17:30:36 -05:00
Micah Warren	f7631b42c3	pb_cipher_suites test creates certs in its own dir. Same reason as pb_security and http_security: to keep other tests from stomping on it.	2014-06-12 17:22:42 -05:00
Micah Warren	f96847beb8	isolate certs created for the pb_security tests. This should prevent other tests from interfering in its execution	2014-06-12 17:18:15 -05:00
Kelly McLaughlin	0589935931	Fix problems with cert specifications causing replication_ssl to fail Fix problem with cacertdir specification in replication_ssl test. The code used load cert files in v2 replication expects the path specific by the cacertdir key to only be a directory. With v3 replication the code used is flexible enough to allow a directory or a file. Also correct a typo in the certfile path for the SSLConfig1 configuration.	2014-06-12 12:38:58 -06:00
Kelly McLaughlin	5f5c3ac035	Merge branch 'bugfix/replication-upgrade-fixes'	2014-06-12 10:39:53 -06:00
Kelly McLaughlin	21b64526f1	Fix two issues with replication_upgrade test * Do not attempt to cancel fullsync if the initial attempt to start and wait for completion fails. It has not been observed that the problem is fullsync starting and not completing in time, but rather the issue is that the initial call to start fullsync does not take effect. Therefore the cancellation is unnecessary. * Replace the call to repl_util:wait_for_connection/2 in the node upgrade process with a call to replication:wait_until_connection/1. This function is geared towards v2 replication and should speed up test execution.	2014-06-11 21:53:29 -06:00
Micah	2c5def132c	Merge pull request #638 from basho/bugfix/mw/pb_security-using-removed-function Fixed map crdt creation	2014-06-11 13:50:09 -05:00
Micah Warren	3067209a97	Fixed map crdt creation riakc_map:add/2 no longer exists, so updated the creation of that key to use the correct update semantics.	2014-06-11 13:25:13 -05:00
Kelly McLaughlin	0e2b52d8b1	Fix timing issue with jmx_verify test Replace use of a 40 second sleep in the test_supervision test case with a wait condition to better handle variances in the time it takes to progress through 10 retry attempts.	2014-06-11 11:26:45 -06:00
Kelly McLaughlin	0601cd594f	Merge branch 'bugfix/replication-upgrade-return-term'	2014-06-10 17:14:31 -06:00
John Burwell	6d8c504dba	- Verifies Riak startup behavior when strong consistency is enabled and AAE is disabled. (defect https://github.com/basho/riak_kv/issues/959) - Adds additional console output to reset-current-env to explain configuration and steps being executed - Adds the -n option to the reset-current-env script to specify the number of nodes to build. By default, 5 will be created.	2014-06-10 15:01:10 -04:00
Kelly McLaughlin	ba4db5ac74	Change replication upgrade tests to return pass on success As of commit `3044839456` tests that return something other than the prescribed success atom 'pass' to indicate success result in test failure. Change the replication_upgrade and replication2_upgrade tests that return the result of the a call to lists:foreach/2 to instead return 'pass' to indicate success.	2014-06-10 12:58:57 -06:00
Joseph Blomstedt	4322795d8f	Merge pull request #632 from basho/jdb-auto-ensemble Update ensemble tests to work with auto-activation	2014-06-06 16:45:43 -07:00
Engel A. Sanchez	cf10cf96a4	Merge pull request #594 from basho/feature/bitcask-tombstone2-upgrade Verify Bitcask tombstone 2 upgrade	2014-06-06 10:34:15 -04:00
Joseph Blomstedt	65e15a50ad	Update ensemble tests to work with auto-activation Prior to this commit, the various riak_ensemble related tests would manually enable the consensus system on one-and-only-one node in a given cluster in order to work around issue basho/riak_core#571. This commit changes the tests to work properly after the above issue has been fixed. In addition to removing the call to riak_ensemble_manager:enable() that is now handled automatically by Riak, this commit also removes a few wait_until_stable/2 checks against 1-node clusters. These checks no longer apply, since Riak is now designed to only enable the consensus system after the cluster contains at least 3 nodes.	2014-06-05 16:43:12 -07:00
Christopher Meiklejohn	e07c5eb3f6	Merge pull request #626 from basho/bugfix/bjs/riak_repl_cancelled_connections_cleanup Disconnect and cancelled connection tests	2014-06-05 16:09:34 -04:00
Christopher Meiklejohn	b208394f63	Adapt test with better assertions and new cluster API.	2014-06-05 16:08:08 -04:00
Micah	ce1969d53b	Merge pull request #621 from basho/bugfix/mw/pb-cipher-suites-and-http-security Change http_security uses different directory for certs then pb_cipher_suites	2014-06-05 11:34:28 -05:00
bsparrow435	a5746cb416	Adding re-connection test Adding onto the existing test to prove that a reconnection to the same endpoint can occur after the cancelled connection has been removed.	2014-06-04 21:46:57 -04:00
bsparrow435	c89de8dac9	Address PR comments Changed intercept to explicitly return `{error, econnrefused}`. Moved helper functions to `repl_util` and added a new helper to distinguish between disconnects on `cluster_by_name` and `cluster_by_address` connections. Added asserts to all wait_for functions.	2014-06-04 19:51:21 -04:00
Joseph Blomstedt	100180e7ff	Merge pull request #593 from basho/ajs/ensemble_remove_node	2014-06-04 11:05:34 -07:00
Joseph Blomstedt	f822e52fe8	Extend ensemble_remove_node2 to test shutdown ensemble_remove_node2 uses an intercept to prevent a riak_ensemble related transition that is necessary for nodes to completely exit and shutdown after removal. In fact, testing for this scenario is the entire point of this test, since it is testing logic that was added to solve basho/riak_core#572 and that logic prevents nodes from exiting until that transition occurs. However, even without this new logic, there is an unrelated riak_ensemble related bug that can trigger a race condition that also prevents nodes from shutting down. The good news is that other changes made as part of the solution to solve basho/riak_core#572 also fix this unrelated bug. Therefore this commit extends ensemble_remove_node2 to remove the intercept at the end of the test and verify that the removed nodes do actually end up exiting as expected. Thus, the test now tests for both the negative and positive scenarios and serves as a test against future regressions that stall node removal/shutdown.	2014-06-03 15:50:06 -07:00
John Burwell	610c1a6ab8	Merge pull request #627 from basho/bugfix/jsb/fix-list-keys-missing-bucket-type-client Test the operation of list keys and buckets for undefined bucket types	2014-06-03 16:59:47 -04:00
Engel A. Sanchez	27b15c8d37	Merge pull request #630 from basho/feature/wait-for-bucket-props Add wait until bucket type visible	2014-06-03 16:35:56 -04:00
Evan Vigil-McClanahan	7f0b898e33	Merge pull request #625 from basho/membackend-test add expanded memory backend tests	2014-06-03 11:50:01 -07:00
Engel A. Sanchez	595c13019d	Add wait until bucket type visible Adding a test to verify a bucket type is visible from a number of nodes since the active status is given as long as the claimant sees it. But requests to other nodes can end up hitting the dreaded {error, no_type}. Also added a general utility that can be used for bucket type checks and for general verification of bucket properties across nodes.	2014-06-03 14:03:24 -04:00
John Burwell	36741bb977	- Adds tests to verify the operation of list keys and list buckets when an undefined bucket type is specified. (defect #875) - Adds a description of the reset-current-env.sh script and its usage to README.md - Corrects a spelling mistake in an information message emitted by the reset-current-env.sh script	2014-06-03 13:44:05 -04:00
Micah Warren	fc13bbcf3c	Added cacert to http_security path. While r16b02-basho5 did not need the cacertfile path put in, r16b03 did. The test still passes r16b02-basho5 with the added cacertfile line. Since there is no harm in putting it in, better for forwards compatibility than not.	2014-06-03 12:41:35 -05:00
Andrew J. Stone	2c7fc9fb4b	Merge pull request #628 from basho/bugfix/ensemble-timouts increase timeouts for SC operations in ensemble tests	2014-06-03 13:29:49 -04:00
Sean Cribbs	3ffab65863	Remove clients tests because they are already running on every commit on builders.	2014-06-03 11:44:07 -05:00
Micah	b14d093ce5	Merge pull request #589 from basho/vinoski/http_security_cluster remove localhost dependencies in http_security test	2014-06-03 11:31:00 -05:00
evan	b431ba19a1	reorder tests to avoid test-breaking side-effects	2014-06-02 15:23:33 -07:00
evan	0295115e14	fix for ee versions	2014-06-02 13:53:35 -07:00
evan	bdeb8a4138	fix coyote error	2014-06-02 13:40:21 -07:00
Andrew J. Stone	a35be4bf28	increase timeouts for SC operations in ensemble tests	2014-06-02 16:40:19 -04:00

1 2 3 4 5 ...

1220 Commits