Change some of the helper functions in the repl_util module to handle
errors more sensibly so that cluster setup race conditions do not
cause unnecessary test failures.
* Change group leader for cover_server while generating reports, so the
'includes data from imported files' message can be suppressed.
* Append a phash of the test metadata to the coverdata filename to keep
them unique.
* Removed unused maybe_stop function.
To enable us to be able to see the *combined* coverage of our unit and
integration tests, modify riak_test and the smoke_test runner to capture
coverage data per-test and post it as a giddyup artifact.
To maintain the current riak_test behaviour where the *combined*
coverage is reported on at the end of a run, each test writes its own
.coverdata file, cover is reset and then once all tests are run, the
coverdata files are all loaded and the total coverage is reported.
Recently Scott was running into an issue running `verify_handoff`
where his old data was not being properly reset when running
`setup_harness`. I noticed we were using `os:cmd` which doesn't check
the exit code of the command. I modified `run_git` to use `cmd` as
well as verify the exit code is 0. This allowed Scott to catch the
real issue which turned out to be a bad path in his config.
While making this modification I noticed a bug in the pipe cleanup
code. The `file:del_dir` call is actually returning `{error, eexist}`
because there is a `bin` directory under each pipe dir which had not
yet been deleted. Rather than spend time writing a recursive delete in
Erlang I changed the code to use `cmd` and to confirm an exit of 0.
I modified `stop_all`, which is used by `setup_harness`, to also use
`cmd` and check exit codes.
Finally I make sure that `spawn_cmd` flattens the list before passing
it along as `open_port` wants a string not an iolist.
Well, that's not true. They break riak_kv's context operations on Maps.
This change works around that breakage by turning the context off for
the operations in this test. It is a temporary thing, when the context fix
work has been done, we'll be changing back.
The heartbeat timeout enforcement was recently updated to be
specified in seconds to match the documentation for that option. The
repl_rt_heartbeat test has since been failing since it still specified
the timeout in milliseconds. This change makes the test use seconds
for the heartbeat timeout gets the test passing again.
The wait_until_leader_converge function could incorrectly return
success if all of the results from the get_leader rpc calls were
either undefined or all returned a badrpc tuple. In either case the
particular result ends up as the sole unique value in a list and the
success condition is verifying that the list is of length 1 regardless
of the value of the member of the list. Change the function to filter
the list of results for values that indicate failure prior to the
success condition checking.
This test verifies that AAE repairs replicas of values without passive
read repairs. This includes missing replicas and replicas with divergent
values. It will also repair entire KV partitions lost, and if
configured for trees to rebuild, it will recover from AAE data loss and
corruption.
This version differs from the original 1.4 test only in the handling of
siblings. It does get before put for modifications and merges values by
choosing the longest one, as modifications in this test append bits.