mirror of
https://github.com/empayre/fleet.git
synced 2024-11-06 08:55:24 +00:00
fa89dff66f
The scheduled test run https://github.com/fleetdm/fleet/actions/runs/7764392848 failed with a panic because `TestWindowsMDMEnrollmentPrevented` timed out: ``` 2024-02-03T05:05:26.3041218Z === RUN TestWindowsMDMEnrollmentPrevented 2024-02-03T05:05:26.3044251Z === RUN TestWindowsMDMEnrollmentPrevented/{RenewEnrollmentProfile:false_RotateDiskEncryptionKey:false_NeedsMDMMigration:false_NeedsProgrammaticWindowsMDMEnrollment:true_WindowsMDMDiscoveryEndpoint:http://example.com/_NeedsProgrammaticWindowsMDMUnenrollment:false_PendingScriptExecutionIDs:[]_EnforceBitLockerEncryption:false} 2024-02-03T05:05:26.3047208Z coverage: 2.5% of statements in github.com/fleetdm/fleet/v4/... 2024-02-03T05:05:26.3047963Z panic: test timed out after 1h0m0s 2024-02-03T05:05:26.3048482Z running tests: 2024-02-03T05:05:26.3049005Z TestWindowsMDMEnrollmentPrevented (59m52s) 2024-02-03T05:05:26.3052172Z TestWindowsMDMEnrollmentPrevented/{RenewEnrollmentProfile:false_RotateDiskEncryptionKey:false_NeedsMDMMigration:false_NeedsProgrammaticWindowsMDMEnrollment:true_WindowsMDMDiscoveryEndpoint:http://example.com/_NeedsProgrammaticWindowsMDMUnenrollment:false_PendingScriptExecutionIDs:[]_EnforceBitLockerEncryption:false} (59m52s) [...] 2024-02-03T05:05:26.3068624Z goroutine 69 [chan receive]: 2024-02-03T05:05:26.3069997Z github.com/fleetdm/fleet/v4/orbit/pkg/update.TestWindowsMDMEnrollmentPrevented.func2.1({{0xe3ada3, 0x12}, {0x0, 0x0}, {0xe37311, 0xc}}) 2024-02-03T05:05:26.3072376Z /home/runner/work/fleet/fleet/orbit/pkg/update/notifications_test.go:295 +0x65 2024-02-03T05:05:26.3074514Z github.com/fleetdm/fleet/v4/orbit/pkg/update.(*windowsMDMEnrollmentConfigFetcher).attemptEnrollment(0xc0000f8cf0, {0x0, 0x0, 0x0, 0x1, {0xe3ada3, 0x12}, 0x0, {0x0, 0x0, ...}, ...}) ``` I was able to reproduce locally 1/4th of the times, after putting the following print statements: ```diff if cfg.NeedsProgrammaticWindowsMDMEnrollment { fetcher.execEnrollFn = func(args WindowsMDMEnrollmentArgs) error { - <-chProceed // will be unblocked only when allowed + fmt.Println("fetcher.execEnrollFn A: ", apiCallCount) + <-chProceed // will be unblocked only when allowed + fmt.Println("fetcher.execEnrollFn B: ", apiCallCount) apiCallCount++ // no need for sync, single-threaded call of this func is guaranteed by the fetcher's mutex return apiErr } @@ -301,7 +303,9 @@ func TestWindowsMDMEnrollmentPrevented(t *testing.T) { } } else { fetcher.execUnenrollFn = func(args WindowsMDMEnrollmentArgs) error { - <-chProceed // will be unblocked only when allowed + fmt.Println("fetcher.execUnenrollFn A: ", apiCallCount) + <-chProceed // will be unblocked only when allowed + fmt.Println("fetcher.execUnenrollFn B: ", apiCallCount) apiCallCount++ // no need for sync, single-threaded call of this func is guaranteed by the fetcher's mutex return apiErr } @@ -317,23 +321,33 @@ func TestWindowsMDMEnrollmentPrevented(t *testing.T) { started := make(chan struct{}) go func() { + fmt.Println("before close started") close(started) + fmt.Println("aftre close started") // the first call will block in enroll/unenroll func + fmt.Println("before inner fetchergetconfig") cfg, err := fetcher.GetConfig() + fmt.Println("after inner fetchergetconfig") assertResult(cfg, err) }() + fmt.Println("before started") <-started + fmt.Println("after started") // this call will happen while the first call is blocked in // enroll/unenrollfn, so it won't call the API (won't be able to lock the // mutex). However it will still complete successfully without being // blocked by the other call in progress. + fmt.Println("before first fetchergetconfig") cfg, err := fetcher.GetConfig() + fmt.Println("before first fetchergetconfig") assertResult(cfg, err) // unblock the first call and wait for it to complete + fmt.Println("before close chProceed 1") close(chProceed) + fmt.Println("after close chProceed 2") time.Sleep(100 * time.Millisecond) ``` This is the output I've got every time the test hung: ``` before started before close started aftre close started after started before first fetchergetconfig before inner fetchergetconfig after inner fetchergetconfig fetcher.execEnrollFn A: 0 ``` And this is the output when the tests passed ``` before started before close started aftre close started before inner fetchergetconfig fetcher.execUnenrollFn A: 0 after started before first fetchergetconfig before first fetchergetconfig before close chProceed 1 after close chProceed 2 fetcher.execUnenrollFn B: 0 after inner fetchergetconfig fetcher.execUnenrollFn A: 1 fetcher.execUnenrollFn B: 1 ``` Note how the deadlock occurs when `GetConfig` is called first outside of the goroutine. I added some logic to prevent this, but I'm confident there must be a better way to accomplish the same. cc: @mna you're the king of concurrency, do you have any ideas? |
||
---|---|---|
.. | ||
badgerstore | ||
filestore | ||
config_fetcher.go | ||
disk_encryption.go | ||
execcmd_darwin.go | ||
execcmd_stub.go | ||
execcmd.go | ||
execwinapi_stub.go | ||
execwinapi_windows.go | ||
execwinapi.go | ||
file.go | ||
flag_runner_test.go | ||
flag_runner.go | ||
hash_test.go | ||
hash.go | ||
notifications_test.go | ||
notifications.go | ||
nudge_test.go | ||
nudge.go | ||
options_darwin.go | ||
options_linux.go | ||
options_windows.go | ||
options.go | ||
runner_test.go | ||
runner.go | ||
swift_dialog_test.go | ||
swift_dialog.go | ||
testing_utils.go | ||
update_test.go | ||
update.go |