#16331
Doc updates in a separate PR:
https://github.com/fleetdm/fleet/pull/17214
# Checklist for submitter
- [x] Changes file added for user-visible changes in `changes/` or
`orbit/changes/`.
See [Changes
files](https://fleetdm.com/docs/contributing/committing-changes#changes-files)
for more information.
- [x] Input data is properly validated, `SELECT *` is avoided, SQL
injection is prevented (using placeholders for values in statements)
- [x] Added/updated tests
- [x] Manual QA for all new/changed functionality (smoke-tested locally
with osquery-perf simulating 100 hosts, ran a live query, a saved live
query, stopped naturally and stopped before the end, and again via
fleetctl)
---------
Co-authored-by: Victor Lyuboslavsky <victor@fleetdm.com>
Co-authored-by: Victor Lyuboslavsky <victor.lyuboslavsky@gmail.com>
This could help future users to detect this issue: #10957
It also adds an error log in Fleet that prints the actual error.
The error is displayed if I kill Redis during a live session or if I set
`client-output-buffer-limit` to something real low like `CONFIG SET
"client-output-buffer-limit" "pubsub 100kb 50kb 60"`:
![Screenshot 2023-05-25 at 09 08
08](https://github.com/fleetdm/fleet/assets/2073526/f021a77a-3a22-4b48-8073-bae9c6e21a11)
- [X] Changes file added for user-visible changes in `changes/` or
`orbit/changes/`.
See [Changes
files](https://fleetdm.com/docs/contributing/committing-changes#changes-files)
for more information.
- ~[ ] Documented any API changes (docs/Using-Fleet/REST-API.md or
docs/Contributing/API-for-contributors.md)~
- ~[ ] Documented any permissions changes~
- ~[ ] Input data is properly validated, `SELECT *` is avoided, SQL
injection is prevented (using placeholders for values in statements)~
- [X] Added support on fleet's osquery simulator `cmd/osquery-perf` for
new osquery data ingestion features.
- ~[ ] Added/updated tests~
- [X] Manual QA for all new/changed functionality
- ~For Orbit and Fleet Desktop changes:~
- ~[ ] Manual QA must be performed in the three main OSs, macOS, Windows
and Linux.~
- ~[ ] Auto-update manual QA, from released version of component to new
version (see [tools/tuf/test](../tools/tuf/test/README.md)).~
This was found while working on #10957.
When running a live query, a lot of unused host data is stored in Redis
and sent on every live query result message via websockets. The frontend
and fleetctl just need `id`, `hostname` and `display_name`. (This
becomes worse every time we add new fields to the `Host` struct.)
Sample of one websocket message result when running `SELECT * from
osquery_info;`:
size in `main`: 2234 bytes
```
a["{\"type\":\"result\",\"data\":{\"distributed_query_execution_id\":57,\"host\":
{\"created_at\":\"2023-05-22T12:14:11Z\",\"updated_at\":\"2023-05-23T12:31:51Z\",
\"software_updated_at\":\"0001-01-01T00:00:00Z\",\"id\":106,\"detail_updated_at\":\"2023-05-23T11:50:04Z\",
\"label_updated_at\":\"2023-05-23T11:50:04Z\",\"policy_updated_at\":\"1970-01-02T00:00:00Z\",
\"last_enrolled_at\":\"2023-05-22T12:14:12Z\",
\"seen_time\":\"2023-05-23T09:52:23.876311-03:00\",\"refetch_requested\":false,
\"hostname\":\"lucass-macbook-pro.local\",\"uuid\":\"BD4DFA10-E334-41D9-8136-D2163A8FE588\",\"platform\":\"darwin\",\"osquery_version\":\"5.8.2\",\"os_version\":\"macOS 13.3.1\",\"build\":\"22E261\",\"platform_like\":\"darwin\",\"code_name\":\"\",
\"uptime\":91125000000000,\"memory\":34359738368,\"cpu_type\":\"x86_64h\",\"cpu_subtype\":\"Intel x86-64h Haswell\",\"cpu_brand\":\"Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz\",\"cpu_physical_cores\":4,\"cpu_logical_cores\":8,\"hardware_vendor\":\"Apple Inc.\",\"hardware_model\":\"MacBookPro16,2\",\"hardware_version\":\"1.0\",
\"hardware_serial\":\"0DPQR4HMD1FZ\",
\"computer_name\":\"Lucas’s MacBook Pro\",\"public_ip\":\"\",
\"primary_ip\":\"192.168.0.230\",\"primary_mac\":\"68:2f:67:8e:b6:1f\",
\"distributed_interval\":1,\"config_tls_refresh\":60,\"logger_tls_period\":10,\"team_id\":null,
\"pack_stats\":null,\"team_name\":null,
\"gigs_disk_space_available\":386.23,\"percent_disk_space_available\":40,
\"issues\":{\"total_issues_count\":0,\"failing_policies_count\":0},
\"mdm\":{\"enrollment_status\":null,\"server_url\":null,\"name\":\"\",\"encryption_key_available\":false},
\"status\":\"online\",\"display_text\":\"lucass-macbook-pro.local\",\"display_name\":\"Lucas’s MacBook Pro\"},
\"rows\":[{\"build_distro\":\"10.14\",\"build_platform\":\"darwin\",
\"config_hash\":\"b7ee9363a7c686e76e99ffb122e9c5241a791e69\",\"config_valid\":\"1\",
\"extensions\":\"active\",\"host_display_name\":\"Lucas’s MacBook Pro\",
\"host_hostname\":\"lucass-macbook-pro.local\",\"instance_id\":\"cde5de81-344b-4c76-b1c5-dae964fdd4f2\",\"pid\":\"8370\",\"platform_mask\":\"21\",\"start_time\":\"1684757652\",
\"uuid\":\"BD4DFA10-E334-41D9-8136-D2163A8FE588\",
\"version\":\"5.8.2\",\"watcher\":\"8364\"}],\"error\":null}}"]
```
vs. size of the message result on this branch: 675 bytes
```
a["{\"type\":\"result\",\"data\":{\"distributed_query_execution_id\":59,
\"host\":{\"id\":106,\"hostname\":\"lucass-macbook-pro.local\",
\"display_name\":\"Lucas’s MacBook Pro\"},
\"rows\":[{\"build_distro\":\"10.14\",\"build_platform\":\"darwin\",
\"config_hash\":\"f80dee827635db39077a458243379b3ad63311fd\",
\"config_valid\":\"1\",\"extensions\":\"active\",\"host_display_name\":\"Lucas’s MacBook Pro\",
\"host_hostname\":\"lucass-macbook-pro.local\",
\"instance_id\":\"cde5de81-344b-4c76-b1c5-dae964fdd4f2\",\"pid\":\"8370\",\"platform_mask\":\"21\",
\"start_time\":\"1684757652\",\"uuid\":\"BD4DFA10-E334-41D9-8136-D2163A8FE588\",\"version\":\"5.8.2\",
\"watcher\":\"8364\"}]}}"]
```
Manual tests included running with an old fleetctl running with a new
fleet server, and vice-versa, a new fleetctl running against an old
fleet server.
- [X] Changes file added for user-visible changes in `changes/` or
`orbit/changes/`.
See [Changes
files](https://fleetdm.com/docs/contributing/committing-changes#changes-files)
for more information.
- [X] Documented any API changes (docs/Using-Fleet/REST-API.md or
docs/Contributing/API-for-contributors.md)
- ~[ ] Documented any permissions changes~
- ~[ ] Input data is properly validated, `SELECT *` is avoided, SQL
injection is prevented (using placeholders for values in statements)~
- ~[ ] Added support on fleet's osquery simulator `cmd/osquery-perf` for
new osquery data ingestion features.~
- [X] Added/updated tests
- [X] Manual QA for all new/changed functionality
- ~For Orbit and Fleet Desktop changes:~
- ~[ ] Manual QA must be performed in the three main OSs, macOS, Windows
and Linux.~
- ~[ ] Auto-update manual QA, from released version of component to new
version (see [tools/tuf/test](../tools/tuf/test/README.md)).~
* Fix races in go tests and run with -race on CI
* Fix race in pubsub
* Increase timeout to 15m for go tests
* CI takes forever, try disabling race
* Remove timeout from go tests
Add a relatively minimal set of linters that raise safe and
mostly un-opinionated issues with the code. It runs
automatically on CI via a github action.
* Make receive calls to redis conn thread safe
Also removes REDIS_TEST env var. Redis is lightweight and fast, no need
to skip these tests.
* No need to increase the wait
This should support Redis in both cluster and non-cluster modes.
Updates were made separately to github.com/throttled/throttled to support the slight changes in types.
Co-authored-by: Joseph Macaulay <joseph.macaulay@uber.com>
Co-authored-by: Zach Wasserman <zach@fleetdm.com>
This feature enables a new config option (redis.duplicate_results). When set to true, all Live Query results will be copied to an additional Redis pubsub channel named LQDuplicate
This is useful in a scenario that would involve shipping the Live Query results outside of Fleet, near-realtime.
Improves MySQL test time (on my 2020 MBP) to ~18s from ~125s.
- Use separate databases for each test to allow parallelization.
- Run migrations only once at beginning of tests and then reload
generated schema.
- Add `--innodb-file-per-table=OFF` for ~20% additional speedup.
This change optimizes live queries by pushing the computation of query
targets to the creation time of the query, and efficiently caching the
targets in Redis. This results in a huge performance improvement at both
steady-state, and when running live queries.
- Live queries are stored using a bitfield in Redis, and takes
advantage of bitfield operations to be extremely efficient.
- Only run Redis live query test when REDIS_TEST is set in environment
- Ensure that live queries are only sent to hosts when there is a client
listening for results. Addresses an existing issue in Fleet along with
appropriate cleanup for the refactored live query backend.
- Add toggle to disable live queries in advanced settings
- Add new live query status endpoint (checks for disabled via config and Redis health)
- Update QueryPage UI to use new live query status endpoint
Implements #2140
Notable refactoring:
- Use stdlib "context" in place of "golang.org/x/net/context"
- Go-kit no longer wraps errors, so we remove the unwrap in transport_error.go
- Use MakeHandler when setting up endpoint tests (fixes test bug caught during
this refactoring)
Closes#1411.
Due to the timezone being changed on time.Time's zero value after a roundtrip
serialization, any times being compared in this test must be explicitly set.
A distributed query campaign can be "orphaned" (left in the QueryRunning state)
if the Kolide server restarts while it is running, or other weirdness occurs.
When this happens, no subscribers are waiting to read results written by
osqueryd agents, but the agents continue to receive the query. Previously, this
would cause us to error on ingestion.
The new behavior will instead set the campaign to completed when it detects
that it is orphaned. This should prevent sending queries for which there is no
subscriber.
- New NoSubscriber error interface in pubsub
- Detect NoSubscriber errors and close campaigns
- Tests on pubsub and service methods
Fixes#695
Removed Gorm, replaced it with Sqlx
* Added SQL bundling command to Makfile
* Using go-kit logger
* Added soft delete capability
* Changed SearchLabel to accept a variadic param for optional omit list
instead of array
* Gorm removed
* Refactor table structures to use CURRENT_TIMESTAMP mysql function
* Moved Inmem datastore into it's own package
* Updated README
* Implemented code review suggestions from @zwass
* Removed reference to Gorm from glide.yaml
A new datastore interface is needed for buffering incoming distributed query results to be sent to the client. This PR attempts to define and implement that interface.
It is intended that the ReadChannel() method be used by the goroutine that will push query results down a websocket to the client. Passing the results through this channel will allow that goroutine to perform a select on both the channel and the websocket, in order to properly handle IO.