* WIP
* Send usage analytics
* Improve loggin of cron tasks and fix test
* Implement appconfig method now that we are checking that as well
* Address review comments
* Migrate all mysql tests to the new form
* Only dump sql if MYSQL_TEST is on
* Removing parallel until we get rid of this code
* Move TestMain to an actual _test file
* A little experiment with tmpfs to speed up the db
* Let's make sure the dump.sql file is also in ram
* WIP
* Add get user_roles and apply for a user_roles spec to fleetctl
* Uncomment other tests
* Update test to check output
* Update test with the new struct
* Mock token so that it doesn't pick up the one in the local machine
* Address review comments
* Fix printJSON and printYaml
* Fix merge conflict error
* WIP
* wip
* wip
* Finish implementation
* Address review comments
* Fix flaky test
* WIP
* Add get user_roles and apply for a user_roles spec to fleetctl
* Uncomment other tests
* Update test to check output
* Update test with the new struct
* Mock token so that it doesn't pick up the one in the local machine
* Address review comments
* Fix printJSON and printYaml
* Fix merge conflict error
* If both roles are specified, fail
* Fix test
* Switch arguments around
* Update test with the new rule
* Fix other tests that fell through the cracks
* Add host users
* Add changes file and test removing pull_request from the on test
* Remove users and store the removal timestamp
* Improve test yml to allow for PRs from forks
* Make roles for users mandatory
* Remove nop migration
* Add missing test for wrong role
* Properly validate global and team roles
* Address codacy issues
* Address codacy review
* No need to check for nil
* First approach to diff
* Refactor things for better readability and testing
* Remove draft comment for algorithm
* Format things a bit better
* Remove unused and simplify code a bit
* Refactor for readability and testing
* Add changes file
* Implement new approach based on review comments
* Make sure to only delete from the current host
* Add single uninstall test and fix code
* Improve code based on review
* Refactor error handling for better extensibility and add more scaffolding for specific db errors
* Add integration tests to check errors from mysql are translated properly
* Address review comments
* Add changes file
* Remove username from UI code
* Remove username from tests
* Remove username from database
* Modify server endpoints for removing username
* Implement backend aspects of removing username
* Update API docs
* Add name to fleetctl
- Add enable_analytics column to database.
- Allow enable_analytics to be set via API.
- Add messaging in fleetctl setup.
Note that this defaults to off for existing installations, and defaults
on for newly set up installs.
No collection or sending of analytics yet exists, we are strictly
storing the preference at this time.
Part of #454
This may help with deadlocks on the `label_membership` table. It is not
clear from MySQL documentation whether the order of the records is
significant for locking within a single query. If it is, this should
help the problem. If it is not, this should have no negative impact.
May fix#1146
* #511 refactored update options - new params & ts
* updated server to include agent_options for read and update
* added agent options form to org settings
* #511 finished connecting agent form to server
* #511 fixing api to save/read agent options
* #511 linted
* #511 fixed reading & updating agent options
* #511 api fixes to support agent options
* #511 removed log
* Fix json.RawMessage pointers in tests
Co-authored-by: Zach Wasserman <zach@fleetdm.com>
Reorder migrations from the long-running `teams` branch to ensure that
they can run successfully for deployments upgrading from a pre-4.0
release.
All migrations from the `teams` branch are reordered to take place
_after_ all migrations from the `main` branch, using `20210601` as the
new date, after the latest released `main` branch migration on `20210526`.
Fixes#1058
- Add TeamFilter to relevant host and label methods.
- Pass appropriate filter in service methods.
The dashboard should now show the appropriate hosts for a user's team membership.
- Add `team_id` field to secrets.
- Remove secret `name` and `active` fields (migration deletes inactive secrets).
- Assign hosts to Team based on secret provided.
- Add API for retrieving secrets by Team.
- Accept Teams as a searchable target type for the target selection API.
- Accept Teams for targets in running live queries.
- Refactoring to support these changes.
- Update API documentation.
- Move host `additional` into a separate table.
- Join when that data is needed.
- API change: `/api/v1/fleet/hosts` now returns only the requested
`additional` columns, unless `*` is provided as the sole argument.
Background:
A customer reported that MySQL binlogs grew huge and replication lag
went way up when data was stored in the `additional` column. In this
deployment MySQL was running with ROW replication. This would cause the
entire `additional` data to be copied on each update of the host checkin
time. While switching to STATEMENT or MIXED replication would likely
mitigate the issue, this was not an option in their environment.
Some datastore and service methods would return slices of structs,
rather than slices to pointers of structs (which most methods used).
Make this more consistent.
- Include only hosts that the user has access to in search targets API.
- Add parameter to specify whether `observer` hosts should be included.
- Generate counts based on which hosts user can access.
- Update API doc.
1. use [staticcheck](https://staticcheck.io/) to check the code, and fix some issues.
2. use `go fmt` to format the code.
3. use `go mod tidy` clean the go mod.
Introduces the appropriate cascading for foreign keys on the
scheduled_query_stats table to prevent errors when deleting the
associated packs, scheduled queries, and queries.
Fixes#764Fixes#766
This allows the host details to be refetched on the next check in,
rather than waiting for the normal interval to go by. Associated UI
changes are in-progress.
- Migration and service methods for requesting refetch.
- Expose refetch over API.
- Change detail query logic to respect this flag.
- Allow agent options to be set on per-team basis.
- Move global agent options into app configs.
- Update logic for calculating agent options for hosts.
- Updates to relevant testing.
- Maintain software inventory with detail queries.
- Associated database migrations.
- Feature flagged off by default (see documentation for details to turn on).
- Documentation.
- New test helper for slice element comparisons skipping ID.
Instead of synchronously updating the seen_time column for a host on an update, batch these updates to be written together every 1 second.
This results in a ~33% reduction in MySQL CPU usage in a local test with 4,000 simulated hosts and MySQL running in Docker.
- Migrate old admins to global admins
- Migrate old non-admins to global maintainers
- Remove old admin column
- Give initial user global admin privilege
- Comment out some tests (to be refactored for new permissions model later)
- Reorder migrations post-rebase
- Fix global_role in user payload
- Add teams/roles to invite entities
- Add teams/roles support to invite datastore methods
- Update tests
- Carry over team information from invite when creating user
- Fix issue with built-in labels showing multiple platforms when hosts
are reinstalled with new platform.
- Add Red Hat Linux built-in label.
- Display more labels by default in target selector.
Fixes#546, #553
The AuthenticateHost loading of hosts accidentally dropped IP addresses,
which would cause the IP to be dropped on save under certain scenarios.
Also fixes a potential issue with flapping host additional info.
Fixes#358
The enrollment cooldown period was sometimes causing problems when
osquery (probably unintentionally, see
https://github.com/osquery/osquery/issues/6993) tried to enroll more
than once from the same osqueryd process.
We now set this to default to off and make it configurable. With #417
this feature may be unnecessary for most deployments.
Uses a LIKE clause to search for hosts matching the query against
columns `host_name`, `uuid`, `hardware_serial`, and `primary_ip`.
Introduces the `searchLike` helper to add the appropriate filters to the
SQL query.
On new installations we unintentionally set the enroll secret to empty
string during database migrations. The enroll secret would be reset
during the setup process. This fixes the migration to not create any
enroll secret until the setup process.
The current implementation of FleetDM doesn't support Docker secrets for supplying the MySQL password and JWT key. This PR provides the ability for a file path to read in secrets. The goal of this PR is to avoid storing secrets in a static config or in an environment variable.
Example config for Docker:
```yaml
mysql:
address: mysql:3306
database: fleet
username: fleet
password_path: /run/secrets/mysql-fleetdm-password
redis:
address: redis:6379
server:
address: 0.0.0.0:8080
cert: /run/secrets/fleetdm-tls-cert
key: /run/secrets/fleetdm-tls-key
auth:
jwt_key_path: /run/secrets/fleetdm-jwt-key
filesystem:
status_log_file: /var/log/osquery/status.log
result_log_file: /var/log/osquery/result.log
enable_log_rotation: true
logging:
json: true
```
This adds the option to set up an S3 bucket as the storage backend for file carving (partially solving #111).
It works by using the multipart upload capabilities of S3 to maintain compatibility with the "upload in blocks" protocol that osquery uses. It does this basically replacing the carve_blocks table while still maintaining the metadata in the original place (it would probably be possible to rely completely on S3 by using object tagging at the cost of listing performance). To make this pluggable, I created a new field in the service struct dedicated to the CarveStore which, if no configuration for S3 is set up will be just a reference to the standard datastore, otherwise it will point to the S3 one (effectively this separation will allow in the future to add more backends).
This addresses an issue some users experienced in which performance
problems were encountered when hosts were "competing" for enrollment
using the same osquery host identifier. The issue is addressed by adding
a cooldown period for host enrollment, preventing the same (as judged by
osquery host identifier) host from enrolling more than once per minute.
When users end up in the problematic scenario, they will see quite a bit
of error logs due to this issue. For now that's probably a good thing as
users need to be aware of the lack of visibility. We can explore rate
limiting the logging if that becomes an issue for someone.
Fixes#102
Replace the now-deleted migration
server/datastore/mysql/migrations/data/20181119180000_DeleteSoftDeletedEntities.go
with a new migration containing the same timestamp. This allows Fleet to
see the appropriate migration state for users upgrading from previous
versions without actually modifying the DB.
Fixes#48
This is another error introduced in
https://github.com/kolide/fleet/pull/2327 we did not catch previously
due to insufficient unit test coverage. Test is now added.
- Add endpoints for osquery to register and continue a carve.
- Implement client functionality for retrieving carve details and contents in fleetctl.
- Add documentation on using file carving with Fleet.
Addresses kolide/fleet#1714
Changes in https://github.com/kolide/fleet/pull/2327 broke the MySQL
syntax for listing hosts with online status. This was not caught due to
the lack of a unit test for the functionality. This PR adds a unit test
and fixes the regression.
* Perform migration to delete any entries with `deleted` set, and
subsequently drop columns `deleted` and `deleted_at`.
* Remove `deleted` and `deleted_at` references.
Closes#2146
- Debounce frontend to reduce number of target searches in live query.
- More efficiently calculate label counts in live query and hosts
dashboard. Instead of using the (slow) CountHostsInTargets function,
retrieve the host counts while looking up the labels.
- Optimize targets search query. Removing the nested query retrieves the
same logical result set, but substantially optimizes MySQL CPU usage.
Testing indicates about a 50% reduction in MySQL CPU usage for the
frontend targets search API call after applying this change.
Getting a single host with `fleetctl get host foobar` will look up the
host with the matching hostname, uuid, osquery identifier, or node key,
and provide the full host details along with the labels the host is a
member of.
"Manual" labels can be specified by hostname, allowing users to specify
the membership of a label without having to use a dynamic query. See the
included documentation.
Label membership is now stored in the label_membership table. This is
done in preparation for adding "manual" labels, as previously label
membership was associated directly with label query executions.
Label queries are now all executed at the same time, rather than on
separate intervals. This simplifies the calculation of which distributed
queries need to be run when a host checks in.
This commit takes advantage of the existing pagination APIs in the Fleet
server, and provides additional APIs to support pagination in the web
UI. Doing this dramatically reduces the response sizes for requests from
the UI, and limits the performance impact of UI clients on the Fleet and
MySQL servers.
This change optimizes live queries by pushing the computation of query
targets to the creation time of the query, and efficiently caching the
targets in Redis. This results in a huge performance improvement at both
steady-state, and when running live queries.
- Live queries are stored using a bitfield in Redis, and takes
advantage of bitfield operations to be extremely efficient.
- Only run Redis live query test when REDIS_TEST is set in environment
- Ensure that live queries are only sent to hosts when there is a client
listening for results. Addresses an existing issue in Fleet along with
appropriate cleanup for the refactored live query backend.
Fleet used significant resources storing the full network interface
information for each host. This data was unused, except to get the
IP and MAC of the primary interface. With these changes, only those
pieces of data are stored.
- Calculate and store primary IP and MAC
- Remove transaction for storing full interfaces
- Update targets search to use new IP and MAC columns
- Update frontend to use new new columns
This PR removes unused types, code, DB tables, and associated migrations that are unused since Fleet 2.0.
An existing migration was refactored, and should remain compatible with both existing and new Fleet installations.
Additional information is collected when host details are updated using
the queries specified in the Fleet configuration. This additional
information is then available in the host API responses.
This adds a SQL injection prevention for a case in which we cannot use
parameters in the query.
It is not clear that this was possible to exploit. If it was possible,
it would have required a valid login to the Fleet server.
- Add toggle to disable live queries in advanced settings
- Add new live query status endpoint (checks for disabled via config and Redis health)
- Update QueryPage UI to use new live query status endpoint
Implements #2140
Almost two years ago, we began referring to the project as Fleet, but there are
many occurences of the term "Kolide" throughout the UI and documentation. This
PR attempts to clear up those uses where it is easily achievable.
The term "Kolide" is used throughout the code as well, but modifying this would
be more likely to introduce bugs.
Brings the behavior of the server in line with the documentation, by using the
query name if the scheduled query name is not specified in a pack spec.
Closes#1990
Avoids potential bugs in which soft-deleted entities are returned from database
queries (soft-deletion is now deprecated), but some records may still exist.
Fixes#1956
This should fix the loading of the all hosts page in cases where there are many
hosts and it overwhelms the number of parameters allowed in a prepared
statement. May also make that page load slightly quicker as it removes the
constraint from the query, but should return the same number of results.
Fixes#1939
Packs can be targeted to individual hosts through the UI. This was supported
previously and was broken with refactoring in Fleet 2.0.
There is currently no support in the fleetctl format for targeting individual
hosts, but this could be added at a later date.
Fixes#1878
- Delete duplicate queries in packs created by the UI (because the duplicates
were causing undefined behavior). Now it is not possible to schedule
duplicates in the UI (but is in fleetctl).
- Fix bug in which packs created in UI could not be loaded by fleetctl.
- Add cascading deletes for scheduled_queries when queries are deleted
- Also add cascading deletes for scheduled_queries when packs are deleted
Fixes#1837
Replaces the UI endpoints for creating and modifying labels. These were removed
in #1686 because we thought we were killing the UI.
Now labels can be created and edited in the UI again.
Replaces (and appropriately refactors) a number of endpoints that were removed long ago when we decided to kill the UI with the fleetctl release. We turned out not to do this, and now need to restore these missing endpoints.
This is not a straight up replacement of the existing code because of refactoring to the DB schemas that was also done in the migration.
Most of the replaced code was removed in #1670 and #1686.
Fixes#1811, fixes#1810
With the UI, deleting by ID made sense. With fleetctl, we now want to delete
by name. Transition only the methods used for spec related entities, as others
will be removed soon.
Previously decorators were stored in a separate table. Now they are stored
directly with the config so that they can be modified on a per-platform basis.
Delete now unused decorators code.
- Add new Apply spec methods for queries and packs
- Remove now extraneous datastore/service methods
- Remove import service (unused, and had many dependencies that this breaks)
- Refactor tests as appropriate
This PR adds support for file integrity monitoring. This is done by providing a simplified API that can be used to PATCH/GET FIM configurations. There is also code to build the FIM configuration to send back to osquery. Each PATCH request, if successful, replaces Fleet's existing FIM configuration. For example:
curl -X "PATCH" "https://localhost:8080/api/v1/kolide/fim" \
-H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzZXNzaW9uX2tleSI6IkVhaFhvZWswMGtWSEdaTTNCWndIMnhpYWxkNWZpcVFDR2hEcW1HK2UySmRNOGVFVE1DeTNTaUlFWmhZNUxhdW1ueFZDV2JiR1Bwdm5TKzdyK3NJUzNnPT0ifQ.SDCHAUA1vTuWGjXtcQds2GZLM27HAAiOUhR4WvgvTNY" \
-H "Content-Type: application/json; charset=utf-8" \
-d $'{
"interval": 500,
"file_paths": {
"etc": [
"/etc/%%"
],
"users": [
"/Users/%/Library/%%",
"/Users/%/Documents/%%"
],
"usr": [
"/usr/bin/%%"
]
}
}'
Closes issue #1475
The command line tool that uses this endpoint -> https://github.com/kolide/configimporter
* Added support for atomic imports and dry run imports
* Added code so that imports are idempotent
Closes issue #1456 This PR adds a single sign on option to the login form, exposes single sign on to the end user, and allows an admin user to set single sign on configuration options.
Closes#1502. This PR adds support for SSO to the new user creation process. An admin now has the option to select SSO when creating a new user. When the confirmation form is submitted, the user is automatically authenticated with the IDP, and if successful, is redirected to the Kolide home page. Password authentication, password change and password reset are not allowed for an SSO user.
This PR partially addresses #1456, providing SSO SAML support. The flow of the code is as follows.
A Kolide user attempts to access a protected resource and is directed to log in.
If SSO identity providers (IDP) have been configured by an admin, the user is presented with SSO log in.
The user selects SSO, which invokes a call the InitiateSSO passing the URL of the protected resource that the user was originally trying access. Kolide server loads the IDP metadata and caches it along with the URL. We then build an auth request URL for the IDP which is returned to the front end.
The IDP calls the server, invoking CallbackSSO with the auth response.
We extract the original request id from the response and use it to fetch the cached metadata and the URL. We check the signature of the response, and validate the timestamps. If everything passes we get the user id from the IDP response and use it to create a login session. We then build a page which executes some javascript that will write the token to web local storage, and redirect to the original URL.
I've created a test web page in tools/app/authtest.html that can be used to test and debug new IDP's which also illustrates how a front end would interact with the IDP and the server. This page can be loaded by starting Kolide with the environment variable KOLIDE_TEST_PAGE_PATH to the full path of the page and then accessed at https://localhost:8080/test
Replaces the existing calculation that uses a global online interval. This method was lacking due to the fact that different hosts may have different checkin intervals set.
The new calculation uses `min(distributed_interval, config_tls_refresh) + 30` as the interval. This is calculated with the stored values for each host.
Closes#1321
Partially addresses #1456. This PR provides datastore support for SSO by creating a new entity IdentityProvider. This entity is an abstraction of the SAML IdentityProvider and contains the data needed to perform SAML authentication.
We now track the `config_tls_refresh`, `distributed_interval` and
`logger_tls_period` flag values for each host. Each value is updated by a
detail query agains the `osquery_flags` table, because they may be specified
outside of Kolide. The flags that can be specified within Kolide are also
updated when a config is returned to the host that changes their value.
This will enable us to do a more accurate per-host online status calculation as
discussed in #1419.
Closes issue #1388. The problem here is that previously, the reset button loaded a hard coded list of default options into the component state, instead of the proper behavior which is to reset the options to default values on the back end, and then load them back into the redux store. This PR adds a ResetOptions endpoint on the server, and wires up the UI so that it triggers the endpoint, then loads the default options from the backend server.
Closes issue #1390
There were quite a few places where UPDATES could fail silently because we weren't checking target rows where actually found where we expect them to be. In order to address this problem clientFoundRows was set in the sql driver configuration and checks for UPDATES were added to determine if matched rows were found where we expect them to be.
Push the calculation of target counts into the SQL query, rather than loading
all of the targets and then counting them. This provides a dramatic (>100x)
speedup in loading of the manage packs page when large numbers of hosts are
present.
Closes#1426
On CentOS6 there is a bug in which osquery incorrectly reports an empty string
for platform. This PR fixes our detection of centos in this case.
Fixes#1339
- Set default database character set to utf8mb4
- Convert character sets for each table to utf8mb4
- Use utf8mb4 as charset in connection string
Closes#1268
Improve the mechanism used to calculate whether or not hosts are online.
Previously, hosts were categorized as "online" if they had been seen within the past 30 minutes. To make the "online" status more representative of reality, hosts are marked "online" if the Kolide server has heard from them within two times the lowest polling interval as described by the Kolide-managed osquery configuration. For example, if you've configured osqueryd to check-in with Kolide every 10 seconds, only hosts that Kolide has heard from within the last 20 seconds will be marked "online".
Due to recreating the 'All Hosts' label in #1282, we get inconsistent counts
for hosts that have not checked in since that migration. This seems acceptable
for other labels, but it is important that 'All Hosts' really includes all the
hosts.
This migration adds all the hosts into that label.
Fixes#1329
Ensure that host network interfaces do not disappear when they (unexpectedly)
are returned with no updates from osquery. Add test to verify.
Fixes#1278
In some MySQL configurations, using a GROUP BY that doesn't refer to every
column in the SELECT will throw errors. Replace the use of GROUP BY with SELECT
DISTINCT as this is also more clear as to the intentions of the query.
Fixes#1249
* Change email functionality
* Code review changes for @groob
* Name change per @groob
* Code review changes per @marpaia
Also added addition non-happy path tests to satisfy concerns by @groob
* correctly list packs in response
Using append was adding a default pack response to the list of packs
* handle unique index for packs that exist but are deleted
Changing from the existing method of adding built in labels at server startup.
This new method should be friendlier to long term changes, and falls in line
with the new pattern established for osquery options.
Fixes#702
This PR separates the table migrations from the data population migrations. Table migrations run before data migrations.
Now, we have the ability to create the database tables without populating them with data. This can be useful for running "unit" tests against a MySQL store that doesn't have any pre-populated data. When performing real migrations, or for more "integration" style testing, the data migrations can also be executed.
Note there are some special cases that must be observed with these migrations, and the README is updated to reflect those.
* Initial scaffolding of the host summary endpoint
* inmem datastore implementation of GenerateHostStatusStatistics
* HostSummary docstring
* changing the url of the host summary endpoint
* datastore tests for GenerateHostStatusStatistics
* MySQL datastore implementation of GenerateHostStatusStatistics
* <= and >= to catch exact time edge case
* removing clock interface method
* lowercase error wraps
* removin superfluous whitespace
* use updated_at
* adding a seen_at column to the hosts table
* moving the update of seen_time to the caller
* using db.Get instead of db.Select
This PR adds the `host_ids` and `label_ids` field to the packs HTTP API so that one can operate on the hosts/labels which a pack is scheduled to be executed on. This replaces (and deletes) the `/api/v1/kolide/packs/123/labels/456` API in favor of `PATCH /api/v1/packs/123` and specifying the `label_ids` field. This also allows for bulk operations.
Consider the following API examples:
## Creating a pack with a known set of hosts and labels
The key addition is the `host_ids` and `label_ids` field in both the request and the response.
### Request
```
POST /api/v1/kolide/packs
```
```json
{
"name": "My new pack",
"description": "The newest of the packs",
"host_ids": [1, 2, 3],
"label_ids": [1, 3, 5]
}
```
### Response
```json
{
"pack": {
"id": 123,
"name": "My new pack",
"description": "The newest of the packs",
"platform": "",
"created_by": 1,
"disabled": false,
"query_count": 0,
"total_hosts_count": 5,
"host_ids": [1, 2, 3],
"label_ids": [1, 3, 5]
}
}
```
## Modifying the hosts and/or labels that a pack is scheduled to execute on
### Request
```
PATCH /api/v1/kolide/packs/123
```
```json
{
"host_ids": [1, 2, 3, 4, 5],
"label_ids": [1, 3, 5, 7]
}
```
### Response
```json
{
"pack": {
"id": 123,
"name": "My new pack",
"description": "The newest of the packs",
"platform": "",
"created_by": 1,
"disabled": false,
"query_count": 0,
"total_hosts_count": 5,
"host_ids": [1, 2, 3, 4, 5],
"label_ids": [1, 3, 5, 7]
}
}
```
close#633
with an exposed interface.
Not checking for a specific sentinel error reduces coupling between packages
and allows adding context like the resource ID and resource type.
Operator precedence was causing incorrect results to be returned. The failing
test was missed because the CI results did not appear in Github before merging.
* Moving query attributes from the query object to the pack-query relationship
* some additional tests
* http request parsing test
* QueryOptions in new test_util code
* initial scaffolding of new request structures
* service and datastore
* test outline
* l2 merge conflict scrub
* service tests for scheduled query service
* service and datastore tests
* most endpoints and transports
* order of values are not deterministic with inmem
* transport tests
* rename PackQuery to ScheduledQuery
* removing existing implementation of adding queries to packs
* accounting for the new argument to NewQuery
* fix alignment in sql query
* removing underscore
* add removed to the datastore
* removed differential from the schema
- New datastore method for bulk deletion
- New service method calling this datastore method
- Endpoint, transport and handler connections for service method
Closes#389
- Add saved state to query (to differentiate queries explicitly saved from
those just run as distributed queries)
- Remove unique constraint on query name
Closes#390