# How to use osquery evented tables
Osquery can let you see the state of your computers right now. However, this snapshot means you have to check the table repeatedly if you want to view data over time. Actively watching and diffing tables could be challenging even with automation, especially for short-lived or high-churn processes.
This is where osquery's evented tables can help. Instead of displaying the point-in-time state for your host, osquery's evented tables store the host's historical data. You can configure osquery to capture certain types of information, which will be stored in the relevant `*_events` table for later analysis.
In this guide, we’ll go through the concepts, considerations, and best practices for setting up evented tables — focusing on osquery. We’ll also cover basic information about commonly-used evented tables to help you get started.
## How do osquery evented tables work?
Osquery does not generate the events itself. Instead, it reads and formats event data generated by various OS components. For example, on Linux, the audit framework generates and broadcasts process and socket event data. Osquery receives the data, converts it into an event row, and buffers them (in the internal [rocksdb](http://rocksdb.org) store) in the `process_events` and `socket_events` tables to await querying. The data can then be filtered and transformed via SQL and shipped to a log destination with the scheduled query functionality.
For the purposes of this article, we'll use the term "utility" to mean the underlying OS component that osquery subscribes to for its various evented tables.
This separation between osquery and the utility means that some evented tables rely on configurations for the utility to determine which events will be generated. At the utility level, you can specify what data is captured. At the osquery level, you can specify what information is ingested, presented, and transmitted. For most evented tables, osquery works great out of the box with the utility’s default configurations, but some use cases may require adjusting the utility configuration.
## How do I know whether a table is evented?
You can tell that an osquery table is evented in two ways:
- The table ends in `_events`.
- In the [osquery schema](osquery.io/schema), the table is marked with an "evented" tag near the table name.
## What do I need to consider when configuring evented tables?
### Performance impact
Capturing event data generates performance overhead from both the utility and osquery. If the utility is configured loosely to generate more events, then the utility performs more operations to generate events and osquery parses and stores more events.
Capturing only the events you need will cut down on the amount of work the host needs to do. For example, when monitoring processes, there may be frequent but low-value processes such as `awk` and `sed` which can be ignored, reducing work for the host. Collecting “good enough” data is key in managing performance impact.
Also, consider the impact of the queries you’re using to collect your event data. Queries using `WHERE` clauses will be fairly efficient (and minimize data volume), while many `JOIN`s or wildcards (%) will use more resources.
Even after considering all of these factors, you may need to give osquery some additional resources. Osquery's watchdog automatically cancels queries if they exceed certain system usage. This can be adjusted with the following flags:
* `--watchdog_memory_limit` changes the maximum memory usage (expressed in MBs).
* `--watchdog_utilization_limit` changes the maximum number of CPU cycles (defined as the `processes` table's `user_time` and `system_time`) for more than the time in seconds set by `--watchdog_latency_limit`.
* `--watchdog_delay` sets the delay in seconds before CPU and memory usage limits will be enforced (60 seconds by default).
### Disk usage and data retention
Osquery collects data from the utility, formats it into an event row, and stores it in the evented table for querying. Of course, the more events data is collected, the more disk space this occupies. We recommend that you do not rely on osquery as long-term storage for event data. Instead, regularly schedule the data to be sent to an external destination for future analysis. Osquery has built-in options to automatically clean up the data.
The following osquery flags will help you manage the size of osquery's data:
* `--events_max` sets the maximum number of event rows per evented table to store in the buffer before expiring them with a default value of 50,000.
* `--events_expiry` sets the lifetime of event rows in seconds with a default value of 86,000 (24 hours). An event only expires if a query has been against the table after event generation. When combined with scheduled queries, this is a handy way to clean out data automatically. Some osquery practitioners set this to `1` so that it immediately gets cleared out when a scheduled query runs.
* `--events_optimize=true` saves the time that this table was last queried and only returns events after that time (enabled by default). This can be overridden in a one-off query by specifying the `time` column in a `WHERE` clause.
You could also consider configuring the utility to ignore extraneous data to minimize resource utilization. Ignoring extraneous data can minimize disk usage at both the utility and osquery levels. The utility will not generate this data in the first place. Or osquery can filter out the extraneous processes in the `SELECT` statement. Both will minimize data volume.
There is always risk of data loss. Both osquery and the utilities limit the disk usage and processor usage, which results in data loss during periods of high system load. Osquery will log when it's dropping events due to high load, though this will not detect when events are dropped by the utility. To address the risk of data loss, you could schedule queries to run more frequently or allocate more resources to osquery or the utility. Of course, this comes at a cost and requires tuning to balance risk and reward.
### Test for impact
Getting the right setup that balances performance, data volume, and data usefulness for the evented tables requires some trial and error. The best way is to try things out on a progressively larger set of machines. We recommend setting up a canary team on Fleet to test different combinations of configurations.
The `osquery_schedule` table will list all scheduled queries and recent information about their memory usage and execution time. Note that these do not have visibility into the utility. For lower-level visibility, use the OS-native profilers.
### Useful troubleshooting tools built into osquery
* The `osquery_events` table tells you which evented tables are turned on (`active` column) and the number of events stored (`events` column) per table.
* The `osquery_flags` table tells you the current set of flags for osquery. You can use this to confirm the desired flags are set correctly.
* The `osquery_schedule` table lists all scheduled queries and collects memory and execution time for the latest execution.
* The `--verbose` flag will generate more logs with troubleshooting information.
## How do I turn on an evented table?
To turn on osquery's eventing system, set the flag `--disable_events=false`. Eventing is disabled by default.
Each evented table is turned on by its own flag. For most evented tables, when you turn them on in osquery, osquery will use the default configuration of the utility. The defaults are good enough for most situations.
However, we recommend getting to know the underlying utility to optimize it for your use case. Let's go over the following topics:
1. File integrity monitoring
2. Process auditing
3. YARA scanning
### File integrity monitoring (FIM)
FIM refers to the monitoring of key files or filepaths. FIM enables organizations to audit the history of critical resources, detect intrusions, and apply remediations.
On all three OSs, in the osquery configuration, use the `file_paths` key to specify the files and directories from which osquery should collect `file_events` data. Use the `exclude_paths` key to ignore files and directories that generate too much noise. [Wildcards](https://osquery.readthedocs.io/en/stable/deployment/file-integrity-monitoring/#matching-wildcard-rules) are available in these configuration options. On Linux, there is a further `file_accesses` option, which specifies the file locations where an "access" event should be recorded in addition to created/modified/deleted.
#### FIM on macOS
To turn `file_events` on for macOS, use the flag `--enable_file_events=true`. The corresponding utility is [FSEvents](https://developer.apple.com/library/archive/documentation/Darwin/Conceptual/FSEvents_ProgGuide/TechnologyOverview/TechnologyOverview.html#//apple_ref/doc/uid/TP40005289-CH3-SW1).
MacOS also has an `es_process_file_events` table that uses the [EndpointSecurity](https://developer.apple.com/documentation/endpointsecurity) API. However, osquery needs Full Disk Access permission, which can be [granted manually or via MDM](https://osquery.readthedocs.io/en/latest/deployment/process-auditing/#full-disk-access). To use this, use the flags `--disable_endpointsecurity=false --disable_endpointsecurity_fim=false`.
`es_process_file_events` records which processes accessed which files, whereas `file_events` does not. However, `es_process_file_events` will generate more data volume because it captures everything by default. Currently, you can configure EndpointSecurity to [ignore certain file paths](https://osquery.readthedocs.io/en/stable/installation/cli-flags/#macos-only-events-control-flags), but there is no way to configure it to only watch certain filepaths.
Due to the data volume, Fleet suggests using `file_events` for macOS, but you can use `es_process_file_events`.
#### FIM on Linux
To turn `file_events` on for Linux, use the flag `--enable_file_events=true`. The corresponding utility is [inotify](https://man7.org/linux/man-pages/man7/inotify.7.html).
Linux has a `process_file_events` table that uses the [audit framework](https://wiki.archlinux.org/title/Audit_framework). To use this table, use the flags `--disable_audit=false --audit_allow_fim_events=true`.
Fleet recommends using the `process_file_events` table since it also includes data for which process accessed which file.
#### FIM on Windows
For Windows, use the `--enable_ntfs_event_publisher=true` flag to turn on `ntfs_journal_events`. The corresponding utility is [NTFS Journal](https://docs.microsoft.com/en-us/windows/win32/fileio/change-journal-records).
#### Learn more
[Read the osquery FIM docs](https://osquery.readthedocs.io/en/stable/deployment/file-integrity-monitoring/) for more information on file integrity monitoring with osquery.
### Process auditing
Process auditing refers to recording process executions and network, or socket, connections.
#### Process auditing on Linux
On Linux, there are two utilities that enable osquery process auditing: [eBPF](https://ebpf.io/what-is-ebpf) and the [audit framework](https://wiki.archlinux.org/title/Audit_framework).
The choice of utility depends on your situation. Here are some factors to consider:
- Audit has earlier support (>2.6 ) compared to eBPF (>4.18).
- Only one consumer of audit’s logs are allowed at a time. The `--audit_persist=true` flag will set osquery to retry connection to audit logs.
- Audit has limited visibility inside containers.
- The audit table and the eBPF table return slightly different data.
We recommend that you try both and compare results for your use case.
To use the `bpf_process_events` and `bpf_socket_events` tables, use the flag `--enable_bpf_events=true`. See the [instructions on auditing using bpf](https://osquery.readthedocs.io/en/latest/deployment/process-auditing/#linux-process-and-socket-auditing-using-bpf) for more information.
To use `process_events` and `socket_events` with the audit framework, use the flags `--disable_audit=false --audit_allow_process_events=true --audit_allow_socket_events=true`. See the [instructions on using audit](https://osquery.readthedocs.io/en/latest/deployment/process-auditing/#linux-process-auditing-using-audit) for more information.
#### Process auditing on macOS
On macOS, there are two utilities that enable osquery process auditing: [OpenBSM](https://github.com/openbsm/openbsm) and the [EndpointSecurity](https://developer.apple.com/documentation/endpointsecurity). Fleet recommends using the EndpointSecurity implementation because it's intended to replace OpenBSM, which is deprecated. EndpointSecurity is available starting macOS 10.15.
To use the `es_process_events` tables, use the flag `--disable_endpointsecurity=false`. See the [EndpointSecurity instructions](https://osquery.readthedocs.io/en/latest/deployment/process-auditing/#auditing-processes-with-endpointsecurity) for more information. To use `process_events` and `socket_events` with OpenBSM, see the [OpenBSM instructions](https://osquery.readthedocs.io/en/latest/deployment/process-auditing/#auditing-processes-with-openbsm).
#### Windows
Currently, osquery does not support process auditing for Windows. To learn more about process auditing on Windows, visit [Microsoft's security auditing overview](https://docs.microsoft.com/en-us/windows/security/threat-protection/auditing/security-auditing-overview). Fleet is tracking work to build process auditing for Windows in osquery. [Stay up to date on GitHub](https://github.com/fleetdm/fleet/issues/7732).
### YARA scanning
[YARA](https://virustotal.github.io/yara/) is a malware research and detection tool available on Linux and macOS that allows users to create descriptions of malware families based on patterns of text or binary code. Each potential piece of malware is matched against a YARA rule and triggers if the specified conditions are met.
Osquery applies pre-specified YARA rules to incoming events in the `file_events` table to populate the `yara_events` table. As such, it requires the following flags:
* `--disable_events=false`
* `--enable_file_events=true`
With the appropriate flags set, specify the appropriate YARA rules in the osquery configuration using the format described in the [YARA configuration doc](https://osquery.readthedocs.io/en/stable/deployment/yara/#yara-configuration).
Explore more topics and useful links for YARA:
* [Osquery's YARA scanning guide](https://osquery.readthedocs.io/en/stable/deployment/yara/#continuous-monitoring-using-the-yara_events-table)
* [How to write YARA rules](https://yara.readthedocs.io/en/stable/writingrules.html)
* [Repository of example YARA rules](https://github.com/Yara-Rules/rules)
* [Collection of use YARA resources by InQuest](https://github.com/InQuest/awesome-yara)
* [YARA performance guidelines](https://github.com/Neo23x0/YARA-Performance-Guidelines/)
### Common event tables
These event tables are available in osquery. We will provide more information for them in other guides.
| Table name | OS | Flags |
| :- | :-- | :-- |
| apparmor_events | Linux | --audit_allow_apparmor_events=true |
| disk_events | macOS | no additional flags needed |
| hardware_events | macOS, Linux | no additional flags needed |
| seccomp_events | Linux | --audit_allow_seccomp_events |
| selinux_events | Linux | --audit_allow_selinux_events=true |
| syslog_events | Linux | no additional flags needed |
| user_interaction_events | macOS | --enable_keyboard_events=true --enable_mouse_events=true |
| user_events | Linux | --audit_allow_user_events=true |
| windows_events | Windows | --enable_windows_events_publisher=true |
| powershell_events | Windows | --enable_powershell_events_subscriber=true |