The refactor of config/packs was initiated because event subscribers needed
a method for toggling `::init` based on some configurable option. In the case
of auditd, turning on the support with `--disable_audit=false` used to start
auditing the EXECVE syscall. It was understandable that this would cause
latency based on the number of processes executing per measure of time.
A new `socket_events` table will do the same but for `bind` and `connect`. These
are less-obvious and for now, require a scan of /proc for socket tuples. In the
future this file descriptor to socket tuple will be faster.
There was a bug in the `osquery::Schedule` container object such that,
when the iteration through the schedule occured, pack objects were being
passed by value (copied) instead of passed by reference. Thus, the
discovery query would be executed, the object's cache would be updated,
and then the object would go out of scope and be destructed, thus
leaving the original object without ever having ran the discovery query.
This caused discovery queries to thrash. Bad times.
I added a new test so that we don't regress here as well as const'd a
few functions that should have been const in `osquery::Pack`.
This commit contains the features specified in #1390 as well as a
refactoring of the general osquery configuration code.
The API for the config plugins hasn't changed, although now there's a
`genPack` method that config plugins can implement. If a plugin doesn't
implement `genPack`, then the map<string, string> format cannot be used.
The default config plugin, the filesystem plugin, now implements
`genPack`, so existing query packs code will continue to work as it
always has.
Now many other config plugins can implement custom pack handling for
what makes sense in their context. `genPacks` is not a pure virtual, so
it doesn't have to be implemented in your plugin if you don't want to
use it. Also, more importantly, all config plugins can use the standard
inline pack format if they want to use query packs. Which is awesome.
For more information, refer to #1390, the documentation and the doxygen
comments included with this pull requests, as well as the following
example config which is now supported, regardless of what config plugin
you're using:
```json
{
"options": {
"enable_monitor": "true"
},
"packs": {
"core_os_monitoring": {
"version": "1.4.5",
"discovery": [
"select pid from processes where name like '%osqueryd%';"
],
"queries": {
"kernel_modules": {
"query": "SELECT name, size FROM kernel_modules;",
"interval": 600
},
"system_controls": {
"query": "SELECT * FROM system_controls;",
"interval": 600,
"snapshot": true,
},
"usb_devices": {
"query": "SELECT * FROM usb_devices;",
"interval": 600
}
}
},
"osquery_internal_info": {
"version": "1.4.5",
"discovery": [
"select pid from processes where name like '%osqueryd%';"
],
"queries": {
"info": {
"query": "select i.*, p.resident_size, p.user_time, p.system_time, time.minutes as counter from osquery_info i, processes p, time where p.pid = i.pid;",
"interval": 60,
"snapshot": true
},
"registry": {
"query": "SELECT * FROM osquery_registry;",
"interval": 600,
"snapshot": true
},
"schedule": {
"query": "select name, interval, executions, output_size, wall_time, (user_time/executions) as avg_user_time, (system_time/executions) as avg_system_time, average_memory from osquery_schedule;",
"interval": 60,
"snapshot": true
}
}
}
}
}
```
The `osquery_packs` table was modified to remove the superfluous
columns which could already have been found in `osquery_schedule`. Two
more columns were added in their place, representing stats about pack's
discovery query execution history.
Notably, the internal API for the `osquery::Config` class has changed
rather dramatically as apart of the refactoring. We think this is an
improvement. While strictly adhering to the osquery config plugin
interface will have avoided any compatibility errors, advanced users may
notice compilation errors if they access config data directly. All
internal users of the config have obviously been updated. Yet another
reason to merge your code into mainline; we update it for you when we
refactor!
Previously, `ConfigParserPlugin`s could only maintain an internal derived object called `data_`.
Then parts of the code that knew to use the plugin's data would call `getParsedData` and provide the name of the plugin.
Parser plugins can now request a mutable version of the `ConfigData` using `::mutableConfigData`.
This requires a lock on the `ConfigDataInstance` and must be provided to their mutable accessor.
Acess to a mutable config enables parsers to make modifications to internal config structures like options and the query schedule.
This was causing a crash when executing a query using the yara table
from the command line, because YARA was never initialized properly, so
the thread index was whatever was left on the stack. Eventually YARA
would attempt to set a rule that matches using this thread index and
would explode in flames.
Fix it by moving the initialization to a place that is always called.
1. Rename yara_matches to yara_events.
2. Add support for Config::getParser().
- This returns a ConfigPluginRef, which is the ConfigParser for the
given key.
- Being able to get the parser is useful because the
YARAConfigParserPlugin uses it to store the compiled rules as an
attribute.
3. Finish rename and use ConfigParserPlugin.
- Finish the table rename to yara_events.
- Use the new ConfigParserPlugin interface to parse the YARA
configuration. The file_paths and signatures are stored in the
ConfigParserPlugin named "yara" under the key "yara". The rules are
compiled and stored as a private attribute of the same
ConfigParserPlugin object.
Here is an example config using this new structure:
{
// Description of the YARA feature.
"yara": {
"signatures": {
// Each key is an arbitrary group name to give the signatures listed
"sig_group_1": [ "/Users/wxs/foo.sig", "/Users/wxs//bar.sig" ],
"sig_group_2": [ "/Users/wxs/baz.sig" ]
},
"file_paths": {
// Each key is a key from file_paths
// The value is a list of signature groups to run when an event fires
// These will be watched for and scanned when the event framework
// fire off an event to yara_events table
"system_binaries": [ "sig_group_1" ],
"tmp": [ "sig_group_1", "sig_group_2" ]
}
},
// Paths to watch for filesystem events
"file_paths": {
"system_binaries": [ "/usr/bin/%", "/usr/sbin/%" ],
"tmp": [ "/Users/wxs/tmp/%%" ]
}
}
- Currently the signature file must be an absolute path.
3. Move common YARA code to yara_utils.
- In preparation for the yara table (different from yara_events) I'm
moving the common YARA code into a separate place which is shared
between the two tables.
4. Add yara table.
- This allows you to do things like:
```sql
select * from yara where path="/bin/ls" and sigfile="/tmp/foo.sig";
select * from yara where path="/bin/ls" and sig_group="sig_group_1";
```
- The latter will use the signature grouping from the config.
5. Check for keys not existing.
Currently only for OS X, will port to others soon.
Also need to add tests.
Remove old comment and add loading message.
Implement YARA table for Linux.
Use mask properly.
Use the various masks to specify the kinds of events we are interested
in. This removes the need to do the dirty "DELETED" check when the event
fires.
Make getYARAFiles return a const map.
Switch to LOG(WARNING) and emit error number.
Add vim .swp files to .gitignore.
Add yara_utils.(c|h).
Start to condense common code between the Linux and Darwin YARA tables
into a yara_utils.h. Right now it includes a function to compile rules
and store the results back in the map, indexed by category. It also has
the callback used by YARA when a rule is processed. I can not move much
more than that for the row creation code because the structures used in
the event callback are slightly different.
Include a better error message.
The errors are still printed by the compiler callback, but this will
allow my future work to return a Status from the event initialization to
print a useful message in summary.
Make Subscriber init() return Status.
Each EventSubscriber::init() now returns a Status. If the init() fails
for any reason the EventSubscriber is still stored but the failure is
tracked.
EventSubscribers now have a state member, which represents the current
state of the subscriber. The current supported states are:
uninitialized, running, paused, failed. Currently the only meaningful
ones are running and failed, but I put paused in there as a
forward-looking feature.
Subscriptions now have a subscriber_name member. This is used in
EventPublisherPlugin::fire() as a lookup to get the EventSubscriber and
check the state. If the EventSubscriber is not running the event will
not fire.
Only the EventSubscribers on OS X are using this. I'll do the Linux
implementation next.
Chase the init() changes to Linux.
This brings the Linux YARA table in line with the OS X one.
Require a EventSubscriberID when creating a subscription.
Now that Subscriptions are "tied" to EventSubscribers you must create a
Subscription with the name of the Subscriber it is for. This is because
when the event fires the list of Subscriptions is walked and the name is
used to lookup the EventSubscriber and make sure it is in the running
state.
Fix various tests.
Some tests would fire an event with only a Subscription, which is no
longer a valid thing to do. For these tests an EventSubscription is
created and registered in the EventFactory.
When Subscriptions are created pass the name of the EventSubscriber to
them. In some cases where no event is ever fired it is fine to pass a
bogus name.
Fix inotify tests.
Move a test down so the class is defined and make sure to create an
EventSubscriber and use it properly.
Add support for yara to provision.sh.
Right now this grabs yara 3.3.0 and applies the patch to fix min() and max(),
which is commit fc4696c8b725be1ac099d340359c8d550d116041 in the yara repo.
This has been tested under Ubuntu 14.04 only.
Remove NOMINMAX.
This is no longer necessary after the patch was backported to 3.3.0.
Revert "Add support for yara to provision.sh."
This reverts commit a8bd371498c0979f070adeff23d05571882ac3f1.
Use vendored YARA code in third-party.
This switches to using the YARA code contained in third-party, including
the patch to fix min/max macros.
Fix mismerge.
Remove unused function after merge.
Well, soon to be unused as soon as I fix up the Linux YARA table. ;)
Chase config changes.
Make the Linux YARA table use ConfigDataInstance along with files() and
yaraFiles().