Merge pull request #1178 from theopolis/move_specs

Move specs to a top-level path, add query examples
This commit is contained in:
Teddy Reed 2015-06-03 13:40:32 -07:00
commit be0803adb0
111 changed files with 576 additions and 346 deletions

View File

@ -191,7 +191,7 @@ macro(GET_GENERATION_DEPS BASE_PATH)
file(GLOB TABLE_FILES_TEMPLATES "${BASE_PATH}/osquery/tables/templates/*.in")
set(GENERATION_DEPENDENCIES
"${BASE_PATH}/tools/codegen/*.py"
"${BASE_PATH}/osquery/tables/specs/blacklist"
"${BASE_PATH}/specs/blacklist"
)
list(APPEND GENERATION_DEPENDENCIES ${TABLE_FILES_TEMPLATES})
endmacro()
@ -271,7 +271,7 @@ macro(AMALGAMATE BASE_PATH NAME OUTPUT)
add_custom_command(
OUTPUT "${CMAKE_BINARY_DIR}/generated/${NAME}_amalgamation.cpp"
COMMAND ${PYTHON_EXECUTABLE} "${BASE_PATH}/tools/codegen/amalgamate.py"
"${BASE_PATH}/osquery/tables/" "${CMAKE_BINARY_DIR}/generated" "${NAME}"
"${BASE_PATH}/tools/codegen/" "${CMAKE_BINARY_DIR}/generated" "${NAME}"
DEPENDS ${GENERATED_TARGETS} ${GENERATION_DEPENDENCIES}
WORKING_DIRECTORY "${CMAKE_SOURCE_DIR}"
)

View File

@ -13,10 +13,7 @@ set(CPP-NETLIB_LINK_DIR "${CPP-NETLIB_BUILD_DIR}/libs/network/src")
set(CPP-NETLIB_LIBRARY
"${CPP-NETLIB_LINK_DIR}/libcppnetlib-uri.a"
"${CPP-NETLIB_LINK_DIR}/libcppnetlib-client-connections.a"
"${CPP-NETLIB_LINK_DIR}/libcppnetlib-server-parsers.a"
)
#LOG("Found library dependency ${CPP-NETLIB_LINK_DIR}/libcppnetlib-uri.a")
LOG_LIBRARY(cpp-netlib "${CPP-NETLIB_LINK_DIR}/libcppnetlib-client-connections.a")
LOG_LIBRARY(cpp-netlib "${CPP-NETLIB_LINK_DIR}/libcppnetlib-server-parsers.a")
LOG_LIBRARY(cpp-netlib "${CPP-NETLIB_LINK_DIR}/libcppnetlib-uri.a")
LOG_LIBRARY(cpp-netlib "${CPP-NETLIB_LINK_DIR}/libcppnetlib-client-connections.a")

View File

@ -79,7 +79,7 @@ The build will run each of the support operating system platform/versions and in
Performance impacting virtual tables are most likely the result of missing features/tooling in osquery. Because of their dependencies on core optimizations there's no hard including the table generation code in master as long as the table is blacklisted when a non-developer builds the tool suite.
If you are developing latent tables that would be blacklisted please make sure you are relying on a feature with a clear issue and traction. Then add your table name (as it appears in the `.table` spec) to `osquery/tables/specs/blacklist` and adopt:
If you are developing latent tables that would be blacklisted please make sure you are relying on a feature with a clear issue and traction. Then add your table name (as it appears in the `.table` spec) to [`specs/blacklist`](https://github.com/facebook/osquery/blob/master/specs/blacklist) and adopt:
```
$ DISABLE_BLACKLIST=1 make

View File

@ -1,152 +1,153 @@
SQL tables are used to represent abstract operating system concepts, such as running processes.
A table can be used in conjunction with other tables via operations like sub-queries and joins. This allows for a rich data exploration experience. While osquery ships with a default set of tables, osquery provides an API that allows you to create new tables.
You can explore current tables: [https://osquery.io/tables](https://osquery.io/tables). Tables that are up for grabs in terms of development can be found on Github issues using the "virtual tables" + "[up for grabs tag](https://github.com/facebook/osquery/issues?q=is%3Aopen+is%3Aissue+label%3A%22virtual+tables%22)".
## New Table Walkthrough
Let's walk through an exercise where we build a 'time' table. The table will have one row, and that row will have three columns: hour, minute, and second.
Column values (a single row) will be dynamically computed at query time.
**Table specifications**
Under the hood, osquery uses libraries from SQLite core to create "virtual tables". The default API for creating virtual tables is relatively complex. osquery has abstracted this complexity away, allowing you to write a simple table declaration.
To make table-creation simple osquery uses a *table spec* file.
The current specs are organized by operating system in the [osquery/tables/specs](https://github.com/facebook/osquery/tree/master/osquery/tables/specs) source folder.
For our time exercise, a spec file would look like the following:
```python
# Use the table_name function to define a virtual table name.
table_name("time")
# Provide a short "one line" description.
description("Returns the current hour, minutes, and seconds.")
# Define your schema, which accepts a list of Column instances at minimum.
# You may also describe foreign keys and "action" columns (more later).
schema([
# Declare the name, type, and documentation description for each column.
# The supported types are INTEGER, BIGINT, TEXT, DATE, and DATETIME.
Column("hour", INTEGER, "The current hour"),
Column("minutes", INTEGER, "The current minutes past the hour"),
Column("seconds", INTEGER, "The current seconds past the minute"),
])
# Name of the C++ function that implements the table: "gen{TableName}"
implementation("genTime")
```
You can leave the comments out in your production spec. Shoot for simplicity, do NOT go "hard in the paint" and do things like inheritance for Column objects, loops in your table spec, etc.
You might wonder "this syntax looks similar to Python?". Well, it is! The build process actually parses the spec files as Python code and meta-programs necessary C/C++ implementation files.
**Where do I put the spec?**
You may be wondering how osquery handles cross-platform support while still allowing operating-system specific tables. The osquery build process takes care of this by only generating the relevant code based on a directory structure convention.
- Cross-platform: [osquery/tables/specs/](https://github.com/facebook/osquery/tree/master/osquery/tables/specs/)
- Mac OS X: [osquery/tables/specs/darwin/](https://github.com/facebook/osquery/tree/master/osquery/tables/specs/darwin)
- General Linux: [osquery/tables/specs/linux/](https://github.com/facebook/osquery/tree/master/osquery/tables/specs/linux)
- Ubuntu: [osquery/tables/specs/ubuntu/](https://github.com/facebook/osquery/tree/master/osquery/tables/specs/ubuntu)
- CentOS: [osquery/tables/specs/centos/](https://github.com/facebook/osquery/tree/master/osquery/tables/specs/centos)
Note: the CMake build provides custom defines for each platform and platform version.
**Creating your implementation**
As indicated in the spec file, our implementation will be in a function called `genTime`. Since this is a very general table and should compile on all supported operating systems we can place it in *osquery/tables/utility/time.cpp*. The directory *osquery/table/* contains the set of *specs* and implementation categories. Place implementations in the corresponding category using your best judgement.
Here is that code for *osquery/tables/utility/time.cpp*:
```cpp
// Copyright 2004-present Facebook. All Rights Reserved.
#include <ctime>
#include <osquery/tables.h>
namespace osquery {
namespace tables {
QueryData genTime(QueryContext &context) {
Row r;
QueryData results;
time_t _time = time(0);
struct tm* now = localtime(&_time);
r["hour"] = INTEGER(now->tm_hour);
r["minutes"] = INTEGER(now->tm_min);
r["seconds"] = INTEGER(now->tm_sec);
results.push_back(r);
return results;
}
}
}
```
Key points to remember:
- Your implementation function should be in the `osquery::tables` namespace.
- Your implementation function should accept on `QueryContext&` parameter and return an instance of `QueryData`
## Using where clauses
The `QueryContext` data type is osquery's abstraction of the underlying SQL engine's query parsing. It is defined in [include/osquery/tables.h](https://github.com/facebook/osquery/blob/master/include/osquery/tables.h).
The most important use of the context is query predicate constraints (e.g., `WHERE col = 'value'`). Some tables MUST have a predicate constraint, others may optionally use the constraints to increase performance.
Examples:
`hash` requires a predicate, since the resultant rows are the hashes of the EQUALS constraint operators (`=`). The table implementation includes:
```cpp
auto paths = context.constraints["path"].getAll(EQUALS);
for (const auto& path_string : paths) {
boost::filesystem::path path = path_string;
[...]
}
```
`processes` optionally uses a predicate. A syscall to list process pids requires few resources. Enumerating "/proc" information and parsing environment/argument uses MANY resources. The table implementation includes:
```cpp
for (auto &pid : pidlist) {
if (!context.constraints["pid"].matches<int>(pid)) {
// Optimize by not searching when a pid is a constraint.
continue;
}
[...]
}
```
## SQL data types
Data types like `QueryData`, `Row`, `DiffResults`, etc. are osquery's built-in data result types. They're all defined in [include/osquery/database/results.h](https://github.com/facebook/osquery/blob/master/include/osquery/database/results.h).
`Row` is just a `typedef` for a `std::map<std::string, std::string>`. That's it. A row of data is just a mapping of strings that represent column names to strings that represent column values. Note that, currently, even if your SQL table type is an `int` and not a `std::string`, we need to cast the ints as strings to comply with the type definition of the `Row` object. They'll be casted back to `int`'s later. This is all handled transparently by osquery's supporting infrastructure as long as you use the macros like `TEXT`, `INTEGER`, `BIGINT`, etc when inserting columns into your row.
`QueryData` is just a `typedef` for a `std::vector<Row>`. Query data is just a list of rows. Simple enough.
To populate the data that will be returned to the user at runtime, your implementation function must generate the data that you'd like to display and populate a `QueryData` map with the appropriate `Rows`. Then, just return the `QueryData`.
In our case, we used system APIs to create a struct of type `tm` which has fields such as `tm_hour`, `tm_min` and `tm_sec` which represent the current time. We can then create our three entries in our `Row` variable: hour, minutes and seconds. Then we push that single row onto the `QueryData` variable and return it. Note that if we wanted our table to have many rows (a more common use-case), we would just push back more `Row` maps onto `results`.
## Building new tables
If you've created a new file, you'll need to make sure that CMake properly builds your code. Open [osquery/tables/CMakeLists.txt](https://github.com/facebook/osquery/blob/master/osquery/tables/CMakeLists.txt). Find the line that defines the library `osquery_tables` and add your file, "utility/time.cpp" to the sources which are compiled by that library.
If your table only works on OS X, find the target called `osquery_tables_darwin` and add your file to that list of sources instead. If your table only works on Linux, find the target called `osquery_tables_linux` and add your implementation file to that list of sources.
Return to the root of the repository and execute `make`. This will generate the appropriate code and link everything properly.
### Testing your table
If your code compiled properly, launch the interactive query console by executing `./build/[darwin|linux]/osquery/osqueryi` and try issuing your new table a command: `SELECT * FROM time;`.
### Getting your query ready for use in osqueryd
You don't have to do anything to make your query work in the osqueryd daemon. All osquery queries work in osqueryd. It's worth noting, however, that osqueryd is a long-running process. If your table leaks memory or uses a lot of systems resources, you will notice poor performance from osqueryd. For more information on ensuring a performant table, see [performance overview](../deployment/performance-safety.md).
When in doubt, use existing open source tables to guide your development.
SQL tables are used to represent abstract operating system concepts, such as running processes.
A table can be used in conjunction with other tables via operations like sub-queries and joins. This allows for a rich data exploration experience. While osquery ships with a default set of tables, osquery provides an API that allows you to create new tables.
You can explore current tables: [https://osquery.io/tables](https://osquery.io/tables). Tables that are up for grabs in terms of development can be found on Github issues using the "virtual tables" + "[up for grabs tag](https://github.com/facebook/osquery/issues?q=is%3Aopen+is%3Aissue+label%3A%22virtual+tables%22)".
## New Table Walkthrough
Let's walk through an exercise where we build a 'time' table. The table will have one row, and that row will have three columns: hour, minute, and second.
Column values (a single row) will be dynamically computed at query time.
**Table specifications**
Under the hood, osquery uses libraries from SQLite core to create "virtual tables". The default API for creating virtual tables is relatively complex. osquery has abstracted this complexity away, allowing you to write a simple table declaration.
To make table-creation simple osquery uses a *table spec* file.
The current specs are organized by operating system in the [specs](https://github.com/facebook/osquery/tree/master/specs) source folder.
For our time exercise, a spec file would look like the following:
```python
# This .table file is called a "spec" and is written in Python
# This syntax (several definitions) is defined in /tools/codegen/gentable/py.
table_name("time")
# Provide a short "one line" description, please use punctuation!
description("Returns the current hour, minutes, and seconds.")
# Define your schema, which accepts a list of Column instances at minimum.
# You may also describe foreign keys and "action" columns.
schema([
# Declare the name, type, and documentation description for each column.
# The supported types are INTEGER, BIGINT, TEXT, DATE, and DATETIME.
Column("hour", INTEGER, "The current hour"),
Column("minutes", INTEGER, "The current minutes past the hour"),
Column("seconds", INTEGER, "The current seconds past the minute"),
])
# Use the "@gen{TableName}" to communicate the C++ symbol name.
implementation("genTime")
```
You can leave the comments out in your production spec. Shoot for simplicity, do NOT go "hard in the paint" and do things like inheritance for Column objects, loops in your table spec, etc.
You might wonder "this syntax looks similar to Python?". Well, it is! The build process actually parses the spec files as Python code and meta-programs necessary C/C++ implementation files.
**Where do I put the spec?**
You may be wondering how osquery handles cross-platform support while still allowing operating-system specific tables. The osquery build process takes care of this by only generating the relevant code based on a directory structure convention.
- Cross-platform: [specs/](https://github.com/facebook/osquery/tree/master/specs/)
- Mac OS X: [specs/darwin/](https://github.com/facebook/osquery/tree/master/specs/darwin)
- General Linux: [specs/linux/](https://github.com/facebook/osquery/tree/master/specs/linux)
- Ubuntu: [specs/ubuntu/](https://github.com/facebook/osquery/tree/master/specs/ubuntu)
- CentOS: [specs/centos/](https://github.com/facebook/osquery/tree/master/specs/centos)
Note: the CMake build provides custom defines for each platform and platform version.
**Creating your implementation**
As indicated in the spec file, our implementation will be in a function called `genTime`. Since this is a very general table and should compile on all supported operating systems we can place it in *osquery/tables/utility/time.cpp*. The directory *osquery/table/* contains the set of *specs* and implementation categories. Place implementations in the corresponding category using your best judgement.
Here is that code for *osquery/tables/utility/time.cpp*:
```cpp
// Copyright 2004-present Facebook. All Rights Reserved.
#include <ctime>
#include <osquery/tables.h>
namespace osquery {
namespace tables {
QueryData genTime(QueryContext &context) {
Row r;
QueryData results;
time_t _time = time(0);
struct tm* now = localtime(&_time);
r["hour"] = INTEGER(now->tm_hour);
r["minutes"] = INTEGER(now->tm_min);
r["seconds"] = INTEGER(now->tm_sec);
results.push_back(r);
return results;
}
}
}
```
Key points to remember:
- Your implementation function should be in the `osquery::tables` namespace.
- Your implementation function should accept on `QueryContext&` parameter and return an instance of `QueryData`
## Using where clauses
The `QueryContext` data type is osquery's abstraction of the underlying SQL engine's query parsing. It is defined in [include/osquery/tables.h](https://github.com/facebook/osquery/blob/master/include/osquery/tables.h).
The most important use of the context is query predicate constraints (e.g., `WHERE col = 'value'`). Some tables MUST have a predicate constraint, others may optionally use the constraints to increase performance.
Examples:
`hash` requires a predicate, since the resultant rows are the hashes of the EQUALS constraint operators (`=`). The table implementation includes:
```cpp
auto paths = context.constraints["path"].getAll(EQUALS);
for (const auto& path_string : paths) {
boost::filesystem::path path = path_string;
[...]
}
```
`processes` optionally uses a predicate. A syscall to list process pids requires few resources. Enumerating "/proc" information and parsing environment/argument uses MANY resources. The table implementation includes:
```cpp
for (auto &pid : pidlist) {
if (!context.constraints["pid"].matches<int>(pid)) {
// Optimize by not searching when a pid is a constraint.
continue;
}
[...]
}
```
## SQL data types
Data types like `QueryData`, `Row`, `DiffResults`, etc. are osquery's built-in data result types. They're all defined in [include/osquery/database/results.h](https://github.com/facebook/osquery/blob/master/include/osquery/database/results.h).
`Row` is just a `typedef` for a `std::map<std::string, std::string>`. That's it. A row of data is just a mapping of strings that represent column names to strings that represent column values. Note that, currently, even if your SQL table type is an `int` and not a `std::string`, we need to cast the ints as strings to comply with the type definition of the `Row` object. They'll be casted back to `int`'s later. This is all handled transparently by osquery's supporting infrastructure as long as you use the macros like `TEXT`, `INTEGER`, `BIGINT`, etc when inserting columns into your row.
`QueryData` is just a `typedef` for a `std::vector<Row>`. Query data is just a list of rows. Simple enough.
To populate the data that will be returned to the user at runtime, your implementation function must generate the data that you'd like to display and populate a `QueryData` map with the appropriate `Rows`. Then, just return the `QueryData`.
In our case, we used system APIs to create a struct of type `tm` which has fields such as `tm_hour`, `tm_min` and `tm_sec` which represent the current time. We can then create our three entries in our `Row` variable: hour, minutes and seconds. Then we push that single row onto the `QueryData` variable and return it. Note that if we wanted our table to have many rows (a more common use-case), we would just push back more `Row` maps onto `results`.
## Building new tables
If you've created a new file, you'll need to make sure that CMake properly builds your code. Open [osquery/tables/CMakeLists.txt](https://github.com/facebook/osquery/blob/master/osquery/tables/CMakeLists.txt). Find the line that defines the library `osquery_tables` and add your file, "utility/time.cpp" to the sources which are compiled by that library.
If your table only works on OS X, find the target called `osquery_tables_darwin` and add your file to that list of sources instead. If your table only works on Linux, find the target called `osquery_tables_linux` and add your implementation file to that list of sources.
Return to the root of the repository and execute `make`. This will generate the appropriate code and link everything properly.
### Testing your table
If your code compiled properly, launch the interactive query console by executing `./build/[darwin|linux]/osquery/osqueryi` and try issuing your new table a command: `SELECT * FROM time;`.
### Getting your query ready for use in osqueryd
You don't have to do anything to make your query work in the osqueryd daemon. All osquery queries work in osqueryd. It's worth noting, however, that osqueryd is a long-running process. If your table leaks memory or uses a lot of systems resources, you will notice poor performance from osqueryd. For more information on ensuring a performant table, see [performance overview](../deployment/performance-safety.md).
When in doubt, use existing open source tables to guide your development.

View File

@ -90,7 +90,7 @@ if(NOT ENV{SKIP_TABLES})
endif()
# Amalgamate the utility tables needed to compile.
GENERATE_UTILITIES("${CMAKE_SOURCE_DIR}/osquery/tables")
GENERATE_UTILITIES("${CMAKE_SOURCE_DIR}")
AMALGAMATE("${CMAKE_SOURCE_DIR}" "utils" AMALGAMATION_UTILS)
ADD_OSQUERY_LIBRARY_CORE(osquery_amalgamation ${AMALGAMATION_UTILS})
@ -124,7 +124,7 @@ add_dependencies(devel libosquery)
if(NOT OSQUERY_BUILD_SDK_ONLY)
# Generate the osquery additional tables (the non-util).
GENERATE_TABLES("${CMAKE_SOURCE_DIR}/osquery/tables")
GENERATE_TABLES("${CMAKE_SOURCE_DIR}")
AMALGAMATE("${CMAKE_SOURCE_DIR}" "additional" AMALGAMATION)
ADD_OSQUERY_LIBRARY_ADDITIONAL(osquery_additional_amalgamation ${AMALGAMATION})

View File

@ -14,7 +14,7 @@ ADD_OSQUERY_LIBRARY(FALSE osquery_config_plugins
${OSQUERY_CONFIG_PLUGINS}
)
ADD_OSQUERY_LIBRARY(FALSE osquery_config_parsers
ADD_OSQUERY_LIBRARY(TRUE osquery_config_parsers
${OSQUERY_CONFIG_PARSERS}
)

View File

@ -135,5 +135,5 @@ Status QueryPackConfigParserPlugin::update(const ConfigTreeMap& config) {
}
/// Call the simple Query Packs ConfigParserPlugin "packs".
REGISTER(QueryPackConfigParserPlugin, "config_parser", "packs");
REGISTER_INTERNAL(QueryPackConfigParserPlugin, "config_parser", "packs");
}

View File

@ -62,12 +62,12 @@ else()
ADD_OSQUERY_LINK_ADDITIONAL("ip4tc")
endif()
file(GLOB OSQUERY_CROSS_TABLES "[!ue]*/*.cpp")
file(GLOB OSQUERY_CROSS_TABLES "[!ueo]*/*.cpp")
ADD_OSQUERY_LIBRARY_ADDITIONAL(osquery_tables
${OSQUERY_CROSS_TABLES}
)
file(GLOB OSQUERY_CROSS_TABLES_TESTS "[!u]*/tests/*.cpp")
file(GLOB OSQUERY_CROSS_TABLES_TESTS "[!uo]*/tests/*.cpp")
ADD_OSQUERY_TABLE_TEST(${OSQUERY_CROSS_TABLES_TESTS})
file(GLOB OSQUERY_UTILITY_TABLES "utility/*.cpp")
@ -76,11 +76,11 @@ ADD_OSQUERY_LIBRARY_CORE(osquery_tables_utility
)
if(NOT FREEBSD)
file(GLOB OSQUERY_UTILS "utils/*.cpp")
file(GLOB OSQUERY_UTILS "other/*.cpp")
ADD_OSQUERY_LIBRARY_ADDITIONAL(osquery_utils
${OSQUERY_UTILS}
)
file(GLOB OSQUERY_UTILS_TESTS "utils/tests/*.cpp")
file(GLOB OSQUERY_UTILS_TESTS "other/tests/*.cpp")
ADD_OSQUERY_TEST_ADDITIONAL(${OSQUERY_UTILS_TESTS})
endif()

View File

@ -1,89 +0,0 @@
/*
* Copyright (c) 2014, Facebook, Inc.
* All rights reserved.
*
* This source code is licensed under the BSD-style license found in the
* LICENSE file in the root directory of this source tree. An additional grant
* of patent rights can be found in the PATENTS file in the same directory.
*
*/
#include <osquery/config.h>
#include <osquery/core.h>
#include <osquery/registry.h>
#include <osquery/tables.h>
#include <osquery/filesystem.h>
#include "osquery/config/parsers/query_packs.h"
namespace osquery {
namespace tables {
typedef pt::ptree::value_type tree_node;
void genQueryPack(const tree_node& pack, QueryData& results) {
Row r;
// Packs are stored by name and contain configuration data.
r["name"] = pack.first;
r["path"] = pack.second.get("path", "");
// There are optional restrictions on the set of queries applied pack-wide.
auto pack_wide_version = pack.second.get("version", "");
auto pack_wide_platform = pack.second.get("platform", "");
// Iterate through each query in the pack.
for (auto const& query : pack.second.get_child("queries")) {
r["query_name"] = query.first;
r["query"] = query.second.get("query", "");
r["interval"] = INTEGER(query.second.get("interval", 0));
r["description"] = query.second.get("description", "");
r["value"] = query.second.get("value", "");
// Set the version requirement based on the query-specific or pack-wide.
if (query.second.count("version") > 0) {
r["version"] = query.second.get("version", "");
} else {
r["version"] = pack_wide_platform;
}
// Set the platform requirement based on the query-specific or pack-wide.
if (query.second.count("platform") > 0) {
r["platform"] = query.second.get("platform", "");
} else {
r["platform"] = pack_wide_platform;
}
// Adding a prefix to the pack queries to differentiate packs from schedule.
r["scheduled_name"] = "pack_" + r.at("name") + "_" + r.at("query_name");
if (Config::checkScheduledQueryName(r.at("scheduled_name"))) {
r["scheduled"] = INTEGER(1);
} else {
r["scheduled"] = INTEGER(0);
}
results.push_back(r);
}
}
QueryData genOsqueryPacks(QueryContext& context) {
QueryData results;
// Get a lock on the config instance.
ConfigDataInstance config;
// Get the loaded data tree from global JSON configuration.
const auto& packs_parsed_data = config.getParsedData("packs");
// Iterate through all the packs to get each configuration and set of queries.
for (auto const& pack : packs_parsed_data) {
// Make sure the pack data contains queries.
if (pack.second.count("queries") == 0) {
continue;
}
genQueryPack(pack, results);
}
return results;
}
}
}

View File

@ -12,7 +12,7 @@
#include <osquery/filesystem.h>
#include "osquery/tables/utils/yara_utils.h"
#include "osquery/tables/other/yara_utils.h"
namespace osquery {

View File

@ -15,7 +15,7 @@
#include <osquery/tables.h>
#include <osquery/status.h>
#include "osquery/tables/utils/yara_utils.h"
#include "osquery/tables/other/yara_utils.h"
#ifdef CONCAT
#undef CONCAT

View File

@ -14,7 +14,7 @@
#include <osquery/config.h>
#include <osquery/logger.h>
#include "osquery/tables/utils/yara_utils.h"
#include "osquery/tables/other/yara_utils.h"
namespace osquery {

View File

@ -99,6 +99,10 @@ const char* BOM::getPointer(int index, size_t* length) const {
}
const BOMVar* BOM::getVariable(size_t* offset) const {
if (offset == nullptr) {
return nullptr;
}
if (size_ < vars_offset_ + *offset + sizeof(BOMVar)) {
// Offset overflows the variable list.
*offset = 0;
@ -106,6 +110,11 @@ const BOMVar* BOM::getVariable(size_t* offset) const {
}
const BOMVar* var = (BOMVar*)((char*)Vars->list + *offset);
if (var == nullptr) {
*offset = 0;
return nullptr;
}
*offset += sizeof(BOMVar) + var->length;
return var;
}
@ -206,7 +215,7 @@ void genPackageBOM(const std::string& path, QueryData& results) {
for (unsigned i = 0; i < ntohl(bom.Vars->count); i++) {
// Iterate through each BOM variable, a packed set of structures.
auto var = bom.getVariable(&var_offset);
if (var == nullptr || var->name == nullptr) {
if (var == nullptr || (char*)var->name == nullptr) {
break;
}
@ -225,7 +234,7 @@ void genPackageBOM(const std::string& path, QueryData& results) {
const BOMTree* tree = (const BOMTree*)var_data;
auto paths = bom.getPaths(tree->child);
while (paths != nullptr && paths->isLeaf == htons(0)) {
if (paths->indices == nullptr) {
if ((BOMPathIndices*)paths->indices == nullptr) {
break;
}
paths = bom.getPaths(paths->indices[0].index0);

View File

@ -40,7 +40,7 @@ struct BOMBlockTable {
// See header for number of non-null blocks
uint32_t count;
// First entry must always be a null entry
BOMPointer* blockPointers;
BOMPointer blockPointers[];
} __attribute__((packed));
struct BOMTree {
@ -60,12 +60,12 @@ struct BOMTree {
struct BOMVar {
uint32_t index;
uint8_t length;
char* name;
char name[];
} __attribute__((packed));
struct BOMVars {
uint32_t count;
BOMVar* list;
BOMVar list[];
} __attribute__((packed));
struct BOMPathIndices {
@ -80,7 +80,7 @@ struct BOMPaths {
uint16_t count;
uint32_t forward;
uint32_t backward;
BOMPathIndices* indices;
BOMPathIndices indices[];
} __attribute__((packed));
struct BOMPathInfo2 {
@ -98,7 +98,7 @@ struct BOMPathInfo2 {
uint32_t devType;
};
uint32_t linkNameLength;
char* linkName;
char linkName[];
} __attribute__((packed));
struct BOMPathInfo1 {
@ -110,7 +110,7 @@ struct BOMPathInfo1 {
struct BOMFile {
// Parent BOMPathInfo1->id
uint32_t parent;
char* name;
char name[];
} __attribute__((packed));
class BOM {

View File

@ -18,9 +18,78 @@
#include <osquery/tables.h>
#include <osquery/filesystem.h>
#include "osquery/config/parsers/query_packs.h"
namespace osquery {
namespace tables {
typedef pt::ptree::value_type tree_node;
void genQueryPack(const tree_node& pack, QueryData& results) {
Row r;
// Packs are stored by name and contain configuration data.
r["name"] = pack.first;
r["path"] = pack.second.get("path", "");
// There are optional restrictions on the set of queries applied pack-wide.
auto pack_wide_version = pack.second.get("version", "");
auto pack_wide_platform = pack.second.get("platform", "");
// Iterate through each query in the pack.
for (auto const& query : pack.second.get_child("queries")) {
r["query_name"] = query.first;
r["query"] = query.second.get("query", "");
r["interval"] = INTEGER(query.second.get("interval", 0));
r["description"] = query.second.get("description", "");
r["value"] = query.second.get("value", "");
// Set the version requirement based on the query-specific or pack-wide.
if (query.second.count("version") > 0) {
r["version"] = query.second.get("version", "");
} else {
r["version"] = pack_wide_platform;
}
// Set the platform requirement based on the query-specific or pack-wide.
if (query.second.count("platform") > 0) {
r["platform"] = query.second.get("platform", "");
} else {
r["platform"] = pack_wide_platform;
}
// Adding a prefix to the pack queries to differentiate packs from schedule.
r["scheduled_name"] = "pack_" + r.at("name") + "_" + r.at("query_name");
if (Config::checkScheduledQueryName(r.at("scheduled_name"))) {
r["scheduled"] = INTEGER(1);
} else {
r["scheduled"] = INTEGER(0);
}
results.push_back(r);
}
}
QueryData genOsqueryPacks(QueryContext& context) {
QueryData results;
// Get a lock on the config instance.
ConfigDataInstance config;
// Get the loaded data tree from global JSON configuration.
const auto& packs_parsed_data = config.getParsedData("packs");
// Iterate through all the packs to get each configuration and set of queries.
for (auto const& pack : packs_parsed_data) {
// Make sure the pack data contains queries.
if (pack.second.count("queries") == 0) {
continue;
}
genQueryPack(pack, results);
}
return results;
}
void genFlag(const std::string& name,
const FlagInfo& flag,
QueryData& results) {

View File

@ -24,3 +24,4 @@ freebsd:yara_events
freebsd:yara
freebsd:system_controls
freebsd:smbios_tables
example

View File

@ -10,3 +10,6 @@ schema([
Column("path", TEXT, "Path of package bom", required=True),
])
implementation("packages@genPackageBOM")
examples([
"select * from package_bom where path = '/var/db/receipts/com.apple.pkg.MobileDevice.bom'"
])

View File

@ -8,6 +8,9 @@ schema([
Column("install_time", INTEGER, "Timestamp of install time"),
Column("installer_name", TEXT, "Name of installer process"),
Column("path", TEXT, "Path of receipt plist",
additional=True),
index=True, additional=True),
])
implementation("packages@genPackageReceipts")
examples([
"select * from package_bom where path = '/var/db/receipts/com.apple.pkg.MobileDevice.bom'"
])

View File

@ -1,11 +1,17 @@
table_name("preferences")
description("OS X defaults and managed preferences.")
schema([
Column("domain", TEXT, "Application ID usually in com.name.product format"),
Column("domain", TEXT, "Application ID usually in com.name.product format",
index=True),
Column("key", TEXT, "Preference top-level key"),
Column("subkey", TEXT, "Intemediate key path, includes lists/dicts"),
Column("value", TEXT, "String value of most CF types"),
Column("forced", INTEGER, "1 if the value is forced/managed, else 0"),
Column("path", TEXT, "(optional) read preferences from a plist"),
Column("path", TEXT, "(optional) read preferences from a plist",
additional=True),
])
implementation("system/darwin/preferences@genOSXPreferences")
examples([
"select * from preferences where domain = 'loginwindow'",
"select * from preferences where path = '/Library/Preferences/loginwindow.plist'"
])

View File

@ -5,4 +5,4 @@ schema([
Column("user_action", TEXT, "Action taken by user after prompted"),
Column("time", TEXT, "Quarantine alert time"),
])
implementation("xprotect@genXProtectReports")
implementation("xprotect@genXProtectReports")

View File

@ -8,4 +8,3 @@ schema([
Column("comment", TEXT, "Optional comment for a service."),
])
implementation("etc_services@genEtcServices")

60
specs/example.table Normal file
View File

@ -0,0 +1,60 @@
# This .table file is called a "spec" and is written in Python
# This syntax (several definitions) is defined in /tools/codegen/gentable/py.
table_name("example")
# Provide a short "one line" description, please use punctuation!
description("This is an example table spec.")
# Define your schema, which accepts a list of Column instances at minimum.
# You may also describe foreign keys and "action" columns.
schema([
# Declare the name, type, and documentation description for each column.
# The supported types are INTEGER, BIGINT, TEXT, DATE, and DATETIME.
Column("name", TEXT, "Description for name column"),
Column("points", INTEGER, "This is a signed SQLite int column"),
Column("size", BIGINT, "This is a signed SQLite bigint column"),
# More complex tables include columns denoted as "required".
# A required column MUST be present in a query predicate (WHERE clause).
Column("action", TEXT, "Action performed in generation", required=True),
# Tables may optimize there selection using "index" columns.
# The optimization is undefined, but this is a hint to table users that
# JOINing on this column will improve performance.
Column("id", INTEGER, "An index of some sort", index=True),
# Some tables operate using default configurations or OS settings.
# OS X has default paths for .app executions, but .apps exist otherwise.
# Tables may generate additional or different data when using some columns.
# Set the "additional" argument if searching a non-default path.
Column("path", TEXT, "Path of example", additional=True),
# When paths are involved they are usually both additional and an index.
])
# Use the "@gen{TableName}" to communicate the C++ symbol name.
# Event subscriber tables and other more-complex implementations may use
# class-static methods for generation; they use "@ClassName::genTable" syntax.
implementation("@genExample")
# Provide some example queries that stress table use.
# If using actions or indexing, it will be best to include those predicates.
examples([
"select * from example where id = 1",
"select name from example where action = 'new'",
# Examples may be used in documentation and in stress/fuzz testing.
# Including example JOINs on indexes is preferred.
"select e.* from example e, example_environments ee where e.id = ee.id"
])
# Attributes provide help to documentation/API generation tools.
# If an attribute is false, or no attributes apply, do no include 'attributes'.
attributes(
# Set event_subscriber if this table is generated using an EventSubscriber.
event_subscriber=False,
# Set utility if this table should be built into the osquery-SDK (core).
# Utility tables are mostly reserved for osquery meta-information.
utility=False,
# Set kernel_required if an osquery kernel extension/module/driver is needed.
kernel_required=False
)

View File

@ -6,3 +6,8 @@ schema([
Column("groupname", TEXT, "Canonical local group name"),
])
implementation("groups@genGroups")
examples([
"select * from groups where gid = 0",
# Group/user_groups is not JOIN optimized
#"select g.groupname, ug.uid from groups g, user_groups ug where g.gid = ug.gid",
])

View File

@ -1,10 +1,14 @@
table_name("process_envs")
description("A key/value table of environment variables for each process.")
schema([
Column("pid", INTEGER, "Process (or thread) ID"),
Column("pid", INTEGER, "Process (or thread) ID", index=True),
Column("key", TEXT, "Environment variable name"),
Column("value", TEXT, "Environment variable value"),
ForeignKey(column="pid", table="processes"),
ForeignKey(column="pid", table="process_open_files"),
])
implementation("system/processes@genProcessEnvs")
examples([
"select * from process_envs where pid = 1",
'''select pe.*
from process_envs pe, (select * from processes limit 10) p
where p.pid = pe.pid;'''
])

View File

@ -1,7 +1,7 @@
table_name("process_memory_map")
description("Process memory mapped files and pseudo device/regions.")
schema([
Column("pid", INTEGER, "Process (or thread) ID"),
Column("pid", INTEGER, "Process (or thread) ID", index=True),
Column("start", TEXT, "Virtual start address (hex)"),
Column("end", TEXT, "Virtual end address (hex)"),
Column("permissions", TEXT, "r=read, w=write, x=execute, p=private (cow)"),
@ -12,3 +12,6 @@ schema([
Column("pseudo", INTEGER, "1 if path is a pseudo path, else 0"),
])
implementation("processes@genProcessMemoryMap")
examples([
"select * from process_memory_map where pid = 1",
])

View File

@ -1,8 +1,11 @@
table_name("process_open_files")
description("File descriptors for each process.")
schema([
Column("pid", BIGINT, "Process (or thread) ID"),
Column("pid", BIGINT, "Process (or thread) ID", index=True),
Column("fd", BIGINT, "Process-specific file descriptor number"),
Column("path", TEXT, "Filesystem path of descriptor"),
])
implementation("system/process_open_files@genOpenFiles")
examples([
"select * from process_open_files where pid = 1",
])

View File

@ -1,7 +1,7 @@
table_name("process_open_sockets")
description("Processes which have open network sockets on the system.")
schema([
Column("pid", INTEGER, "Process (or thread) ID"),
Column("pid", INTEGER, "Process (or thread) ID", index=True),
Column("socket", INTEGER, "Socket descriptor number"),
Column("family", INTEGER, "Network protocol (IPv4, IPv6)"),
Column("protocol", INTEGER, "Transport protocol (TCP/UDP)"),
@ -11,4 +11,6 @@ schema([
Column("remote_port", INTEGER, "Socket remote port"),
])
implementation("system/process_open_sockets@genOpenSockets")
examples([
"select * from process_open_sockets where pid = 1",
])

View File

@ -1,7 +1,7 @@
table_name("processes")
description("All running processes on the host system.")
schema([
Column("pid", INTEGER, "Process (or thread) ID"),
Column("pid", INTEGER, "Process (or thread) ID", index=True),
Column("name", TEXT, "The process path or shorthand argv[0]"),
Column("path", TEXT, "Path to executed binary"),
Column("cmdline", TEXT, "Complete argv"),
@ -21,3 +21,6 @@ schema([
Column("parent", INTEGER, "Process parent's PID"),
])
implementation("system/processes@genProcesses")
examples([
"select * from processes where pid = 1",
])

View File

@ -4,10 +4,15 @@ schema([
Column("uid", BIGINT, "User ID"),
Column("gid", BIGINT, "Group ID (unsigned)"),
Column("uid_signed", BIGINT, "User ID as int64 signed (Apple)"),
Column("gid_signed", BIGINT, "Group ID as int64 signed (Apple)"),
Column("gid_signed", BIGINT, "Default group ID as int64 signed (Apple)"),
Column("username", TEXT),
Column("description", TEXT, "Optional user description"),
Column("directory", TEXT, "User's home directory"),
Column("shell", TEXT, "User's configured default shell"),
])
implementation("users@genUsers")
examples([
"select * from users where uid = 1000",
"select * from users where username = 'root'",
"select count(*) from users u, user_groups ug where u.uid = ug.uid",
])

View File

@ -24,3 +24,8 @@ schema([
])
attributes(utility=True)
implementation("utility/file@genFile")
examples([
"select * from file where path = '/etc/passwd'",
"select * from file where directory = '/etc/'",
"select * from file where pattern = '/etc/%'",
])

View File

@ -9,3 +9,7 @@ schema([
])
attributes(utility=True)
implementation("utility/hash@genHash")
examples([
"select * from hash where path = '/etc/passwd'",
"select * from hash where directory = '/etc/'",
])

Some files were not shown because too many files have changed in this diff Show More