Add code optimizing the boolean expressions in the abstract syntax tree
before generating output using the backend.
The main idea behind optimizing the AST is that less repeated terms is
generally better for backend performance. This is especially relevant
to backends that do not perform any query language optimization down
the road, such as those that generate code.
The following optimizations are currently performed:
- Removal of empty OR(), AND()
- OR(X), AND(X) => X
- OR(X, X, ...), AND(X, X, ...) => OR(X, ...), AND(X, ...)
- OR(X, OR(Y)) => OR(X, Y)
- OR(AND(X, ...), AND(X, ...)) => AND(X, OR(AND(...), AND(...)))
- NOT(NOT(X)) => X
A common example for when these suboptimal rules actually occur in
practice is when a rule has multiple alternative detections that are
OR'ed together in the condition, and all of the detections include a
common element, such as the same EventID.
This implementation is not optimized for performance and will perform
poorly on very large expressions.
Breaking change: Instead of feeding the output class with the results,
they are now returned as strings (*Backend.generate()) or list
(SigmaCollectionParser.generate()). Users of the library must now take
care of the output to the terminal, files or wherever Sigma rules should
be pushed to.