Picireny Hierarchical Delta Debugging Framework
Hierarchical Delta Debugging Framework
Picireny is a Python 3 implementation of the Hierarchical Delta Debugging (HDD in short) algorithm adapted to use ANTLR v4 for parsing both the input and the grammar(s) describing the format of the input. It relies on picire to provide the implementation of the core Delta Debugging algorithm along with various tweaks like parallelization. Just like the picire framework, picireny can also be used either as a command line tool or as a library.
Both Hierarchical Delta Debugging and Delta Debugging automatically reduce “interesting” tests while keeping their “interesting” behaviour. (E.g., “interestingness” may mean failure-inducing input to a system-under-test.) However, HDD is an improvement that tries to investigate less test cases during the reduction process by making use of knowledge on the structure of the input.
The tool (and the algorithm) works iteratively in several ways. As a first step, it splits up the input into tokens and organizes them in a tree structure as defined by a grammar. Then, iteratively, it invokes Delta Debugging on each level of the tree from top to bottom, and DD is an iterative process itself, too. Finally, the nodes kept in the tree are “unparsed” to yield a reduced but still “interesting” output.
The quick way:
pip install picireny
Alternatively, by cloning the project and running setuptools:
python setup.py install
picireny uses the same CLI as picire and hence accepts the same options. On top of the inherited ones, picireny accepts several further arguments:
- --grammar (optional): List of grammars describing the input format. (You can write them by hand or simply download them from the ANTLR v4 grammars repository.)
- --start (optional): Name of the start rule (optionally prefixed with a grammar name) as [grammarname:]rulename.
- --replacements (optional): Json file containing rule names and minimal replacement strings (otherwise these are calculated automatically) (see schema).
- --format (optional): Json file describing the input format (see schema and example). This descriptor can incorporate all the above (--grammar, --start and --replacements) properties, along with the possibility of island grammar definitions. If both --format and the aforementioned arguments are present, then the latter will override the appropriate values of the format file.
- --antlr (optional): Path to the ANTLR tool jar.
- --parser (optional): Language of the generated parser. Currently ‘python’ (default) and ‘java’ targets (faster, but needs JDK) are supported.
Note: although, all the arguments are optional, the grammar files and the start rule of the top-level parser must be defined with an arbitrary combination of the --format, --grammars, and --start arguments.
Example usage to reduce an HTML file:
picireny --input=<path/to/the/input.html> --test=<path/to/the/tester> \ --grammar "HTMLLexer.g4 HTMLParser.g4" --start htmlDocument \ --parallel --subset-iterator=skip --complement-iterator=backward
picireny was tested on:
- Linux (Ubuntu 14.04 / 15.10 / 16.04)
- Mac OS X (El Capitan 10.11 / Sierra 10.12)
- Windows (Server 2012 R2)
Copyright and Licensing