-
Notifications
You must be signed in to change notification settings - Fork 82
Module: Argument Parser
- Boost: http://www.boost.org/doc/libs/1_58_0/doc/html/program_options.html
- follow POSIX guidelines: http://bioportal.weizmann.ac.il/course/prog2/tutorial/essential/attributes/_posix.html
- a modern c++ standalone arg-parser: https://github.com/adishavit/argh
- https://www.gnu.org/software/libc/manual/html_node/Argument-Syntax.html
There are to ways to handle command line parsing and bridge the developer-user communication. both have advantaes and disadvantages:
Store the options/parameters/flags/.. specified by the developer in a list. When parsing starts, we already now the names, types and properties of every option and can dissect the command line piece by piece by looking for expected options.
Advantages:
- Knowing the options/.. beforehand can solve ambiguousness and help to throw helpful errors. E.g.
-i hello
could be a flagi
and argument/positionalhello
or it could be an option-value pair. The developer thus is not restricted to specify the parser set up in a certain order.
Disadvantages:
- Storing options. is difficult for type handling. SeqAn2: define extra types like SEQAN::ARGUMENT_TYPE::INTEGER, which is a lot of overhead, limits the types, and requires maintenance. A >C++-11 solution would be to store the options/.., which are templetized by the option type, in vector of std::any or via an std::variant. This gets tedious for a lot of functionality like transforming the command line string into the value type because I would always need to ask for the type.
Parse the command line first and store user defined options/flags/parameters/.. in a list. Now check every option specified by the developer directly for existence. Advantages:
- Since the developer specified options are retrieved directly I can have functions working on any type, because I don't need to store them.
Disadvantages:
- I need to take care of ambiguousness, e.g.
-i hello
flag+argument vs. option-value, by storing both possibilities and restricting the developer to set up the parser in a strict order, e.g. first specify all flags, than all options.
To avoid the above mentioned disadvantages our current approach combines the two others. We do not want to store the options in a list but parse them directly, but we also want to allow the developer to specify options and arguments in an independent order.
// positional option
// in order, required (except if last one is a list)
add_positional_option(value_type & save_place, // e.g. myOptions.outputFile
std::string const & description, // temporary string for help page output
option_spec const & spec, // (optional) default, advanced, hidden
functor const & validator, // (optional) functor satisfying the validator concept
);
// non-valued option (existance is true/false)
add_flag(bool & save_place // e.g. myOptions.fastMode
char const short_id, // -f
std::string const & long_id, // --fast
std::string const & description // "fast mode is switched on"
);
// valued option
add_option(value_type & save_place, // e.g. myOptions.outputFile
char const short_id, // -o
std::string const & long_id, // --output
std::string const & description // "give the name of an output file."
option_spec const & spec, // (optional) default, advanced, hidden
functor const & validator, // (optional) functor satisfying the validator concept
);
savePlace
would be an existing variable that also defines the type of the argument. savePlace
can be set to a default value upon definition/creation / outside of the add_* function.
validator
is a functor that verifies the argument. It can be user-specified, but there shall be pre-defined validators for integral ranges (takes a pair of integral) or file extensions (takes a container of std::string).
Legal ways to specify flags (options without value):
Allowed:
-i (simple, one char short id)
-fi (clustered, one char short id)
--iter (simple, multi char long id, no cluster allowed)
Legal ways to specify options:
Allowed:
-i 5 (short id, space separation)
-i5 (short id, no space separation)
--iter 5 (long id, space separation)
--iter=5 (long id, equality sign separation)
Not allowed:
-i=5 (short id, equality sign separation)
--iter5 (long id, no space separation)
-fi 5 (where f is flag)
-fi5 (where f is flag)
-fi=5 (where f is flag)
Validators are designed as functors which throw if a value does not pass validation and thus terminate the program. They are not Semi-regular since default construction is not possible right now. The reason for this decision was that there is no obvious use case to default construct a validator and change its state later on, as most validators are given directly as rvalues to the add_option(myint, ..., range_validator{0,10})
call. If default construction is implemented later on, it would be preferable that a default constructed validator always accepts every value (type dependent).
- Naming convention: parameter option argument positional anonymous...
- Should the parser be put into it's own repo?
Wishlist from h-2:
- also support enums as TValue and offer a validator that accepts the enum labels as valid strings. This will be a little tricky to implement, but one could check how e.g. Cereal get string values from enum labels.
- the
input_file_validator
(and output_file_validator and directory...) shall check via http://en.cppreference.com/w/cpp/filesystem that the file/directory exists / is readable / writeable et cetera.
Questions:
- rename option to parameter?
Check if the following is supported: (taken from seqan2 feature request issue #533)
Martin Frith notes that there are some deficiencies with the current state of the argument parser:
- It requires space between an option and its argument.
- It doesn't allow "-" as a positional argument. "-" is often used to indicate stdin and stdout.
- It doesn't seem to allow a list of zero or more positional arguments. (It does allow a list of one or more positional arguments.) It's standard to allow a list of zero or more file-names, with zero meaning "read stdin".
- It doesn't recognize "--" meaning "end of options (http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html). This is useful to allow negative numbers as positional arguments.
- It requires argv to be const, whereas, strictly speaking, argv should not be const (http://en.wikipedia.org/wiki/Main_function). And it's not possible to convert non-const argv to const (http://www.parashift.com/c++-faq/constptrptr-conversion.html).
- (Minor): It's not possible to display a fake default value in the help message. This might be useful if the default value is logically -INF, but as an implementation detail is actually INT_MIN.