SingleSel1 | SingleSel2 |
---|---|
echo '<input type="radio" name="Sex" value="F" />' | tee /tmp/cascadia.xml | cascadia -i -o -c 'input[name=Sex][value=F]' |
cascadia -i /tmp/cascadia.xml -o -c 'input[name=Sex][value=F]' |
The Go Cascadia package implements CSS selectors for html. This is the command line tool, started as a thin wrapper around that package, but growing into a better tool to test CSS selectors without writing Go code:
Its output has two modes, none-block selection mode and block selection mode, depending on whether the --piece
parameter is given on the command line or not.
For details about the concept of block and pieces, check out andrew-d/goscrape (in fact, cascadia
was initially developed just for it, so that I don't need to tweak Go code, build & run it just to test out the block and pieces selectors). Here is the exception:
- Inside each page, there's 1 or more blocks - some logical method of splitting up a page into subcomponents.
- Inside each block, you define some number of pieces of data that you wish to extract. Each piece consists of a name, a selector, and what data to extract from the current block.
This all sounds rather complicated, but in practice it's quite simple. See the next section for details.
In summary,
- The none-block selection mode will output the selection as HTML source by default
- but if
-t
, or--text
cli option is provided, the none-block selection mode will output as text instead.- By default, such text output will get their leading and trailing white space trimmed.
- However, if
-R
, or--Raw
cli option is provided, no trimming will be done.
- but if
- The block selection mode will output HTML as text in a
tsv
/csv
table form by default- if the
--piece
selection is prefixed withRAW:
, then that specific block selection will output in HTML instead. See the following for details.
- if the
All the three -i -o -c
options are required. By default it reads from stdin
and output to stdout
:
$ {{shell .SingleSel1}}
Either the input or the output can be followed by a file name:
$ {{shell .SingleSel2}}
$ cascadia -i /tmp/cascadia.xml -c 'input[name=Sex][value=F]' -o /tmp/out.html
1 elements for 'input[name=Sex][value=F]':
$ cat /tmp/out.html
<input type="radio" name="Sex" value="F"/>
More other options can be applied too:
# using --wrap-html
$ cascadia -i /tmp/cascadia.xml -c 'input[name=Sex][value=F]' -o /tmp/out.html -w
1 elements for 'input[name=Sex][value=F]':
$ cat /tmp/out.html
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<base href="">
</head>
<body>
<input type="radio" name="Sex" value="F"/>
</body>
# using --wrap-html with --style
$ cascadia -i /tmp/cascadia.xml -c 'input[name=Sex][value=F]' -o /tmp/out.html -w -y '<link rel="stylesheet" href="styles.css">'
1 elements for 'input[name=Sex][value=F]':
$ cat /tmp/out.html
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<base href="">
<link rel="stylesheet" href="styles.css">
</head>
<body>
<input type="radio" name="Sex" value="F"/>
</body>
-
For more on using the
--style
option, check out "adding styles". -
For more examples, check out the wiki, which includes but not limits to,