-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
656cc14
commit 22b68e4
Showing
5 changed files
with
69 additions
and
33 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,23 @@ | ||
To use Kaleidoscope, first import its package, | ||
Kaleidoscope is included in the `kaleidoscope` package, and exported to the `soundness` package. | ||
|
||
To use Kaleidoscope alone, you can include the import, | ||
```scala | ||
import kaleidoscope.* | ||
``` | ||
or to use it with other [Soundness](https://github.com/propensive/soundness/) libraries, include: | ||
```scala | ||
import soundness.* | ||
``` | ||
|
||
and you can then use a Kaleidoscope regular expression—a string prefixed with | ||
the letter `r`—anywhere you can use a pattern in Scala. For example, | ||
> Note that Kaleidoscope uses the `Text` type from | ||
> [Anticipation](https://github.com/propensive/anticipation) and the `Optional` | ||
> type from [Vacuous](https://github.com/propensive/vacuous/). These offer some | ||
> advantages, but they can be easily converted: `Text#s` converts a `Text` to a | ||
> `String` and `Optional#option` converts an `Optional` value to its equivalent | ||
> `Option`. The necessary imports are show in the examples. | ||
You can then use a Kaleidoscope regular expression—a string prefixed with | ||
the letter `r`—anywhere you can pattern match against a string in Scala. For example, | ||
```scala | ||
import anticipation.Text | ||
|
||
|
@@ -31,7 +44,7 @@ with the exception that a capturing group (enclosed within `(` and `)`) may be | |
bound to an identifier by placing it, like an interpolated string substitution, | ||
immediately prior to the capturing group, as `$identifier` or `${identifier}`. | ||
|
||
Here is an example: | ||
Here is an example of using a pattern match against filenames: | ||
```scala | ||
enum FileType: | ||
case Image(text: Text) | ||
|
@@ -42,29 +55,41 @@ def identify(path: Text): FileType = path match | |
case r"/styles/$styles(.*)" => FileType.Stylesheet(styles) | ||
``` | ||
|
||
Alternatively, this can be extracted directly in a `val` definition, like so: | ||
Alternatively, as with patterns in general, this can be extracted directly in a | ||
`val` definition. | ||
|
||
Here is an example of matching an email address: | ||
```scala | ||
val r"^[a-z0-9._%+-]+@$domain([a-z0-9.-]+\.$tld([a-z]{2,6}))$$" = | ||
"[email protected]": @unchecked | ||
``` | ||
In the REPL, this would bind the following values: | ||
|
||
The `@unchecked` annotation ascribed to the result is standard Scala, and | ||
acknowledges to the compiler that the match is _partial_ and may fail at | ||
runtime. | ||
|
||
If you try this example in the Scala REPL, it would bind the following values: | ||
``` | ||
> domain: Text = t"example.com" | ||
> tld: Text = t"com" | ||
``` | ||
|
||
In addition, the syntax of the regular expressionwill be checked at compile-time, and any | ||
issues will be reported then. | ||
In addition, the syntax of the regular expression will be checked at | ||
compile-time, and any issues will be reported then. | ||
|
||
### Repeated and optional capture groups | ||
|
||
A normal, unitary capturing group will extract into a `Text` value. But if a capturing group has | ||
a repetition suffix, such as `*` or `+`, then the extracted type will be a `List[Text]`. This also | ||
applies to repetition ranges, such as `{3}`, `{2,}` or `{1,9}`. Note that `{1}` will still extract | ||
a `Text` value. | ||
A normal, _unitary_ capturing group, like `domain` and `tld` above, will | ||
extract into `Text` values. But if a capturing group has a repetition suffix, | ||
such as `*` or `+`, then the extracted type will be a `List[Text]`. This also | ||
applies to repetition ranges, such as `{3}`, `{2,}` or `{1,9}`. | ||
|
||
Note that `{1}` will still extract a `Text` value. The type is determined | ||
statically from the pattern, and not dynamically from the runtime scrutinee. | ||
|
||
A capture group may be marked as optional, meaning it can appear either zero or one times. This | ||
will extract a value with the type `Option[Text]`. | ||
A capture group may be marked as optional, meaning it can appear either zero or | ||
one times. This will extract a value with the type `Optional[Text]`; that is, | ||
if it present it will be a `Text` value, and if not, it will be `Unset`. | ||
|
||
For example, see how `init` is extracted as a `List[Text]`, below: | ||
```scala | ||
|
@@ -80,9 +105,15 @@ def parseList(): List[Text] = "parsley, sage, rosemary, and thyme" match | |
|
||
Note that inside an extractor pattern string, whether it is single- (`r"..."`) | ||
or triple-quoted (`r"""..."""`), special characters, notably `\`, do not need | ||
to be escaped, with the exception of `$` which should be written as `$$`. It is | ||
still necessary, however, to follow the regular expression escaping rules, for | ||
example, an extractor matching a single opening parenthesis would be written as | ||
`r"\("` or `r"""\("""`. | ||
to be escaped, with the exception of `$` which should be written as `$$`. | ||
|
||
It is still necessary, however, to follow the regular expression escaping | ||
rules, for example, an extractor matching a single opening parenthesis would be | ||
written as `r"\("` or `r"""\("""`. | ||
|
||
## Globs | ||
|
||
Globs offer a simplified and limited form of regular expression. You can use | ||
these in exactly the same way as a standard regular expresion, using the | ||
`g"..."` interpolator instead. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
- pattern match strings against regular expressions | ||
- regular expressions can be written inline in patterns | ||
- extraction of capturing groups in patterns | ||
- typed extraction (into `List`s or `Option`s) of variable-length capturing groups | ||
- static verification of regular expression syntax | ||
- regular expressions can be written inline in patterns, anywhere a string could match | ||
- direct extraction of capturing groups in patterns | ||
- typed extraction (into `List`s or [Vacuous](https://github.com/propensive/vacuous/) `Optional`s) of variable-length capturing groups | ||
- static checking of regular expression syntax | ||
- simpler "glob" syntax is also provided |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,14 @@ | ||
Kaleidoscope is a small library to make pattern matching against strings more pleasant. Regular | ||
expressions can be written directly in patterns, and capturing groups bound directly to variables, | ||
typed according to the group's repetition. Here is an example: | ||
```amok scala | ||
Kaleidoscope is a small library to make pattern matching against strings more | ||
pleasant. Regular expressions can be written directly in patterns, and | ||
capturing groups bound directly to variables, typed according to the group's | ||
repetition. Here is an example: | ||
```scala | ||
case class Email(user: Text, domain: Text) | ||
|
||
email match | ||
case r"$user([^@]+)@$domain(.*)" => Email(name, domain) | ||
``` | ||
``` | ||
|
||
Strings are widely used to carry complex data, when it's wiser to use | ||
structured objects. Kaleidoscope makes it easier to move away from strings. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters