Skip to content

Commit

Permalink
Updated documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
propensive committed May 9, 2024
1 parent 656cc14 commit 22b68e4
Show file tree
Hide file tree
Showing 5 changed files with 69 additions and 33 deletions.
67 changes: 49 additions & 18 deletions doc/basics.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,23 @@
To use Kaleidoscope, first import its package,
Kaleidoscope is included in the `kaleidoscope` package, and exported to the `soundness` package.

To use Kaleidoscope alone, you can include the import,
```scala
import kaleidoscope.*
```
or to use it with other [Soundness](https://github.com/propensive/soundness/) libraries, include:
```scala
import soundness.*
```

and you can then use a Kaleidoscope regular expression—a string prefixed with
the letter `r`—anywhere you can use a pattern in Scala. For example,
> Note that Kaleidoscope uses the `Text` type from
> [Anticipation](https://github.com/propensive/anticipation) and the `Optional`
> type from [Vacuous](https://github.com/propensive/vacuous/). These offer some
> advantages, but they can be easily converted: `Text#s` converts a `Text` to a
> `String` and `Optional#option` converts an `Optional` value to its equivalent
> `Option`. The necessary imports are show in the examples.
You can then use a Kaleidoscope regular expression—a string prefixed with
the letter `r`—anywhere you can pattern match against a string in Scala. For example,
```scala
import anticipation.Text

Expand All @@ -31,7 +44,7 @@ with the exception that a capturing group (enclosed within `(` and `)`) may be
bound to an identifier by placing it, like an interpolated string substitution,
immediately prior to the capturing group, as `$identifier` or `${identifier}`.

Here is an example:
Here is an example of using a pattern match against filenames:
```scala
enum FileType:
case Image(text: Text)
Expand All @@ -42,29 +55,41 @@ def identify(path: Text): FileType = path match
case r"/styles/$styles(.*)" => FileType.Stylesheet(styles)
```

Alternatively, this can be extracted directly in a `val` definition, like so:
Alternatively, as with patterns in general, this can be extracted directly in a
`val` definition.

Here is an example of matching an email address:
```scala
val r"^[a-z0-9._%+-]+@$domain([a-z0-9.-]+\.$tld([a-z]{2,6}))$$" =
"[email protected]": @unchecked
```
In the REPL, this would bind the following values:

The `@unchecked` annotation ascribed to the result is standard Scala, and
acknowledges to the compiler that the match is _partial_ and may fail at
runtime.

If you try this example in the Scala REPL, it would bind the following values:
```
> domain: Text = t"example.com"
> tld: Text = t"com"
```

In addition, the syntax of the regular expressionwill be checked at compile-time, and any
issues will be reported then.
In addition, the syntax of the regular expression will be checked at
compile-time, and any issues will be reported then.

### Repeated and optional capture groups

A normal, unitary capturing group will extract into a `Text` value. But if a capturing group has
a repetition suffix, such as `*` or `+`, then the extracted type will be a `List[Text]`. This also
applies to repetition ranges, such as `{3}`, `{2,}` or `{1,9}`. Note that `{1}` will still extract
a `Text` value.
A normal, _unitary_ capturing group, like `domain` and `tld` above, will
extract into `Text` values. But if a capturing group has a repetition suffix,
such as `*` or `+`, then the extracted type will be a `List[Text]`. This also
applies to repetition ranges, such as `{3}`, `{2,}` or `{1,9}`.

Note that `{1}` will still extract a `Text` value. The type is determined
statically from the pattern, and not dynamically from the runtime scrutinee.

A capture group may be marked as optional, meaning it can appear either zero or one times. This
will extract a value with the type `Option[Text]`.
A capture group may be marked as optional, meaning it can appear either zero or
one times. This will extract a value with the type `Optional[Text]`; that is,
if it present it will be a `Text` value, and if not, it will be `Unset`.

For example, see how `init` is extracted as a `List[Text]`, below:
```scala
Expand All @@ -80,9 +105,15 @@ def parseList(): List[Text] = "parsley, sage, rosemary, and thyme" match

Note that inside an extractor pattern string, whether it is single- (`r"..."`)
or triple-quoted (`r"""..."""`), special characters, notably `\`, do not need
to be escaped, with the exception of `$` which should be written as `$$`. It is
still necessary, however, to follow the regular expression escaping rules, for
example, an extractor matching a single opening parenthesis would be written as
`r"\("` or `r"""\("""`.
to be escaped, with the exception of `$` which should be written as `$$`.

It is still necessary, however, to follow the regular expression escaping
rules, for example, an extractor matching a single opening parenthesis would be
written as `r"\("` or `r"""\("""`.

## Globs

Globs offer a simplified and limited form of regular expression. You can use
these in exactly the same way as a standard regular expresion, using the
`g"..."` interpolator instead.

8 changes: 4 additions & 4 deletions doc/features.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
- pattern match strings against regular expressions
- regular expressions can be written inline in patterns
- extraction of capturing groups in patterns
- typed extraction (into `List`s or `Option`s) of variable-length capturing groups
- static verification of regular expression syntax
- regular expressions can be written inline in patterns, anywhere a string could match
- direct extraction of capturing groups in patterns
- typed extraction (into `List`s or [Vacuous](https://github.com/propensive/vacuous/) `Optional`s) of variable-length capturing groups
- static checking of regular expression syntax
- simpler "glob" syntax is also provided
15 changes: 10 additions & 5 deletions doc/intro.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,14 @@
Kaleidoscope is a small library to make pattern matching against strings more pleasant. Regular
expressions can be written directly in patterns, and capturing groups bound directly to variables,
typed according to the group's repetition. Here is an example:
```amok scala
Kaleidoscope is a small library to make pattern matching against strings more
pleasant. Regular expressions can be written directly in patterns, and
capturing groups bound directly to variables, typed according to the group's
repetition. Here is an example:
```scala
case class Email(user: Text, domain: Text)

email match
case r"$user([^@]+)@$domain(.*)" => Email(name, domain)
```
```

Strings are widely used to carry complex data, when it's wiser to use
structured objects. Kaleidoscope makes it easier to move away from strings.

2 changes: 1 addition & 1 deletion fury
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,4 @@ project kaleidoscope
include kaleidoscope/core probably/cli larceny/plugin
sources src/test
main kaleidoscope.Tests
coverage kaleidoscope/core
# coverage kaleidoscope/core
10 changes: 5 additions & 5 deletions src/core/kaleidoscope.Regex.scala
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ import RegexError.Reason.*

object Regex:
private val cache: ConcurrentHashMap[String, Pattern] = ConcurrentHashMap()

enum Greed:
case Greedy, Reluctant, Possessive

Expand Down Expand Up @@ -172,17 +173,17 @@ object Regex:
val mainGroup = group(0, Nil, true)

def check(groups: List[Group], canCapture: Boolean): Unit =
groups.foreach: group =>
groups.each: group =>
if !canCapture && group.capture then abort(RegexError(Uncapturable))
check(group.groups, canCapture && group.quantifier.unitary)

check(mainGroup.groups, true)

Regex(text, mainGroup.groups)

def makePattern
(pattern: Text, todo: List[Regex.Group], last: Int, text: Text, end: Int, index: Int)
def makePattern(pattern: Text, todo: List[Group], last: Int, text: Text, end: Int, index: Int)
: (Int, Text) =

todo match
case Nil =>
(index, (text.s+pattern.s.substring(last, end).nn).tt)
Expand Down Expand Up @@ -229,8 +230,7 @@ case class Regex(pattern: Text, groups: List[Regex.Group]):
val submatcher = compiled.matcher(matchedText).nn
var submatches: List[Text] = Nil

while submatcher.find()
do submatches ::= submatcher.toMatchResult.nn.group(0).nn.tt
while submatcher.find() do submatches ::= submatcher.toMatchResult.nn.group(0).nn.tt

if group.quantifier == Regex.Quantifier.Between(0, 1)
then submatches.prim :: matches
Expand Down

0 comments on commit 22b68e4

Please sign in to comment.