Updated documentation

propensive · May 9, 2024 · 22b68e4 · 22b68e4
1 parent 656cc14
commit 22b68e4
Show file tree

Hide file tree

Showing 5 changed files with 69 additions and 33 deletions.
diff --git a/doc/basics.md b/doc/basics.md
@@ -1,10 +1,23 @@
-To use Kaleidoscope, first import its package,
+Kaleidoscope is included in the `kaleidoscope` package, and exported to the `soundness` package.
+
+To use Kaleidoscope alone, you can include the import,
 ```scala
 import kaleidoscope.*
 ```
+or to use it with other [Soundness](https://github.com/propensive/soundness/) libraries, include:
+```scala
+import soundness.*
+```
 
-and you can then use a Kaleidoscope regular expression—a string prefixed with
-the letter `r`—anywhere you can use a pattern in Scala. For example,
+> Note that Kaleidoscope uses the `Text` type from
+> [Anticipation](https://github.com/propensive/anticipation) and the `Optional`
+> type from [Vacuous](https://github.com/propensive/vacuous/). These offer some
+> advantages, but they can be easily converted: `Text#s` converts a `Text` to a
+> `String` and `Optional#option` converts an `Optional` value to its equivalent
+> `Option`. The necessary imports are show in the examples.
+
+You can then use a Kaleidoscope regular expression—a string prefixed with
+the letter `r`—anywhere you can pattern match against a string in Scala. For example,
 ```scala
 import anticipation.Text
 
@@ -31,7 +44,7 @@ with the exception that a capturing group (enclosed within `(` and `)`) may be
 bound to an identifier by placing it, like an interpolated string substitution,
 immediately prior to the capturing group, as `$identifier` or `${identifier}`.
 
-Here is an example:
+Here is an example of using a pattern match against filenames:
 ```scala
 enum FileType:
   case Image(text: Text)
@@ -42,29 +55,41 @@ def identify(path: Text): FileType = path match
   case r"/styles/$styles(.*)" => FileType.Stylesheet(styles)
 ```
 
-Alternatively, this can be extracted directly in a `val` definition, like so:
+Alternatively, as with patterns in general, this can be extracted directly in a
+`val` definition.
+
+Here is an example of matching an email address:
 ```scala
 val r"^[a-z0-9._%+-]+@$domain([a-z0-9.-]+\.$tld([a-z]{2,6}))$$" =
   "[email protected]": @unchecked
 ```
-In the REPL, this would bind the following values:
+
+The `@unchecked` annotation ascribed to the result is standard Scala, and
+acknowledges to the compiler that the match is _partial_ and may fail at
+runtime.
+
+If you try this example in the Scala REPL, it would bind the following values:
 ```
 > domain: Text = t"example.com"
 > tld: Text = t"com"
 ```
 
-In addition, the syntax of the regular expressionwill be checked at compile-time, and any
-issues will be reported then.
+In addition, the syntax of the regular expression will be checked at
+compile-time, and any issues will be reported then.
 
 ### Repeated and optional capture groups
 
-A normal, unitary capturing group will extract into a `Text` value. But if a capturing group has
-a repetition suffix, such as `*` or `+`, then the extracted type will be a `List[Text]`. This also
-applies to repetition ranges, such as `{3}`, `{2,}` or `{1,9}`. Note that `{1}` will still extract
-a `Text` value.
+A normal, _unitary_ capturing group, like `domain` and `tld` above, will
+extract into `Text` values. But if a capturing group has a repetition suffix,
+such as `*` or `+`, then the extracted type will be a `List[Text]`. This also
+applies to repetition ranges, such as `{3}`, `{2,}` or `{1,9}`.
+
+Note that `{1}` will still extract a `Text` value. The type is determined
+statically from the pattern, and not dynamically from the runtime scrutinee.
 
-A capture group may be marked as optional, meaning it can appear either zero or one times. This
-will extract a value with the type `Option[Text]`.
+A capture group may be marked as optional, meaning it can appear either zero or
+one times. This will extract a value with the type `Optional[Text]`; that is,
+if it present it will be a `Text` value, and if not, it will be `Unset`.
 
 For example, see how `init` is extracted as a `List[Text]`, below:
 ```scala
@@ -80,9 +105,15 @@ def parseList(): List[Text] = "parsley, sage, rosemary, and thyme" match
 
 Note that inside an extractor pattern string, whether it is single- (`r"..."`)
 or triple-quoted (`r"""..."""`), special characters, notably `\`, do not need
-to be escaped, with the exception of `$` which should be written as `$$`. It is
-still necessary, however, to follow the regular expression escaping rules, for
-example, an extractor matching a single opening parenthesis would be written as
-`r"\("` or `r"""\("""`.
+to be escaped, with the exception of `$` which should be written as `$$`.
+
+It is still necessary, however, to follow the regular expression escaping
+rules, for example, an extractor matching a single opening parenthesis would be
+written as `r"\("` or `r"""\("""`.
+
+## Globs
 
+Globs offer a simplified and limited form of regular expression. You can use
+these in exactly the same way as a standard regular expresion, using the
+`g"..."` interpolator instead.
 
diff --git a/doc/features.md b/doc/features.md
@@ -1,6 +1,6 @@
 - pattern match strings against regular expressions
-- regular expressions can be written inline in patterns
-- extraction of capturing groups in patterns
-- typed extraction (into `List`s or `Option`s) of variable-length capturing groups
-- static verification of regular expression syntax
+- regular expressions can be written inline in patterns, anywhere a string could match
+- direct extraction of capturing groups in patterns
+- typed extraction (into `List`s or [Vacuous](https://github.com/propensive/vacuous/) `Optional`s) of variable-length capturing groups
+- static checking of regular expression syntax
 - simpler "glob" syntax is also provided
diff --git a/doc/intro.md b/doc/intro.md
@@ -1,9 +1,14 @@
-Kaleidoscope is a small library to make pattern matching against strings more pleasant. Regular
-expressions can be written directly in patterns, and capturing groups bound directly to variables,
-typed according to the group's repetition. Here is an example:
-```amok scala
+Kaleidoscope is a small library to make pattern matching against strings more
+pleasant. Regular expressions can be written directly in patterns, and
+capturing groups bound directly to variables, typed according to the group's
+repetition. Here is an example:
+```scala
 case class Email(user: Text, domain: Text)
 
 email match
   case r"$user([^@]+)@$domain(.*)" => Email(name, domain)
-```
+```
+
+Strings are widely used to carry complex data, when it's wiser to use
+structured objects. Kaleidoscope makes it easier to move away from strings.
+
diff --git a/fury b/fury
@@ -17,4 +17,4 @@ project kaleidoscope
     include   kaleidoscope/core probably/cli larceny/plugin
     sources   src/test
     main      kaleidoscope.Tests
-    coverage  kaleidoscope/core
+    # coverage  kaleidoscope/core
diff --git a/src/core/kaleidoscope.Regex.scala b/src/core/kaleidoscope.Regex.scala
@@ -30,6 +30,7 @@ import RegexError.Reason.*
 
 object Regex:
   private val cache: ConcurrentHashMap[String, Pattern] = ConcurrentHashMap()
+
   enum Greed:
     case Greedy, Reluctant, Possessive
 
@@ -172,17 +173,17 @@ object Regex:
     val mainGroup = group(0, Nil, true)
 
     def check(groups: List[Group], canCapture: Boolean): Unit =
-      groups.foreach: group =>
+      groups.each: group =>
         if !canCapture && group.capture then abort(RegexError(Uncapturable))
         check(group.groups, canCapture && group.quantifier.unitary)
 
     check(mainGroup.groups, true)
 
     Regex(text, mainGroup.groups)
 
-  def makePattern
-      (pattern: Text, todo: List[Regex.Group], last: Int, text: Text, end: Int, index: Int)
+  def makePattern(pattern: Text, todo: List[Group], last: Int, text: Text, end: Int, index: Int)
           : (Int, Text) =
+
     todo match
       case Nil =>
         (index, (text.s+pattern.s.substring(last, end).nn).tt)
@@ -229,8 +230,7 @@ case class Regex(pattern: Text, groups: List[Regex.Group]):
                 val submatcher = compiled.matcher(matchedText).nn
                 var submatches: List[Text] = Nil
 
-                while submatcher.find()
-                do submatches ::= submatcher.toMatchResult.nn.group(0).nn.tt
+                while submatcher.find() do submatches ::= submatcher.toMatchResult.nn.group(0).nn.tt
 
                 if group.quantifier == Regex.Quantifier.Between(0, 1)
                 then submatches.prim :: matches