From a38fdcc3342bd00c07dd5a0a75094548666b60d8 Mon Sep 17 00:00:00 2001
From: Raimo Niskanen <raimo@erlang.org>
Date: Fri, 6 Oct 2023 16:00:40 +0200
Subject: [PATCH] Improve examples and describe `~c` and `~C` sigils

---
 eeps/eep-0066.md | 69 +++++++++++++++++++++++++++++++++++-------------
 1 file changed, 51 insertions(+), 18 deletions(-)

diff --git a/eeps/eep-0066.md b/eeps/eep-0066.md
index cf160d9..009a62d 100644
--- a/eeps/eep-0066.md
+++ b/eeps/eep-0066.md
@@ -188,7 +188,7 @@ shall be interpreted.  The suggested Sigil Types are:
   Creates an Erlang `unicode:unicode_binary()`, handling
   escape characters in the string content.  How other features
   like string interpolation would work is still an open question.
-  
+
   Escape characters and other features are the same regardless
   of which [String Delimiters][] that are used.
 
@@ -196,20 +196,38 @@ shall be interpreted.  The suggested Sigil Types are:
 
   Creates an Erlang `unicode:unicode_binary()`, with verbatim
   string content in that only the [end delimiter][] character
-  can be escaped with a «`\`» character.  How other features
-  like string interpolation would work is still an open question.
+  can be escaped with a «`\`» character.
+
+  Which [String Delimiters][] that are used does not matter,
+  except that between triple-quote delimiters according to
+  [EEP 64][] there is no end delimiter character to escape.
+
+* «`c`»: [charlist in Elixir][4].
+
+  Creates an Erlang `string()`, handling escape characters
+  in the string content.  How other features like
+  string interpolation would work is still an open question.
 
   Escape characters and other features are the same regardless
-  of which [String Delimiters][] that are used, except that
-  between triple-quote delimiters according to [EEP 64][]
-  there is no end delimiter character to escape.
+  of which [String Delimiters][] that are used.
+
+* «`C`»: [charlist in Elixir][4], verbatim.
+
+  Creates an Erlang `string()`, with verbatim string content
+  in that only the [end delimiter][] character can be escaped
+  with a «`\`» character.
+
+  Which [String Delimiters][] that are used does not matter,
+  except that between triple-quote delimiters according to
+  [EEP 64][] there is no end delimiter character to escape.
 
 * «`r`»: regular expression.
 
-  Creates a term `{RE::unicode:charlist(),Flags::[unicode:latin1_char()]}`
+  Creates a term `{re,RE::unicode:charlist(),Flags::[unicode:latin1_char()]}`
   that is an uncompiled regular expression with compile flags,
-  suitable for functions in the `re` module.  The `RE` element is
-  the [String Content][], and the `Flags` element is the [Sigil Suffix][].
+  suitable for (yet to be implemented) functions in the `re` module.
+  The `RE` element is the [String Content][], and the `Flags` element
+  is the [Sigil Suffix][].
 
   See the [Regular Expressions][] section about the reasoning
   behind this proposed term type.
@@ -221,11 +239,15 @@ shall be interpreted.  The suggested Sigil Types are:
   there is no end delimiter character to escape.
 
   The main advantage of a regular expression [Sigil][] is to avoid
-  the additional escaping of «'\'» that regular erlang strings add.
+  the additional escaping of «`\`» that regular erlang strings add.
+
+  Today: `re:run(Subject, "^\\s*\"[a-z]+\\\\\\d+\"", [caseless,unicode])`
 
-  Today: `re:run(Subject, "^[ \\t]*\\[a-z]*\\\\s+", [caseless,unicode])`
+  Sigil: `re:run(Subject, ~r'^\s*"[a-z]+\\\d+"'iu)`
 
-  Sigil: `re:run(Subject, ~r"^[ \t]*\[a-z]*\\s+"iu)`
+  Other advantages are possible tools and library integration features
+  such as making the `re` module recognize this tuple format,
+  and having the code loader pre-compile them.
 
 ### String Delimiters
 
@@ -276,19 +298,24 @@ the [String Content][] when it sees the Sigil Suffix.
 
 ### Regular Expressions
 
+A regular expression sigil «`~r"expression"flags`» should
+be translated to something useful for tools/libraries.
+There are at least two ways; [uncompiled regular expressions][],
+or [compiled regular expressions][].
+
 #### Uncompiled Regular Expression
 
-The value of a regular expression [Sigil][] is a 2-tuple
-with the uncompiled regular expression and its compile flags
-(in the guise of a sequence of character flags).
+The value of a regular expression [Sigil][] is chosen
+to be a tuple `{re,RE,Flags}`.
 
 With this representation, the `re` module can be augmented
-with functions that accept this tuple format.  These functions
+with functions that accept this tuple format that bundles
+a regular expression with compile flags.  These functions
 are `re:compile/1,2`, `re:replace/3,4` `re:run/2,3`,
-and `re:split/2,3`.  Translation of the compile flag characters
+and `re:split/2,3`.  Translation of the `Flags`' characters
 into `re:compile_option()`s should be done by these functions.
 
-Example:
+Example of calling a yet to be implemented `re:run/3`:
 
     1> re:run("ABC123", ~r"abc\d+"i, [{capture,first,list}]).
     {match,["ABC123"]}
@@ -422,6 +449,12 @@ more tokenizer rewriting.
 [Regular Expressions]:  #regular-expressions
                         "Regular Expressions"
 
+[uncompiled regular expressions]:       #uncompiled-regular-expressions
+                                        "Uncompiled Regular Expressions"
+
+[compiled regular expressions]:         #compiled-regular-expressions
+                                        "Compiled Regular Expressions"
+
 Copyright
 =========