Skip to content

Commit

Permalink
Improve examples and describe ~c and ~C sigils
Browse files Browse the repository at this point in the history
  • Loading branch information
RaimoNiskanen committed Oct 6, 2023
1 parent 45e4b16 commit a38fdcc
Showing 1 changed file with 51 additions and 18 deletions.
69 changes: 51 additions & 18 deletions eeps/eep-0066.md
Original file line number Diff line number Diff line change
Expand Up @@ -188,28 +188,46 @@ shall be interpreted. The suggested Sigil Types are:
Creates an Erlang `unicode:unicode_binary()`, handling
escape characters in the string content. How other features
like string interpolation would work is still an open question.

Escape characters and other features are the same regardless
of which [String Delimiters][] that are used.

* «`S`»: [string in Elixir][4], verbatim.

Creates an Erlang `unicode:unicode_binary()`, with verbatim
string content in that only the [end delimiter][] character
can be escaped with a «`\`» character. How other features
like string interpolation would work is still an open question.
can be escaped with a «`\`» character.

Which [String Delimiters][] that are used does not matter,
except that between triple-quote delimiters according to
[EEP 64][] there is no end delimiter character to escape.

* «`c`»: [charlist in Elixir][4].

Creates an Erlang `string()`, handling escape characters
in the string content. How other features like
string interpolation would work is still an open question.

Escape characters and other features are the same regardless
of which [String Delimiters][] that are used, except that
between triple-quote delimiters according to [EEP 64][]
there is no end delimiter character to escape.
of which [String Delimiters][] that are used.

* «`C`»: [charlist in Elixir][4], verbatim.

Creates an Erlang `string()`, with verbatim string content
in that only the [end delimiter][] character can be escaped
with a «`\`» character.

Which [String Delimiters][] that are used does not matter,
except that between triple-quote delimiters according to
[EEP 64][] there is no end delimiter character to escape.

* «`r`»: regular expression.

Creates a term `{RE::unicode:charlist(),Flags::[unicode:latin1_char()]}`
Creates a term `{re,RE::unicode:charlist(),Flags::[unicode:latin1_char()]}`
that is an uncompiled regular expression with compile flags,
suitable for functions in the `re` module. The `RE` element is
the [String Content][], and the `Flags` element is the [Sigil Suffix][].
suitable for (yet to be implemented) functions in the `re` module.
The `RE` element is the [String Content][], and the `Flags` element
is the [Sigil Suffix][].

See the [Regular Expressions][] section about the reasoning
behind this proposed term type.
Expand All @@ -221,11 +239,15 @@ shall be interpreted. The suggested Sigil Types are:
there is no end delimiter character to escape.

The main advantage of a regular expression [Sigil][] is to avoid
the additional escaping of «'\'» that regular erlang strings add.
the additional escaping of «`\`» that regular erlang strings add.

Today: `re:run(Subject, "^\\s*\"[a-z]+\\\\\\d+\"", [caseless,unicode])`

Today: `re:run(Subject, "^[ \\t]*\\[a-z]*\\\\s+", [caseless,unicode])`
Sigil: `re:run(Subject, ~r'^\s*"[a-z]+\\\d+"'iu)`

Sigil: `re:run(Subject, ~r"^[ \t]*\[a-z]*\\s+"iu)`
Other advantages are possible tools and library integration features
such as making the `re` module recognize this tuple format,
and having the code loader pre-compile them.

### String Delimiters

Expand Down Expand Up @@ -276,19 +298,24 @@ the [String Content][] when it sees the Sigil Suffix.

### Regular Expressions

A regular expression sigil «`~r"expression"flags`» should
be translated to something useful for tools/libraries.
There are at least two ways; [uncompiled regular expressions][],
or [compiled regular expressions][].

#### Uncompiled Regular Expression

The value of a regular expression [Sigil][] is a 2-tuple
with the uncompiled regular expression and its compile flags
(in the guise of a sequence of character flags).
The value of a regular expression [Sigil][] is chosen
to be a tuple `{re,RE,Flags}`.

With this representation, the `re` module can be augmented
with functions that accept this tuple format. These functions
with functions that accept this tuple format that bundles
a regular expression with compile flags. These functions
are `re:compile/1,2`, `re:replace/3,4` `re:run/2,3`,
and `re:split/2,3`. Translation of the compile flag characters
and `re:split/2,3`. Translation of the `Flags`' characters
into `re:compile_option()`s should be done by these functions.

Example:
Example of calling a yet to be implemented `re:run/3`:

1> re:run("ABC123", ~r"abc\d+"i, [{capture,first,list}]).
{match,["ABC123"]}
Expand Down Expand Up @@ -422,6 +449,12 @@ more tokenizer rewriting.
[Regular Expressions]: #regular-expressions
"Regular Expressions"

[uncompiled regular expressions]: #uncompiled-regular-expressions
"Uncompiled Regular Expressions"

[compiled regular expressions]: #compiled-regular-expressions
"Compiled Regular Expressions"

Copyright
=========

Expand Down

0 comments on commit a38fdcc

Please sign in to comment.