Skip to content

Commit

Permalink
WIP draft of function-composition concept
Browse files Browse the repository at this point in the history
  • Loading branch information
Colin Leach committed Oct 26, 2024
1 parent c8ca094 commit 93e655c
Show file tree
Hide file tree
Showing 4 changed files with 163 additions and 0 deletions.
7 changes: 7 additions & 0 deletions concepts.wip/function-composition/.meta/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"authors": [
"colinleach"
],
"contributors": [],
"blurb": "Julia supports composing multiple functions into one, and piping data through a chain of functions."
}
149 changes: 149 additions & 0 deletions concepts.wip/function-composition/about.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
# About

Julia encourages programmers to put as much code as possible inside functions that can be JIT-compiled, and creating many small functions is, by design, performant.

That tends to leave many small, simple functions, which need to be combined to carry out non-trivial tasks.

One obvious approach is to nest function calls.
The following example is very contrived, but illustrates the point.

```julia-repl
julia> first.(titlecase.(reverse.(["my", "test", "strings"])))
3-element Vector{Char}:
'Y': ASCII/Unicode U+0059 (category Lu: Letter, uppercase)
'T': ASCII/Unicode U+0054 (category Lu: Letter, uppercase)
'S': ASCII/Unicode U+0053 (category Lu: Letter, uppercase)
```

The disadvantage of this approach is that readability drops rapidly as nesting gets deeper.

We need a simpler and more flexible approach.

## Composition

This is the technique beloved of mathematicians, and Julia copies the mathematical syntax.

An arbitrary number of functions can be [`composed`][comp] together with `` operators (entered as `\circ` then tab).
The result can be used as a single function.

```julia-repl
julia> compfunc = first ∘ titlecase ∘ reverse
first ∘ titlecase ∘ reverse
julia> compfunc.(["my", "test", "strings"])
3-element Vector{Char}:
'Y': ASCII/Unicode U+0059 (category Lu: Letter, uppercase)
'T': ASCII/Unicode U+0054 (category Lu: Letter, uppercase)
'S': ASCII/Unicode U+0053 (category Lu: Letter, uppercase)
# alternative syntax, giving the same result
julia> (first ∘ titlecase ∘ reverse).(["my", "test", "strings"])
```

A couple of points to note:

- The starting functions appear in the same order as when nesting, and are executed in right-to-left order.
- Broadcasting is not simple to use when composing, but can be applied when calling the composed function.

## Pipelining

An alternative might be thought of as the _programmers'_ approach, rather than the _mathematicians'_.

[`Pipelines`][comp] have long been used in Unix shell scripts, and more recently became popular in mainstream programming languages (F# is sometimes credited with pioneering their adoption).

The basic concept is to start with some data, then pipe it through a sequence of functions to get the result.

The pipe operator is `|>` (as in F# and recent versions of R), though Julia also has a broadcast version `.|>`.

```julia-repl
julia> ["my", "test", "strings"] .|> reverse .|> titlecase .|> first
3-element Vector{Char}:
'Y': ASCII/Unicode U+0059 (category Lu: Letter, uppercase)
'T': ASCII/Unicode U+0054 (category Lu: Letter, uppercase)
'S': ASCII/Unicode U+0053 (category Lu: Letter, uppercase)
```

Execution is now strictly left-to-right, with the output of each function flowing in the direction of the arrow to become the input for the next function.

## Limitations, workarounds, and other options

It is no coincidence that the functions used to illustrate composition and pipelining all take a _single_ argument.

Some purely-functional languages, pipe the _first_ argument into a function but allow others to be included.

In contrast, Julia only expects function _names_ (or something equivalent) in a pipeline, without any additional arguments.

There are important technical reasons for this (related to the fact that [`currying`][currying] is not a standard part of the language design).
The _many_ people who have no understanding of currying should merely accept that this limitation is not a careless oversight, and is not likely to change in future Julia versions.

### Workarounds

We need single-arguments functions that do whatever is needed.
Fortunately, defining new functions in Julia is easy.

Most simply, we could use an [`anonymous function`][anonymous-function].
For example, if we have a single input string and we want to split on underscores:

```julia-repl
julia> "my_test_strings" |> (s -> split(s, '_'))
3-element Vector{SubString{String}}:
"my"
"test"
"strings"
```

That vector could then be piped to other functions, as before.

Enclosing the anonymous function in parentheses is optional in this case, but more generally is a useful way to reduce ambiguity.

Equally, we could create a named function, earlier in the program, and reuse it as needed.

[`Closures`][closures] are beyond the scope of this Concept, but anyone familiar with them from other languages will recognise that they offer a more flexible way to create single-argument functions.

```julia-repl
julia> function makesplit(sep)
fs(str) = split(str, sep)
fs
end
makesplit (generic function with 1 method)
julia> f_us = makesplit('_')
(::var"#fs#32"{Char}) (generic function with 1 method)
julia> "my_test_strings" |> f_us
3-element Vector{SubString{String}}:
"my"
"test"
"strings"
# alternatively:
julia> "my_test_strings" |> makesplit('_')
3-element Vector{SubString{String}}:
"my"
"test"
"strings"
```

Once `makesplit()` is defined, it can be used to work with any separator.
Note that `makesplit('_')` is a _function call_ that evaluates to another function, which in turn receives input from the pipe.

If this seems confusing, that is normal at first (but it becomes clearer with practice).

### Other options

There has been a long discussion about making pipes more versatile in base Julia, but the various suggestions are mutually incompatible and no agreement has been reached.

Meanwhile, users have taken the usual approach of creating various installable packages that address specific needs.
None will work within Exercism, but take a look at these if you are interested:

- [`Chain.jl`][chain]
- [`Underscores.jl`][underscores]
- [`DataPipes.jl`][datapipes]

[comp]: https://docs.julialang.org/en/v1/manual/functions/#Function-composition-and-piping
[chain]: https://github.com/jkrumbiegel/Chain.jl
[underscores]: https://c42f.github.io/Underscores.jl/stable/
[datapipes]: https://github.com/JuliaAPlavin/DataPipes.jl
[closures]: https://en.wikipedia.org/wiki/Closure_(computer_programming)
[currying]: https://en.wikipedia.org/wiki/Currying
[anonymous-function]: https://docs.julialang.org/en/v1/manual/functions/#man-anonymous-functions
1 change: 1 addition & 0 deletions concepts.wip/function-composition/introduction.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Introduction
6 changes: 6 additions & 0 deletions concepts.wip/function-composition/links.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[
{
"url": "https://docs.julialang.org/en/v1/manual/functions/#Function-composition-and-piping",
"description": "Manual section on function composition and piping."
}
]

0 comments on commit 93e655c

Please sign in to comment.