RFC: allow splatting in hvcat syntax #39249

simeonschaub · 2021-01-14T09:36:45Z

This changes the lowering of hvcat syntax to properly support splatting, which also simplified lowering slightly. I was a bit worried about introducing another function call here in terms of type inference/constant prop, but in my initial micro benchmarks, I don't see any regression:
master:

julia> @time [1 2 3 4; 5 6 7 8; 9 10 11 12; 13 14 15 16]
  0.075053 seconds (36.30 k allocations: 2.334 MiB, 99.99% compilation time)
4×4 Matrix{Int64}:
  1   2   3   4
  5   6   7   8
  9  10  11  12
 13  14  15  16

julia> @time [1 2 3 4; 5 6 7 8; 9 10 11 12; 13 14 15 16]
  0.000008 seconds (2 allocations: 352 bytes)
4×4 Matrix{Int64}:
  1   2   3   4
  5   6   7   8
  9  10  11  12
 13  14  15  16

julia> @time [[1, 2] [3 4; 5 6]; [7 8] 9]
  0.966912 seconds (964.36 k allocations: 57.004 MiB, 4.15% gc time, 99.93% compilation time)
3×3 Matrix{Int64}:
 1  3  4
 2  5  6
 7  8  9

julia> @time [[1, 2] [3 4; 5 6]; [7 8] 9]
  0.000108 seconds (49 allocations: 2.453 KiB)
3×3 Matrix{Int64}:
 1  3  4
 2  5  6
 7  8  9

This PR:

julia> @time [1 2 3 4; 5 6 7 8; 9 10 11 12; 13 14 15 16]
  0.064680 seconds (36.30 k allocations: 2.334 MiB, 99.99% compilation time)
4×4 Matrix{Int64}:
  1   2   3   4
  5   6   7   8
  9  10  11  12
 13  14  15  16

julia> @time [1 2 3 4; 5 6 7 8; 9 10 11 12; 13 14 15 16]
  0.000010 seconds (2 allocations: 352 bytes)
4×4 Matrix{Int64}:
  1   2   3   4
  5   6   7   8
  9  10  11  12
 13  14  15  16

julia> @time [[1, 2] [3 4; 5 6]; [7 8] 9]
  0.847476 seconds (963.75 k allocations: 56.888 MiB, 3.57% gc time, 99.94% compilation time)
3×3 Matrix{Int64}:
 1  3  4
 2  5  6
 7  8  9

julia> @time [[1, 2] [3 4; 5 6]; [7 8] 9]
  0.000080 seconds (49 allocations: 2.453 KiB)
3×3 Matrix{Int64}:
 1  3  4
 2  5  6
 7  8  9

fixes #38844

JeffBezanson · 2021-01-14T21:34:02Z

I wonder if it would be better just to wrap t... as permutedims([t...]). I think that would work? Might help avoid pathological tuple cases.

simeonschaub · 2021-01-15T09:14:41Z

Do you mean during lowering? That would always be allocating though and wouldn' t work for tuples of arrays, for example. Could you clarify what you mean by the pathological tuple cases here?

JeffBezanson · 2021-01-15T17:33:31Z

Oh right it would actually just be hcat(t...). The worry is that tuple_cat is O(n^2), and that this approach will generate lots of new tuple types.

simeonschaub · 2021-01-15T18:19:38Z

Would something like a fallback method for Any16 for tuple_cat that avoids this overspecialization by building up an intermediate Vector{Any} instead address your worries? How worried are we actually about scaling here? Are people really using this for constructing huge arrays?

simeonschaub · 2021-01-25T01:03:31Z

I have no clue what is going on here:

> (expand-forms '(vcat (row a (|...| b)) (row c d)))
(call (top hvcat_rows) (call (core _apply_iterate) (top iterate) (core tuple)
			    (call (core tuple) a) b)
      (call (core tuple) c d))

> (expand-forms '(vcat (row a (|...| b)) (row c d)))
(call (top hvcat_rows) (call (core _apply_iterate) (top iterate) (core tuple)
			    (call (core tuple) a) b)
      (call (core tuple) c d) (call (core _apply_iterate) (top iterate) (core
  tuple)
				   (call (core tuple) a) b)
      (call (core tuple) c d))

> (expand-forms '(vcat (row a (|...| b)) (row c d)))
(call (top hvcat_rows) (call (core _apply_iterate) (top iterate) (core tuple)
			    (call (core tuple) a) b)
      (call (core tuple) c d) (call (core _apply_iterate) (top iterate) (core
  tuple)
				   (call (core tuple) a) b)
      (call (core tuple) c d) (call (core _apply_iterate) (top iterate) (core
  tuple)
				   (call (core tuple) a) b)
      (call (core tuple) c d))

It's probably just a dumb mistake I made, but I am running out of ideas. Weirdly enough, it works just fine with typed_vcat. @JeffBezanson Any idea why this could happen?

BioTurboNick · 2021-01-25T01:54:59Z

Just saw you're working on this! Cool, looking at it myself.

Based on the checks, it looks like your PR is not compatible with this line from base\uuid.jl, line 84:
[36:-1:25; 23:-1:20; 18:-1:15; 13:-1:10; 8:-1:1], so I can't build your PR right now.

simeonschaub · 2021-01-25T09:25:43Z

Yes, that's the problem I described above. I am increasingly suspecting this might actually be a femtolisp bug. Perhaps something similar to JeffBezanson/femtolisp#43?

BioTurboNick · 2021-01-25T13:47:37Z

Ah! I wasn't clear on what you were showing. Is it that repeated execution of the same line is causing different output?

simeonschaub · 2021-01-25T13:59:09Z

Exactly. This is taken from Julia's flisp repl

BioTurboNick · 2021-01-25T14:05:50Z

Aside: What files need to be loaded to prepare the flisp REPL, and how do I do that? (That is, flisp equivalent of include/using)

BioTurboNick · 2021-01-25T23:27:18Z

I think found a parser solution that doesn't require new functions.

a = [1,2,3]
b = [4,5,6]
[a... b...; b... a...]

It's effectively doing this: hvcat((sum((length(a),length(b))), sum((length(a), length(b)))), a..., b..., b..., a...)

@btime gives 702.098 ns (7 allocations: 448 bytes), doesn't have any impact on the standard hvcats (4 allocations: 288 bytes).

Wondering if there's a way to get rid of the extra allocations. How does it compare to yours?

   'vcat
   (lambda (e)
     (let ((a (cdr e)))
       (if (any assignment? a)
           (error (string "misplaced assignment statement in \"" (deparse e) "\"")))
       (if (has-parameters? a)
           (error "unexpected semicolon in array expression"))
       (expand-forms
         (if (any (lambda (x)
                   (and (pair? x) (eq? (car x) 'row)))
                   a)
           ;; convert nested hcat inside vcat to hvcat
           (let ((rows (map (lambda (x)
                              (if (and (pair? x) (eq? (car x) 'row))
                                (cdr x)
                                (list x)))
                              a)))
             (let ((lengths (map (lambda (r)
                                   `(call (top sum) (tuple
                                     ,.(map (lambda (x)
                                            (if (vararg? x)
                                              `(call (top length) ,.(cdr x))
                                              1))
                                          r))))
                                 rows)))
               `(call (top hvcat)
                      (tuple ,.lengths)
                      ,.(apply append rows))))
             `(call (top vcat) ,@a)))))

simeonschaub · 2021-01-25T23:30:49Z

The problem with that is that not all iterators define length. Since splatting should work with any iterator, I don't think we can do this.

BioTurboNick · 2021-01-25T23:36:27Z

Though you'd need iterators to have matching numbers of elements for an hvcat operation to work at all... Is there an example of where this would be an issue?

simeonschaub · 2021-01-25T23:47:14Z

I will admit, this is a little contrived for hvcat, but I am sure there are other examples as well:

julia> write("foo", "foo\nbar")
7

julia> itr = eachline("foo")
Base.EachLine{IOStream}(IOStream(<file foo>), Base.var"#323#324"{IOStream}(IOStream(<file foo>)), false)

julia> string(itr...)
"foobar"

julia> length(itr)
ERROR: MethodError: no method matching length(::Base.EachLine{IOStream})
Closest candidates are:
  length(::BitSet) at bitset.jl:365
  length(::LibGit2.GitStatus) at /buildworker/worker/package_linuxaarch64/build/usr/share/julia/stdlib/v1.5/LibGit2/src/status.jl:21
  length(::Base.Iterators.Flatten{Tuple{}}) at iterators.jl:1061
  ...
Stacktrace:
 [1] top-level scope at REPL[17]:1

BioTurboNick · 2021-01-26T06:49:25Z

So I don't know why it's a problem, but Julia/flisp doesn't like that (top vcat) etc. are being assigned to variables like that. If I place them explicitly inside the call, it builds fine.

(Also I don't know what this does, but I think the last line should have ,@a instead of ,.a ? At least that's what it is in the original.)

BioTurboNick · 2021-01-26T08:26:20Z

I basically adapted your solution into FLISP, looks like it is a bit more efficient. Let me know what you think.

a = [1,2,3]
b = [4,5,6]
@btime [a... b...; b... a...]
#=
  848.611 ns (6 allocations: 480 bytes)
2×6 Matrix{Int64}:
 1  2  3  4  5  6
 4  5  6  1  2  3
=#

@btime Base.hvcat_rows(tuple(a..., b...), tuple(b..., a...))
#=
  1.043 μs (8 allocations: 704 bytes)
2×6 Matrix{Int64}:
 1  2  3  4  5  6
 4  5  6  1  2  3
=#

One reason I'm interested in a solution that doesn't require extra methods is so I can adapt it to my N-dimensional syntax PR. Checked it with your eachline example to ensure it worked on length-less iterators.

(let ((a (cdr e)))
    (if (any assignment? a)
        (error (string "misplaced assignment statement in \"" (deparse e) "\"")))
    (if (has-parameters? a)
        (error "unexpected semicolon in array expression"))
    (expand-forms
      (if (any (lambda (x)
                (and (pair? x) (eq? (car x) 'row)))
                a)
        ;; convert nested hcat inside vcat to hvcat
        (let ((rows (map (lambda (x)
                          (if (and (pair? x) (eq? (car x) 'row))
                            (cdr x)
                            (list x)))
                          a)))
          ;; in case there is splatting inside `hvcat`, collect each row as a tuple for length determination
          (let ((has-vararg (any (lambda (row) (any vararg? row)) rows)))
            (let ((lengths (cond (has-vararg (map (lambda (x) `(call (top length) ,x))
                                                  (map (lambda (x) `(tuple ,.x)) rows)))
                                 (else       (map length rows)))))
              `(call (top hvcat)
                     (tuple ,.lengths)
                     ,.(apply append rows)))))
        `(call (top vcat) ,@a)))))

simeonschaub · 2021-01-26T09:22:53Z

Hmm, I think the problem with that approach is that you are iterating the splatted iterator twice, which is a problem if the iterator is stateful. I am actually surprised this works for eachline, does it actually return the correct result?

BioTurboNick · 2021-01-26T17:01:42Z

Seems to, but let's see what it's doing under the hood. Maybe some problem is getting obscured?

write("foo", "foo\nbar")
[eachline("foo")... 1; 3 eachline("foo")...]
#=
2×3 Matrix{Any}:
  "foo"  "bar"  1
 3       "foo"   "bar"
=#

Lowered:

CodeInfo(
1 ─ %1 = Core._apply_iterate(Base.iterate, Base.promote_eltypeof, xs)
│   %2 = Core.tuple(%1, rows)
│   %3 = Core._apply_iterate(Base.iterate, Base.typed_hvcat, %2, xs)
└──      return %3

I rewrote it to explicitly do the splatting only once and the result was the same.

simeonschaub · 2021-01-26T17:18:59Z

Ah, I see. What happens if you try this instead though?

write("foo", "foo\nbar")
itr = eachline("foo")
[itr... 1; 3 "foo" "bar"]

We always want to be careful in lowering that we don't accidentally call the same function twice.

BioTurboNick · 2021-01-26T17:29:02Z

Ah, yep, there's the issue. That's disappointing.

BioTurboNick · 2021-01-26T20:28:00Z

For my next attempt to help, I think you can get rid of tuple_cat and shave some allocations? Does work with the stateful iterator.

hvcat_rows(rows::Tuple...) = hvcat(map(length, rows), (rows...)...)
typed_hvcat_rows(T::Type, rows::Tuple...) = typed_hvcat(T, map(length, rows), (rows...)...)

simeonschaub · 2021-01-26T20:36:04Z

Wait, what!?! My mind just got blown!!! 🤯 I would never have thought of nesting splatting like this, but you are right, this does work!

BioTurboNick · 2021-01-26T21:04:20Z

It might be possible for hvcat_rows to accept all hvcats without additional overhead, but I'm not familiar with the pathological tuple cases that Jeff mentioned or how that could come into play. I've tried a couple examples and btime looks the same. Then we just have one path instead of two.

JeffBezanson · 2021-01-26T21:24:11Z

src/julia-syntax.scm

+               ;; in case there is splatting inside `hvcat`, collect each row as a
+               ;; separate tuple and pass those to `hvcat_rows` instead (ref #38844)
+               (if (any (lambda (row) (any vararg? row)) rows)
+                   `(call ,.hvcat_rows ,.(map (lambda (x) `(tuple ,.x)) rows))


Use ,@ everywhere instead of ,..

Are the differences between the two explained anywhere? I always thought they were the same. I typically turn to the Racket manual for stuff like this, but they seem to only have ,@ for unquote-splicing.

They do the same thing except ,. is mutating, so it's a bit of an archaic micro-optimization.

Ah, I see. So that's probably why it caused problems when dealing with multiple splicing interpolations.

simeonschaub · 2021-02-05T17:13:30Z

This should be GTG from my side, if we like this approach.

simeonschaub added arrays [a, r, r, a, y, s] compiler:lowering Syntax lowering (compiler front end, 2nd stage) labels Jan 14, 2021

simeonschaub force-pushed the sds/hvcat_splat branch from ac66ca0 to 510049b Compare January 25, 2021 00:57

JeffBezanson reviewed Jan 26, 2021

View reviewed changes

simeonschaub added 3 commits January 28, 2021 20:05

allow splatting in hvcat syntax

5e0d760

use ,@ instead of ,.

feeb198

use @BioTurboNick's cool splatting trick

c13569a

simeonschaub force-pushed the sds/hvcat_splat branch from 510049b to c13569a Compare January 28, 2021 19:28

This was referenced Jan 29, 2021

Added error when splatting inside hvcat syntax #38862

Closed

Syntax for multidimensional arrays #33697

Merged

JeffBezanson merged commit 7d34b0d into master Feb 5, 2021

JeffBezanson deleted the sds/hvcat_splat branch February 5, 2021 17:29

ElOceanografo pushed a commit to ElOceanografo/julia that referenced this pull request May 4, 2021

allow splatting in hvcat syntax (JuliaLang#39249)

5df02d8

antoine-levitt pushed a commit to antoine-levitt/julia that referenced this pull request May 9, 2021

allow splatting in hvcat syntax (JuliaLang#39249)

bd9ecfe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: allow splatting in hvcat syntax #39249

RFC: allow splatting in hvcat syntax #39249

simeonschaub commented Jan 14, 2021

JeffBezanson commented Jan 14, 2021

simeonschaub commented Jan 15, 2021

JeffBezanson commented Jan 15, 2021

simeonschaub commented Jan 15, 2021

simeonschaub commented Jan 25, 2021 •

edited

Loading

BioTurboNick commented Jan 25, 2021

simeonschaub commented Jan 25, 2021

BioTurboNick commented Jan 25, 2021

simeonschaub commented Jan 25, 2021

BioTurboNick commented Jan 25, 2021 •

edited

Loading

BioTurboNick commented Jan 25, 2021

simeonschaub commented Jan 25, 2021

BioTurboNick commented Jan 25, 2021

simeonschaub commented Jan 25, 2021

BioTurboNick commented Jan 26, 2021

BioTurboNick commented Jan 26, 2021

simeonschaub commented Jan 26, 2021

BioTurboNick commented Jan 26, 2021

simeonschaub commented Jan 26, 2021

BioTurboNick commented Jan 26, 2021

BioTurboNick commented Jan 26, 2021

simeonschaub commented Jan 26, 2021

BioTurboNick commented Jan 26, 2021 •

edited

Loading

JeffBezanson Jan 26, 2021

simeonschaub Jan 28, 2021

JeffBezanson Feb 5, 2021

simeonschaub Feb 5, 2021

simeonschaub commented Feb 5, 2021

RFC: allow splatting in hvcat syntax #39249

RFC: allow splatting in hvcat syntax #39249

Conversation

simeonschaub commented Jan 14, 2021

JeffBezanson commented Jan 14, 2021

simeonschaub commented Jan 15, 2021

JeffBezanson commented Jan 15, 2021

simeonschaub commented Jan 15, 2021

simeonschaub commented Jan 25, 2021 • edited Loading

BioTurboNick commented Jan 25, 2021

simeonschaub commented Jan 25, 2021

BioTurboNick commented Jan 25, 2021

simeonschaub commented Jan 25, 2021

BioTurboNick commented Jan 25, 2021 • edited Loading

BioTurboNick commented Jan 25, 2021

simeonschaub commented Jan 25, 2021

BioTurboNick commented Jan 25, 2021

simeonschaub commented Jan 25, 2021

BioTurboNick commented Jan 26, 2021

BioTurboNick commented Jan 26, 2021

simeonschaub commented Jan 26, 2021

BioTurboNick commented Jan 26, 2021

simeonschaub commented Jan 26, 2021

BioTurboNick commented Jan 26, 2021

BioTurboNick commented Jan 26, 2021

simeonschaub commented Jan 26, 2021

BioTurboNick commented Jan 26, 2021 • edited Loading

JeffBezanson Jan 26, 2021

Choose a reason for hiding this comment

simeonschaub Jan 28, 2021

Choose a reason for hiding this comment

JeffBezanson Feb 5, 2021

Choose a reason for hiding this comment

simeonschaub Feb 5, 2021

Choose a reason for hiding this comment

simeonschaub commented Feb 5, 2021

simeonschaub commented Jan 25, 2021 •

edited

Loading

BioTurboNick commented Jan 25, 2021 •

edited

Loading

BioTurboNick commented Jan 26, 2021 •

edited

Loading