Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How should I declare StructArray field type in type-stable manner? #262

Open
sairus7 opened this issue Feb 8, 2023 · 8 comments
Open

How should I declare StructArray field type in type-stable manner? #262

sairus7 opened this issue Feb 8, 2023 · 8 comments

Comments

@sairus7
Copy link

sairus7 commented Feb 8, 2023

There are a lot of Any inferred types in this example:

struct Foo
    x::StructVector{NamedTuple{(:a, :b), Tuple{Int64, Float64}}}
end

f = Foo(StructVector(a=[1,2,3], b=[0.1, 0.2, 0.3]))

function first_prod(f::Foo)
    f.x.a[1] * f.x.b[1]
end

@code_typed first_prod(f)

outputs:

CodeInfo(
1%1 = Base.getfield(f, :x)::StructVector{NamedTuple{(:a, :b), Tuple{Int64, Float64}}}%2 = StructArrays.getfield(%1, :components)::Union{Tuple, NamedTuple}%3 = StructArrays.getfield(%2, :a)::Any%4 = Base.getindex(%3, 1)::Any%5 = Base.getfield(f, :x)::StructVector{NamedTuple{(:a, :b), Tuple{Int64, Float64}}}%6 = StructArrays.getfield(%5, :components)::Union{Tuple, NamedTuple}%7 = StructArrays.getfield(%6, :b)::Any%8 = Base.getindex(%7, 1)::Any%9 = (%4 * %8)::Any
└──      return %9
) => Any
@sairus7
Copy link
Author

sairus7 commented Feb 9, 2023

And another question - how should I declare empty StructVector knowing only element type (NamedTuple)?

@jishnub
Copy link
Member

jishnub commented Feb 28, 2023

The issue is that

julia> StructVector{NamedTuple{(:a, :b), Tuple{Int64, Float64}}} |> Base.isconcretetype
false

Can you parameterize the struct instead?

julia> struct Foo{T}
           x::T
       end

julia> f = Foo(StructVector(a=[1,2,3], b=[0.1, 0.2, 0.3]));

julia> function first_prod(f::Foo)
           f.x.a[1] * f.x.b[1]
       end;

julia> @code_typed first_prod(f)
CodeInfo(
1%1  = Base.getfield(f, :x)::StructVector{NamedTuple{(:a, :b), Tuple{Int64, Float64}}, NamedTuple{(:a, :b), Tuple{Vector{Int64}, Vector{Float64}}}, Int64}
│   %2  = StructArrays.getfield(%1, :components)::NamedTuple{(:a, :b), Tuple{Vector{Int64}, Vector{Float64}}}%3  = StructArrays.getfield(%2, :a)::Vector{Int64}%4  = Base.arrayref(true, %3, 1)::Int64%5  = Base.getfield(f, :x)::StructVector{NamedTuple{(:a, :b), Tuple{Int64, Float64}}, NamedTuple{(:a, :b), Tuple{Vector{Int64}, Vector{Float64}}}, Int64}
│   %6  = StructArrays.getfield(%5, :components)::NamedTuple{(:a, :b), Tuple{Vector{Int64}, Vector{Float64}}}%7  = StructArrays.getfield(%6, :b)::Vector{Float64}%8  = Base.arrayref(true, %7, 1)::Float64%9  = Base.sitofp(Float64, %4)::Float64%10 = Base.mul_float(%9, %8)::Float64
└──       return %10
) => Float64

@sairus7
Copy link
Author

sairus7 commented Apr 17, 2023

Sorry @jishnub, I've missed your answer - yes, looks like its just because it depends on a derived type parameter for tuple of vectors. The question is, how can I automate it maybe with some macro in function signature?

julia> StructVector{NamedTuple{(:a, :b), Tuple{Int64, Float64}}} |> Base.isconcretetype
false

julia> s = StructVector(a=[1,2,3], b=[0.1, 0.2, 0.3])
3-element StructArray(::Vector{Int64}, ::Vector{Float64}) with eltype NamedTuple{(:a, :b), Tuple{Int64, Float64}}:
 (a = 1, b = 0.1)
 (a = 2, b = 0.2)
 (a = 3, b = 0.3)

julia> T = typeof(s)
StructVector{NamedTuple{(:a, :b), Tuple{Int64, Float64}}, NamedTuple{(:a, :b), Tuple{Vector{Int64}, Vector{Float64}}}, Int64} (alias for StructArray{NamedTuple{(:a, :b), Tuple{Int64, Float64}}, 1, NamedTuple{(:a, :b), Tuple{Array{Int64, 1}, Array{Float64, 1}}}, Int64})

julia> isconcretetype(T)
true

Otherwise I should write long boilerplate type signature:

struct Foo2
    x::StructVector{NamedTuple{(:a, :b), Tuple{Int64, Float64}}, NamedTuple{(:a, :b), Tuple{Vector{Int64}, Vector{Float64}}}, Int64}
end

f = Foo2(StructVector(a=[1,2,3], b=[0.1, 0.2, 0.3]))

function first_prod(f::Foo2)
    f.x.a[1] * f.x.b[1]
end

@code_typed first_prod(f)

for type-stable output

CodeInfo(
1%1  = Base.getfield(f, :x)::StructVector{NamedTuple{(:a, :b), Tuple{Int64, Float64}}, NamedTuple{(:a, :b), Tuple{Vector{Int64}, Vector{Float64}}}, Int64}
│   %2  = StructArrays.getfield(%1, :components)::NamedTuple{(:a, :b), Tuple{Vector{Int64}, Vector{Float64}}}%3  = StructArrays.getfield(%2, :a)::Vector{Int64}%4  = Base.arrayref(true, %3, 1)::Int64%5  = Base.getfield(f, :x)::StructVector{NamedTuple{(:a, :b), Tuple{Int64, Float64}}, NamedTuple{(:a, :b), Tuple{Vector{Int64}, Vector{Float64}}}, Int64}
│   %6  = StructArrays.getfield(%5, :components)::NamedTuple{(:a, :b), Tuple{Vector{Int64}, Vector{Float64}}}%7  = StructArrays.getfield(%6, :b)::Vector{Float64}%8  = Base.arrayref(true, %7, 1)::Float64%9  = Base.sitofp(Float64, %4)::Float64%10 = Base.mul_float(%9, %8)::Float64
└──       return %10
) => Float64

Actually there is already macro for namedtuple signatures, so I can simplify a part of this:

StructVector{@NamedTuple{a::Int64, b::Float64}, @NamedTuple{a::Vector{Int64}, b::Vector{Float64}}, Int64}

but it is not ideal as well...

@sairus7
Copy link
Author

sairus7 commented Apr 17, 2023

I think a macro like @StructVector{a::Int64, b::Float64} or @StructVector{Bar} that unrolls those dependent types would come in handy here.

@sairus7
Copy link
Author

sairus7 commented Apr 18, 2023

(updated)
So, I've tried to write one based on @NamedTuple code, but it looks like it works only for named tuple eltypes, not for custom struct. The problem is, I need to eval struct type to get its fields, but I don't have access to it at parse time. Any ideas?

module My

using StructArrays

macro StructVector(ex)
    Meta.isexpr(ex, :braces) || Meta.isexpr(ex, :block) ||
        throw(ArgumentError("@NamedTuple expects {...} or begin...end"))
    decls = filter(e -> !(e isa LineNumberNode), ex.args)
    if length(decls) == 1 && decls[1] isa Symbol 
        @show decls[1]
        T = Core.eval(__module__, decls[1])
        vars = [QuoteNode(e) for e in fieldnames(T)]
        types = [esc(e) for e in fieldtypes(T)]
        vtypes = [:(Vector{$(t)}) for t in types]
        @show vars, types, vtypes
        return :(StructVector{$T, NamedTuple{($(vars...),), Tuple{$(vtypes...)}}, Int64})
    else 
        all(e -> e isa Symbol || Meta.isexpr(e, :(::)), decls) ||
            throw(ArgumentError("@NamedTuple must contain a sequence of name or name::type expressions"))
        vars = [QuoteNode(e isa Symbol ? e : e.args[1]) for e in decls]
        types = [esc(e isa Symbol ? :Any : e.args[2]) for e in decls]
        vtypes = [:(Vector{$(t)}) for t in types]
        @show vars, types, vtypes
        return :(StructVector{NamedTuple{($(vars...),), Tuple{$(types...)}}, NamedTuple{($(vars...),), Tuple{$(vtypes...)}}, Int64})
    # end
end

# works for named tuples
struct Bar
    x::@StructVector{a::Int64,b::Float64}
    function Bar()
        new(StructVector(a = Int64[], b = Float64[]))
    end
end
first_prod(f::Bar) = f.x.a[1] * f.x.b[1]

# does not work for structs
struct MyStruct
    a::Int
    b::Float64
end
struct Baz
    x::@StructVector{MyStruct}
    function Baz()
        new(StructVector{MyStruct}(undef, 0))
    end
end
first_prod(f::Baz) = f.x.a[1] * f.x.b[1]

end

f = My.Bar()

f = My.Baz() # error

@sairus7
Copy link
Author

sairus7 commented Apr 19, 2023

Looks like some problem on my side with REPL state - the code above works well in fresh REPL.

@aplavin
Copy link
Member

aplavin commented Apr 19, 2023

The problem is, I need to eval struct type to get its fields, but I don't have access to it at parse time. Any ideas?

A @generated function should help here, they can access the type and generate arbitrary code. You don't even need any new macros, the following is possible to implement:

@generated StructVectorT(elT) = ...

StructVectorT(Tuple{Int, String})
StructVectorT(@NamedTuple{a::Int, b::String})
StructVectorT(MyStruct)

But... Is that really needed? What's wrong with @jishnub suggestion to just parameterize the struct? It gives strictly more flexibility: can add new fields to the namedtuple arbitrarily, or use another array type with the same/similar interface.

@sairus7
Copy link
Author

sairus7 commented Apr 19, 2023

This macro returns type declaraion that can be messed up with functions typically used for object initialization.

Actually, I think of this short form also for function input arguments from here https://discourse.julialang.org/t/functions-with-static-table-like-inputs-and-outputs/97471

I want to restrict it to StructArray to guarantee it should be run for a collection which support both row-wise and column-wise access, so I replace AbstractVector with StructVector.

I agree with you that having incomplete parametrized types may be a better approach here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants