-
Notifications
You must be signed in to change notification settings - Fork 260
llvm.noalias attribute on Memref arguments is not properly lowered #309
Comments
I think this is a crucial issue - thanks for bringing this up. I wish you actually had a test case upstream in test/Target/llvmir.mlir exercising propagation of llvm.noalias as you show below --- the issue would have been exposed earlier.
Using the llvm noalias function parameter attribute was really a simple solution to get away with. :) This now requires a systematic solution, which I think was anyway inevitable. I think alias.scope metadata will now be needed on the generated load/stores with self referencing domain scopes. The name and meaning of llvm.noalias could be changed to convey propagation via indirection: llvm.noalias -> llvm.nomemrefalias as the first step. I tend to think the desired goal is easier/cleaner to accomplish in a pass on the LLVM dialect: the pass would process the nomemrefalias attribute, and mark load/stores to use alias.scope metadata while discarding the nomemrefalias attribute at the end. OTOH, trying to do this during the std to llvm dialect lowering would require introducing state to consistently/uniquely set scope domain IDs during store/load op lowering by tracking the origin of its memref descriptor values. ~ Uday |
Thanks, Uday! Yes, it was a temporary (but very convenient) solution :). Agree it's very unfortunate that we didn't have testing coverage for that in MLIR. This has become a blocker in terms of performance for us. Your proposed approach sounds reasonable to me. I was wondering if we currently have support to represent metadata in LLVM-IR dialect in some way. Otherwise, translating "markers" (attributes?) on LLVM-IR dialect loads/stores to proper alias domains using metadata in LLVM-IR might not be as a straightforward 1:1 mapping as the rest of the translation. Any thoughts on this? I'm going to be OOO for a couple of days. I could give this a thought when I'm back if nobody has started already. However, I think I'll need some help. Thanks, |
I don't think we have examples of representing metadata like LLVM scope domains in MLIR. Also, just translating op attributes to LLVM instruction metadata won't be sufficient here since the alias metadata would have out of line definitions as shown below that we may want to be printed in a similar way with the LLVM dialect.
In fact, if we want something that looks close to the printed LLVM IR (out of line definitions for the domains), a possible approach with MLIR would be to use 0-d constant affine maps. AFAIK, they are the only thing (besides integer sets) which have out of line definitions/IR printing. All the infra needed is also already in the IR. 0-d constant affine maps could be used to define scoped domains and the maps could be referred to in the attributes on the load/store's.
This would look nearly isomorphic to the resulting LLVM IR. To define a list of domains:
The key issue with using affine maps in the LLVM dialect would appear to be that it goes against a future plan to move the affine expr/map infrastructure into the affine dialect to make core MLIR leaner. But one could argue that a use case like this is a motivation to just keep it in IR the way it is; otherwise, we'd anyway need another structure and the associated infra to model such "global" metadata and it could be representationally less powerful if it's just designed for the "constant integers defining domains" case. It'd be good to consider now what sort of other LLVM metadata/attributes we aren't able to model well atm. The above still doesn't address the 1:1 mapping question you have, but it allows use outside the LLVM dialect as well (since the alias domain information is likely populated in higher level dialects). The other approach could be to just use a list of integers as an attribute instead of a map. But you won't get out of line printing from the IR. In either case, mlir-translate would have to generate the domains based on alias_domain attributes. @joker-eph and @ftynse will have useful feedback on this.
I can help with this. ~ Uday
|
Hi Diego, Here are a few of the tradeoffs involved here:
Bottom line, I think you guys properly identified a reasonable solution that is general enough and IMO it is the right path forward. Irrespective of making aliasing work on the current struct descriptors, it is also reasonable to implement a working solution, controlled by a flag, for point 2 for the following cases:
However I think the previous approach (i.e. partial specializations and passing only exactly the dynamic sizes and strides) is premature optimization that has a high code complexity cost. I would still avoid it in the absence of strong supporting performance data. |
There was another person on the mailing list interested in mermef lowering schemes other than our current descriptors, so I think providing an easy way to plug into the std->llvm lowering is a reasonable short-term solution. This will allow one to implement the point 2. from Nicolas's comment above as a lowering option. Note that the dialect conversion framework actually supports one-to-many rewrites of function arguments, so it is technically possible to unpack the descriptor into individual arguments, and potentially rewrite the allocated+aligned pair of aliasing pointers into allocated+alignment_offset pair of non-aliasing pointers and integer. However, using multiple arguments to a function quickly gets into the ABI territory if you want to interact with C code. The longer-term solution would be to introduce an alias model into MLIR proper, and I would be happy to see an RFC going into that direction.
There's no direct support for LLVM metadata, but using attributes to model individual pieces of metadata on the need-driven basis is the way to go IMO.
Any attributes are allowed in attribute aliases, https://github.com/tensorflow/mlir/blob/master/g3doc/LangRef.md#attribute-value-aliases, in particular array and dictionary attributes work perfectly fine and are exercised by linalg/structured ops. #scope0 = [0,1]
#scope1 = [2,3]
#scope3 = [#scope0, #scope1]
llvm.func @foo() {
// ...
llvm.load %0 {llvm.alias_scope = #scope3} : !llvm<"float*">
} could work just fine. The problem with this approach (and with affine maps) is that it could create arbitrarily nested structures, potentially with repeating values inside, rather than become a flattened integer set. If we were to model this properly, I would suggest introducing a separate attribute type that has set-like behavior with unpacking on top of a list. There was a discussion in the mailing list on introducing leading keywords in attribute syntax to disambiguate between affine maps and function types, so something like |
Thank you all for the feedback! I think we all agree on that introducing an alias model is the long-term solution, also regardless of this particular issue. Let me investigate a bit more how difficult introducing custom memref lowering schemes would be. |
291a309 implemented a separation between memory-related conversion patterns and the remaining patterns. From there, it should suffice to derive LLVMTypeConverter, override its convertType() to lower MemRefType differently while deferring to the base class for the other types, and provide the lowering patterns for std operations listed in populateStdToLLVMMemoryConversionPattern. This should unblock you in the short term by doing local changes in your pipeline. We remain open to discuss upstream solutions. |
On Thu, Dec 12, 2019 at 1:54 PM Uday Bondhugula ***@***.***> wrote:
Your proposed approach sounds reasonable to me. I was wondering if we
currently have support to represent metadata in LLVM-IR dialect in some
way. Otherwise, translating "markers" (attributes?) on LLVM-IR dialect
loads/stores to proper alias domains using metadata in LLVM-IR might not be
as a straightforward 1:1 mapping as the rest of the translation. Any
thoughts on this?
I don't think we have examples of representing metadata like LLVM scope
domains in MLIR. Also, just translating op attributes to LLVM instruction
metadata won't be sufficient here since the alias metadata would have out
of line definitions as shown below that we may want to be printed in a
similar way with the LLVM dialect.
; scope domain
!0 = !{!0}
!1 = !{!1}
; scope list
!2 = !{!0, !1}
%0 = load float, float* %c, align 4, !alias.scope !5
In fact, if we want something that looks close to the printed LLVM IR (out
of line definitions for the domains), a possible approach with MLIR would
be to use 0-d constant affine maps. AFAIK, they are the only thing (besides
integer sets) which have out of line definitions/IR printing. All the infra
needed is also already in the IR. 0-d constant affine maps could be used to
define scoped domains and the maps could be referred to in the attributes
on the load/store's.
// alias scope domain
#0 = () -> (0)
%0 = llvm.load ... {alias_domain = #0} :
This would look nearly isomorphic to the resulting LLVM IR. To define a
list of domains:
#2 = () -> (0, 1)
The key issue with using affine maps in the LLVM dialect would appear to
be that it goes against a future plan to move the affine expr/map
infrastructure into the affine dialect to make core MLIR leaner. But one
could argue that a use case like this is a motivation to just keep it in IR
the way it is; otherwise, we'd anyway need another structure and the
associated infra to model such "global" metadata and it could be
representationally less powerful if it's just designed for the "constant
integers defining domains" case. It'd be good to consider now what sort of
other LLVM metadata/attributes we aren't able to model well atm.
Using affine maps to represent a list of integer seems quite like a big
hammer to me, it isn't clear why we would do this instead of using
attributes for example.
But beyond this: in LLVM the aliasing scope and domains aren't "integer",
they are just unique abstract entities and you don't manipulate domain with
a number, you just create a metadata and the infrastructure keeps it
unique. The integer in the textual print is just the standard metadata
printing in LLVM (similarly like SSA value numbers don't exist in the
in-memory IR and are just an artifact of serialization).
Also, we have a way to represent "global" (module level really)
informations: operations.
I don't know what is the best representation for this, but we can't explore
a few things.
…--
Mehdi
The above still doesn't address the 1:1 mapping question you have, but it
allows use outside the LLVM dialect as well (since the alias domain
information is likely populated in higher level dialects). The other
approach could be to just use a list of integers as an attribute instead of
a map. But you won't get out of line printing from the IR. In either case,
mlir-translate would have to generate the domains based on alias_domain
attributes.
@joker-eph <https://github.com/joker-eph> and @ftynse
<https://github.com/ftynse> will have useful feedback on this.
I'm going to be OOO for a couple of days. I could give this a thought when
I'm back if nobody has started already. However, I think I'll need some
help.
I can help with this.
~ Uday
Thanks,
Diego
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#309?email_source=notifications&email_token=AAZXKDE2PVCX6W4DX635XKTQYIYBDA5CNFSM4JZB4QJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGWSITI#issuecomment-564995149>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZXKDD7VZNSZYHRJUOPN2TQYIYBDANCNFSM4JZB4QJA>
.
|
This PR introduces the infrastructure to provide a custom Std-to-LLVM lowering for MemRef type: * MemRefDescriptor class is turned into an abstract API that defines the methods needed to perform the custom lowering. * Existing MemRefDesriptor class is renamed to DefaultMemRefDescriptor. It provides default struct lowering implementation (NFC). * TestCustomMemRefLLVMLowering.cpp implements a custom MemRef descriptor lowering and LLVM type converter with the basic functionality to lower MemRef type to a plain pointer to element type. * convert-memref-ops.mlir is split into convert-static-memref-ops.mlir and `convert-dynamic-memref-ops.mlir` so that TestCustomMemRefLLVMLowering.cpp can be tested on all the static MemRef tests available. Related discussion: tensorflow#309
I implemented my own LLVMTypeConverter with a custom lowering for MemRef type. However, that was not enough since Std-to-LLVM conversion uses MemRefDescriptor class all over the place, which assumes that MemRef type is lowered to the struct descriptor and generates the corresponding insertion and extraction operations to/from the struct. To overcome this issue, I introduced some changes to be able to provide a custom MemRefDescriptor as well (#337). This PR also introduces a test with a custom MemRef descriptor lowering that lowers MemRef types to plain pointers. This seems to work nicely. However, it's not enough to workaround the aliasing problem. Currently, due to ABI requirements,
However, I found a simple workaround for now. I locally overrode What do you think? PS: Commit 291a309 doesn't seem to be needed atm. |
That's why I separated the memref-related patterns into a separate
that is, implement new patterns that are aware of a different lowering scheme and use them instead of the existing ones. A different lowering pattern indeed removes your problem with ABI requirements in function signature. I think using different patterns for different lowering schemes is the right solution achieving better separation of concerns. It is also significantly easier to implement that custom ABI support. Why are you not happy with it? I did not read #337 in detail, but it goes in the wrong direction IMO. It essentially makes the current descriptor more standard and requires any other implementation to comply with its API. There may be cases where the |
Sorry, I saw this comment after replying to the PR.
Basically, what I said in the PR. Of course we can always create our own LLVM lowering but that has a high maintainability and code duplication cost. A custom LLVM lowering won't have control on the Std ops reaching the lowering so we would have to be catching up with upstream every time something changes in memref-related ops in Std dialect or new ops are added.
That's because I didn't want to change the existing API to minimize changes. I hoped the API could be turned into something more generic in subsequent patches. Maybe a good trade-off could be hosting upstream these alternative LLVM patterns to lower MemRef to plain pointers. If I understood correctly, there are more people interested in this. That would reduce the maintainability burden and maybe some code reuse across lowerings is possible. WDYT? |
LLVM lowering isn't a monolithic pass and it shouldn't be, so I would argue you are not creating an entirely new lowering, just changing the part that you want to be handled differently. As for the duplication cost, given how different is the lowering scheme, how much code would be actually duplicated? Most of the code in the current patterns is handling dynamic sizes and strides and it's essentially dead if you don't have them. If there is something reusable indeed, that can be factored out into helper functions or something like pattern factories.
It should be possible to
If we push the "being generic" part a couple of steps further (e.g., don't rely on the specific descriptor API, having allocated/aligned pointers, etc.), we will essentially get the default pattern along the lines of
at which point, having any sort of dispatch to actual implementations from the "default" pattern looks like overhead compared to just having different patterns.
This would shift the burden on whoever updates the core, not necessarily reduce it. But I understand the concern of getting out of sync. So I'm fine with upstreaming an additional lowering for memrefs to go to bare pointers when possible and failing otherwise. IMO, this can still be implemented as separate patterns producing simple code, sharing implementation helpers if reasonable. The choice between lowerings can be controlled by an option in the pass, similarly to what we do for malloc/alloca. |
Summary: This patch introduces an alternative calling convention for MemRef function arguments in LLVM dialect. It converts MemRef function arguments to LLVM bare pointers to the MemRef element type instead of creating a MemRef descriptor. Bare pointers are then promoted to a MemRef descriptors at the beginning of the function. This calling convention is only enabled with a flag. This is a stepping stone towards having an alternative and simpler lowering for MemRefs when dynamic shapes are not needed. It can also be used to temporarily overcome the issue with passing 'noalias' attribute for MemRef arguments, discussed in [1, 2], since we can now convert: func @check_noalias(%static : memref<2xf32> {llvm.noalias = true}) { return } into: llvm.func @check_noalias(%arg0: !llvm<"float*"> {llvm.noalias = true}) { %0 = llvm.mlir.undef ... %1 = llvm.insertvalue %arg0, %0[0] ... %2 = llvm.insertvalue %arg0, %1[1] ... ... llvm.return } Related discussion: [1] tensorflow/mlir#309 [2] tensorflow/mlir#337 WIP: I plan to move all the tests with only static shapes from convert-memref-ops.mlir to an independent file so that we can also have coverage for those tests with this alternative calling convention. Reviewers: ftynse, bondhugula, nicolasvasilache Subscribers: jholewinski, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, aartbik, liufengdb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72802
We've been using 'llvm.noalias' attribute on Memrefs to propagate no-alias information of function arguments from nGraph to LLVM. This enabled some LLVM optimizations like vectorization and prevented unnecessary runtime checks on pointers, which were crucial for performance. Unfortunately, 'llvm.noalias' is not properly lowered after latest changes in Memref type. We are observing significant performance drops because of this.
Initially, Memref descriptor in LLVM was just a pointer to the allocated data for that Memref so 'llvm.noalias' was propagated to that pointer. However, currently Memref descriptor in LLVM is a struct as follows:
and 'llvm.noalias' is propagated to the pointer to Memref descriptor (struct) instead of to the actual pointers to data (allocatedPtr and alignedPtr).
Example:
We would need no-alias information to be propagated to allocatedPtr/alignedPtr in Memref descriptor. I'm not sure what would be the best way to do that since
noalias
is a function argument attribute that cannot be applied to struct fields andallocatedPtr
andalignedPtr
do alias now. We would probably have to use alias scopes but I'm not very familiar with that. Any thoughts?Thanks,
Diego
The text was updated successfully, but these errors were encountered: