Skip to content

Commit

Permalink
Add streaming-compatible SVE variant to VFABI mangling
Browse files Browse the repository at this point in the history
From the point of view of vector libraries, it is convenient to treat
SVE and streaming-compatible SVE as separate vector variants. This is
because existing optimised SVE routines may not be compatible with
streaming mode, for instance where they use SVE instructions which are
illegal in streaming mode.

This patch adds the ISA marker 'c', for streaming-compatible
SVE. Existing mapping from scalar to SVE symbols should all still make
sense with streaming-compatibility enabled, with the exception that if
the region being vectorised may have streaming enabled then the 'c'
variant should be used rather than 's'.

At present, for library purposes we are only interested in reaching a
consensus about what to name the routines, rather than extending
OpenMP and the VFABI to actually facilitate autovectorisation, however
please let me know if there is anything that I have left ambiguous or
need to add.
  • Loading branch information
joeramsay committed Oct 17, 2024
1 parent 2fa789a commit c12a8e1
Showing 1 changed file with 37 additions and 13 deletions.
50 changes: 37 additions & 13 deletions vfabia64/vfabia64.rst
Original file line number Diff line number Diff line change
Expand Up @@ -942,6 +942,13 @@ undefined.
Zn.b [msb] ... 0x??????03 0x??????02 0x??????01 0x??????00 [lsb]
Zn.s [msb] ... 0x00000003 0x00000002 0x00000001 0x00000000 [lsb]

Streaming compatibility
^^^^^^^^^^^^^^^^^^^^^^^

If targeting SVE from a streaming or streaming-compatible region,
calls should be emitted to the streaming-compatible SVE rather than
the plain SVE variant (differentiated by mangling, as below).

Vector function name mangling
-----------------------------

Expand Down Expand Up @@ -983,6 +990,7 @@ Name mangling grammar for vector functions.

<isa> := "n" (Advanced SIMD)
| "s" (SVE)
| "c" (Streaming-compatible SVE)

<mask> := "N" (No Mask)
| "M" (Mask)
Expand Down Expand Up @@ -1195,6 +1203,19 @@ Note that the ``svbool_t`` parameter is described in `SVE masking`_.
svfloat32_t _ZGVsM8vv_bar(svfloat64_t vx, svfloat64_t vy,
svbool_t vmask);
Streaming-compatible SVE Examples
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The use of ``#pragma omp declare simd`` with ``f``, ``g`` and ``foo``
in a streaming or streaming-compatible region will also generate:

* ``svfloat32_t _ZGVcMxv_f(svfloat64_t, svbool_t) __arm_streaming_compatible``
streaming-compatible VLA signature for the vector version of ``f``;
* ``svfloat64_t _ZGVcMxv_g(svfloat32_t, svbool_t) __arm_streaming_compatible``
streaming-compatible VLA signature for the vector version of ``g``;
* ``svint16_t _ZGVcMxvvv_foo(svint64_t, svint32_t, svint8_t, svbool_t) __arm_streaming_compatible``
streaming-compatible VLA signature for the vector version of ``foo``.

Linear parameters examples
--------------------------

Expand Down Expand Up @@ -1364,17 +1385,19 @@ AArch64 Variant Traits

.. table:: AArch64 traits for OpenMP contexts.

+------------------+-----------------------+-------------------------+
|Trait set |Trait value |Notes |
+==================+=======================+=========================+
|``device`` |``isa("simd")`` |Advanced SIMD call. |
+------------------+-----------------------+-------------------------+
|``device`` |``isa("sve")`` |SVE call. |
+------------------+-----------------------+-------------------------+
|``device`` |``arch("march-list")`` |Used to match |
| | |``-march=march-list`` |
| | |from the compiler. |
+------------------+-----------------------+-------------------------+
+------------------+-----------------------+-------------------------------+
|Trait set |Trait value |Notes |
+==================+=======================+===============================+
|``device`` |``isa("simd")`` |Advanced SIMD call. |
+------------------+-----------------------+-------------------------------+
|``device`` |``isa("sve")`` |SVE call. |
+------------------+-----------------------+-------------------------------+
|``device`` |``isa("sc_sve")`` |Streaming-compatible SVE call. |
+------------------+-----------------------+-------------------------------+
|``device`` |``arch("march-list")`` |Used to match |
| | |``-march=march-list`` |
| | |from the compiler. |
+------------------+-----------------------+-------------------------------+

The scalar function ``f`` that is decorated with a ``declare
variant`` directive with a ``simd`` trait in the ``construct`` set is
Expand All @@ -1391,8 +1414,9 @@ mapped to the vector function ``F`` according to the following rules:

1. ``isa("simd")`` targets Advanced SIMD function signatures.
2. ``isa("sve")`` targets SVE function signatures.
3. Either ``isa("simd")`` or ``isa("sve")`` must be specified.
4. The ``arch`` traits of the ``device`` set is optional, and it
3. ``isa("sc_sve")`` targets streaming-compatible SVE function signatures.
4. One of ``isa("simd")``, ``isa("sve")`` or ``isa("sc_sve")`` must be specified.
5. The ``arch`` traits of the ``device`` set is optional, and it
accepts any value that can be passed to the compiler via the
command line option ``-march``.

Expand Down

0 comments on commit c12a8e1

Please sign in to comment.