A Python package to compile multiple Numpy einsum operations into one. While this package produces strings which are compatible with the default np.einsum
, it actually uses opt_einsum as this is significantly faster.
The package is available on PyPi:
pip install einsum_pipe
Given two arrays:
A = np.random.rand(32, 32, 10, 5)
B = np.random.rand(32, 32, 10, 5)
We frequently need to run multiple reshape/transpose/products/trace/etc., such as:
C = np.einsum('ij...,kl...->ikjl...', A, B)
D = C.reshape([2, ]*20 + [10, 5])
E = D.transpose([2, 3, 4, 5, 6, 7, 8, 9, 12, 13, 14,
15, 16, 17, 18, 19, 0, 1, 10, 11, 20, 21])
F = E.reshape([256, 256, 4, 4, 10, 5])
X = np.trace(F)
This obviously results in multiple intermediate arrays, some of which can be large. Instead of doing this, it is possible to combine multiple np.einsum
operations into one. By carefully modifying the input shape, it is even possible to do this in cases in which the intermediate data is reshaped during the process, provided the shapes are all compatible. The previous example can instead be performed in a single np.einsum
step:
X = einsum_pipe(
'ik...,jl...->ijkl...',
[2, ]*20 + [10, 5],
'abcde fghij klmno pqrst...->cde fghij mno pqrst ab kl...',
[256, 256, 4, 4, 10, 5],
'ii...',
A, B
)
Internally, this calculates a compatible input shape, (4, 8, 4, 8, 50)
and (32, 32, 50)
, and a combined np.einsum
set of subscripts, "ebdbc,aac->edc"
. A
and B
are reshaped (which is frequently free), the single np.einsum
(or opt_einsum.contract
in practice) operation is run, and the output is reshaped back to the expected output shape.
You can find further examples in the "tests" folder.
The syntax is based on Numpy's einsum
, with the addition of allowing multiple subscripts and defining the shapes of the intermediate arrays. The input arrays can be put at the end, as shown, or next to the subscript definitions. In this example, only two arrays are used at start of the pipe, however you can add more arrays at later stages. The output of the previous step is always considered the first input of the subsequent step.
Shapes are compatible if each dimension is the product of some subsequence of a matching shape (of the previous output). For example, (32, 32)
and (4, 256)
are compatible, since both can be built from the shape (4, 8, 4, 8)
: (4*8, 4*8)
and (4, 8*4*8)
. On the other hand, (2, 3)
and (3, 2)
aren't directly compatible since they don't share divisors.
Note that transposition of axes also causes the transposition of the compatible shape, so while [(3, 2), 'ij->ij', (2, 3)]
isn't valid, [(3, 2), 'ij->ji', (2, 3)]
is.
If a series of steps are incompatible, einsum_pipe
will reduce it down to the fewest number of steps possible and optimise for the smallest intermediate array size. This isn't guaranteed to be the absolute optimum since calling this function recursively could reduce it further, but this probably isn't worth it.
In order to merge multiple subscript steps with different intermediate shapes, the input arrays must be reshaped to be compatible with all steps. However, after merging multiple subscripts, certain complex shapes may be eliminated. While it makes no difference to the performance of the operations, the actual subscript string passed to np.einsum
can be unnecessarily long. This may even be an issue if there are more axes than available letters.
einsum_pipe
includes the simplify
argument to deal with such cases. This can be set to False
to disable simplification or "max"
to reduce the length of the subscripts as much as possible. However, this isn't always advisable as merging smaller axes into a larger axis can force an array copy during the initial reshape
if the input array has been transposed (more on that here). Splitting an axis should never cause a problem. The default argument (True
) simplifies the subscript as much as possible while maintaining the splits from the original input arrays. If your inputs are contiguous, you can safely use "max"
.
Numpy's documentation on einsum lists some operations that can be implemented using np.einsum
. Some of these have been implemented here in the ops
submodule. These are just convenience functions to generate the correct subscripts for np.einsum
, they generally produce a string. They can be used as part of einsum_pipe
operations:
from einsum_pipe import ops
X = einsum_pipe(
ops.inner(),
ops.transpose((1, 0))
ops.diag(),
'a->'
A, B
)
More operations may be added in future. As part of this implementation, einsum_pipe
also supports "lazy" arguments: functions passed as arguments which will be called during parsing with the list of available input shapes, to then produce the subscript string or a reshape operation. Note this is still run during "compilation", not when running with np.einsum
.