LocalTransformer Encoder Layer #19

AmitMY · 2024-01-02T10:18:24Z

I was looking for a drop in replacement for torch's TransformerEncoder:

nn.TransformerEncoder(
    nn.TransformerEncoderLayer(d_model=hidden_dim, nhead=nhead,
                               dim_feedforward=dim_feedforward,
                               batch_first=True),
    num_layers=num_layers
)

And while this repo does offer a LocalTransformer (#10), the implementation expects a discrete input and a language modeling objective.

Would you be up for including a LocalTransformerLayer and then use it in the LocalTransformer? (pseudocode)

class LocalTransformerLayer(nn.Module):
    def __init__(self, ...):
      self.attn = LocalMHA(dim=dim, dim_head=dim_head, heads=heads, dropout=attn_dropout, causal=causal,
               window_size=local_attn_window_size, use_xpos=use_xpos, xpos_scale_base=xpos_scale_base,
               use_rotary_pos_emb=not use_dynamic_pos_bias, prenorm=True, **kwargs),
      self.ff = FeedForward(dim=dim, mult=ff_mult, dropout=ff_dropout)

      self.dynamic_pos_bias = None
      if use_dynamic_pos_bias:
          self.dynamic_pos_bias = DynamicPositionBias(dim=dim // 2, heads=heads)

    def forward(self,  x, mask = None, ):
        # dynamic pos bias
        attn_bias = None
        if exists(self.dynamic_pos_bias):
            w = self.local_attn_window_size
            attn_bias = self.dynamic_pos_bias(w, w * 2)

        x = self.attn(x, mask=mask, attn_bias=attn_bias) + x
        return self.ff(x) + x

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LocalTransformer Encoder Layer #19

LocalTransformer Encoder Layer #19

AmitMY commented Jan 2, 2024

LocalTransformer Encoder Layer #19

LocalTransformer Encoder Layer #19

Comments

AmitMY commented Jan 2, 2024