Hip Refactor #1359

Bob-Chen222 · 2024-04-06T22:14:41Z

Description of changes:
Hi @lockshaw Colin and @reyna-abhyankar Reyna,
This PR aligned the hip of linear and layer_norm against the Cuda version. If everything works fine, I can proceed to change other hip kernels.

Related Issues:

Linked Issues:

Issue Hip kernel refactor #1296

Issues closed by this PR:

Closes Hip kernel refactor #1296

This change is

… into repo-refactor

Finish implementing `compiler`

reyna-abhyankar

Reviewable status: 0 of 13 files reviewed, 11 unresolved discussions (waiting on @Bob-Chen222)

docs/doxygen/Doxyfile line 884 at r1 (raw file):

INPUT                 += $(FF_HOME)/python
INPUT                 += $(FF_HOME)/src
INPUT                 += $(FF_HOME)/lib/substitutions/include

What is this change for?

lib/kernels/src/hip/layer_norm_kernels.cpp line 27 at r1 (raw file):

constexpr int kColwiseReduceTileSize = 32;

LayerNormPerDeviceState::LayerNormPerDeviceState(

Actually, you can get rid of this constructor in both the hip and cuda kernels. (Also remove the LayerNormPerDeviceState class, only keep the struct in layer_norm_kernels.h)

lib/kernels/src/hip/layer_norm_kernels.cpp line 57 at r1 (raw file):

                                    int64_t effective_num_elements_,
                                    float eps_) {
  elementwise_affine = elementwise_affine_;

Need to have a datatype for these.

Code snippet:

float* mean = allocator.allocate(...);

lib/kernels/src/hip/layer_norm_kernels.cpp line 67 at r1 (raw file):

  scale = allocator.allocate(sizeof(float) * effective_batch_size);
  bias = allocator.allocate(sizeof(float) * effective_batch_size);
  LayerNormPerDeviceState per_device_state =

Here, just list-initialize LayerNormPerDeviceState (both hip and cuda)

Code snippet:

LayerNormPerDeviceState per_device_state = {handle, elementwise_affine, ...};

lib/kernels/src/hip/layer_norm_kernels.cpp line 85 at r1 (raw file):

struct ForwardKernel {
  void operator()(hipStream_t stream,
                  LayerNormPerDeviceState const *m,

Use reference instead of pointer (same for rest of file)

lib/kernels/src/hip/linear_kernels.cpp line 25 at r1 (raw file):

namespace Linear {

// what's the float * one_ptr

You can get rid of this comment

lib/kernels/src/hip/linear_kernels.cpp line 90 at r1 (raw file):

void forward_kernel(hipStream_t stream,
                    LinearPerDeviceState const *m,

Change to a reference (so use m.xxx instead of m->xxx in the rest of the code)

Code snippet:

LinearPerDeviceState const &m,

lib/kernels/src/hip/linear_kernels.cpp line 152 at r1 (raw file):

                            HIPBLAS_GEMM_DEFAULT));
  }
  if (use_activation(m->activation)) {

use_activation needs to be defined and implemented.

Code snippet:

bool use_activation(ActiMode mode) {
  switch (mode) {
    case AC_MODE_RELU:
    case AC_MODE_SIGMOID:
    case AC_MODE_TANH:
      return true;
    case AC_MODE_NONE:
      return false;
    default:
      assert(0);
      break;
  }
  return false;
}

lib/kernels/src/hip/linear_kernels.cpp line 182 at r1 (raw file):

void backward_kernel(hipStream_t stream,
                     LinearPerDeviceState const *m,

Same here. Use a reference instead of a pointer

lib/substitutions/TUTORIAL.md line 1 at r1 (raw file):

## Tutorial of substitution lib with simple example

Why is this file included?

lib/substitutions/include/substitutions/attribute_expr.h line 10 at r1 (raw file):

enum class ConstraintType { EQUAL };

/**

I'm guessing you may have merged a docs PR with the kernel refactor. We should probably separate these out.

This reverts commit c21d66e.

This reverts commit ffa7f79.

This reverts commit 6962bc8.

Bob-Chen222 · 2024-04-12T20:30:59Z

Hi @reyna-abhyankar, I believe that everything is fixed now! Redundant changes from substitution are eliminated and all the requests you made have been fixed.

…factor

Bob-Chen222

Reviewable status: 0 of 16 files reviewed, 11 unresolved discussions (waiting on @reyna-abhyankar)

lib/kernels/src/hip/layer_norm_kernels.cpp line 27 at r1 (raw file):