Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diagonal computation is too inefficient #46

Open
davidscn opened this issue Aug 10, 2021 · 0 comments
Open

Diagonal computation is too inefficient #46

davidscn opened this issue Aug 10, 2021 · 0 comments
Labels
deal.II Related to deal.II

Comments

@davidscn
Copy link
Owner

When running cases in 3D, ~55%-60% of the time is spent on the diagonal computation, which is even more than the linear solver (32% + 15% for GMG) requires. The reason is probably that we call the local_vmult action per dof on each cell instead of each cell, as usual.

Replacing the MatrixFreeTools implementation (I don't have any hanging nodes) by the simplest possible solution I am aware of boils it down to 45% of the total time, which is still too much much in my opinion

  data->template cell_loop<VectorType, int>(
    [this](const auto &data, auto &dst, const auto &, const auto &cell_range) {
      // Initialize data structures
      const VectorizedArrayType one  = make_vectorized_array<Number>(1.);
      const VectorizedArrayType zero = make_vectorized_array<Number>(0.);
      FECellIntegrator          phi(data);
      AlignedVector<VectorizedArrayType> local_diagonal_vector(
        phi.dofs_cell;
      for (unsigned int cell = cell_range.first; cell < cell_range.second;
           ++cell)
        {
          phi.reinit(cell);
          // Loop over all DoFs and set dof values to zero everywhere but i-th
          // DoF. With this input (instead of read_dof_values()) we do the
          // action and store the result in a diagonal vector
          for (unsigned int i = 0; i < phi.dofs_per_cell; ++i)
              {
                for (unsigned int j = 0; j < phi.dofs_per_cell; ++j)
                  phi.begin_dof_values()[j] = zero;

                phi.begin_dof_values()[i] = one;
                do_operation_on_cell(phi);
                local_diagonal_vector[i] = phi.begin_dof_values()[i];
              }

          // Cannot handle hanging nodes
          for (unsigned int i = 0; i < phi.dofs_per_cell; ++i)
            phi.begin_dof_values()[i] = local_diagonal_vector[i];
          phi.distribute_local_to_global(dst);
        }
    },
    inverse_diagonal_vector,
    dummy);

Maybe I can get rid of the most inner call to do_operation_on_cell, i.e., pull it out of the loop here. This requires some investigations.

@davidscn davidscn added the deal.II Related to deal.II label Aug 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deal.II Related to deal.II
Projects
None yet
Development

No branches or pull requests

1 participant