You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running cases in 3D, ~55%-60% of the time is spent on the diagonal computation, which is even more than the linear solver (32% + 15% for GMG) requires. The reason is probably that we call the local_vmult action per dof on each cell instead of each cell, as usual.
Replacing the MatrixFreeTools implementation (I don't have any hanging nodes) by the simplest possible solution I am aware of boils it down to 45% of the total time, which is still too much much in my opinion
data->template cell_loop<VectorType, int>(
[this](constauto &data, auto &dst, constauto &, constauto &cell_range) {
// Initialize data structuresconst VectorizedArrayType one = make_vectorized_array<Number>(1.);
const VectorizedArrayType zero = make_vectorized_array<Number>(0.);
FECellIntegrator phi(data);
AlignedVector<VectorizedArrayType> local_diagonal_vector(
phi.dofs_cell;
for (unsignedint cell = cell_range.first; cell < cell_range.second;
++cell)
{
phi.reinit(cell);
// Loop over all DoFs and set dof values to zero everywhere but i-th// DoF. With this input (instead of read_dof_values()) we do the// action and store the result in a diagonal vectorfor (unsignedint i = 0; i < phi.dofs_per_cell; ++i)
{
for (unsignedint j = 0; j < phi.dofs_per_cell; ++j)
phi.begin_dof_values()[j] = zero;
phi.begin_dof_values()[i] = one;
do_operation_on_cell(phi);
local_diagonal_vector[i] = phi.begin_dof_values()[i];
}
// Cannot handle hanging nodesfor (unsignedint i = 0; i < phi.dofs_per_cell; ++i)
phi.begin_dof_values()[i] = local_diagonal_vector[i];
phi.distribute_local_to_global(dst);
}
},
inverse_diagonal_vector,
dummy);
Maybe I can get rid of the most inner call to do_operation_on_cell, i.e., pull it out of the loop here. This requires some investigations.
The text was updated successfully, but these errors were encountered:
When running cases in 3D,
~55%-60%
of the time is spent on the diagonal computation, which is even more than the linear solver (32%
+15%
for GMG) requires. The reason is probably that we call thelocal_vmult
action per dof on each cell instead of each cell, as usual.Replacing the
MatrixFreeTools
implementation (I don't have any hanging nodes) by the simplest possible solution I am aware of boils it down to45%
of the total time, which is still too much much in my opinionMaybe I can get rid of the most inner call to
do_operation_on_cell
, i.e., pull it out of the loop here. This requires some investigations.The text was updated successfully, but these errors were encountered: