Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor: RuntimeTransaction Instruction Processing #3490

Closed
wants to merge 5 commits into from

Conversation

apfitzge
Copy link

@apfitzge apfitzge commented Nov 5, 2024

Problem

  • Currently RuntimeTransaction does multiple loops over instructions to do minor processing for caching meta data
  • It would be more efficient to do a single loop over instructions
  • Previously this work was not done because we did not want to duplicate code OR run into inefficiencies when we wanted to get a single instruction detail item

Summary of Changes

  • Top-level ProgramIdFlag to assign labels to program keys
  • Convert from top-level to "inner" flags when needed
  • Trait for processing transactions individually and getting output
  • Generic cached processor for caching converted flags
  • Top-level processor to do all processings in single-loop

Fixes #

@apfitzge apfitzge added the noCI Suppress CI on this Pull Request label Nov 5, 2024
.program_instructions_iter()
.map(|(program_id, ix)| (program_id, SVMInstruction::from(ix))),
);
let (precompile_signature_details, compute_budget_instruction_details) =
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tao-stones this is top-level processing function. Single loop over the instructions that we can use to get all the details we want. very easy to add more to this!

Copy link

@tao-stones tao-stones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First impression is neat but perhaps overengineered. Would you give Visitor design pattern another thought? To me Visitor fits the needs nicely: we have a sequence of data, each need to be processed independently by different parties (CB, Sig, Builtin ....), optionally to update some shared dataspace/cache sequentially.

impl InstructionProcessor<ComputeBudgetFlag> for ComputeBudgetInstructionDetails {
type OUTPUT = Self;

#[inline]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this function too large for inlining?

) -> Result<(), TransactionError>;

/// Finalize the processor and return the output.
fn finalize(self) -> Self::OUTPUT;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyway to avoid this additional step?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah i don't think this is necessary actually

type OUTPUT;

/// Process instructions individually given a `flag` that is derived from
/// the program id.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Flag necessary? Can we use Visitor pattern that each "type" implements process_instruction?

T: for<'k> From<&'k ProgramIdFlag>,
P: InstructionProcessor<T>,
{
#[inline]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this inline-ness is also unconventional

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it can be significantly faster, particularly for the individual processors instead of the "all together" one.

P: InstructionProcessor<T>,
{
#[inline]
pub fn process_instructions<'b>(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clever but a bit over-engineering to me. The function fits nicely with Visitor design pattern, I personally prefer Visitor if possible, for its readability and maintainability

@@ -17,6 +22,36 @@ pub fn process_compute_budget_instructions<'a>(
.sanitize_and_convert_to_compute_budget_limits()
}

type RuntimeTransactionInstructionsProcessor =
(PrecompileSignatureDetails, ComputeBudgetInstructionDetails);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as it is, instructionS_processor isn't very scaleable

compute_budget_details.process_instruction(&flag.into(), instruction_index, instruction)
}

fn finalize(self) -> Self::OUTPUT {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't really do anything tangible, maybe we can remove this extra step in processing

}

/// Self translation for [ProgramIdFlag].
impl From<&ProgramIdFlag> for ProgramIdFlag {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why need this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conforming with my generic code. May be able to do some sort of better generic in the code, but copying a u8 for now seems fine to me.

/// Program flag for determining how to process instructions for finding the
/// signature details.
pub enum SignatureDetailsFlag {
NoMatch,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having to specify NoMatch at both levels is a redundancy not exists w Visitor

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to write some benchmarks in this, but visitor is inefficient because we need to re-check the key for each visitor.

If we add more stuff here, like builtins (which is really my goal rn fro cost-model stuff), then we end up doing more work w/ the visitor pattern.
Let's consider when we add BUILTINs.
I already know from my compute_budget and precompile_details if an instruction is one of those or not.
If it is one of them, then I don't need to do a hash-table lookup - I already know the cost, because I've already identified the key.

Same even as is. If I see that something is compute-budget key, I don't need to check if it's secp or ed25519, it's not because it's compute-budget.

@apfitzge
Copy link
Author

apfitzge commented Nov 7, 2024

First impression is neat but perhaps overengineered. Would you give Visitor design pattern another thought? To me Visitor fits the needs nicely: we have a sequence of data, each need to be processed independently by different parties (CB, Sig, Builtin ....), optionally to update some shared dataspace/cache sequentially.

Appreciate the feedback, but I will have to disagree with one of your descriptions here: "processed independently".
See my comments above. The identification/flagging of program ids is not independent for each processor/visitor.

@apfitzge
Copy link
Author

closing this - decided to take more incremental approach. add in individual processors we need, then can show improvement with something like this later on.

@apfitzge apfitzge closed this Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
noCI Suppress CI on this Pull Request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants