Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(stats): Arbitrum charts #1154

Merged
merged 17 commits into from
Dec 18, 2024
Merged

feat(stats): Arbitrum charts #1154

merged 17 commits into from
Dec 18, 2024

Conversation

bragov4ik
Copy link
Contributor

@bragov4ik bragov4ik commented Dec 13, 2024

#1079

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced a new environment variable STATS__ENABLE_ALL_ARBITRUM for enabling Arbitrum-specific charts.
    • Added new metrics: total_operational_txns, new_operational_txns, and operational_txns_growth for enhanced transaction tracking.
    • Implemented a graphical user interface for visualizing cron schedules.
    • Expanded operational transaction tracking capabilities with new modules and charts, including NewOperationalTxns and OperationalTxnsGrowth.
    • Introduced a new update group for operational transactions and enhanced scheduling capabilities.
  • Bug Fixes

    • Enhanced error handling and logging for operational transaction settings.
  • Documentation

    • Updated README to clarify configuration options and new environment variable, including usage instructions for the new GUI tool.
  • Tests

    • Added new assertions to existing tests for operational transaction metrics.
  • Chores

    • Updated dependencies in requirements.txt to support new GUI functionalities.

@bragov4ik bragov4ik requested a review from sevenzing December 13, 2024 15:42
Copy link

coderabbitai bot commented Dec 13, 2024

Warning

Rate limit exceeded

@bragov4ik has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 16 minutes and 21 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between cff2137 and 68df67b.

📒 Files selected for processing (1)
  • stats/stats/src/charts/counters/total_operational_txns.rs (1 hunks)

Walkthrough

The pull request introduces several enhancements to the stats module, primarily focusing on operational transactions and their visualization. A new environment variable, STATS__ENABLE_ALL_ARBITRUM, is added to the stats/README.md, allowing users to configure Arbitrum-specific charts conveniently. The charts.json, layout.json, and update_groups.json files are updated to include new metrics and schedules related to operational transactions, such as total_operational_txns, new_operational_txns, and operational_txns_growth. Additionally, a new Python script, find_free_timeslot.py, provides a graphical user interface for visualizing cron schedules. The Rust files are modified to accommodate new update groups and settings for operational transactions, enhancing the overall configuration capabilities. New modules and type aliases are introduced for better data handling, and tests are expanded to ensure the correctness of the new functionalities. Overall, the changes aim to improve the tracking and reporting of operational transactions without altering existing functionalities.

Possibly related PRs

Poem

🐰 In the meadow where data flows,
New charts bloom, as the rabbit knows.
Operational tales in colors bright,
With Arbitrum's magic, we take flight!
Metrics dance, in joyful play,
Hopping along the data way! 🌼📊


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (12)
stats/stats/src/data_source/kinds/data_manipulation/resolutions/average.rs (1)

Line range hint 89-134: Consider enhancing error logging for debugging

The implementation looks solid with proper handling of edge cases and weighted averages. However, the warning message for missing weights could be more detailed to aid debugging.

Consider enhancing the warning message:

-                    EitherOrBoth::Left(v) => tracing::warn!(
-                        value = v,
-                        timespan =? h_res,
-                        "average value for higher res timespan is present while weight is not (weight is zero).\
-                         this is very likely incorrect, please investigate.",
-                    ),
+                    EitherOrBoth::Left(v) => tracing::warn!(
+                        value = v,
+                        timespan =? h_res,
+                        current_l_res =? current_l_res,
+                        "average value for higher res timespan is present while weight is not (weight is zero).\
+                         This might indicate data inconsistency in the time range. Please investigate.",
+                    ),
stats/stats/src/charts/lines/operational_txns_growth.rs (3)

25-27: Consider using consistent naming conventions for chart names.

The chart name "operationalTxnsGrowth" might be more consistent with Rust naming conventions if written in snake_case as "operational_txns_growth". This can improve readability and maintain consistency across the codebase.


51-69: Add documentation comments to public type aliases for clarity.

Adding /// documentation comments to the public type aliases like OperationalTxnsGrowth, OperationalTxnsGrowthWeekly, etc., will help users understand their purpose and how to use them. This is especially helpful for complex type definitions.


77-78: Avoid ignoring tests by using test configurations or mocks.

The test update_operational_txns_growth is currently ignored due to the need for a database connection. Consider configuring a test database environment or using mocks to enable this test to run automatically during testing. This ensures the test suite remains comprehensive and issues are caught early.

stats/stats/src/charts/lines/new_operational_txns.rs (1)

119-121: Enable the ignored test to ensure functionality is verified

The test update_new_operational_txns is currently marked as ignored due to database dependencies. Consider setting up a test database environment or using mocks to enable this test. Running the test will help verify that the new functionality works as intended and maintain code reliability.

stats/config/utils/find_free_timeslot.py (2)

247-250: Remove leftover fix comments for cleaner code

The comments # Fix 1 and # Fix 2 on lines 247-250 indicate temporary fixes and should be removed to clean up the code. Inline comments should be used to explain the code's purpose rather than referencing fixes.

Apply this diff to remove unnecessary comments:

-    # Fix 1: Check if we have a tasks tag before accessing index 3
    tasks_tag = next((tag for tag in tags if tag.startswith('tasks_')), None)
    if tasks_tag:
        tasks = tasks_tag[6:].split('/')  # Change separator to '/'
-        # Fix 2: Change separator to '/'

236-258: Optimize on_hover method for improved performance

The on_hover method currently scans for overlapping items on every mouse movement, which may cause performance issues with a large number of tasks. Consider optimizing by caching the computations or limiting the updates to significant movements or when the hovered time slot changes.

stats/stats/src/charts/counters/total_operational_txns.rs (1)

1-18: Add documentation for the calculation logic

Please add documentation explaining:

  1. What defines an operational transaction
  2. Why total_txns - total_blocks equals operational transactions
  3. How timespan mismatches are handled

Also applies to: 39-67

stats/stats/src/update_groups.rs (1)

117-124: Consider separating operational transaction metrics into a dedicated group.

Currently, operational transaction metrics are mixed with block metrics in NewBlocksGroup. Consider creating a separate update group for operational transaction metrics to:

  • Maintain better separation of concerns
  • Allow for independent update scheduling
  • Improve maintainability
 construct_update_group!(NewBlocksGroup {
     charts: [
         NewBlocks,
         NewBlocksWeekly,
         NewBlocksMonthly,
         NewBlocksYearly,
+    ]
+});
+
+construct_update_group!(OperationalTxnsGroup {
+    charts: [
         NewOperationalTxns,
         NewOperationalTxnsWeekly,
         NewOperationalTxnsMonthly,
         NewOperationalTxnsYearly,
         OperationalTxnsGrowth,
         OperationalTxnsGrowthWeekly,
         OperationalTxnsGrowthMonthly,
         OperationalTxnsGrowthYearly
     ]
 });
stats/stats-server/src/settings.rs (1)

131-159: Add test coverage for handle_enable_all_arbitrum.

The function implementation looks good, but it would benefit from test coverage similar to the existing test for handle_disable_internal_transactions.

Would you like me to help generate a test case for this function?

stats/stats/src/data_processing.rs (2)

88-139: Consider adding input validation.

While the implementation is efficient, consider adding validation to ensure input vectors are sorted. This would prevent potential runtime errors if unsorted data is provided.

 pub fn zip_same_timespan<T, LeftValue, RightValue>(
     left: Vec<TimespanValue<T, LeftValue>>,
     right: Vec<TimespanValue<T, RightValue>>,
 ) -> Vec<(T, EitherOrBoth<LeftValue, RightValue>)>
 where
     T: Ord,
 {
+    // Validate input vectors are sorted
+    debug_assert!(left.windows(2).all(|w| w[0].timespan <= w[1].timespan), 
+        "Left vector must be sorted");
+    debug_assert!(right.windows(2).all(|w| w[0].timespan <= w[1].timespan), 
+        "Right vector must be sorted");
+
     let mut left = left.into_iter().peekable();
     let mut right = right.into_iter().peekable();

366-432: Consider adding an edge case test.

The test coverage is good, but consider adding a test case for adjacent dates to verify the handling of consecutive timespan values.

     #[test]
     fn zip_same_timespan_works() {
         // ... existing test cases ...
+
+        // Test adjacent dates
+        assert_eq!(
+            zip_same_timespan(
+                vec![d_v("2024-07-01", "1L"), d_v("2024-07-02", "2L")],
+                vec![d_v("2024-07-02", "2R"), d_v("2024-07-03", "3R")],
+            ),
+            vec![
+                (d("2024-07-01"), EitherOrBoth::Left("1L".to_string())),
+                (d("2024-07-02"), EitherOrBoth::Both("2L".to_string(), "2R".to_string())),
+                (d("2024-07-03"), EitherOrBoth::Right("3R".to_string())),
+            ]
+        );
     }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 559b25a and af6fc97.

📒 Files selected for processing (20)
  • stats/README.md (1 hunks)
  • stats/config/charts.json (2 hunks)
  • stats/config/layout.json (2 hunks)
  • stats/config/update_groups.json (1 hunks)
  • stats/config/utils/find_free_timeslot.py (1 hunks)
  • stats/config/utils/requirements.txt (1 hunks)
  • stats/stats-server/src/runtime_setup.rs (2 hunks)
  • stats/stats-server/src/server.rs (2 hunks)
  • stats/stats-server/src/settings.rs (5 hunks)
  • stats/stats-server/tests/it/lines.rs (1 hunks)
  • stats/stats/src/charts/counters/mod.rs (2 hunks)
  • stats/stats/src/charts/counters/total_blocks.rs (2 hunks)
  • stats/stats/src/charts/counters/total_operational_txns.rs (1 hunks)
  • stats/stats/src/charts/counters/total_txns.rs (2 hunks)
  • stats/stats/src/charts/lines/mod.rs (2 hunks)
  • stats/stats/src/charts/lines/new_operational_txns.rs (1 hunks)
  • stats/stats/src/charts/lines/operational_txns_growth.rs (1 hunks)
  • stats/stats/src/data_processing.rs (3 hunks)
  • stats/stats/src/data_source/kinds/data_manipulation/resolutions/average.rs (2 hunks)
  • stats/stats/src/update_groups.rs (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • stats/config/utils/requirements.txt
🔇 Additional comments (26)
stats/stats/src/data_source/kinds/data_manipulation/resolutions/average.rs (2)

Line range hint 2-17: LGTM: Clean import reorganization

The refactoring to move zip_same_timespan to the data_processing module improves code organization by centralizing common data processing utilities.


Line range hint 146-315: LGTM: Comprehensive test coverage

The test suite thoroughly covers:

  • Basic functionality with weekly_average_from_works
  • Range queries with weekly_average_source_queries_correct_range
  • Edge cases with missing data in average_weekly_works_with_missing_avg and average_weekly_works_with_missing_weight
stats/stats/src/charts/counters/mod.rs (4)

11-11: LGTM!

The addition of the total_operational_txns module is correctly implemented.


26-26: LGTM!

The export of TotalBlocksInt alongside TotalBlocks is appropriate and enhances usability.


30-30: LGTM!

The export of TotalOperationalTxns is correctly added.


32-32: LGTM!

The export of TotalTxnsInt alongside TotalTxns is appropriate and consistent with other exports.

stats/config/layout.json (2)

15-15: LGTM!

The addition of "total_operational_txns" to the counters_order array is properly formatted.


40-41: LGTM!

The inclusion of "new_operational_txns" and "operational_txns_growth" in the "Transactions" category enhances the metrics and is correctly implemented.

stats/config/update_groups.json (1)

10-10: Verify schedule timing and potential conflicts

The new schedule runs at 5 minutes past 1:00 AM daily. Let's verify:

  1. If this timing conflicts with other jobs
  2. If daily updates are sufficient for operational transactions tracking
✅ Verification successful

Let's analyze the content of the find_free_timeslot.py script to understand how it validates schedules.


Based on the analysis of the schedules and the find_free_timeslot.py implementation, I can now provide a final response:

Schedule timing is appropriate with no conflicts

The new schedule for total_operational_txns_group running at 5 minutes past 1:00 AM is well-positioned because:

  • It doesn't overlap with any high-frequency jobs (closest ones are average_block_time_group at 10-minute intervals)
  • The 1 AM timeslot has minimal concurrent jobs, with only completed_txns_group (every 3 hours) potentially overlapping occasionally
  • Daily updates for operational transactions are sufficient as this appears to be an aggregation metric
  • The timing allows for processing during off-peak hours
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Analyze cron schedule conflicts
# Use the find_free_timeslot.py script mentioned in the AI summary to visualize schedules

# Find the script
fd -t f "find_free_timeslot.py"

# Extract all schedules to check for conflicts
jq -r '.schedules | to_entries | .[] | "\(.key): \(.value)"' stats/config/update_groups.json

Length of output: 2057


Script:

#!/bin/bash
# Check the content of the find_free_timeslot script
cat stats/config/utils/find_free_timeslot.py

Length of output: 10822

stats/stats/src/charts/counters/total_operational_txns.rs (1)

49-66: Validate operational transactions calculation logic

The current implementation assumes operational transactions = total transactions - total blocks. This assumption needs validation:

  1. Is this calculation always accurate?
  2. Could there be edge cases where this doesn't hold true?
✅ Verification successful

Based on the search results from all our queries, I can now generate the final response:

Calculation of operational transactions appears to be a consistent pattern

The codebase shows a consistent pattern of calculating operational transactions as (total_txns - total_blocks) across multiple components:

  • The same calculation is used in both total and new operational transaction metrics
  • There's an existing test case that validates this logic (48 - 13 = 35)
  • The pattern is used across different time resolutions (daily, weekly, monthly, yearly)
  • The implementation includes proper overflow checks and timespan validation

While the calculation logic lacks explicit documentation, its widespread and consistent usage throughout the codebase, along with test coverage, suggests this is the intended business logic rather than an incorrect assumption.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for related documentation or tests validating this calculation

# Search for documentation about operational transactions
rg -i "operational.*transaction" --type md

# Search for related test cases
rg -i "operational.*transaction.*test" --type rust

Length of output: 97


Script:

#!/bin/bash
# Let's broaden our search to understand the context better

# Search for any mentions of operational transactions in rust files
rg -i "operational.*txn" --type rust -A 3

# Look for the test module in the same file
rg "#\[cfg\(test\)\]" stats/stats/src/charts/counters/total_operational_txns.rs -A 10

# Search for documentation comments in the file
rg "///" stats/stats/src/charts/counters/total_operational_txns.rs -A 2

# Look for any related struct or trait definitions
ast-grep --pattern 'struct Properties {
  $$$
}'

Length of output: 12800


Script:

#!/bin/bash
# Let's look for the implementation of NewOperationalTxns to verify if the same calculation logic is used consistently

# Search for the Calculate implementation in new_operational_txns.rs
rg "impl.*Calculate.*for" stats/stats/src/charts/lines/new_operational_txns.rs -A 10

# Look for any SQL queries or database-related code that might validate this calculation
rg -i "select.*operational.*txn" --type rust

# Check for any documentation about the calculation in the parent module
rg "///" stats/stats/src/charts/mod.rs -A 2
rg "///" stats/stats/src/charts/counters/mod.rs -A 2

Length of output: 230

stats/stats/src/charts/counters/total_txns.rs (1)

110-110: Verify integer overflow handling in MapParseTo

The conversion from String to i64 could fail for large transaction counts. Please verify:

  1. How MapParseTo handles parsing errors
  2. Whether i64 is sufficient for the expected transaction volumes
✅ Verification successful

Let me search for the MapParseTo trait implementation to understand how it handles parsing errors.


Let me try a different search pattern to find the MapParseTo implementation.


Let me search for the ParseToFunction implementation to understand how parsing errors are handled.


Based on the findings, I can now provide a final response about the integer overflow and error handling concerns:

MapParseTo safely handles parsing errors

The implementation of MapParseTo properly handles parsing errors by:

  1. Using Rust's built-in parse() for i64 conversion which safely handles overflow
  2. Converting any parsing errors into ChartError::Internal with descriptive messages
  3. Propagating errors through the Result type rather than panicking

The i64 type is sufficient for transaction counts as it can handle up to 9 quintillion transactions (9,223,372,036,854,775,807), which is well beyond realistic blockchain volumes.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for MapParseTo implementation and usage

# Find MapParseTo implementation
ast-grep --pattern 'impl$$$MapParseTo$$$'

# Search for other String to i64 conversions
rg 'MapParseTo.*i64' --type rust

Length of output: 3467


Script:

#!/bin/bash
# Find MapParseTo trait definition and implementation
ast-grep --pattern 'trait MapParseTo'

# Search for parse_to function implementation
rg 'fn parse_to' --type rust -A 5

# Look for error handling patterns around MapParseTo
rg 'impl.*MapParseTo' --type rust -A 10

Length of output: 600


Script:

#!/bin/bash
# Search for MapParseTo struct/trait definition and implementation
rg "MapParseTo" --type rust -B 2 -A 10

# Look for any error handling related to parsing numbers
rg "parse.*i64" --type rust -B 2 -A 5

# Search for files that might contain the MapParseTo implementation
fd MapParseTo

Length of output: 67250


Script:

#!/bin/bash
# Search for ParseToFunction implementation and error handling
rg "impl.*ParseToFunction" --type rust -B 2 -A 10

# Look for any parse error handling patterns
rg "parse.*error" --type rust -B 2 -A 5

# Search for FromStr implementation that might be used by ParseToFunction
rg "impl.*FromStr" --type rust -B 2 -A 10

Length of output: 20384

stats/stats/src/charts/lines/mod.rs (2)

19-19: LGTM! Module additions follow consistent patterns.

The new modules new_operational_txns and operational_txns_growth are well-organized and follow the established module structure.

Also applies to: 22-22


93-96: LGTM! Public exports maintain consistency.

The exports for both new modules follow the established pattern of exposing variants for different time resolutions (Monthly, Weekly, Yearly).

Also applies to: 103-106

stats/stats/src/update_groups.rs (1)

26-30: LGTM! Well-documented addition to singleton groups.

The addition of TotalOperationalTxns is well-placed and the comments clearly explain the update frequency rationale.

stats/stats-server/src/server.rs (2)

9-9: LGTM: New settings handler import

The import of handle_enable_all_arbitrum aligns with the PR's objective of adding Arbitrum chart support.


69-69: Verify the order of chart configuration modifications

The new Arbitrum handler is called before the internal transactions handler. While this order seems logical, we should verify that there are no dependencies between these operations.

✅ Verification successful

Let me analyze the implementation of both handlers to understand if there are any dependencies between them.


Let me try a different approach to check the implementations.


The order of handlers is safe and independent

After analyzing the implementation of both handlers, I can confirm that they operate on different, non-overlapping sets of charts and settings:

  • handle_enable_all_arbitrum enables specific Arbitrum-related charts (NewOperationalTxns, OperationalTxnsGrowth, TotalOperationalTxns)
  • handle_disable_internal_transactions disables internal transaction ratio check and different charts (NewContracts, LastNewContracts, ContractsGrowth)

Both handlers modify independent configuration settings and can be executed in any order without affecting each other's functionality.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for any potential ordering dependencies between handlers
rg -A 5 "handle_enable_all_arbitrum|handle_disable_internal_transactions" --type rust

Length of output: 2276


Script:

#!/bin/bash
# Check the implementation of both handlers to understand their effects
ast-grep --pattern 'fn handle_enable_all_arbitrum($_) {
  $$$
}'

ast-grep --pattern 'fn handle_disable_internal_transactions($_) {
  $$$
}'

Length of output: 143


Script:

#!/bin/bash
# Get the content of the settings.rs file which contains both handler implementations
cat stats/stats-server/src/settings.rs

Length of output: 11273

stats/stats/src/charts/counters/total_blocks.rs (1)

102-102: LGTM: Clean type alias addition

The TotalBlocksInt type alias provides a clear mapping to i64 and follows Rust type system best practices. This addition supports the new operational transaction calculations without affecting existing functionality.

stats/config/charts.json (2)

41-45: LGTM: Well-structured counter addition

The new total_operational_txns counter is well-defined with a clear title and description. Being disabled by default is a good safety measure.


118-127: LGTM: Comprehensive operational transaction metrics

The new line charts new_operational_txns and operational_txns_growth provide a complete view of operational transaction metrics:

  • Current period metrics via new_operational_txns
  • Historical growth tracking via operational_txns_growth
  • Both disabled by default for safe deployment
stats/stats-server/src/settings.rs (2)

11-12: LGTM!

The new imports for operational transactions types are properly organized.


40-41: LGTM!

The new configuration field is well-documented and follows the established pattern for feature flags.

stats/README.md (1)

95-95: LGTM!

The new environment variable is well-documented and follows the existing documentation format.

stats/stats/src/data_processing.rs (2)

2-2: LGTM!

The new imports are appropriate for the added functionality.

Also applies to: 10-10


143-143: LGTM!

The test module imports are properly organized.

Also applies to: 148-148

stats/stats-server/src/runtime_setup.rs (2)

264-264: LGTM: Addition of TotalOperationalTxnsGroup

The addition of TotalOperationalTxnsGroup to the "actual singletons" section is well-placed and follows the existing pattern.


355-360: Please clarify the TODO comment about "other operational txns"

The comment // todo: other operational txns suggests there might be additional operational transaction types to be handled. Could you provide more details about what other operational transactions are planned?

Let's check for any related TODOs or operational transaction types:

@bragov4ik
Copy link
Contributor Author

note: will need to enable the env on staging

@bragov4ik bragov4ik force-pushed the bragov4ik/stats-arbitrum-charts branch from af6fc97 to f21d40b Compare December 13, 2024 15:54
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (8)
stats/stats/src/data_source/kinds/data_manipulation/resolutions/average.rs (1)

146-148: Consider adding concurrent access test.

Since this is a data processing component that might be accessed concurrently, consider adding a test case that verifies thread safety of the implementation.

Example test case:

#[tokio::test]
async fn test_concurrent_access() {
    let handles: Vec<_> = (0..10)
        .map(|_| {
            tokio::spawn(async {
                // Use existing test setup and query data concurrently
                // Assert results are consistent
            })
        })
        .collect();
    
    for handle in handles {
        handle.await.unwrap();
    }
}
stats/stats/src/data_processing.rs (2)

102-139: Consider pre-allocating result vector for better performance.

While the implementation is correct and handles all cases properly, consider pre-allocating the result vector with an estimated capacity to reduce reallocations when processing large vectors.

-    let mut result = vec![];
+    let mut result = Vec::with_capacity(left.len().max(right.len()));

366-432: Consider adding more edge cases to test suite.

While the current test coverage is good, consider adding these edge cases:

  1. Vectors with duplicate dates
  2. Very large vectors to verify performance
  3. Dates in reverse order to verify sorting requirements

Example test case for duplicate dates:

assert_eq!(
    zip_same_timespan(
        vec![
            d_v("2024-07-08", "8L"),
            d_v("2024-07-08", "8L_2"),  // duplicate date
        ],
        vec![
            d_v("2024-07-08", "8R"),
            d_v("2024-07-08", "8R_2"),  // duplicate date
        ]
    ),
    vec![
        (
            d("2024-07-08"),
            EitherOrBoth::Both("8L".to_string(), "8R".to_string())
        ),
        (
            d("2024-07-08"),
            EitherOrBoth::Both("8L_2".to_string(), "8R_2".to_string())
        ),
    ]
);
stats/stats/src/charts/lines/operational_txns_growth.rs (1)

71-98: Consider enhancing test coverage

While the basic test is good, consider:

  1. Adding unit tests that don't require a database to test the transformation logic
  2. Including edge cases:
    • Data gaps to verify FillPrevious behavior
    • Boundary conditions (e.g., zero or negative values)
  3. Adding at least one test for each resolution type with a smaller dataset

The comment about tests becoming "excruciatingly slow" suggests a potential performance issue that should be investigated.

stats/stats/src/charts/lines/new_operational_txns.rs (1)

119-141: Consider enhancing test implementation

The test update_new_operational_txns is marked with #[ignore] and uses hardcoded data. To improve test coverage and reliability:

  • Implement mock data or use a test database setup to allow the test to run automatically.
  • Expand test cases to include edge scenarios such as timespan mismatches or cases where blocks > txns.
  • Address the comment about tests becoming slow by optimizing test data or using parameterized tests.
stats/stats-server/src/settings.rs (2)

40-41: Add documentation for the new setting.

The new enable_all_arbitrum field lacks documentation. Consider adding a doc comment explaining its purpose and impact.

    pub disable_internal_transactions: bool,
+   /// Enable arbitrum-specific charts. When enabled, charts related to operational
+   /// transactions will be activated in the configuration.
    pub enable_all_arbitrum: bool,

131-159: Consider refactoring the enable/disable chart functions.

The handle_enable_all_arbitrum function shares similar patterns with handle_disable_internal_transactions. Consider extracting the common logic into a helper function to reduce code duplication.

+ fn toggle_chart_settings(
+     charts: &mut config::charts::Config<AllChartSettings>,
+     chart_keys: &[&str],
+     enabled: bool,
+     context: &str,
+ ) {
+     for key in chart_keys {
+         let settings = match (
+             charts.lines.get_mut(key),
+             charts.counters.get_mut(key),
+         ) {
+             (Some(settings), _) => settings,
+             (_, Some(settings)) => settings,
+             _ => {
+                 warn!(
+                     "Could not {} chart {}: chart not found in settings. \
+                     This should not be a problem for running the service.",
+                     context, key
+                 );
+                 continue;
+             }
+         };
+         settings.enabled = enabled;
+     }
+ }

  pub fn handle_enable_all_arbitrum(
      enable_all_arbitrum: bool,
      charts: &mut config::charts::Config<AllChartSettings>,
  ) {
      if enable_all_arbitrum {
-         for enable_key in [
-             NewOperationalTxns::key().name(),
-             OperationalTxnsGrowth::key().name(),
-             TotalOperationalTxns::key().name(),
-         ] {
-             let settings = match (
-                 charts.lines.get_mut(enable_key),
-                 charts.counters.get_mut(enable_key),
-             ) {
-                 (Some(settings), _) => settings,
-                 (_, Some(settings)) => settings,
-                 _ => {
-                     warn!(
-                         "Could not enable arbitrum-specific chart {}: chart not found in settings. \
-                         This should not be a problem for running the service.",
-                         enable_key
-                     );
-                     continue;
-                 }
-             };
-             settings.enabled = true;
-         }
+         toggle_chart_settings(
+             charts,
+             &[
+                 NewOperationalTxns::key().name(),
+                 OperationalTxnsGrowth::key().name(),
+                 TotalOperationalTxns::key().name(),
+             ],
+             true,
+             "enable arbitrum-specific",
+         );
      }
  }
stats/stats-server/src/runtime_setup.rs (1)

355-360: Remove TODO comment and complete the implementation.

The comment "todo: other operational txns" suggests incomplete implementation. Consider removing the TODO comment and implementing the remaining operational transaction configurations.

Would you like me to help implement the remaining operational transaction configurations or create a GitHub issue to track this task?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af6fc97 and f21d40b.

📒 Files selected for processing (20)
  • stats/README.md (1 hunks)
  • stats/config/charts.json (2 hunks)
  • stats/config/layout.json (2 hunks)
  • stats/config/update_groups.json (1 hunks)
  • stats/config/utils/find_free_timeslot.py (1 hunks)
  • stats/config/utils/requirements.txt (1 hunks)
  • stats/stats-server/src/runtime_setup.rs (2 hunks)
  • stats/stats-server/src/server.rs (2 hunks)
  • stats/stats-server/src/settings.rs (5 hunks)
  • stats/stats-server/tests/it/lines.rs (1 hunks)
  • stats/stats/src/charts/counters/mod.rs (2 hunks)
  • stats/stats/src/charts/counters/total_blocks.rs (2 hunks)
  • stats/stats/src/charts/counters/total_operational_txns.rs (1 hunks)
  • stats/stats/src/charts/counters/total_txns.rs (2 hunks)
  • stats/stats/src/charts/lines/mod.rs (2 hunks)
  • stats/stats/src/charts/lines/new_operational_txns.rs (1 hunks)
  • stats/stats/src/charts/lines/operational_txns_growth.rs (1 hunks)
  • stats/stats/src/data_processing.rs (3 hunks)
  • stats/stats/src/data_source/kinds/data_manipulation/resolutions/average.rs (2 hunks)
  • stats/stats/src/update_groups.rs (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (11)
  • stats/config/utils/requirements.txt
  • stats/config/update_groups.json
  • stats/stats-server/tests/it/lines.rs
  • stats/README.md
  • stats/config/charts.json
  • stats/stats/src/charts/counters/total_txns.rs
  • stats/stats/src/charts/counters/total_blocks.rs
  • stats/config/layout.json
  • stats/stats/src/charts/lines/mod.rs
  • stats/stats/src/charts/counters/mod.rs
  • stats/config/utils/find_free_timeslot.py
🧰 Additional context used
📓 Learnings (1)
stats/stats/src/charts/lines/new_operational_txns.rs (1)
Learnt from: bragov4ik
PR: blockscout/blockscout-rs#1154
File: stats/stats/src/charts/lines/new_operational_txns.rs:71-71
Timestamp: 2024-12-13T15:52:06.550Z
Learning: In `stats/stats/src/charts/lines/new_operational_txns.rs`, the variable `data` from `zip_same_timespan` is of type `EitherOrBoth<i64, i64>` and not `Option`, so methods like `unwrap_or` are not applicable.
🔇 Additional comments (17)
stats/stats/src/data_source/kinds/data_manipulation/resolutions/average.rs (3)

10-10: LGTM! Good refactoring to reduce code duplication.

The import of zip_same_timespan from the data_processing module follows the DRY principle by reusing common functionality instead of maintaining a local implementation.


Line range hint 89-117: LGTM! Robust implementation with proper error handling.

The weighted average calculation:

  • Correctly handles missing data points
  • Properly validates timespan boundaries
  • Includes appropriate warning logs for data inconsistencies
  • Handles edge cases like zero weights

146-148: LGTM! Comprehensive test coverage.

The test suite thoroughly covers core functionality, edge cases, and error conditions with clear test cases and assertions.

stats/stats/src/data_processing.rs (2)

2-2: LGTM! Well-documented function signature with appropriate type constraints.

The imports and function signature are well-structured, with clear documentation explaining the behavior for all possible cases.

Also applies to: 10-10, 88-101


95-98: Verify all callers are updated after function move.

Since this function was moved from average.rs, let's verify all callers are updated to use the new location.

✅ Verification successful

All callers are properly updated to use the new location

The verification shows that:

  1. There are no remaining references to the old location (average.zip_same_timespan)
  2. The function is now used in:
    • charts/lines/new_operational_txns.rs
    • data_source/kinds/data_manipulation/resolutions/average.rs

All callers are correctly importing and using the function from its new location in data_processing.rs.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining references to the old function location
# and verify all callers are updated.

echo "Checking for any remaining references to the old location..."
rg -l "average\.zip_same_timespan"

echo "Finding all current callers of zip_same_timespan..."
rg -l "data_processing\.zip_same_timespan"

Length of output: 325


Script:

#!/bin/bash
# Let's try a different approach by searching for the function name directly
# and examining the context to understand the usage

echo "Searching for all occurrences of zip_same_timespan function..."
rg "zip_same_timespan" -A 2 -B 2

echo -e "\nChecking imports in all Rust files..."
rg "use.*data_processing" stats/

Length of output: 4031

stats/stats/src/charts/lines/operational_txns_growth.rs (4)

1-20: LGTM! Well-organized imports

The imports are well-structured and properly grouped between internal and external dependencies.


22-40: LGTM! Well-designed chart properties

The Properties implementation is well-designed with appropriate choices for:

  • Chart type (Line) which is suitable for showing growth trends
  • Missing date policy (FillPrevious) which maintains continuity in growth metrics

42-49: LGTM! Good use of macro for resolution properties

The macro usage effectively creates consistent properties across different time resolutions while maintaining DRY principles.


51-69: Verify the rationale for batch sizes

The implementation uses specific batch sizes:

  • 30 weeks for weekly resolution
  • 36 months for monthly resolution
  • 30 years for yearly resolution

Please verify if these batch sizes are optimal for your use case and document the rationale.

✅ Verification successful

Let me search for the definition and documentation of these batch parameters.


Let me check the test file to see if it provides any insights into the batch size rationale.


Based on the codebase analysis, I can now provide a final response about the batch sizes:

Batch sizes are consistently used across all chart types

The batch sizes are defined as global constants and consistently used across all chart types for data aggregation:

  • 30 weeks for weekly resolution (approximately 7 months)
  • 36 months for monthly resolution (3 years)
  • 30 years for yearly resolution

These batch sizes appear to be carefully chosen to provide:

  1. Reasonable data windows for each time resolution
  2. Consistent data loading patterns across all chart types
  3. Balanced trade-off between data granularity and query performance
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any documentation or similar batch size patterns
rg -A 2 "Batch(30Weeks|36Months|30Years)" --type rust

Length of output: 25293


Script:

#!/bin/bash
# Search for any documentation or comments about these batch parameters
rg -B 2 -A 2 "Batch(30Weeks|36Months|30Years)" --type rust --type md --type txt

# Also check if there are any test files that might explain the rationale
fd -e rs "test" -x rg -l "Batch(30Weeks|36Months|30Years)" {}

Length of output: 47334


Script:

#!/bin/bash
# Check the test file content for any documentation or test cases related to batch sizes
cat stats/stats/src/data_source/tests.rs

# Also check if there are any comments in the parameters module that might explain the rationale
cat stats/stats/src/data_source/kinds/local_db/parameters/update/batching/parameters/mod.rs

Length of output: 13488

stats/stats/src/charts/lines/new_operational_txns.rs (1)

71-71: Confirmed correct use of or method

The use of data.or(0, 0) is appropriate for handling EitherOrBoth<i64, i64>, effectively providing default values when either side is missing.

stats/stats/src/charts/counters/total_operational_txns.rs (1)

77-83: Enhance test coverage and handling

The test update_total_operational_txns is currently ignored and validates only a single scenario. To ensure robustness:

  • Remove the #[ignore] attribute if possible by setting up the necessary test environment.
  • Include additional test cases covering edge conditions, such as when blocks > transactions, overflow scenarios, and timespan mismatches.
  • Ensure that potential errors are gracefully handled and tested.
stats/stats/src/update_groups.rs (2)

29-29: Addition of TotalOperationalTxns is appropriate

Including TotalOperationalTxns in singleton_groups! aligns with the tracking of operational transaction metrics.


117-125: Inclusion of new operational transaction charts is well-structured

Adding the operational transaction charts to the NewBlocksGroup enhances metric tracking and is correctly implemented.

stats/stats-server/src/server.rs (1)

69-69: Integration of handle_enable_all_arbitrum is correct

The call to handle_enable_all_arbitrum appropriately integrates the new Arbitrum-specific settings into the charts configuration.

stats/stats-server/src/settings.rs (2)

11-12: LGTM: Import changes are correctly organized.

The new imports for operational transaction types (TotalOperationalTxns, NewOperationalTxns, OperationalTxnsGrowth) are properly added.


89-89: LGTM: Default implementation is correct.

The enable_all_arbitrum field is properly initialized to false in the default implementation, maintaining a safe default state.

stats/stats-server/src/runtime_setup.rs (1)

264-264: LGTM: Update group properly added.

The TotalOperationalTxnsGroup is correctly added to the list of update groups.

@bragov4ik bragov4ik self-assigned this Dec 13, 2024
@bragov4ik bragov4ik linked an issue Dec 13, 2024 that may be closed by this pull request
4 tasks
@bragov4ik bragov4ik requested a review from sevenzing December 17, 2024 13:02
@bragov4ik bragov4ik force-pushed the bragov4ik/stats-arbitrum-charts branch from cff2137 to 68df67b Compare December 17, 2024 13:44
@sevenzing
Copy link
Contributor

sevenzing commented Dec 17, 2024

lgtm.

shipai, kati, kachestvo RUSTi

@bragov4ik bragov4ik merged commit 4ad678d into main Dec 18, 2024
6 checks passed
@bragov4ik bragov4ik deleted the bragov4ik/stats-arbitrum-charts branch December 18, 2024 06:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

stats: add arbitrum specific counters and charts
2 participants