Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented an rpc to get the number of addresses with positive balance #1496

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

levonpetrosyan93
Copy link
Contributor

Iterates through the whale address index to calculate the total number of addresses with a positive balance.

Copy link

coderabbitai bot commented Nov 17, 2024

Walkthrough

The changes introduce a new RPC command, getAddressNumWBalance, which retrieves the count of unique addresses with a positive balance. This command is integrated into the existing RPC framework, with corresponding updates to the vRPCConvertParams array for parameter conversion and modifications to the CBlockTreeDB class for database querying. Additionally, the getinfoex function is updated to include total coin supply information. Overall, the modifications expand the RPC server's functionality without altering existing structures or error handling.

Changes

File Change Summary
src/rpc/client.cpp Added method: { "getAddressNumWBalance", 0} in vRPCConvertParams.
src/rpc/misc.cpp Added method: UniValue getAddressNumWBalance(const JSONRPCRequest& request).
Updated method: UniValue getinfoex(const JSONRPCRequest& request) (added "moneysupply").
Added command: getAddressNumWBalance to RPC commands.
src/rpc/server.cpp Added command: getAddressNumWBalance in CRPCTable, marked as not safe for concurrent access.
src/rpc/server.h Added declaration: extern UniValue getAddressNumWBalance(const JSONRPCRequest &request);.
src/txdb.cpp Added method: size_t CBlockTreeDB::findAddressNumWBalance() to count unique addresses with balance.
src/txdb.h Added method declaration: size_t findAddressNumWBalance(); in CBlockTreeDB.

Poem

In the burrow deep, where the data flows,
A new command hops in, as the balance grows.
With addresses counted, both near and far,
Our RPC shines bright, like a twinkling star.
So let’s celebrate, with a joyful cheer,
For every balance found, brings us all near! 🐇✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (5)
src/rpc/server.h (1)

209-210: LGTM! Consider adding a brief comment.

The function declaration is well-placed among other address-related RPCs and follows the established patterns. Consider adding a brief comment describing the purpose of this RPC, similar to other functions in this file.

Add this comment above the declaration:

+/** Get the total number of addresses with positive balance */
 extern UniValue getAddressNumWBalance(const JSONRPCRequest &request);
src/rpc/client.cpp (1)

143-143: Follow consistent method naming convention

While the functionality is correct, the method name getAddressNumWBalance uses mixed case, while other RPC methods in the file use lowercase (e.g., getaddressbalance, getaddressdeltas). Consider renaming to getaddressnumwbalance for consistency.

src/rpc/server.cpp (1)

334-335: Follow consistent naming convention for RPC commands.

The new RPC command getAddressNumWBalance uses mixed case, while other address-related commands like getaddressmempool, getaddressutxos, etc., use lowercase. Consider renaming it to getaddressnumwbalance for consistency.

Apply this diff:

-    { "addressindex",       "getAddressNumWBalance",  &getAddressNumWBalance,  false },
+    { "addressindex",       "getaddressnumwbalance",  &getaddressnumwbalance,  false },
src/txdb.cpp (1)

316-316: Consider adding documentation for the new method

Add documentation explaining the purpose, complexity, and memory requirements of the method.

+/**
+ * Find the number of addresses with positive balance.
+ * 
+ * @note This operation requires scanning the entire address index and may be memory intensive
+ * @note Only considers payToPubKeyHash and payToExchangeAddress types
+ * @return The number of addresses with positive balance
+ */
 size_t CBlockTreeDB::findAddressNumWBalance() {
src/rpc/misc.cpp (1)

1014-1020: Enhance RPC help documentation

The help text should be expanded to include:

  • Return value format and units
  • Example usage with bitcoin-cli and JSON-RPC
  • Clear documentation of the -addressindex requirement
 throw std::runtime_error(
         "getAddressNumWBalance\n"
-        "Gives the number of addresses which has positive balance."
+        "Returns the number of addresses with positive balance.\n"
+        "\nResult:\n"
+        "n    (numeric) Number of addresses with balance > 0\n"
+        "\nNote: Requires -addressindex enabled\n"
+        "\nExamples:\n"
+        + HelpExampleCli("getAddressNumWBalance", "") 
+        + HelpExampleRpc("getAddressNumWBalance", "")
 );
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between a09abf0 and 40c941c.

📒 Files selected for processing (6)
  • src/rpc/client.cpp (1 hunks)
  • src/rpc/misc.cpp (1 hunks)
  • src/rpc/server.cpp (1 hunks)
  • src/rpc/server.h (1 hunks)
  • src/txdb.cpp (1 hunks)
  • src/txdb.h (1 hunks)
🔇 Additional comments (7)
src/txdb.h (2)

145-145: Method placement and integration looks good.

The new method is well-placed among other address-related operations and follows the class's established patterns.


145-145: Verify the implementation's impact on database performance.

Since this operation requires scanning the address index, let's verify its usage pattern to ensure it won't impact system performance.

✅ Verification successful

Implementation has acceptable performance characteristics

The method findAddressNumWBalance() is implemented with reasonable performance considerations:

  • Uses an iterator to scan the address index efficiently
  • Maintains a hash map to track unique addresses
  • Only processes specific address types (payToPubKeyHash and payToExchangeAddress)
  • Used in a non-critical RPC endpoint (getanonymityset)

The implementation follows similar patterns to other address index operations in the codebase and includes interruption points for long-running operations.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check how the new method is used in the codebase

# Check if the method is called in performance-critical paths
rg -l "findAddressNumWBalance"

# Check for any rate limiting or caching mechanisms in RPC layer
rg "getAddressNumWBalance.*rate" 

# Look for similar operations to understand performance patterns
ast-grep --pattern 'CBlockTreeDB::Read$$$Index'

Length of output: 322


Script:

#!/bin/bash
# Let's check the implementation and usage context

# Check the implementation in txdb.cpp
rg -A 10 "findAddressNumWBalance" src/txdb.cpp

# Check how it's used in RPC
rg -A 10 "findAddressNumWBalance" src/rpc/misc.cpp

# Look for similar address index operations
rg "ReadAddressIndex|WriteAddressIndex" -A 3

Length of output: 2887

src/rpc/client.cpp (1)

143-143: Verify RPC method implementation

The parameter conversion is correctly configured. Let's verify that the corresponding RPC method is properly implemented.

✅ Verification successful

RPC method implementation verified successfully

The getAddressNumWBalance RPC method is properly implemented in src/rpc/misc.cpp. It:

  • Takes no parameters
  • Requires -addressindex to be enabled
  • Returns the count of addresses with positive balance via pblocktree->findAddressNumWBalance()
  • Is correctly registered in the RPC server

The parameter count of 0 in the client configuration matches the implementation.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the RPC method implementation exists

# Check for the RPC method declaration
echo "Checking for RPC method declaration..."
rg "getAddressNumWBalance" --type cpp

# Check for the RPC method registration
echo "Checking for RPC method registration..."
rg "getAddressNumWBalance.*CRPCCommand" --type cpp

Length of output: 652


Script:

#!/bin/bash
# Let's check the implementation details of getAddressNumWBalance
echo "Checking implementation of getAddressNumWBalance..."
rg -A 10 "UniValue getAddressNumWBalance\(const JSONRPCRequest& request\)" --type cpp

Length of output: 844


Script:

#!/bin/bash
# Let's see the complete implementation
echo "Checking complete implementation of getAddressNumWBalance..."
rg -A 20 "UniValue getAddressNumWBalance\(const JSONRPCRequest& request\)" --type cpp

Length of output: 1395

src/rpc/server.cpp (1)

334-335: Verify the implementation of the new RPC command.

Let's ensure the implementation function exists and is properly declared.

✅ Verification successful

Let me check the actual implementation and declaration to verify they match properly.


The RPC command implementation is properly declared and defined

The verification confirms:

  • Function is correctly declared in src/rpc/server.h
  • Implementation exists in src/rpc/misc.cpp with proper signature
  • Function includes help text and parameter validation
  • Appropriate checks for address indexing feature
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if the implementation and declaration of getAddressNumWBalance exists

# Check for function declaration in header files
echo "Checking for function declaration..."
rg -l "UniValue.*getAddressNumWBalance.*JSONRPCRequest" --type cpp

# Check for function implementation
echo "Checking for function implementation..."
ast-grep --pattern 'UniValue getAddressNumWBalance(const JSONRPCRequest& $req) { $$$ }'

Length of output: 364


Script:

#!/bin/bash
# Check the declaration in server.h
echo "Declaration in server.h:"
rg "getAddressNumWBalance" src/rpc/server.h -A 2

# Check the implementation in misc.cpp
echo -e "\nImplementation in misc.cpp:"
rg "getAddressNumWBalance" src/rpc/misc.cpp -A 5

Length of output: 755

src/txdb.cpp (2)

323-323: Consider adding support for additional address types

The current implementation only handles payToPubKeyHash and payToExchangeAddress types. Consider whether other address types should be included in the count.

#!/bin/bash
# Description: Check for all address types used in the codebase

# Search for AddressType usage
ast-grep --pattern 'AddressType::$_'

319-319: Verify database cursor initialization

Ensure that the database cursor is properly initialized before use.

✅ Verification successful

Based on the search results, I can now generate the final response as we have sufficient information about cursor initialization patterns in the codebase.

Database cursor initialization is properly handled

The cursor initialization at this location follows the established pattern used throughout the codebase:

  • It uses boost::scoped_ptr<CDBIterator> for automatic resource management
  • The cursor is properly initialized via NewIterator()
  • Validity checks are consistently performed using pcursor->Valid()
  • The code includes proper interruption points for thread safety
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for proper cursor initialization and error handling patterns in the codebase

# Search for similar cursor initialization patterns
rg -A 5 "boost::scoped_ptr<CDBIterator>" src/

# Search for error handling patterns with cursors
rg "pcursor.*Valid.*" src/

Length of output: 2876

src/rpc/misc.cpp (1)

1021-1023: LGTM! Proper validation of addressindex requirement

The code correctly checks for the address index requirement and provides a clear error message about reindexing.

src/txdb.h Show resolved Hide resolved
src/txdb.cpp Outdated Show resolved Hide resolved
src/rpc/misc.cpp Show resolved Hide resolved
src/txdb.cpp Outdated
if (it != addrMap.end()) {
it->second += nValue;
if (it->second <= 0)
addrMap.erase(it);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's less that a zero due to the previous vulnerability the result would be incorrect because we erased non-zero amount from the map.

@gevorgvoskanyan
Copy link

Iterates through the whale address index to calculate the total number of addresses with a positive balance.

whale or whole?

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (1)
src/txdb.cpp (1)

328-328: Clarify comment about non-zero values

The comment "Only process non-zero values" is misleading because we're actually accumulating all values and checking for positive balances later.

Update the comment to better reflect the logic:

-                if (pcursor->GetValue(nValue) && nValue != 0) { // Only process non-zero values
+                if (pcursor->GetValue(nValue)) { // Accumulate all values for final balance calculation
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 40c941c and 3bfe8ac.

📒 Files selected for processing (1)
  • src/txdb.cpp (1 hunks)
🧰 Additional context used
🪛 cppcheck (2.10-2)
src/txdb.cpp

[performance] 344-344: Variable 'type' is assigned in constructor body. Consider performing initialization in initialization list.

(useInitializationList)

src/txdb.cpp Outdated Show resolved Hide resolved
src/txdb.cpp Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 3bfe8ac and 8859752.

📒 Files selected for processing (1)
  • src/txdb.cpp (1 hunks)
🔇 Additional comments (2)
src/txdb.cpp (2)

316-341: ⚠️ Potential issue

Critical: Fix double value retrieval and add error handling

The current implementation has several critical issues:

  1. The GetValue call appears twice (implicit in line 324 and explicit in line 326)
  2. Missing error handling for database operations
  3. Potential memory issues with large datasets

Apply this fix for the immediate issues:

 size_t CBlockTreeDB::findAddressNumWBalance() {
     boost::scoped_ptr<CDBIterator> pcursor(NewIterator());
+    if (!pcursor) {
+        return error("findAddressNumWBalance: failed to create iterator");
+    }
     pcursor->SeekToFirst();
     std::unordered_map<uint160, CAmount> addrMap;
     while (pcursor->Valid()) {
         boost::this_thread::interruption_point();
         std::pair<char,CAddressIndexKey> key;
         if (pcursor->GetKey(key) && key.first == DB_ADDRESSINDEX && (key.second.type == AddressType::payToPubKeyHash || key.second.type == AddressType::payToExchangeAddress)) {
             CAmount nValue;
-            // Retrieve the associated value
-            if (pcursor->GetValue(nValue) && nValue != 0) { // Only process non-zero values
+            if (pcursor->GetValue(nValue)) {
+                if (nValue != 0) { // Only process non-zero values
                 addrMap[key.second.hashBytes] += nValue; // Accumulate balance for the address
+                }
             }
         }
         pcursor->Next();
     }

Likely invalid or redundant comment.


316-341: 🛠️ Refactor suggestion

Optimize memory usage and improve performance

The current implementation loads all addresses into memory, which could be problematic for large blockchains. Consider implementing batching to manage memory usage.

Consider this optimized implementation:

 size_t CBlockTreeDB::findAddressNumWBalance() {
+    const size_t BATCH_SIZE = 1000000; // Process 1M addresses at a time
+    size_t totalAddresses = 0;
     boost::scoped_ptr<CDBIterator> pcursor(NewIterator());
+    if (!pcursor) {
+        return error("findAddressNumWBalance: failed to create iterator");
+    }
     pcursor->SeekToFirst();
-    std::unordered_map<uint160, CAmount> addrMap;
+    std::unordered_map<uint160, CAmount> currentBatch;
+    size_t processedRecords = 0;
     while (pcursor->Valid()) {
         boost::this_thread::interruption_point();
-        std::pair<char,CAddressIndexKey> key;
-        if (pcursor->GetKey(key) && key.first == DB_ADDRESSINDEX && (key.second.type == AddressType::payToPubKeyHash || key.second.type == AddressType::payToExchangeAddress)) {
-            CAmount nValue;
-            if (pcursor->GetValue(nValue)) {
-                CAmount nValue;
-                if (pcursor->GetValue(nValue) && nValue != 0) {
-                    addrMap[key.second.hashBytes] += nValue;
+        try {
+            std::pair<char,CAddressIndexKey> key;
+            if (pcursor->GetKey(key) && key.first == DB_ADDRESSINDEX && 
+                (key.second.type == AddressType::payToPubKeyHash || 
+                 key.second.type == AddressType::payToExchangeAddress)) {
+                CAmount nValue;
+                if (pcursor->GetValue(nValue)) {
+                    if (nValue != 0) {
+                        currentBatch[key.second.hashBytes] += nValue;
+                    }
                 }
             }
+            pcursor->Next();
+            processedRecords++;
+            
+            // Process in batches to manage memory
+            if (processedRecords % BATCH_SIZE == 0) {
+                for (const auto& addr : currentBatch) {
+                    if (addr.second > 0) {
+                        totalAddresses++;
+                    }
+                }
+                LogPrint("db", "Processed %zu records, found %zu addresses with positive balance\n", 
+                    processedRecords, totalAddresses);
+                currentBatch.clear();
+            }
+        } catch (const std::exception& e) {
+            return error("findAddressNumWBalance: %s", e.what());
         }
-        pcursor->Next();
     }

-    size_t counter = 0;
-    for (auto& itr : addrMap) {
-        if (itr.second > 0) {
-            ++counter;
+    // Process remaining addresses in the last batch
+    for (const auto& addr : currentBatch) {
+        if (addr.second > 0) {
+            totalAddresses++;
         }
     }

-    return counter;
+    return totalAddresses;
 }

Likely invalid or redundant comment.

Comment on lines +316 to +341
size_t CBlockTreeDB::findAddressNumWBalance() {
boost::scoped_ptr<CDBIterator> pcursor(NewIterator());
pcursor->SeekToFirst();
std::unordered_map<uint160, CAmount> addrMap;
while (pcursor->Valid()) {
boost::this_thread::interruption_point();
std::pair<char,CAddressIndexKey> key;
if (pcursor->GetKey(key) && key.first == DB_ADDRESSINDEX && (key.second.type == AddressType::payToPubKeyHash || key.second.type == AddressType::payToExchangeAddress)) {
CAmount nValue;
// Retrieve the associated value
if (pcursor->GetValue(nValue) && nValue != 0) { // Only process non-zero values
addrMap[key.second.hashBytes] += nValue; // Accumulate balance for the address
}
}
pcursor->Next();
}

size_t counter = 0;
for (auto& itr : addrMap) {
if (itr.second > 0) {
++counter;
}
}

return counter;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Database error handling needs improvement

The verification reveals several concerns with error handling in the findAddressNumWBalance implementation:

  • Unlike other similar methods in the codebase (e.g., GetAddressIndex, GetAddressUnspent), this method doesn't return a boolean and lacks error reporting when GetValue fails
  • Other database operations in the file use the error() function to report failures, but this method silently continues on errors
  • The method should follow the established pattern of returning false with an error message when database operations fail

Key locations that need attention:

  • Add error handling for GetValue operation similar to line 252: return error("failed to get address unspent value")
  • Consider changing the return type to bool and returning false on database errors
  • Add error handling for database iterator initialization
🔗 Analysis chain

Verify database operations and error conditions

The implementation should be tested with various database states and error conditions.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the database operations and potential error conditions

# Test: Check if the method handles database errors gracefully
rg -A 5 "error\(.*findAddressNumWBalance.*\)"

# Test: Check for similar implementations that might provide insights
ast-grep --pattern 'boost::scoped_ptr<CDBIterator> pcursor(NewIterator());'

# Test: Check for other address balance calculations that might need similar optimizations
rg -A 10 "GetValue.*nValue.*addrMap"

Length of output: 710


Script:

#!/bin/bash
# Let's analyze error handling patterns in similar database operations

# Check how other similar methods handle database errors
rg -A 10 "pcursor->GetKey|pcursor->GetValue" src/txdb.cpp

# Look for error handling patterns in the database operations
rg -A 5 "catch\s*\(.*\)" src/txdb.cpp

# Check if there are any database error logging mechanisms
rg "error|LogPrint.*DB" src/txdb.cpp

# Look for similar iterator usage patterns
ast-grep --pattern 'while (pcursor->Valid()) {
    $$$
    if (pcursor->GetKey($_) && $_) {
        $$$
    }
    $$$
}'

Length of output: 5974

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants