Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some quality of life fixes for diagnostics and to improve fault finding for sync issues #1829

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ license = "MIT OR Apache-2.0"
# Makes flamegraphs more readable.
# https://doc.rust-lang.org/cargo/reference/manifest.html#the-profile-sections
debug = true
# Set to false for profiling data.
lto = "thin"

[profile.release-stripped]
Expand Down
5 changes: 5 additions & 0 deletions scripts/dhat_zq2
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#! /bin/bash
# Use this as the value of ZQ2_SCRIPT to enable oprofile
# You will also want to set [profile.release] debug = 1 in Cargo.toml
# Args passed are (binary) (rest)
valgrind --tool=dhat $*
3 changes: 3 additions & 0 deletions scripts/perf_zq2
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#! /bin/bash
# Use this as the value of PERF_SCRIPT to runt perf
perf record --call-graph dwarf -- $*
5 changes: 4 additions & 1 deletion z2/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ git2 = "0.18.3"
hex = "0.4.3"
home = "0.5.9"
indicatif = "0.17.9"
hyper = {version = "1.5.0", features = [ "client" ] }
itertools = "0.13.0"
jsonrpsee = {version = "0.22.4", features = ["client"]}
k256 = "0.13.4"
Expand All @@ -66,9 +67,11 @@ sha3 = "0.10.8"
tempfile = "3.14.0"
tera = "1.19.1"
thiserror = "2.0.3"
tokio = {version = "1.41.1", features = ["macros", "rt-multi-thread", "sync", "io-std", "io-util", "process", "fs"]}
tokio = {version = "1.41.1", features = ["macros", "rt-multi-thread", "sync", "io-std", "io-util", "process", "fs", "time"]}
tokio-stream = "0.1.16"
toml = "0.8.19"
tower = "0.5.1"
tower-http = "0.6.1"
tracing = "0.1.40"
tracing-subscriber = "0.3.18"
url = "2.5.3"
Expand Down
35 changes: 35 additions & 0 deletions z2/docs/testing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Testing

There is now some test framework in z2. After starting a z2 network running in the context directory `/tmp/a`, you can...

```sh
./scripts/z2 test /tmp/a partition 0:0/30000 1-2:2000/30000 2-5:1-3:1000/20000
```

This tells the system to call the admin API to partition the network:

* With node 0 being told to talk just to itself from t=0ms to t=30000ms.
* With nodes 1 and 2 being told to talk to just 1 and 2 from t=2000ms to t=30000ms
* With nodes 2-5 being told to talk to nodes 1-3 from t=1000ms to t=20000ms

We do this by calling `admin_whitelist` at appropriate times. Code in `testing.rs`.

You can also see what the nodes think of the chain:

```sh
./scripts/z2 test /tmp/a graphs xc viewmin-viewmax 1-2,3
```

* In the context `/tmp/a`
* `graphs` - draw graphs
* With names `/tmp/xc<node_number>.dot`
* From `viewmin` to `viewmax` (see below) inclusive.
* On nodes `1-2,3`

`viewmax==0` means "the latest view". `viewmin>0, viewmax=0` means "the last `viewmin` views".
Otherwise they are a range of views to visualise.

Chrome is the best way to view svgs these days, it seems, so we
convert the dotfiles written by the `admin` API to `svg` with `dot`
(which you need to have installed) and then output the URLs.

17 changes: 17 additions & 0 deletions z2/resources/chain-specs/zq2-richard.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
p2p_port = 3333
bootstrap_address = [ "12D3KooWACUuqbMYRddTh34HejKg8i1QyuPJoffWVecYotCi8FzZ", "/ip4/34.87.179.185/udp/3333/quic-v1" ]

[[nodes]]
eth_chain_id = 33469
allowed_timestamp_skew = { secs = 60, nanos = 0 }
data_dir = "/data"
consensus.genesis_accounts = [ ["0xed4Ec243b08456404F37CFA9a09DFdF6a52137F1", "20_800_000_000_000_000_000_000_000_000" ] ]
consensus.genesis_deposits = [ ["a81a31aaf946111bbe9a958cd9b8bd85d277b8b7c64fc67f579696dbcb6a460a96d4f70e0187064cda83a74b32b1f81f", "12D3KooWACUuqbMYRddTh34HejKg8i1QyuPJoffWVecYotCi8FzZ", "100_000_000_000_000_000_000_000_000", "0xed4Ec243b08456404F37CFA9a09DFdF6a52137F1", "0xed4Ec243b08456404F37CFA9a09DFdF6a52137F1"] ]

# Reward parameters
consensus.rewards_per_hour = "51_000_000_000_000_000_000_000"
consensus.blocks_per_hour = 3600
consensus.minimum_stake = "10_000_000_000_000_000_000_000_000"
# Gas parameters
consensus.eth_block_gas_limit = 84000000
consensus.gas_price = "4_761_904_800_000"
66 changes: 61 additions & 5 deletions z2/src/bin/z2.rs
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ enum Commands {
Perf(PerfStruct),
#[clap(subcommand)]
/// Group of subcommands to deploy and configure a Zilliqa 2 network
/// If you define the environment variable ZQ2_API_URL, we will use it in preference to the default API url for this network.
Deployer(DeployerCommands),
#[clap(subcommand)]
/// Convert Zilliqa 1 to Zilliqa 2 persistnce
Expand All @@ -57,6 +58,8 @@ enum Commands {
Nodes(NodesStruct),
/// Start a node and join it to a network
JoinNode(JoinNodeStruct),
/// Run various tests on the chain
Test(TestStruct),
}

#[derive(Subcommand, Debug)]
Expand Down Expand Up @@ -104,6 +107,8 @@ enum DeployerCommands {
GeneratePrivateKeys(DeployerGenerateActionsArgs),
/// Generate the genesis key. --force to replace if already existing
GenerateGenesisKey(DeployerGenerateGenesisArgs),
/// Get info
Info(DeployerInfoArgs),
}

#[derive(Args, Debug)]
Expand Down Expand Up @@ -144,6 +149,9 @@ pub struct DeployerInstallArgs {
/// gsutil URI of the persistence file. Ie. gs://my-bucket/my-file
#[clap(long)]
persistence_url: Option<String>,
/// Machines to install
#[clap(long, num_args= 0..)]
machines: Vec<String>,
}

#[derive(Args, Debug)]
Expand All @@ -156,6 +164,9 @@ pub struct DeployerUpgradeArgs {
/// Define the number of nodes to process in parallel. Default: 1
#[clap(long)]
max_parallel: Option<usize>,
/// Machines to install
#[clap(long, num_args= 0..)]
machines: Vec<String>,
}

#[derive(Args, Debug)]
Expand Down Expand Up @@ -224,6 +235,11 @@ pub struct DeployerGenerateGenesisArgs {
force: bool,
}

#[derive(Args, Debug)]
pub struct DeployerInfoArgs {
config_file: String,
}

#[derive(Subcommand, Debug)]
enum ConverterCommands {
/// Convert Zilliqa 1 to Zilliqa 2 persistence format.
Expand Down Expand Up @@ -300,6 +316,23 @@ struct DocStruct {
api_url: Option<String>,
}

#[derive(Args, Debug)]
struct TestStruct {
config_dir: String,
#[clap(long)]
#[clap(default_value = "warn")]
log_level: LogLevel,

#[clap(long)]
debug_modules: Vec<String>,

#[clap(long)]
trace_modules: Vec<String>,

#[arg(trailing_var_arg = true, hide = true)]
rest: Vec<String>,
}

// See https://jwodder.github.io/kbits/posts/clap-bool-negate/
#[derive(Args, Debug)]
struct RunStruct {
Expand Down Expand Up @@ -743,6 +776,7 @@ async fn main() -> Result<()> {
arg.select,
arg.max_parallel,
arg.persistence_url.clone(),
&arg.machines,
)
.await
.map_err(|err| {
Expand All @@ -756,11 +790,16 @@ async fn main() -> Result<()> {
"Provide a configuration file. [--config-file] mandatory argument"
)
})?;
plumbing::run_deployer_upgrade(&config_file, arg.select, arg.max_parallel)
.await
.map_err(|err| {
anyhow::anyhow!("Failed to run deployer upgrade command: {}", err)
})?;
plumbing::run_deployer_upgrade(
&config_file,
arg.select,
arg.max_parallel,
&arg.machines,
)
.await
.map_err(|err| {
anyhow::anyhow!("Failed to run deployer upgrade command: {}", err)
})?;
Ok(())
}
DeployerCommands::GetConfigFile(ref arg) => {
Expand Down Expand Up @@ -917,6 +956,14 @@ async fn main() -> Result<()> {
})?;
Ok(())
}
DeployerCommands::Info(ref arg) => {
plumbing::run_deployer_info(&arg.config_file)
.await
.map_err(|err| {
anyhow::anyhow!("Failed to run deployer info command: {}", err)
})?;
Ok(())
}
},
Commands::Converter(converter_command) => match &converter_command {
ConverterCommands::Convert(ref arg) => {
Expand Down Expand Up @@ -1067,5 +1114,14 @@ async fn main() -> Result<()> {
.await?;
Ok(())
}
Commands::Test(ref arg) => {
let log_spec = utils::compute_log_string(
&arg.log_level.to_string(),
&arg.debug_modules,
&arg.trace_modules,
)?;
plumbing::test(&arg.config_dir, &base_dir, &log_spec, false, &arg.rest).await?;
Ok(())
}
}
}
2 changes: 1 addition & 1 deletion z2/src/chain.rs
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ impl Chain {

pub fn get_toml_contents(chain_name: &str) -> Result<&'static str> {
match chain_name {
"zq2-richard" => Err(anyhow!("Configuration file for {} not found", chain_name)),
"zq2-richard" => Ok(include_str!("../resources/chain-specs/zq2-richard.toml")),
"zq2-uccbtest" => Ok(include_str!("../resources/chain-specs/zq2-uccbtest.toml")),
"zq2-infratest" => Err(anyhow!("Configuration file for {} not found", chain_name)),
"zq2-perftest" => Ok(include_str!("../resources/chain-specs/zq2-perftest.toml")),
Expand Down
10 changes: 5 additions & 5 deletions z2/src/chain/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,11 @@ use crate::github;

#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct NetworkConfig {
pub(super) name: String,
pub(super) eth_chain_id: u64,
pub(super) project_id: String,
pub(super) roles: Vec<NodeRole>,
pub(super) versions: HashMap<String, String>,
pub name: String,
pub eth_chain_id: u64,
pub project_id: String,
pub roles: Vec<NodeRole>,
pub versions: HashMap<String, String>,
}

impl NetworkConfig {
Expand Down
7 changes: 4 additions & 3 deletions z2/src/chain/node.rs
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ impl fmt::Display for NodeRole {
}
}

#[derive(Clone, Debug)]
#[derive(Clone, Serialize, Deserialize, Debug)]
pub struct Machine {
pub project_id: String,
pub zone: String,
Expand Down Expand Up @@ -386,7 +386,8 @@ impl ChainNode {
private_key.value().await?
} else {
return Err(anyhow!(
"Found multiple private keys for the instance {}",
"Found {} private keys for the instance {}",
private_keys.len(),
&self.machine.name
));
};
Expand Down Expand Up @@ -764,7 +765,7 @@ pub async fn retrieve_secret_by_role(
.await
}

async fn retrieve_secret_by_node_name(
pub async fn retrieve_secret_by_node_name(
chain_name: &str,
project_id: &str,
node_name: &str,
Expand Down
Loading