Skip to content

Commit

Permalink
fix: real_s3 flakyness with rust tests (#5386)
Browse files Browse the repository at this point in the history
Fixes #5072. See proof from
#5072 (comment).
Turns out multiple threads can get the same nanoseconds since epoch, so
switch to using millis (for finding the prefix later on) and randomness
via `thread_rng` (protect against adversial ci runners).

Also changes the "per test looking alike" prefix to more "general"
prefix.
  • Loading branch information
koivunej authored Sep 26, 2023
1 parent 3322b6c commit 16f0622
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 3 deletions.
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions libs/remote_storage/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,3 +29,4 @@ workspace_hack.workspace = true
[dev-dependencies]
tempfile.workspace = true
test-context.workspace = true
rand.workspace = true
15 changes: 12 additions & 3 deletions libs/remote_storage/tests/test_real_s3.rs
Original file line number Diff line number Diff line change
Expand Up @@ -378,21 +378,30 @@ impl AsyncTestContext for MaybeEnabledS3WithSimpleTestBlobs {
fn create_s3_client(
max_keys_per_list_response: Option<i32>,
) -> anyhow::Result<Arc<GenericRemoteStorage>> {
use rand::Rng;

let remote_storage_s3_bucket = env::var("REMOTE_STORAGE_S3_BUCKET")
.context("`REMOTE_STORAGE_S3_BUCKET` env var is not set, but real S3 tests are enabled")?;
let remote_storage_s3_region = env::var("REMOTE_STORAGE_S3_REGION")
.context("`REMOTE_STORAGE_S3_REGION` env var is not set, but real S3 tests are enabled")?;
let random_prefix_part = std::time::SystemTime::now()

// due to how time works, we've had test runners use the same nanos as bucket prefixes.
// millis is just a debugging aid for easier finding the prefix later.
let millis = std::time::SystemTime::now()
.duration_since(UNIX_EPOCH)
.context("random s3 test prefix part calculation")?
.as_nanos();
.as_millis();

// because nanos can be the same for two threads so can millis, add randomness
let random = rand::thread_rng().gen::<u32>();

let remote_storage_config = RemoteStorageConfig {
max_concurrent_syncs: NonZeroUsize::new(100).unwrap(),
max_sync_errors: NonZeroU32::new(5).unwrap(),
storage: RemoteStorageKind::AwsS3(S3Config {
bucket_name: remote_storage_s3_bucket,
bucket_region: remote_storage_s3_region,
prefix_in_bucket: Some(format!("pagination_should_work_test_{random_prefix_part}/")),
prefix_in_bucket: Some(format!("test_{millis}_{random:08x}/")),
endpoint: None,
concurrency_limit: NonZeroUsize::new(100).unwrap(),
max_keys_per_list_response,
Expand Down

1 comment on commit 16f0622

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2564 tests run: 2441 passed, 1 failed, 122 skipped (full report)


Failures on Postgres 15

  • test_delete_timeline_client_hangup: debug
# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_delete_timeline_client_hangup[debug-pg15]"
Flaky tests (3)

Postgres 16

Postgres 14

  • test_get_tenant_size_with_multiple_branches: release

Test coverage report is not available

The comment gets automatically updated with the latest test results
16f0622 at 2023-09-26T15:40:35.181Z :recycle:

Please sign in to comment.