Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add async walredo mode (disabled-by-default, opt-in via config) #6548

Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
2c1652a
WIP: async walredo
problame Jan 27, 2024
a93be15
remove wal_redo_timeout
problame Jan 31, 2024
8012b80
some cleanup work
problame Jan 31, 2024
2736f61
error handling
problame Jan 31, 2024
639ed3c
clippy + compile errors
problame Jan 31, 2024
a29ac8b
clippy (again?)
problame Jan 31, 2024
4160d40
cfg(testing) still needs io::Write
problame Jan 31, 2024
70b37cf
WIP poison
problame Jan 31, 2024
b1b8ca3
working impl
problame Jan 31, 2024
9641374
move `poison` to `utils` and document
problame Jan 31, 2024
0cf5619
Merge remote-tracking branch 'origin/main' into problame/integrate-to…
problame Mar 13, 2024
cd6d9ab
WIP: throughput-oriented walredo benchmark
problame Mar 15, 2024
f31f2e9
finish benchmark impl (switch to criterion)
problame Mar 20, 2024
c853c61
replace bench_walredo with my impl
problame Mar 20, 2024
f038304
minimize diff
problame Mar 20, 2024
80de856
Merge branch 'main' into problame/integrate-tokio-epoll-uring/benchma…
problame Mar 20, 2024
48b22bd
walredo: better benchmark
problame Mar 20, 2024
929423c
add i3en.3xlarge reference numbers
problame Mar 20, 2024
081af38
Merge branch 'main' into problame/async-walredo/better-benchmark
problame Mar 20, 2024
a37d713
Merge branch 'main' into problame/async-walredo/better-benchmark
problame Mar 20, 2024
8677136
Merge branch 'problame/async-walredo/better-benchmark' into problame/…
problame Mar 20, 2024
15cfa7b
apply review suggestions
problame Mar 21, 2024
db3333e
yield after ever redo execution
problame Mar 21, 2024
d6c4562
update numbers (the yield makes a big difference, who would have thun…
problame Mar 21, 2024
c6a74bd
Merge branch 'problame/async-walredo/better-benchmark' into problame/…
problame Mar 21, 2024
a21409b
measure results
problame Mar 21, 2024
b2f5b84
cargo fmt
problame Mar 21, 2024
e669b6d
Merge branch 'problame/async-walredo/better-benchmark' into problame/…
problame Mar 21, 2024
86b0df9
apply review suggestion https://github.com/neondatabase/neon/pull/719…
problame Mar 21, 2024
3a5994b
Merge branch 'main' into problame/integrate-tokio-epoll-uring/benchma…
problame Mar 21, 2024
3dfc7de
use chrono::DateTime for Poisoned errors
problame Mar 21, 2024
655d3b6
audit for cancellation-safety
problame Mar 21, 2024
cca66e5
HACK: restore old impl, make runtime configurable (how to: reconfigur…
problame Mar 22, 2024
67a7abc
make the default process kind runtime-configurable, and switch to sync
problame Apr 3, 2024
c77ce7c
Merge remote-tracking branch 'origin/main' into problame/integrate-to…
problame Apr 3, 2024
31d4d1e
env_config from PR #6125
problame Apr 5, 2024
43cf9d1
env_config improvements
problame Apr 5, 2024
dc03f7a
pageserver: ability to use a single runtime
problame Apr 5, 2024
3779854
rename "single runtime" to "one runtime", allow configuring current_t…
problame Apr 5, 2024
5cf45df
remove env_config::Bool
problame Apr 5, 2024
740efb0
cleanup
problame Apr 5, 2024
6b820bb
fixup env var value parsing
problame Apr 5, 2024
70fb7e3
metric, useful for rollout / analyzing grafana metrics
problame Apr 5, 2024
edd7f69
make current_thread mode work
problame Apr 5, 2024
871a3ca
change thread name
problame Apr 5, 2024
dc8e318
fix copy-pasta
problame Apr 5, 2024
aa5439c
Merge remote-tracking branch 'origin/main' into problame/configurable…
problame Apr 8, 2024
5efadde
Merge remote-tracking branch 'origin/problame/configurable-one-runtim…
problame Apr 8, 2024
b72891d
Revert "make the default process kind runtime-configurable, and switc…
problame Apr 8, 2024
c38b3e6
Revert "HACK: restore old impl, make runtime configurable (how to: re…
problame Apr 8, 2024
d8a9266
tokio-test not necessary
problame Apr 8, 2024
ffef90f
Merge remote-tracking branch 'origin/main' into problame/integrate-to…
problame Apr 8, 2024
4ef2fb2
bring back wal_redo_timeout
problame Apr 8, 2024
bea2e12
Revert "Revert "HACK: restore old impl, make runtime configurable (ho…
problame Apr 8, 2024
f489a10
fixup: re-apply bring-back of wal_redo_timeout changes after file mov…
problame Apr 8, 2024
825c0e3
Revert "Revert "make the default process kind runtime-configurable, a…
problame Apr 8, 2024
c28ed6a
HACK: set walredo process kind metric on startup
problame Apr 9, 2024
845f2ea
adjust bench for both sync and async benchmarking
problame Apr 9, 2024
6f236e8
benchmark numbers
problame Apr 9, 2024
99c20c5
cleanups around metric
problame Apr 12, 2024
883a071
expose kind in tenant status
problame Apr 12, 2024
644f7f9
add failing test to ensure walredo config option works
problame Apr 12, 2024
237f27a
also assert metric is set
problame Apr 12, 2024
4a26245
remove runtime reconfiguration capability + assert a bit more (can pa…
problame Apr 12, 2024
f334235
address https://github.com/neondatabase/neon/pull/6548#discussion_r15…
problame Apr 15, 2024
005dcbd
simplify around ProcessKind enum type; addresses https://github.com/n…
problame Apr 15, 2024
18c4b35
indentation
problame Apr 15, 2024
df5feb7
fixup(005dcbd6a89f06db2577edfb51d3aea0f287d491): bench_walredo
problame Apr 15, 2024
b6e168b
fixup(4a26245d993a840ec36942e4ebab476e6d8524aa): sometimes bench runs…
problame Apr 15, 2024
fb11c39
rerun benchmark
problame Apr 15, 2024
b311615
also level = DEBUG the process_std
problame Apr 15, 2024
bd53ab8
rerun benches
problame Apr 15, 2024
989de61
undo level = DEBUG and re-run benchmarks
problame Apr 15, 2024
4fd26c2
fixup: empty line
problame Apr 15, 2024
cecc9bc
Merge branch 'main' into problame/integrate-tokio-epoll-uring/benchma…
problame Apr 15, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,7 @@ tokio-postgres-rustls = "0.10.0"
tokio-rustls = "0.24"
tokio-stream = "0.1"
tokio-tar = "0.3"
tokio-test = "0.4.3"
tokio-util = { version = "0.7.10", features = ["io", "rt"] }
toml = "0.7"
toml_edit = "0.19"
Expand Down
1 change: 1 addition & 0 deletions libs/utils/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ criterion.workspace = true
hex-literal.workspace = true
camino-tempfile.workspace = true
serde_assert.workspace = true
tokio-test.workspace = true

[[bench]]
name = "benchmarks"
Expand Down
2 changes: 2 additions & 0 deletions libs/utils/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,8 @@ pub mod failpoint_support;

pub mod yielding_loop;

pub mod poison;

/// This is a shortcut to embed git sha into binaries and avoid copying the same build script to all packages
///
/// we have several cases:
Expand Down
120 changes: 120 additions & 0 deletions libs/utils/src/poison.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
//! Protect a piece of state from reuse after it is left in an inconsistent state.
//!
//! # Example
//!
//! ```
//! # tokio_test::block_on(async {
//! use utils::poison::Poison;
//! use std::time::Duration;
//!
//! struct State {
//! clean: bool,
//! }
//! let state = tokio::sync::Mutex::new(Poison::new("mystate", State { clean: true }));
//!
//! let mut mutex_guard = state.lock().await;
//! let mut poison_guard = mutex_guard.check_and_arm()?;
//! let state = poison_guard.data_mut();
//! state.clean = false;
//! // If we get cancelled at this await point, subsequent check_and_arm() calls will fail.
//! tokio::time::sleep(Duration::from_secs(10)).await;
//! state.clean = true;
//! poison_guard.disarm();
//! # Ok::<(), utils::poison::Error>(())
//! # });
//! ```

use std::time::Instant;

use tracing::warn;

pub struct Poison<T> {
what: &'static str,
state: State,
data: T,
}

#[derive(Clone, Copy)]
enum State {
Clean,
Armed,
Poisoned { at: Instant },
}

impl<T> Poison<T> {
/// We log `what` `warning!` level if the [`Guard`] gets dropped without being [`Guard::disarm`]ed.
pub fn new(what: &'static str, data: T) -> Self {
Self {
what,
state: State::Clean,
data,
}
}

/// Check for poisoning and return a [`Guard`] that provides access to the wrapped state.
pub fn check_and_arm(&mut self) -> Result<Guard<T>, Error> {
match self.state {
State::Clean => {
self.state = State::Armed;
Ok(Guard(self))
}
State::Armed => unreachable!("transient state"),
State::Poisoned { at } => Err(Error::Poisoned {
what: self.what,
at,
}),
}
}
}

/// Use [`Self::data`] and [`Self::data_mut`] to access the wrapped state.
/// Once modifications are done, use [`Self::disarm`].
/// If [`Guard`] gets dropped instead of calling [`Self::disarm`], the state is poisoned
/// and subsequent calls to [`Poison::check_and_arm`] will fail with an error.
pub struct Guard<'a, T>(&'a mut Poison<T>);

impl<'a, T> Guard<'a, T> {
pub fn data(&self) -> &T {
&self.0.data
}
pub fn data_mut(&mut self) -> &mut T {
&mut self.0.data
}

pub fn disarm(self) {
match self.0.state {
State::Clean => unreachable!("we set it to Armed in check_and_arm()"),
State::Armed => {
self.0.state = State::Clean;
}
State::Poisoned { at } => {
unreachable!("we fail check_and_arm() if it's in that state: {at:?}")
}
}
}
}

impl<'a, T> Drop for Guard<'a, T> {
fn drop(&mut self) {
match self.0.state {
State::Clean => {
// set by disarm()
}
State::Armed => {
// still armed => poison it
let at = Instant::now();
self.0.state = State::Poisoned { at };
warn!(at=?at, "poisoning {}", self.0.what);
}
State::Poisoned { at } => {
unreachable!("we fail check_and_arm() if it's in that state: {at:?}")
}
}
}
}

#[derive(thiserror::Error, Debug)]
pub enum Error {
#[error("poisoned at {at:?}: {what}")]
Poisoned { what: &'static str, at: Instant },
problame marked this conversation as resolved.
Show resolved Hide resolved
}
19 changes: 0 additions & 19 deletions pageserver/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,6 @@ pub mod defaults {
pub use storage_broker::DEFAULT_ENDPOINT as BROKER_DEFAULT_ENDPOINT;

pub const DEFAULT_WAIT_LSN_TIMEOUT: &str = "60 s";
pub const DEFAULT_WAL_REDO_TIMEOUT: &str = "60 s";

pub const DEFAULT_SUPERUSER: &str = "cloud_admin";

Expand Down Expand Up @@ -94,7 +93,6 @@ pub mod defaults {
#listen_http_addr = '{DEFAULT_HTTP_LISTEN_ADDR}'

#wait_lsn_timeout = '{DEFAULT_WAIT_LSN_TIMEOUT}'
#wal_redo_timeout = '{DEFAULT_WAL_REDO_TIMEOUT}'

#page_cache_size = {DEFAULT_PAGE_CACHE_SIZE}
#max_file_descriptors = {DEFAULT_MAX_FILE_DESCRIPTORS}
Expand Down Expand Up @@ -162,8 +160,6 @@ pub struct PageServerConf {

// Timeout when waiting for WAL receiver to catch up to an LSN given in a GetPage@LSN call.
pub wait_lsn_timeout: Duration,
// How long to wait for WAL redo to complete.
pub wal_redo_timeout: Duration,

pub superuser: String,

Expand Down Expand Up @@ -291,7 +287,6 @@ struct PageServerConfigBuilder {
availability_zone: BuilderValue<Option<String>>,

wait_lsn_timeout: BuilderValue<Duration>,
wal_redo_timeout: BuilderValue<Duration>,

superuser: BuilderValue<String>,

Expand Down Expand Up @@ -354,8 +349,6 @@ impl Default for PageServerConfigBuilder {
availability_zone: Set(None),
wait_lsn_timeout: Set(humantime::parse_duration(DEFAULT_WAIT_LSN_TIMEOUT)
.expect("cannot parse default wait lsn timeout")),
wal_redo_timeout: Set(humantime::parse_duration(DEFAULT_WAL_REDO_TIMEOUT)
.expect("cannot parse default wal redo timeout")),
superuser: Set(DEFAULT_SUPERUSER.to_string()),
page_cache_size: Set(DEFAULT_PAGE_CACHE_SIZE),
max_file_descriptors: Set(DEFAULT_MAX_FILE_DESCRIPTORS),
Expand Down Expand Up @@ -440,10 +433,6 @@ impl PageServerConfigBuilder {
self.wait_lsn_timeout = BuilderValue::Set(wait_lsn_timeout)
}

pub fn wal_redo_timeout(&mut self, wal_redo_timeout: Duration) {
self.wal_redo_timeout = BuilderValue::Set(wal_redo_timeout)
}

pub fn superuser(&mut self, superuser: String) {
self.superuser = BuilderValue::Set(superuser)
}
Expand Down Expand Up @@ -601,9 +590,6 @@ impl PageServerConfigBuilder {
wait_lsn_timeout: self
.wait_lsn_timeout
.ok_or(anyhow!("missing wait_lsn_timeout"))?,
wal_redo_timeout: self
.wal_redo_timeout
.ok_or(anyhow!("missing wal_redo_timeout"))?,
superuser: self.superuser.ok_or(anyhow!("missing superuser"))?,
page_cache_size: self
.page_cache_size
Expand Down Expand Up @@ -860,7 +846,6 @@ impl PageServerConf {
"listen_http_addr" => builder.listen_http_addr(parse_toml_string(key, item)?),
"availability_zone" => builder.availability_zone(Some(parse_toml_string(key, item)?)),
"wait_lsn_timeout" => builder.wait_lsn_timeout(parse_toml_duration(key, item)?),
"wal_redo_timeout" => builder.wal_redo_timeout(parse_toml_duration(key, item)?),
"initial_superuser_name" => builder.superuser(parse_toml_string(key, item)?),
"page_cache_size" => builder.page_cache_size(parse_toml_u64(key, item)? as usize),
"max_file_descriptors" => {
Expand Down Expand Up @@ -978,7 +963,6 @@ impl PageServerConf {
PageServerConf {
id: NodeId(0),
wait_lsn_timeout: Duration::from_secs(60),
wal_redo_timeout: Duration::from_secs(60),
page_cache_size: defaults::DEFAULT_PAGE_CACHE_SIZE,
max_file_descriptors: defaults::DEFAULT_MAX_FILE_DESCRIPTORS,
listen_pg_addr: defaults::DEFAULT_PG_LISTEN_ADDR.to_string(),
Expand Down Expand Up @@ -1164,7 +1148,6 @@ listen_pg_addr = '127.0.0.1:64000'
listen_http_addr = '127.0.0.1:9898'

wait_lsn_timeout = '111 s'
wal_redo_timeout = '111 s'

page_cache_size = 444
max_file_descriptors = 333
Expand Down Expand Up @@ -1205,7 +1188,6 @@ background_task_maximum_delay = '334 s'
listen_http_addr: defaults::DEFAULT_HTTP_LISTEN_ADDR.to_string(),
availability_zone: None,
wait_lsn_timeout: humantime::parse_duration(defaults::DEFAULT_WAIT_LSN_TIMEOUT)?,
wal_redo_timeout: humantime::parse_duration(defaults::DEFAULT_WAL_REDO_TIMEOUT)?,
superuser: defaults::DEFAULT_SUPERUSER.to_string(),
page_cache_size: defaults::DEFAULT_PAGE_CACHE_SIZE,
max_file_descriptors: defaults::DEFAULT_MAX_FILE_DESCRIPTORS,
Expand Down Expand Up @@ -1279,7 +1261,6 @@ background_task_maximum_delay = '334 s'
listen_http_addr: "127.0.0.1:9898".to_string(),
availability_zone: None,
wait_lsn_timeout: Duration::from_secs(111),
wal_redo_timeout: Duration::from_secs(111),
superuser: "zzzz".to_string(),
page_cache_size: 444,
max_file_descriptors: 333,
Expand Down
Loading
Loading