-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: use WAL mode and set pragmas correctly #98
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
A long comment that is potentially in the wrong issue follows..
busy_timeout
doesn't always work as we expect with deferred transactions (the default).
Here's a few examples that all start with the same setup. The prefix X:
is the DB connection X:
setup
A: create table T(t text);
A: create table R(r text);
A: insert into T values ('a'), ('b');
A: insert into R values ('a'), ('b');
A: begin;
A: update T set t='x' where t='a';
B: PRAGMA busy_timeout=5000;
select same table
Succeeds (as expected because WAL allows concurrent read+writes).
B: begin;
B: select * from T;
a
b
B: commit;
update different table
Blocks 5 seconds (busy_timeout
) and returns an error.
B: begin;
B: update R set r='x' where r='a';
# Blocks for 5 seconds
Runtime error: database is locked (5)
select+update different table
Immediately returns an error. Doesn't respect the busy_timeout
set. I can see this case happening with the Replace()
method (we're doing both read+write). eg: New Pods
event comes in and at the same time, Namespaces
resyncs.
Note: Retrying the update
will keep returning an error even when A's transaction is committed. Yes, even if A updated a different table.
B: begin;
B: select * from R;
a
b
B: update R set r='x' where r='a';
Runtime error: database is locked (5)
B: update R set r='x' where r='a';
Runtime error: database is locked (5)
A: commit;
B: update R set r='x' where r='a';
Runtime error: database is locked (5)
select+update with immediate
Contrast the above scenario with immediate mode transactions. It basically tries to get a lock immediately instead of on the first write command. The rest of the transaction is then guaranteed to not get database is locked error.
B: begin immediate;
# Blocks for 5 seconds waiting for commit
Runtime error: database is locked (5)
A: commit;
B: begin immediate;
B: select * from R;
a
b
B: update R set r='x' where r='a';
B: commit;
Should we move the retry logic elsewhere rather than removing it OR use immediate transactions OR use a single connection specifically for writes?
@tomleb I am very glad you brought this up as it gave me the opportunity to learn things I did not know. "select+update different table" behavior seemed obscure/broken to me for a few days, until I found a tangentially related article that made it clear why it behaves that way:
Fundamentally Back to our options:
Unless you disagree or see flaws in the reasoning above, I'd start by trying to go with option 3 in a separate PR and see how it goes, as it seems to minimize the complexity of the finished thing. |
@moio thanks for letting me know about this article, it's got really good information! I'm okay with you trying out option 3. I'm tempted to try option 2 & 4 together myself and see how that turns out. (Inspired by the rqlite codebase) Naively, I'm thinking the interface is already defined for us so we know what needs read-only and read-write. Though of course as code changes.. I suspect some non-mock test would be able to find issues here such as trying to write to a database using a read-only connection.. |
🤦 I wrote option 3. while meaning option 4. - as I added a point mid editing 😮💨 Are you still OK with that? Unrelated: I absolutely do not oppose you going down the option 2. route. In my eyes, once that is done and correctly enqueues write transactions, adding 4. on top is not even needed. Let me know what you decide in the end. |
Yes, definitely.
Heh you're right, I hadn't thought of that. Once we limit writing to a single connection, then deferred mode cannot fail with database is locked. Played a bit with option 4 I'm seeing some weird behavior when opening the DB with default transaction mode package main
import (
"context"
"database/sql"
"log"
"os"
"os/signal"
"syscall"
"time"
_ "modernc.org/sqlite"
)
func main() {
if err := mainErr(); err != nil {
log.Fatal(err)
}
}
func mainErr() error {
ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGTERM)
defer stop()
// cache=shared is necessary for in-memory DB
rwDB, err := sql.Open("sqlite", "file::memory:?cache=shared&_pragma=journal_mode=wal&_txlock=immediate&_pragma=busy_timeout=5000")
if err != nil {
return err
}
rwDB.SetMaxOpenConns(2)
conn, _ := rwDB.Conn(ctx)
tx, err := conn.BeginTx(ctx, nil)
if err != nil {
return err
}
go func() {
conn2, _ := rwDB.Conn(ctx)
// BeginTx will fail with database is locked error even though
// we set immediate transaction as default
_, err2 := conn2.BeginTx(ctx, nil)
// Running BEGIN IMMEDIATE manually gives us the behavior we
// want: busy_timeout is respected
//_, err2 := conn2.ExecContext(ctx, "BEGIN IMMEDIATE;")
if err2 != nil {
log.Fatal(err2)
}
}()
time.Sleep(10 * time.Second)
err = tx.Commit()
if err != nil {
return err
}
return nil
} |
modernc's implementation of SQLite uses a slightly different syntax Signed-off-by: Silvio Moioli <[email protected]>
120 seconds, or two consecutive minutes of uninterrupted contention, should be more than sufficient and anything exceeding this value should be reported as error. Signed-off-by: Silvio Moioli <[email protected]>
With busy_timeout properly set, SQLite will take care of the error internally. Moreover with WAL mode properly set, the error should almost never happen to begin with. https://www.sqlite.org/wal.html#sometimes_queries_return_sqlite_busy_in_wal_mode Signed-off-by: Silvio Moioli <[email protected]>
Signed-off-by: Silvio Moioli <[email protected]>
Signed-off-by: Silvio Moioli <[email protected]>
Signed-off-by: Silvio Moioli <[email protected]>
Signed-off-by: Silvio Moioli <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I'm still not sure whether I prefer having 2 separate connection pool (1 read-write, 1 read-only) or using begin immediate
. Immediate transaction mode has a footgun (imo) with modernc.org/sqlite
(and github.com/mattn/go-sqlite3
): busy_timeout
is NOT respected when using cache=shared
and executing BeginTx
, but it IS respected when using cache=shared
and executing ExecContext("begin immediate")
. We're not using cache=shared
though, but if we ever want to use :memory:
databases for tests, we will have to use it, which will lead to very different behavior between tests and production.
As I understand it, 2 separate connection pools would avoid this problem since the read-write connection would be blocked by the application instead.
I am not exactly sure this is actually the case - as BeginTx with the change in https://github.com/rancher/lasso/pull/98/files#diff-28f84babc5eeebb6a449367653bf97f8670e2816ffd8f898eb592dd3819285b0R236 actually calls In any case, I would not recommend going down the |
Yes, should but once you actually test it they end up not being equivalent.
Fair, though we might end up having to be able to set the path to the DB to avoid having multiple tests. Anyway, nothing to change here, I'm good with the current solution 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving since this has the two required approvers.
This includes various fixes to the UI Server-Side Pagination experimental feature: https://ranchermanager.docs.rancher.com/how-to-guides/advanced-user-guides/enable-experimental-features/ui-server-side-pagination - rancher/lasso#86 - rancher/lasso#90 - rancher/lasso#85 - rancher/lasso#92 - rancher/lasso#79 - rancher/lasso#97 - rancher/lasso#94 - rancher/lasso#98 - rancher/lasso#108 - rancher/lasso#99 It also includes improvements to code collecting metrics: - rancher/lasso#89 - rancher/lasso#95 And to test code: - rancher/lasso#100 - rancher/lasso#101 Signed-off-by: Silvio Moioli <[email protected]>
This
shared=cache
option as sql: drop the attachdriver mechanism #97 removed the need for itBEGIN
on read-only transactions andBEGIN IMMEDIATE
for transactions expected to writeThe last point is necessary in order to avoid
SQLITE_BUSY
(5) errors in presence of multiple concurrent writing transactions. Without itbusy_timeout
would not be sufficient, see discussion below.best reviewed commit-by-commit
Benchmark notes
Results from a quick and rough benchmark on my dev environment look good - performance is on par or superior compared to the situation without this PR.
This test lists the first page of ConfigMaps with a varying number of ConfigMaps: 0 up to 6000. Also, a number of ConfigMaps are changed (added and deleted) every second: 0 up to 400.
Results are the p(95) of the whole request time:
(Grey is without this PR, green is with this PR. Lower is better)
Note 1: the test at 6000 ConfigMaps failed with a SQLite error after 100 changes per second: this PR also fixes that bug 😇
Note 2: results are also robustly better than running with Vai off (no SQLite pagination):
(Grey is without this PR, green is with this PR, red is with Vai off. Lower is better)
To replicate use:
https://gist.github.com/moio/20a43fa406432cec5fc8aee0394d3e1f
With these scripts from the dartboard project:
https://github.com/rancher/dartboard/blob/main/k6/api_benchmark.js
https://github.com/rancher/dartboard/blob/main/k6/create_k8s_resources.js
https://github.com/rancher/dartboard/blob/main/k6/change_config_maps.js