Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd database corruption in microshift bundles leading to microshift service not starting #823

Open
anjannath opened this issue Nov 22, 2023 · 0 comments
Assignees

Comments

@anjannath
Copy link
Member

It was observed that when the etcd db is corrupted the microshift.service is unable to start and since etcd doesn't self recover a corrupted db etcd also fails to start leading to microshift being unusable until the crc vm is deleted and a new one is created.

snip from journalctl -u microshift o/p

Nov 21 04:26:07 api.crc.testing microshift[2486]: etcd W1121 04:26:07.587521    2486 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport faile>
Nov 21 04:26:07 api.crc.testing microshift[2486]:   "Addr": "localhost:2379",
Nov 21 04:26:07 api.crc.testing microshift[2486]:   "ServerName": "localhost",
Nov 21 04:26:07 api.crc.testing microshift[2486]:   "Attributes": null,
Nov 21 04:26:07 api.crc.testing microshift[2486]:   "BalancerAttributes": null,
Nov 21 04:26:07 api.crc.testing microshift[2486]:   "Type": 0,
Nov 21 04:26:07 api.crc.testing microshift[2486]:   "Metadata": null
Nov 21 04:26:07 api.crc.testing microshift[2486]: }. Err: connection error: desc = "transport: Error while dialing dial tcp [::1]:2379: connect: connection refused"
Nov 21 04:26:07 api.crc.testing microshift[2486]: etcd W1121 04:26:07.650006    2486 etcd.go:110] etcd failed waiting on process to finish: exit status 2
Nov 21 04:26:07 api.crc.testing microshift[2486]: etcd I1121 04:26:07.650022    2486 etcd.go:112] etcd process quit: exit status 2
Nov 21 04:26:07 api.crc.testing microshift[2486]: etcd W1121 04:26:07.650025    2486 etcd.go:115] microshift-etcd process terminated prematurely, restarting MicroShift

etcd error from the microshift vm

bash-5.1# systemd-run --uid=root --scope --unit microshift-etcd --property Before=microshift.service --property BindsTo=microshift.service /usr/bin/microshift-etcd run
Running scope as unit: microshift-etcd.scope
{"level":"warn","ts":"2023-11-21T06:48:48.426586-0500","caller":"embed/config.go:677","msg":"Running http and grpc server on single port. This is not recommended for production."}
{"level":"info","ts":"2023-11-21T06:48:48.426702-0500","caller":"embed/etcd.go:127","msg":"configuring peer listeners","listen-peer-urls":["https://localhost:2380/"]}
{"level":"info","ts":"2023-11-21T06:48:48.426722-0500","caller":"embed/etcd.go:497","msg":"starting with peer TLS","tls-info":"cert = /var/lib/microshift/certs/etcd-signer/etcd-peer/peer.crt, key = /var/lib/microshift/certs/etcd-signer/etcd-peer/peer.key, client-cert=, client-key=, trusted-ca = /var/lib/microshift/certs/etcd-signer/ca.crt, client-cert-auth = false, crl-file = ","cipher-suites":["TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384","TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384","TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256","TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256","TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305","TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305"]}
{"level":"info","ts":"2023-11-21T06:48:48.427615-0500","caller":"embed/etcd.go:135","msg":"configuring client listeners","listen-client-urls":["https://localhost:2379/"]}
{"level":"info","ts":"2023-11-21T06:48:48.427783-0500","caller":"embed/etcd.go:310","msg":"starting an etcd server","etcd-version":"3.5.9","git-sha":"Not provided (use ./build instead of go build)","go-version":"go1.20.10 X:strictfipsruntime","go-os":"linux","go-arch":"arm64","max-cpu-set":2,"max-cpu-available":2,"member-initialized":true,"name":"api.crc.testing","data-dir":"/var/lib/microshift/etcd","wal-dir":"","wal-dir-dedicated":"","member-dir":"/var/lib/microshift/etcd/member","force-new-cluster":false,"heartbeat-interval":"100ms","election-timeout":"1s","initial-election-tick-advance":true,"snapshot-count":100000,"max-wals":5,"max-snapshots":5,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["https://localhost:2380/"],"listen-peer-urls":["https://localhost:2380/"],"advertise-client-urls":["https://localhost:2379/"],"listen-client-urls":["https://localhost:2379/"],"listen-metrics-urls":["https://localhost:2381/"],"cors":["*"],"host-whitelist":["*"],"initial-cluster":"","initial-cluster-state":"new","initial-cluster-token":"","quota-backend-bytes":8589934592,"max-request-bytes":1572864,"max-concurrent-streams":4294967295,"pre-vote":true,"initial-corrupt-check":false,"corrupt-check-time-interval":"0s","compact-check-time-enabled":false,"compact-check-time-interval":"1m0s","auto-compaction-mode":"","auto-compaction-retention":"0s","auto-compaction-interval":"0s","discovery-url":"","discovery-proxy":"","downgrade-check-interval":"5s","max-learners":1}
panic: freepages: failed to get all reachable pages (page 4: multiple references (stack: [192 785 37 4]))

goroutine 142 [running]:
go.etcd.io/bbolt.(*DB).freepages.func2()
	/builddir/build/BUILD/microshift-4.14.1/etcd/vendor/go.etcd.io/bbolt/db.go:1178 +0x8c
created by go.etcd.io/bbolt.(*DB).freepages
	/builddir/build/BUILD/microshift-4.14.1/etcd/vendor/go.etcd.io/bbolt/db.go:1176 +0x13c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants