You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The VTGate crashes when a specific query is sent to it with the following message:
fatal error: concurrent map writes
We observed it in production (see the 4 attached stacktraces) every time a specific set of queries was performed and we can reproduce it consistently in our dev environment. We found a workaround: these queries crash VTGate when they are done in OLAP mode, it does not crash when the queries are done in OLTP mode (we don't know why the workload setting matters).
We moved our platform to Vitess last night and were happy with the workaround to get everything online and stable, we did not have the time to track down which specific query causes it. Hopefully the stacktrace is enough for you to work with. If not we'll try to narrow it down further next week.
Binary Version
# vtgate --version
vtgate version Version: 21.0.1 (Git revision 3d4f41db2fbc32611c7d2ea2af3dc68b9d962415 branch 'HEAD') built on Tue Dec 3 05:39:35 UTC 2024 by runner@fv-az2029-313 using go1.23.3 linux/amd64
# Installed via the deb package
Operating System and Environment details
# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 24.04.1 LTS
Release: 24.04
Codename: noble
# uname -sr
Linux 6.8.0-1019-aws
# uname -m
x86_64
First off, thank you for the bug report! I have looked at the attached go routine dumps, and I can confirm that both this issue and #17410 have the same underlying cause. I have found the issue and working on the fix. We will backport the fix as well, since panics are severe enough to warrant them.
Overview of the Issue
The VTGate crashes when a specific query is sent to it with the following message:
We observed it in production (see the 4 attached stacktraces) every time a specific set of queries was performed and we can reproduce it consistently in our dev environment. We found a workaround: these queries crash VTGate when they are done in OLAP mode, it does not crash when the queries are done in OLTP mode (we don't know why the workload setting matters).
Potentially related to: #17410
Reproduction Steps
We moved our platform to Vitess last night and were happy with the workaround to get everything online and stable, we did not have the time to track down which specific query causes it. Hopefully the stacktrace is enough for you to work with. If not we'll try to narrow it down further next week.
Binary Version
Operating System and Environment details
Log Fragments
https://ytec.nl/media/4a0c2556-81f7-4203-aecc-abd6bb82808e/2024-12-18_server_b1_crash_01_map_writes.txt
https://ytec.nl/media/4a0c2556-81f7-4203-aecc-abd6bb82808e/2024-12-18_server_c2_crash_01_map_writes.txt
https://ytec.nl/media/4a0c2556-81f7-4203-aecc-abd6bb82808e/2024-12-18_server_c2_crash_02_map_writes.txt
(the next stacktrace contains the stacktrace twice, interleaved. This is how I recovered it from the server)
https://ytec.nl/media/4a0c2556-81f7-4203-aecc-abd6bb82808e/2024-12-18_server_c1_crash_01_map_writes.txt
The text was updated successfully, but these errors were encountered: