You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The VTGate crashes when a specific query is sent to it with the following message:
fatal error: concurrent map iteration and map write
We observed it in production (see the 6 attached stacktraces) every time a specific set of queries was performed and we can reproduce it consistently in our dev environment. We found a workaround: these queries crash VTGate when they are done in OLAP mode, it does not crash when the queries are done in OLTP mode (we don't know why the workload setting matters).
We moved our platform to Vitess last night and were happy with the workaround to get everything online and stable, we did not have the time to track down which specific query causes it. Hopefully the stacktrace is enough for you to work with. If not we'll try to narrow it down further next week.
Binary Version
# vtgate --version
vtgate version Version: 21.0.1 (Git revision 3d4f41db2fbc32611c7d2ea2af3dc68b9d962415 branch 'HEAD') built on Tue Dec 3 05:39:35 UTC 2024 by runner@fv-az2029-313 using go1.23.3 linux/amd64
# Installed via the deb package
Operating System and Environment details
# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 24.04.1 LTS
Release: 24.04
Codename: noble
# uname -sr
Linux 6.8.0-1019-aws
# uname -m
x86_64
Overview of the Issue
The VTGate crashes when a specific query is sent to it with the following message:
We observed it in production (see the 6 attached stacktraces) every time a specific set of queries was performed and we can reproduce it consistently in our dev environment. We found a workaround: these queries crash VTGate when they are done in OLAP mode, it does not crash when the queries are done in OLTP mode (we don't know why the workload setting matters).
Potentially related to: #17411
Reproduction Steps
We moved our platform to Vitess last night and were happy with the workaround to get everything online and stable, we did not have the time to track down which specific query causes it. Hopefully the stacktrace is enough for you to work with. If not we'll try to narrow it down further next week.
Binary Version
Operating System and Environment details
Log Fragments
https://ytec.nl/media/4a0c2556-81f7-4203-aecc-abd6bb82808e/2024-12-18_server_b1_crash_01_map_iteration.txt
https://ytec.nl/media/4a0c2556-81f7-4203-aecc-abd6bb82808e/2024-12-18_server_b1_crash_02_map_iteration.txt
https://ytec.nl/media/4a0c2556-81f7-4203-aecc-abd6bb82808e/2024-12-18_server_b1_crash_03_map_iteration.txt
https://ytec.nl/media/4a0c2556-81f7-4203-aecc-abd6bb82808e/2024-12-18_server_b2_crash_01_map_iteration.txt
https://ytec.nl/media/4a0c2556-81f7-4203-aecc-abd6bb82808e/2024-12-18_server_c1_crash_01_map_iteration.txt
https://ytec.nl/media/4a0c2556-81f7-4203-aecc-abd6bb82808e/2024-12-18_server_c1_crash_02_map_iteration.txt
The text was updated successfully, but these errors were encountered: