Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault when degraded cluster restores #78

Open
cc32d9 opened this issue Apr 14, 2023 · 3 comments
Open

segfault when degraded cluster restores #78

cc32d9 opened this issue Apr 14, 2023 · 3 comments

Comments

@cc32d9
Copy link

cc32d9 commented Apr 14, 2023

I will collect more data when my current test finishes. But this is what I observed with 2.16.2b:

a cluster of 4 machines, running latest 5.2 release candidate. The keyspace has replication factor 3. The writer is pushing about 30k inserts per second, with consistency level set to QUORUM.

While the client is stopped, I stopped one of the servers. Then I started the client (all 4 are configured in cluster contact points). The client complained a bit about failed connections, but went chugging along, as we have enough replicas for the quorum.

Then I started the server that was stopped, and the client segfaulted immediately as the server started accepting connections.

@cc32d9
Copy link
Author

cc32d9 commented Apr 14, 2023

the writer in question: https://github.com/EOSChronicleProject/chronos/blob/main/writer/exp_chronos_plugin.cpp

    cass_cluster_set_local_port_range(cluster, 49152, 65535);
    cass_cluster_set_core_connections_per_host(cluster, scylla_conn_per_host);
    cass_cluster_set_request_timeout(cluster, 100000);
    cass_cluster_set_num_threads_io(cluster, scylla_io_threads);
    cass_cluster_set_queue_size_io(cluster, 1048576);

scylla_conn_per_host is set to 1, and scylla_io_threads=4.

@cc32d9
Copy link
Author

cc32d9 commented Apr 14, 2023

[48556796.397656] chronos-writer[1981192]: segfault at 8 ip 00007f60800c5af8 sp 00007f5c777fa9d0 error 4 in libscylla-cpp-driver.so.2.16.2-b[7f607ffd2000+293000]
[48556796.397677] Code: 01 00 00 4c 8d 2c d0 0f 85 4e 02 00 00 49 8b 6d 08 4d 8b 75 00 49 39 ee 0f 84 ac 00 00 00 4d 89 f4 49 83 c6 08 4c 39 f5 74 48 <49> 8b 3e e8 00 54 08 00 84 c0 75 eb 49 8b 3c 24 e8 f3 53 08 00 84

@cc32d9 cc32d9 changed the title segfault when partially alive cluster restores segfault when degraded cluster restores Apr 26, 2023
@cc32d9
Copy link
Author

cc32d9 commented Apr 26, 2023

@jul-stas ^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant