Skip to content

Commit

Permalink
Connection pools now always use utf8mb4
Browse files Browse the repository at this point in the history
Fixes error-prone behavior for connection pools, where connections would use the server's default character set.

close #187
  • Loading branch information
anarthal committed Mar 4, 2024
1 parent 21bca1f commit f98427c
Show file tree
Hide file tree
Showing 18 changed files with 433 additions and 166 deletions.
39 changes: 17 additions & 22 deletions doc/qbk/21_connection_pool.qbk
Original file line number Diff line number Diff line change
Expand Up @@ -91,9 +91,12 @@ create temporary tables, start transactions, or set session variables. When usin
pooled connections, session state can be problematic: if not reset properly,
state from a previous operation may affect subsequent ones.

After you return a connection to the pool, it uses [refmem any_connection async_reset_connection]
to wipe session state before the connection can be obtained again. This will deallocate
prepared statements, rollback uncommited transactions and clear variables.
After you return a connection to the pool, the equivalent of
[refmem any_connection async_reset_connection] and [refmemunq any_connection async_set_character_set]
are used to wipe session state before
the connection can be obtained again. This will deallocate
prepared statements, rollback uncommitted transactions, clear variables and restore the connection's
character set to `utf8mb4`.
In particular, you don't need to call [refmem any_connection async_close_statement]
to deallocate statements.

Expand All @@ -109,28 +112,20 @@ returned, so it does not affect latency. If you're not sure if an operation
affects state or not, assume it does.


[heading Character sets]
[heading Character set]

[warning
The current implementation, together with unintuitive MySQL defaults,
can yield to surprising behavior.
]

When using pooled connections, [*all connections use the server's default character set].
This is because [refmem any_connection async_reset_connection] discards the character set
options specified during connection establishment.

You can obtain this character set by running:
Pooled connections always use `utf8mb4` as its character set. When connections
are reset, the equivalent of [refmem any_connection async_set_character_set] is used
to restore the character set to `utf8mb4` (recall that raw [refmemunq any_connection async_reset_connection]
will wipe character set data).

[!teletype]
```
"SELECT @@global.character_set_client, @@global.character_set_results;"
```
Pooled connections always know the character set they're using.
This means that [refmem any_connection format_opts] and [refmemunq any_connection current_character_set]
always succeed.

MySQL v8.0+ defaults to `utf8mb4`, but older MySQL and MariaDB servers default to
`latin1`, which is not usually what you want. Issue a __SET_NAMES__ statement
after you get a connection to change it. [@https://github.com/boostorg/mysql/issues/187 This issue]
tracks solving this limitation.
We recommend to always stick to `utf8mb4`. If you really need to use any other character set,
use [refmemunq any_connection async_set_character_set] on your connection after it's been retrieved
from the pool.


[heading Connection lifecycle]
Expand Down
4 changes: 4 additions & 0 deletions include/boost/mysql/any_connection.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,10 @@ class any_connection
{
detail::connection_impl impl_;

#ifndef BOOST_MYSQL_DOXYGEN
friend struct detail::access;
#endif

BOOST_MYSQL_DECL
static std::unique_ptr<detail::any_stream> create_stream(
asio::any_io_executor ex,
Expand Down
1 change: 1 addition & 0 deletions include/boost/mysql/detail/algo_params.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,7 @@ struct ping_algo_params
struct reset_connection_algo_params
{
diagnostics* diag;
character_set charset; // set a non-empty character set to pipeline a SET NAMES with the reset request

using result_type = void;
};
Expand Down
22 changes: 20 additions & 2 deletions include/boost/mysql/detail/connection_impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -468,9 +468,27 @@ class connection_impl
ping_algo_params make_params_ping(diagnostics& diag) const noexcept { return {&diag}; }

// Reset connection
reset_connection_algo_params make_params_reset_connection(diagnostics& diag) const noexcept
reset_connection_algo_params make_params_reset_connection(
diagnostics& diag,
const character_set& charset = {}
) const noexcept
{
return {&diag};
return {&diag, charset};
}

// Reset connection and issue a SET NAMES, using a pipeline.
// Used internally by connection_pool.
template <BOOST_ASIO_COMPLETION_TOKEN_FOR(void(::boost::mysql::error_code)) CompletionToken>
auto async_reset_with_charset(const character_set& charset, CompletionToken&& token)
BOOST_MYSQL_RETURN_TYPE(decltype(std::declval<connection_impl&>().async_run(
std::declval<reset_connection_algo_params>(),
std::forward<CompletionToken>(token)
)))
{
return async_run(
make_params_reset_connection(shared_diag(), charset),
std::forward<CompletionToken>(token)
);
}

// Quit connection
Expand Down
3 changes: 1 addition & 2 deletions include/boost/mysql/detail/connection_pool_fwd.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,6 @@
#ifndef BOOST_MYSQL_DETAIL_CONNECTION_POOL_FWD_HPP
#define BOOST_MYSQL_DETAIL_CONNECTION_POOL_FWD_HPP

#include <boost/mysql/any_connection.hpp>

#include <boost/mysql/detail/config.hpp>

#include <memory>
Expand All @@ -18,6 +16,7 @@ namespace boost {
namespace mysql {

class pooled_connection;
class any_connection;

namespace detail {

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
#define BOOST_MYSQL_IMPL_INTERNAL_CONNECTION_POOL_CONNECTION_NODE_HPP

#include <boost/mysql/any_connection.hpp>
#include <boost/mysql/character_set.hpp>
#include <boost/mysql/client_errc.hpp>
#include <boost/mysql/diagnostics.hpp>
#include <boost/mysql/error_code.hpp>
Expand Down Expand Up @@ -146,7 +147,7 @@ class basic_connection_node : public intrusive::list_base_hook<>,
break;
case next_connection_action::reset:
run_with_timeout(
node_.conn_.async_reset_connection(asio::deferred),
access::get_impl(node_.conn_).async_reset_with_charset(utf8mb4_charset, asio::deferred),
node_.timer_,
node_.params_->ping_timeout,
std::move(self)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,6 @@ inline internal_pool_params make_internal_pool_params(pool_params&& params)
connect_prms.username = std::move(params.username);
connect_prms.password = std::move(params.password);
connect_prms.database = std::move(params.database);
connect_prms.connection_collation = 0; // use the server's default collation
connect_prms.ssl = params.ssl;
connect_prms.multi_queries = params.multi_queries;

Expand Down
2 changes: 1 addition & 1 deletion include/boost/mysql/impl/internal/protocol/protocol.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ struct reset_connection_command
};

// Deserializes a response that may be an OK or an error packet.
// Applicable for ping and reset connection.
// Applicable for commands like ping and reset connection.
// If the response is an OK packet, sets backslash_escapes according to the
// OK packet's server status flags
BOOST_ATTRIBUTE_NODISCARD BOOST_MYSQL_DECL error_code deserialize_ok_response(
Expand Down
9 changes: 2 additions & 7 deletions include/boost/mysql/impl/internal/sansio/close_statement.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -58,18 +58,13 @@ class close_statement_algo : public sansio_algorithm, asio::coroutine
ping_command{},
ping_seqnum_
);
BOOST_ASIO_CORO_YIELD return next_action::write(next_action::write_args_t{{}, false});
BOOST_ASIO_CORO_YIELD return next_action::write({});

// Read ping response
BOOST_ASIO_CORO_YIELD return read(ping_seqnum_);

// Process the OK packet
return deserialize_ok_response(
st_->reader.message(),
st_->flavor,
*diag_,
st_->backslash_escapes
);
return st_->deserialize_ok(*diag_);
}

return next_action();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@

#include <boost/mysql/impl/internal/protocol/capabilities.hpp>
#include <boost/mysql/impl/internal/protocol/db_flavor.hpp>
#include <boost/mysql/impl/internal/protocol/protocol.hpp>
#include <boost/mysql/impl/internal/sansio/message_reader.hpp>
#include <boost/mysql/impl/internal/sansio/message_writer.hpp>

Expand Down Expand Up @@ -93,6 +94,12 @@ struct connection_state_data
backslash_escapes = true;
current_charset = character_set{};
}

// Reads an OK packet from the reader. This operation is repeated in several places.
error_code deserialize_ok(diagnostics& diag) noexcept
{
return deserialize_ok_response(reader.message(), flavor, diag, backslash_escapes);
}
};

} // namespace detail
Expand Down
7 changes: 1 addition & 6 deletions include/boost/mysql/impl/internal/sansio/ping.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -52,12 +52,7 @@ class ping_algo : public sansio_algorithm, asio::coroutine
BOOST_ASIO_CORO_YIELD return read(seqnum_);

// Process the OK packet
return deserialize_ok_response(
st_->reader.message(),
st_->flavor,
*diag_,
st_->backslash_escapes
);
return st_->deserialize_ok(*diag_);
}

return next_action();
Expand Down
82 changes: 69 additions & 13 deletions include/boost/mysql/impl/internal/sansio/reset_connection.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,10 @@

#include <boost/mysql/detail/algo_params.hpp>

#include <boost/mysql/impl/internal/protocol/protocol.hpp>
#include <boost/mysql/impl/internal/sansio/connection_state_data.hpp>
#include <boost/mysql/impl/internal/sansio/sansio_algorithm.hpp>
#include <boost/mysql/impl/internal/sansio/set_character_set.hpp>

#include <boost/asio/coroutine.hpp>

Expand All @@ -26,11 +28,44 @@ namespace detail {
class reset_connection_algo : public sansio_algorithm, asio::coroutine
{
diagnostics* diag_;
std::uint8_t seqnum_{0};
character_set charset_;
std::uint8_t reset_seqnum_{0};
std::uint8_t set_names_seqnum_{0};
error_code stored_ec_;

// true if we need to pipeline a SET NAMES with the reset request
bool has_charset() const noexcept { return !charset_.name.empty(); }

next_action compose_request()
{
if (has_charset())
{
// Compose the SET NAMES statement
auto query = compose_set_names(charset_);
if (query.has_error())
return query.error();

// Compose the pipeline
st_->writer.prepare_pipelined_write(
reset_connection_command{},
reset_seqnum_,
query_command{query.value()},
set_names_seqnum_
);

// Success
return next_action::write({});
}
else
{
// Just compose the reset connection request
return write(reset_connection_command{}, reset_seqnum_);
}
}

public:
reset_connection_algo(connection_state_data& st, reset_connection_algo_params params) noexcept
: sansio_algorithm(st), diag_(params.diag)
: sansio_algorithm(st), diag_(params.diag), charset_(params.charset)
{
}

Expand All @@ -45,20 +80,41 @@ class reset_connection_algo : public sansio_algorithm, asio::coroutine
diag_->clear();

// Send the request
BOOST_ASIO_CORO_YIELD return write(reset_connection_command(), seqnum_);
BOOST_ASIO_CORO_YIELD return compose_request();

// Read the response
BOOST_ASIO_CORO_YIELD return read(seqnum_);
// Read the reset response
BOOST_ASIO_CORO_YIELD return read(reset_seqnum_);

// Verify it's what we expected
ec = deserialize_ok_response(st_->reader.message(), st_->flavor, *diag_, st_->backslash_escapes);
if (ec)
return ec;

// Reset was successful. Resetting changes the connection's character set
// to the server's default, which is an unknown value that doesn't have to match
// what was specified in handshake. As a safety measure, clear the current charset
st_->current_charset = character_set{};
stored_ec_ = st_->deserialize_ok(*diag_);

if (!stored_ec_)
{
// Reset was successful. Resetting changes the connection's character set
// to the server's default, which is an unknown value that doesn't have to match
// what was specified in handshake. As a safety measure, clear the current charset
st_->current_charset = character_set{};
}

if (has_charset())
{
// We issued a SET NAMES too, read its response
BOOST_ASIO_CORO_YIELD return read(set_names_seqnum_);

// Verify it's what we expected. Don't overwrite diagnostics if reset failed
ec = st_->deserialize_ok(stored_ec_ ? st_->shared_diag : *diag_);
if (!ec)
{
// Set the character set to the new known value
st_->current_charset = charset_;
}

// Set the return value if there is no error code already stored
if (!stored_ec_)
stored_ec_ = ec;
}

return stored_ec_;
}

return next_action();
Expand Down
39 changes: 20 additions & 19 deletions include/boost/mysql/impl/internal/sansio/set_character_set.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -29,25 +29,31 @@ namespace boost {
namespace mysql {
namespace detail {

// Securely compose a SET NAMES statement. This function
// is also used by the pipelined version of reset_connection
inline system::result<std::string> compose_set_names(const character_set& charset)
{
// The character set should have a non-empty name
BOOST_ASSERT(!charset.name.empty());

// For security, if the character set has non-ascii characters in it name, reject it.
format_context ctx(format_options{ascii_charset, true});
ctx.append_raw("SET NAMES ").append_value(charset.name);
return std::move(ctx).get();
}

class set_character_set_algo : public sansio_algorithm, asio::coroutine
{
diagnostics* diag_;
character_set charset_;
std::uint8_t seqnum_{0};

static error_code compose_query(const character_set& charset, std::string& output)
next_action compose_request()
{
// The character set should have a non-empty name
BOOST_ASSERT(!charset.name.empty());

// For security, if the character set has non-ascii characters in it name, reject it.
format_context ctx(format_options{ascii_charset, true});
ctx.append_raw("SET NAMES ").append_value(charset.name);
auto res = std::move(ctx).get();
if (res.has_error())
return res.error();
output = std::move(res).value();
return error_code();
auto q = compose_set_names(charset_);
if (q.has_error())
return q.error();
return write(query_command{q.value()}, seqnum_);
}

public:
Expand All @@ -61,26 +67,21 @@ class set_character_set_algo : public sansio_algorithm, asio::coroutine
if (ec)
return ec;

std::string query;

// SET NAMES never returns rows. Using execute requires us to allocate
// a results object, which we can avoid by simply sending the query and reading the OK response.
BOOST_ASIO_CORO_REENTER(*this)
{
// Setup
diag_->clear();
ec = compose_query(charset_, query);
if (ec)
return ec;

// Send the execution request
BOOST_ASIO_CORO_YIELD return write(query_command{query}, seqnum_);
BOOST_ASIO_CORO_YIELD return compose_request();

// Read the response
BOOST_ASIO_CORO_YIELD return read(seqnum_);

// Verify it's what we expected
ec = deserialize_ok_response(st_->reader.message(), st_->flavor, *diag_, st_->backslash_escapes);
ec = st_->deserialize_ok(*diag_);
if (ec)
return ec;

Expand Down
Loading

0 comments on commit f98427c

Please sign in to comment.