A native PostgresSQL driver / client for Zig. Supports LISTEN.
See or run example/main.zig for a number of examples.
This library does not support TLS / sslmode - Zig's TLS support is currently incomplete/buggy.
- Add pg.zig as a dependency in your
build.zig.zon
:
zig fetch --save git+https://github.com/karlseguin/pg.zig#master
- In your
build.zig
, add thepg
module as a dependency you your program:
const pg = b.dependency("pg", .{
.target = target,
.optimize = optimize,
});
// the executable from your call to b.addExecutable(...)
exe.root_module.addImport("pg", pg.module("pg"));
var pool = try pg.Pool.init(allocator, .{
.size = 5,
.connect = .{
.port = 5432,
.host = "127.0.0.1",
},
.auth = .{
.username = "postgres",
.database = "postgres",
.password = "root_pw",
.timeout = 10_000,
}
});
defer pool.deinit();
var result = try pool.query("select id, name from users where power > $1", .{9000});
defer result.deinit();
while (try result.next()) |row| {
const id = row.get(i32, 0);
// this is only valid until the next call to next(), deinit() or drain()
const name = row.get([]u8, 1);
}
The pool keeps a configured number of database connection open. The acquire()
method is used to retrieve a connection from the pool. The pool may start one background thread to attempt to reconnect disconnected connections (or connections which are in an invalid state).
Initializes a connection pool. Pool options are:
size
- Number of connections to maintain. Defaults to10
auth
: - See Conn.authconnect
: - See the Conn.opentimeout
- The amount of time, in milliseconds, to wait for a connection to be available whenacquire()
is called.
Initializes a connection pool using a std.Uri.
const uri = try std.Uri.parse("postgresql://username:password@localhost:5432/database_name");
const pool = try pg.Pool.initUri(allocator, uri, 5, 10_000);
defer pool.deinit();
Returns a *Conn for the connection pool. Returns an error.Timeout
if the connection cannot be acquired (i.e. if the pool remains empty) for the timeout
configuration passed to init
.
const conn = try pool.acquire();
defer pool.release(conn);
_ = try conn.exec("...", .{...});
Releases the conection back into the pool. Calling pool.release(conn)
is the same as calling conn.release()
.
Returns a new Listener. This function creates a new connection, it does not use/acquire a connection from the pool. It is a convenience function for cases which have already setup a pool (with the connection and authentication configuration) and want to create a listening connection using those settings.
For single-query operations, the pool offers wrappers around the connection's exec
, query
, queryOpts
, row
and rowOpts
methods. These are convenience methods.
pool.exec
acquires, executes and releases the connection.
pool.query
and pool.queryOpts
acquire and execute the query. The connection is automatically returned to the pool when result.deinit()
is called. Note that this is a special behavior of pool.query
. When the result comes explicitly from a conn.query
, result.deinit()
does not automatically release the connection back into the pool.
pool.row
and pool.rowOpts
acquire and execute the query. The connection is automatically returned to the pool when row.deinit()
is called. Note that this is a special behavior of pool.row
. When the result comes explicitly from a conn.row
, row.deinit()
does not automatically release the connection back into the pool.
Opens a connection, or returns an error. Prefer creating connections through the pool. Connection options are:
host
- Defaults to"127.0.0.1"
port
- Defaults to5432
write_buffer
- Size of the write buffer, used when sending messages to the server. Will temporarily allocate more space as needed. If you're writing large SQL or have large parameters (e.g. long text values), making this larger might improve performance a little. Defaults to2048
, cannot be less than128
.read_buffer
- Size of the read buffer, used when reading data from the server. Will temporarily allocate more space as needed. Given most apps are going to be reading rows of data, this can have large impact on performance. Defaults to4096
.result_state_size
- EachResult
(retrieved via a call toquery
) carries metadata about the data (e.g. the type of each column). For results with less than or equal toresult_state_size
columns, a staticstate
container is used. Queries with more columns require a dynamic allocation. Defaults to32
.
Closes the connection and releases its resources. This method should not be used when the connection comes from the pool.
Authentications the request. Prefer creating connections through the pool. Auth options are:
username
: Defaults to"postgres"
password
: Defaults tonull
database
: Defaults tonull
timeout
: Defaults to10_000
(milliseconds)application_name
: Defaults tonull
params
: Defaults tonull
. Anstd.StringHashMap([]const u8)
Releases the connection back to the pool. The pool might decide to close the connection and open a new one.
Executes the query with arguments, returns the number of rows affected, or null. Should not be used with a query that returns rows.
Executes the query with arguments, returns Result. deinit
, and possibly drain
, must be called on the returned result
.
Same as query
but takes options:
timeout: ?u32
- This is not reliable and should probably not be used. Currently it simply puts a recv socket timeout. On timeout, the connection will likely no longer be valid (which the pool will detect and handle when the connection is released) and the underlying query will likely still execute. Defaults tonull
column_names: bool
- Whether or not theresult.column_names
should be populated. When true, this requires memory allocation (duping the column names). Defaults tofalse
allocator
- The allocator to use for any allocations needed when executing the query and reading the results. Whennull
this will default to the connection's allocator. If you were executing a query in a web-request and each web-request had its own arena tied to the lifetime of the request, it might make sense to use that arena. Defaults tonull
.release_conn: bool
- Whether or not to callconn.release()
whenresult.deinit()
is called. Useful for writing a function that acquires a connection from aPool
and returns aResult
. Whenquery
orrow
are called from aPool
this is forced totrue
. Otherwise, defaults tofalse
.
Executes the query with arguments, returns a single row. Returns an error if the query returns more than one row. Returns null
if the query returns no row. deinit
must be called on the returned QueryRow
.
Same as row
but takes the same options as queryOpts
Creates a Stmt. It is generally better to use query
, row
or exec
,
Same as prepare
but takes the same options as queryOpts
Calls _ = try execOpts("begin", .{}, .{})
Calls _ = try execOpts("commit", .{}, .{})
Calls _ = try execOpts("rollback", .{}, .{})
The conn.query
and conn.queryOpts
methods return a pg.Result
which is used to read rows and values.
number_of_columns: usize
- Number of columns in the resultcolumn_names: [][]const u8
- Names of the column, empty unless the query was executed with thecolumn_names = true
option.
Releases resources associated with the result.
If you do not iterate through the result until next
returns null
, you must call drain
.
Why can't deinit
handle this? If deinit
also drained, you'd have to handle a possible error in deinit
and you can't try
in a defer. Thus, this is done to provide better ergonomics for the normal case - the normal case being where next
is called until it returns null
. In these cases, just defer result.deinit()
.
Iterates to the next row of the result, or returns null if there are no more rows.
Returns the index of the column with the given name. This is only valid when the query is executed with the column_names = true
option.
Returns a Mapper which can be used to create a T for each row. Mapping from column to field is done by name. This is an optimized version of row.to when iterating through multiple rows with the {.map = .name}
.
See row.to and Mapper for more information.
The row
represents a single row from a result. Any non-primitive value that you get from the row
are valid only until the next call to next
, deinit
or drain
.
Only advance usage will need access to the row fields:
oids: []i32
- The PG OID value for each column in the row. Seeresult.number_of_columns
for the length of this slice. Might be useful if you're trying to read a non-natively supported type.values: []Value
- The underlying byte value for each column in the row. Seeresult.number_of_columns
for the length of this slice. Might be useful if you're trying to read a non-natively supported type. Has two fields,is_null: bool
anddata: []const u8
.
Gets a value from the row at the specified column index (0-based). Type mapping is strict. For example, you cannot use i32
to read an smallint
column.
For any supported type, you can use an optional instead. Therefore, if you use row.get(i16, 0)
the return type is i16
. If you use row.get(?i16, 0)
the return type is ?i16
. If you use a non-optional type for a null value, you'll get a failed assertion in Debug
and ReleaseSafe
, and undefined behavior in ReleaseFast
, ReleaseSmall
or if you set pg_assert = false
.
u8
-char
i16
-smallint
i32
-int
i64
- Depends on the underlying column type. Atimestamp(tz)
will be converted to microseconds since unix epoch. Otherwise, abigint
.f32
-float4
f64
- Depends on the underlying column type. Anumeric
will be converted to anf64
. Otherwise, afloat
.bool
-bool
[]const u8
- Returns the raw underlying data. Can be used for any column type to get the PG-encoded value. Fortext
andbytea
columns, this will be the expected value. Fornumeric
, this will be a text representation of the number. ForUUID
this will be a 16-byte slice (usepg.uuidToHex [36]u8
if you want a hex-encoded UUID). ForJSON
andJSONB
this will be the serialized JSON value.[]u8
- Same as []const u8 but returns a mutable value.pg.Numeric
- See numeric sectionpg.Cidr
- See CIDR/INET section
Same as get
but uses the column name rather than its position. Only valid when the column_names = true
option is passed to queryOpts
.
This relies on calling result.columnIndex
which iterates through result.column_names
fields. In some cases, this is more efficient than StringHashMap
lookup, in others, it is worse. For performance-sensitive code, prefer using get
, or cache the column index in a local variables outside of the next()
loop:
const id_idx = result.columnIndex("id").?
while (try result.next()) |row| {
// row.get(i32, id_idx)
}
Use row.get(pg.Iterator(i32))
to return an Iterator over an array column. Supported array types are:
u8
-char[]
i16
-smallint[]
i32
-int[]
i64
-bigint[]
ortimestamp(tz)[]
(seeget
)f32
-float4
f64
-float8
bool
-bool[]
[]const u8
- More strict thanget([]u8)
). Supports:text[]
,char(n)[]
,bytea[]
,uuid[]
,json[]
andjsonb[]
[]u8
- Same as[]const u8
but returns mutable value.pg.Numeric
- See numeric sectionpg.Cidr
- See CIDR/INET section
Gets a Record by column position.
Gets an Record by column name. See getCol for performance notes.
Populates and returns a T
.
opts
values are:
dupe
- Duplicate string columns using the internal arena. When set totrue
non-scalar values are valid untildeinit
is called on therow
/result
. Defaults tofalse
allocator
- Allocator to use to duplicate non-scalar values (i.e. strings). It is the caller's responsible to free any non-scalar values from their structure. Defaults tonull
.map
-.ordinal
or.name
, defaults to.ordinal
Setting allocator
implies dupe
, but uses the specified allocator rather than the internal arena. By default (when dupe
is false
and allocator
is null
), non-scalar values (i.e. strings) are only valid until the next call to next()
or drain()
or deinit()
.
When .map = .ordinal
, the default, the order of the field names must match the order of the columns.
When .map = .name
, the query must be executed with the {.column_names = true}
option. Columns with no field equivalent are ignored. Fields with no column equivalent are set to their default value; if they do not have a default value the function will return error.FieldColumnMismatch
. If you're going to use this in a loop with a result
, consider using a Mapper to avoid the name->index lookup on each iteration.
Slice fields can either be mapped to a pg.Iterator(T)
or a slice. When mapped to a slice
, an allocator MUST be provided. When mapping to an array of strings (i.e. [][]const u8), the values are duped, and thus both the values and the slice itself must be freed. When mapping to a slice of primitives (i.e. []i32) the slice must be freed. When mapping to an pg.Iterator(T)
with a custom allocator (.{.allocator = allocator}
), the iterator must be freed by calling iteartor.deinit(allocator)
. Whether you're mapping to an pg.Iterator(T)
or a slice, I Strongly suggest you use an ArenaAllocator.
A QueryRow
is returned from a call to conn.row
or conn.rowOpts
and wraps both a Result
and a Row.
It exposes the same methods as Row
as well as deinit
, which must be called once the QueryRow
is no longer needed. This is a rare case where deinit()
can fail. In most cases, you can simply throw away the error (because failure is extremely rare and, if the connection came from a pool, it should repair itself).
The iterator returned from row.get(pg.Iterator(T), col)
can be iterated using the next() ?T
call:
var names = row.get(pg.Iterator([]const u8), 0);
while (names.next()) |name| {
...
}
len
- the number of values in the iterator
Allocates a slice and populates it with all values.
If the slice is a []u8
or []const u8
, the string is also duplicated. It is the responsibility of the caller to free the string values AND the slice.
Fill into
with values of the iterator. into
can be smaller than it.len
, in which case only into.len
values will be filled. This can be a bit faster than calling next()
multiple times. Values are not duplicated; they are only valid until the next iterations.
Returned by row.record(col)
for fetching a PostgreSQL record-type, for example from this query:
select row('over', 9000)
In many cases, PostgreSQL will mark the inner-types as "unknown", which is likely to cause assertion failures in this library. The solution is to type each value:
select row('over'::text, 9000::int)
number_of_columns
- the number of columns in the record
Gets the next column in the record. This behaves similarly row.get with the same supported types for T
, including nullables.
A mapper is used to iterate through a result and turn a row into an instance of T
. When converting a single row, or using ordinal mapping, prefer using row.to. The mapper is an optimization over row.to
with the {.map = .name}
option which only has to do the name -> index lookup once.
To use a mapper, the {.column_names = true}
option must be passed to the query/row function.
const User = struct {
id: i32,
name: []const u8,
};
///...
var result = try conn.queryOpts("select id, name from users", .{}, .{.column_names = true});
defer result.deinit();
var mapper = result.mapper(User, .{});
while (try mapper.next()) |user| {
// use: user.id and user.name
}
A column with no matching field is ignored. A field with no matching column is set to its default fault. If no default value is defined, mapper.next()
will return error.FieldColumnMismatch
.
The 2nd argument to result.mapper
is an option:
dupe
- Duplicate string columns using the internal arena. When set totrue
non-scalar values are valid untildeinit
is called on therow
/result
. Defaults tofalse
allocator
- Allocator to use to duplicate non-scalar values (i.e. strings). It is the caller's responsible to free any non-scalar values from their structure. Defaults tonull
.
Setting allocator
implies dupe
, but uses the specified allocator rather than the internal arena. By default (when dupe
is false
and allocator
is null
), non-scalar values (i.e. strings) are only valid until the next call to next()
or drain()
or deinit()
.
For most queries, you should use the conn.query(...)
, conn.row(...)
or conn.exec(...)
methods. For queries with parameters, these methods look like:
var stmt = try Stmt.init(conn, opts)
errdefer stmt.deinit();
try stmt.prepare(sql);
inline for (parameters) |param| {
try stmt.bind(param);
}
return stmt.execute();
You can create a statement directly using conn.prepare(sql)
or conn.prepareOpts(sql, ConnQueryOpts{...})
and call stmt.bind(value: anytype)
and execute()
directly.
The main reason to do this is to have more flexibility in binding parameters (e.g. such as when creating dynanmic SQL where all the parameters aren't fixed at compile-time).
Note that stmt.deinit()
should only be called if stmt.execute()
is not called or returns an error. Once stmt.execute()
returns a Result, stmt
should be considered invalid. As we can see in the above example, stmt.deinit()
is only called on errdefer
.
When you read a value, such as row.get(i32, 0)
, the library assumes you know what you're doing and that column 0 really is a non-null 32-bit integer. row.get
doesn't return an error union. There are some assertions, but these are disabled in ReleaseFast and ReleaseSmall. You can also disable these assertions in Debug/ReleaseSafe by placing pub const pg_assert = false;
in your root, (e.g. main.zig
):
const std = @import("std");
...
pub const pg_assert = false;
pub fm main() !void {
...
}
Conversely, when binding a value to an SQL parameter, the library is a little more generous. For example, an u64
will bind to an i32
provided the value is within range.
This is particularly relevant for types which are expressed as []u8
. For example a UUID can be a raw binary [16]u8
or a hex-encoded [36]u8
. Where possible (e.g. UUID, MacAddr, MacAddr8), the library will support binding either the raw binary data or text-representation. When reading, the raw binary value is always returned.
Strongly consider using pg.Pool
rather than using pg.Conn
directly. The pool will attempt to reconnect disconnected connections or connections which are in an invalid state. Until more real world testing is done, you should assume that connections will get into invalid states.
Zig errorsets do not support arbitrary payloads. This is problematic in a database driver where most applications probably care about the details of an error. The library takes a simple approach. If error.PG
is returned, conn.err
should be set and will contains a PG error object:
_ = conn.exec("drop table x", .{}) catch |err| {
if (err == error.PG) {
if (conn.err) |pge| {
std.log.err("PG {s}\n", .{pge.message});
}
}
return err;
};
In the above snippet, it's possible to skip the if (err == error.PG)
check, but in that case conn.err
could be set from some previous command (conn.err
is always reset when acquired from the pool).
If error.PG
is returned from a non-connection object, like a query result, the associated connection will have its conn.err
set. In other words, conn.err
is the only thing you ever have to check.
A PG error always exposes the following fields:
code: []const u8
- https://www.postgresql.org/docs/current/errcodes-appendix.htmlmessage: []const u8
severity: []const u8
And optionally (depending on the error and the version of the server):
column: ?[]const u8 = null
constraint: ?[]const u8 = null
data_type_name: ?[]const u8 = null
detail: ?[]const u8 = null
file: ?[]const u8 = null
hint: ?[]const u8 = null
internal_position: ?[]const u8 = null
internal_query: ?[]const u8 = null
line: ?[]const u8 = null
position: ?[]const u8 = null
routine: ?[]const u8 = null
schema: ?[]const u8 = null
severity2: ?[]const u8 = null
table: ?[]const u8 = null
where: ?[]const u8 = null
The isUnique() bool
method can be called on the error to determine whether or not the error was a unique violation (i.e. error code 23505
).
All implementations have to deal with things like: how to support unsigned integers, given that PostgreSQL only has signed integers. Or, how to support UUIDs when the language has no UUID type. This section documents the exact behavior.
Multi-dimensional arrays aren't supported. The array lower bound is always 0 (or 1 in PG)
No surprises, arrays supported.
When reading a char[]
, it's tempting to use row.get([]u8, 0)
, but this is incorrect. A char[]
is an array, and thus row.get(pg.Iterator(u8), 0
) must be used.
When binding an integer, the library will coerce the Zig value to the parameter type, as long as it fits. Thus, a u64
can be bound to a smallint
, if the value fits, else an error will be returned.
Array binding is strict. For example, an []i16
must be used for a smallint[]
parameter. The only exception is that the unsigned variant, e.g. []u16
can be used provided all values fit.
When reading a column, you must use the correct type.
When binding, @floatCast
is used based on the SQL parameter type. Array binding is strict. When reading a value, you must use the correct type.
Until standard support comes to Zig (either in the stdlib or a de facto standard library), numeric support is half-baked. When binding a value to a parameter, you can use a f32, f64, comptime_float or string. The same applies to binding to a numeric array.
You can get(pg.Numeric, $COL)
to return a pg.Numeric
. The pg.Numeric
type only has 2 useful methods: toFloat
and toString
. You can also use num.estimatedStringLen
to get the max size of the string representation:
const numeric = row.get(pg.Numeric, 0);
var buf = allocator.alloc(u8, numeric.estimatedStringLen());
defer allocator.free(buf)
const str = numeric.toString(&buf);
Using row.get(f64, 0)
on a numeric is the same as row.get(pg.Numeric, 0).toFloat()
.
You should consider simply casting the numeric to ::double
or ::text
within SQL in order to rely on PostgreSQL's own robust numeric to float/text conversion.
However, pg.Numeric
has fields for the underlying wire-format of the numeric value. So if you require precision and the text representation isn't sufficient, you can parse the fields directly. types/numeric.zig
is relatively well documented and tries to explain the fields. Note that any non-primitive fields, e.g. the digits: []u8
, is only valid until the next call to result.next
, result.deinit
, result.drain
or row.deinit
.
When a []u8
is bound to a UUID column, it must either be a 16-byte slice, or a valid 36-byte hex-encoded UUID. Arrays behave the same.
When reading a uuid
column with []u8
a 16-byte slice will be returned. Use the pg.uuidToHex() ![36]u8
helper if you need it hex-encoded.
You can bind a string value to a cidr
, inet
, cidr[]
or inet[]
parameter.
When reading a value, via row.get
or row.iterator
you should use pg.Cidr
. It exposes 3 fields:
address: []u8
- Will be a 4 or 16 byte slice depending on the familyfamily: Family
- An enum, eitherFamily.v4
ofFamily.v6
netmask: u8
- The network mask
You can bind a []u8
to either a macaddr
or a macaddr8
. These can be either binary representation (6-bytes for macaddr
or 8 bytes for macaddr8
) or a text-representation supported by PostgreSQL. This works, like UUID, because there's no ambiguity in the length. The same applied for array variants - it's even possible to mix and match formats within the array.
When reading a value, via row.get
or row.iterator
using []u8
, the binary representation is always returned.
When you bind an i64
to a timestamp(tz) parameter, the value is assumed to be the number of microseconds since unix epoch (e.g. std.time.microTimestamp()
). Array binding works the same. You can also bind a string, which will pass the string as-is and depend on PostgreSQL to do the conversion. This is true for arrays as well.
When reading a timestamp
column with i64
, the number of microseconds since unix epoch will be returned
When binding a value to a JSON or JSONB parameter, you can either supply a serialized value (i.e. []u8
) or a struct which will be serialized using std.json.stringify
.
When binding to an array of JSON or JSONB, automatic serialization is not support and thus an array of serialized values must be provided.
When reading a JSON
or JSONB
column with []u8
, the serialized JSON will be returned.
You can create a pg.Listener
either from an existing Pool
or directly.
Creating a new Listener directly is a lot like creating a new connection. See Conn.open and Conn.auth.
// see the Conn.ConnectOpts
var listener = try pg.Listener.open(allocator, .{
.host = "127.0.0.1",
.port = 5432,
});
defer listener.deinit();
try listener.auth(.{
.username = "leto",
.password = "ghanima",
.database = "caladan",
});
// add 1 or more channels to listen to
try listener.listen("chan_1");
try listener.listen("chan_2");
// .next() blocks until there's a notification or an error
while (listener.next()) |notification| {
std.debug.print("Channel: {s}\nPayload: {s}", .{notification.channel, notification.payload});
}
// The error handling is explained, sorry about this API. Zig error payloads plz
switch (listener.err.?) {
.pg => |pg| std.debug.print("{s}\n", .{pg.message}),
.err => |err| std.debug.print("{s}\n", .{@errorName(err)}),
}
When using the pool, a new connection/session is created. It does not use a connection from the pool. This is merely a convenience function if you're also using normal connections through a pool.
var listener = try pool.newListener();
defer listener.deinit();
// listen to 1 or more channels
try listener.listen("chan_1");
// same as above
A listener will not automatically reconnect on error/disconnect. The pub/sub nature of LISTEN/NOTIFY mean that delivery is at-most-once and auto-reconnecting can hide that fact. Put the above code in a while (true) {...}
loop.
The handling of errors isn't great. Blame Zig's lack of error payloads and the awkwardness of using try
within a while
condition.
listener.next()
can only return null
on error. When null
is returned, listener.err
will be non-null. Unlike the Conn
this is a tagged union that can either be err
for a normal Zig error (e.g. error.ConnectionResetByPeer) or pg
a detailed PostgresSQL error.
A few basic metrics are collected using metrics.zig, a prometheus-compatible library. These can be written to an std.io.Writer
using try pg.writeMetrics(writer)
. As an example using httpz:
pub fn metrics(_: *httpz.Request, res: *httpz.Response) !void {
const writer = res.writer();
try pg.writeMetrics(writer);
// also write out the httpz metrics
try httpz.writeMetrics(writer);
}
The metrics are:
pg_queries
- counts the number of queriespg_pool_empty
- counts how often the pool is emptypg_pool_dirty
- counts how often a connection is released back into the pool in an unclean state (thus requiring the connection to be closed and the pool to re-open another connection). This could indicate that results aren't being fully drained (either by callingnext()
untilnull
is returned or explicitly calling thedrain()
method)pg_alloc_params
- counts the number of parameter states that were allocated. This indicates that your queries have more parameters thanresult_state_size
. If this happens often, consider increasingresult_state_size
.pg_alloc_columns
- counts the number of columns states that were allocated. This indicates that your queries are returning more columns thanresult_state_size
. If this happens often, consider increasingresult_state_size
.pg_alloc_reader
- counts the number of bytes allocated while reading messages from PostgreSQL. This generally happens as a result of large result (e.g. selecting large text fields). Controlled by theread_buffer
configuration option.
Launch the Postgres database with the provided Docker Compose configuration:
cd tests/
docker compose up
Run tests:
zig build test