How to change serialization level on Amazon Redshift
January 20, 2023 in DevopsSo, by default, AWS Redshift has table-level serializable isolation. The goal is to ensure that two parallel queries cannot affect one another – so you don’t need to care about concurrency in the context of data consistency. This means that, if one query is running a SELECT
from a table, and another query happens to update the same table, and the update might impact the select, Redshift will reject the query with a “Serializable isolation violation”. So now you do have to care about concurrency, and either retry, or redesign your queries. This isn’t new, of course. But what’s different about Redshift is that the isolation is maintained per table, while, say, in PostgreSQL it’s per row. So, you can run into a situation where rewriting the queries safely is impossible, and that on a certain level of load it’s also impossible to avoid concurrent queries.
To mitigate this, in May 2022 Redshift rolled out a new feature – snapshot isolation. The effect of it is, snapshot isolation brings Redshift down to row-level isolation. If you have a Redshift table with a large throughput of both reads and writes, this is a sure one to enable.
Enabling snapshot isolation is quick, but it requires a database free of other connections. As you can imagine, that’s hard to ensure, especially in an environment big enough to already have problems with concurrency!
-
Login as superuser (
master
or other); this can be done throughpsql
, or through the Redshift Query editor. -
Isolation level is set per-database. Set DB connection limit to 0 for the database you’re altering it for:
ALTER DATABASE yourdb CONNECTION LIMIT 0;
-
Forcibly disconnect all clients; this can’t be done in batch, so you’ll need to do a little manual scripting. First, get the list of process IDs of clients connected to your database.
SELECT process FROM STV_SESSIONS WHERE db_name='yourdb' AND process != pg_backend_pid();
-
Now, take the list of process IDs and run a
SELECT PG_TERMINATE_BACKEND(every_process_id);
query for every one (I suggest making a quick script with a multi-cursor editor like VS Code.) You can be sure that no new clients could connect because of the connection limit. -
Double-check the list of clients using query from p.3
-
Now, alter serialization level; this is an instant operation
ALTER DATABASE yourdb ISOLATION LEVEL SNAPSHOT
-
And finally, allow clients back in (
UNLIMITED
is the default connection non-limit, but if you had one set, make sure to set it again):ALTER DATABASE yourdb CONNECTION LIMIT UNLIMITED;
That’s it! With instructions ready at hand, the whole operation should not take more than 5 minutes, although, as I understand, downtime is not completely avoidable.
Liked the post? Treat me to a coffee