MySQL isolation levels and how they work

In the early 1980’s, computer scientists Andreas Reuter and Theo Harder coined the term ACID to describe a set of properties related to database transactions designed to keep data stored reliably and with integrity.

Most (if not all) modern database systems are built around ACID compliance. By adhering to these fundamentals, businesses can confidently trust the data within their database, whether it’s for a small project management app, or a large banking system. Isolation levels, as well as the related concepts, are cornerstones that enable MySQL to fulfill ACID guarantees.

In this article, we’ll break down how multiple clients can work with a single database and maintaining data consistency by using isolation levels.

What is a MySQL isolation level?

A MySQL isolation level is one of four modes that can be set on a MySQL session that controls how transactions should behave when executing concurrently.

This concept relates directly to the Isolation requirement of ACID compliant databases, which states that transactions should be executed in a way that does not affect other transactions. Depending on the isolation level set on a session, various locking mechanisms are used to manage which transactions have access to what data at any given time. This also determines the different inconsistencies (known as violations) that can surface when running multiple transactions concurrently.

With that brief explanation on what a MySQL isolation level is and what benefit it provides, let’s further explore the different isolation levels, transactions, violations, locks, and how they all relate to each other.

What is ACID compliance?

Before we dive deeper into how MySQL keeps data consistent, it’s worth briefly explaining the concepts that make up ACID.

Atomicity

Atomicity means that any number of SQL operations can be executed as a single unit of work known as a transaction.

Transactions are used in MySQL to execute one or more statements as a group, satisfying the Atomicity requirement of an ACID compliant database. When statements are executed as part of a transaction, no data within the database is actually changed until the transaction is committed. Alternatively, if something goes wrong within the transaction, you are given the opportunity to roll back any changes, which can prevent issues with the underlying data if committed.

By default, MySQL has autocommit turned on, which means that individual statements issued are automatically committed without you having to manually start a transaction.

Consistency

Consistency ensures that the database can confidently be moved from one state to another, avoiding any adverse effects or corruption of the data. This is done by enforcing things like constraints, cascading effects, triggers, etc.

Isolation

Isolation states that transactions are executed independently and in a controlled and ordered way since many clients and threads are often connected to a single database and executing transactions concurrently. MySQL isolation levels relate directly to this part of ACID compliance.

Durability

Durability simply means that once a transaction completes, the data can survive system failures and outages by storing the data on persistent storage. Memcached, for example, stores all of its data in memory, so it would fail this requirement.

Example: A music store

The remainder of this article will use a common example of an online store that specializes in selling units of an antique audio medium known as compact discs, or “CDs”.

The schema will be relatively simple with four entities:

products will contain the available stock of a given CD, including the name, artist, and quantity available.
unit_status will list the quantity of CDs for any product that are currently available, on hold, or pending shipment.
customer_transactions will contain records of when sales are made.
customers will list all of the customers that have purchased CDs.

Here is a diagram of what the database will look like:

Isolation violations

Violations are various read phenomena that can occur when transactions are executed concurrently, and are specifically what isolation levels seek to prevent. Depending on the selected MySQL isolation level (each of which will be covered later in this article), one or more violations may be permitted in the name of database performance and query consistency requirements.

Let’s take a look at the three common violations you may encounter while building around your database.

Dirty read

A dirty read occurs when a query within one transaction returns inconsistent data because it may read a new version of the data has not yet been committed to the database based on another concurrent transaction.

To demonstrate this, let’s say that two customers are interested in purchasing the same CD and two units have a status of available. One customer purchases both units, which prompts the system to move the status of two units to sold_not_shipped before reducing the quantity available. At the same time, the other customer checks the inventory to see if the CD is available, and the system returns that there are still two units available.

The transaction used to update the status for the CD is as follows:

SQL

start transaction;
update unit_status set quantity = quantity + 2 where product_id = 20 and status = 'sold_not_shipped';
update unit_status set quantity = quantity - 2 where product_id = 20 and status = 'available';
commit;

Assuming the following query is executed during the above transaction just before commit; is called, the result would be 0 even though the transaction is not finished yet.

SQL

select quantity from unit_status where product_id = 20 and status = 'available';

Here is a timeline of what the above transactions would look like that would result in the customer seeing a potentially unexpected result for quantity available:

At first glance, this may not appear to be an issue since the database reflects that the values of the two rows in the unit_status table after both UPDATES occured. However, if that transaction issued a rollback; instead of commit;, the data returned would be inaccurate since the changes were reverted after the customer was shown that no units were available.

Non-repeatable reads

Non-repeatable reads occur when a transaction with multiple select queries reads the same rows within that transaction, but the data within those rows is different between selects since another transaction has modified the data in that time.

For this example, the store owner is looking to read the number of units available for a given CD across two select queries, however they receive a different result between those two statements since it occurs at the same time that a customer is purchasing the CD.

Here are the queries used to get the quantity available:

SQL

start transaction;


-- Get the current available quantity of the CD, will return 15
select quantity from unit_status where product_id = 20 and status = 'available';


-- The same query will now return 13 (see the transaction below)
select quantity from unit_status where product_id = 20 and status = 'available';


commit;

And here is the same transaction used in the previous example to update the inventory of a given CD when it is purchased:

SQL

start transaction;


-- Move the units from available to sold_not_shipped
update unit_status set quantity = quantity + 2 where product_id = 20 and status = 'sold_not_shipped';
update unit_status set quantity = quantity - 2 where product_id = 20 and status = 'available';


commit;

Because these two transactions are executed simultaneously, they overlap each other, resulting in the second statement in the first transaction returning a different value.

Phantom reads

Phantom reads occur when the actual rows returned between select statements in a transaction differ because another transaction has inserted rows before the first can complete.

Using the CD store example, lets say that the owner wants to know the count of all products in the database. The following transaction is used to get that information:

SQL

start transaction;


-- This will return 100.
select count(*) from products;


-- Add another one for good measure, or for demo purposes, you decide.
-- Either way this would return 101.
select count(*) from products;


commit

At the same time, an employee receives a shipment a new CD, which needs to be added to the database. The following transaction is used to add the new units:

SQL

start transaction;


insert into products set album = 'Rage Against The Machine', artist = 'The Battle Of Los Angeles', release_year = 1999, cost = 1500;


commit;

Finally, here is what the timeline would look like to cause this phenomena:

Because of the newly inserted records in unit_status table, the summation of the quantity available would be different between the two select statements.

Locks and how they relate to isolation

Transaction isolation is enforced by using various types of locks, or flags that can be set on rows and tables to prevent data from being read or modified.

MySQL will use one of two types of locks depending on the isolation level: shared or exclusive locks.

Shared locks

A shared lock is one that can be created by transactions for ensuring the data they are reading doesn’t change. When a shared lock is created on a row, that row is still readable by other transactions. Other transactions, however, cannot modify the data within that row. Any number of transactions can create a shared lock on a row and all locks will need to be released before the data can be modified.

Exclusive locks

When a transaction creates an exclusive lock on a row, only that transaction can read or write the data. If another transaction attempts to read or write data to that row, it will be prevented from doing so. This is especially useful during transactions where you expect data to be updated to prevent some of the violations outlined above.

Gap locking

Gap locking is a special type of lock that is used by MySQL to prevent phantom reads. Gap locking will use criteria in where clauses to lock the space around the read data, preventing rows from being inserted that may alter the query if its run a second time.

To demonstrate this, let’s use the music store example and assume the owner wants to run a sale on any CDs released in 1999:

SQL

start transaction;
select * from products where release_year = 1999 for update;
update products set cost = 800 where release_year = 1999;
commit;

If an employee decides to add a new CD to the products table that was also released in 1999, MySQL would make that transaction wait until the previous one had completed before proceeding:

SQL

insert into products (album, artist, release_year, cost)
	values ('Rage Against The Machine', 'The Battle Of Los Angeles', 1999, 1500);

This prevents the second transaction from inserting a new row that would alter the results of the first transaction.

Controlling locks

It’s worth mentioning that select statements can actually be altered to indicate that MySQL should place a specific type of lock on a row. By adding the for share keywords on the end of a select statement, MySQL will create a shared lock on that row. And using the for update keywords on the end of a select statement creates an exclusive lock.

Here is a modified version of the “dirty read” example from above that uses a locking read. This approach would prevent another transaction from reading the data before it was updated:

SQL

start transaction;
select * from products where id = 20 for update;
update products set cost = 800 where id = 20;
commit;

The four MySQL isolation levels

Now that we’ve covered much of the prerequisite knowledge, let’s talk about how MySQL adheres to the Isolation requirement with various Transaction Isolation Levels.

Isolation levels instruct the database engine on how to manage multiple transactions being performed concurrently, and what violations are possible.

Isolation levels are set per session and each database has a default mode that will be used by every session if it is not changed.

Let’s explore the four isolation levels and what violations they each prevent and allow.

Read uncommitted

Read uncommitted is the lowest isolation level, effectively allowing all violations. This is because transactions with this isolation level will always read the latest version of a row thats been modified by any transaction, whether it's been committed or not (more on row versioning in a bit). This mode should be used when performance takes priority over data consistency.

Deviating from the example outlined above, one situation where this might fit well is in a social media app when finding the number of likes a popular post has. Many social media apps shorten their numbers after a certain point (ie; 1k over 1,000). If a client needs to know how many likes the post has, returning an approximation over the exact value is usually acceptible.

Read committed

Read committed is the step above read uncommitted and prevents dirty reads.

In this mode, each row modified will create a version of that row which is tagged by the transaction ID. Subsequent operations on that row will use the version belonging to that transaction. Each select within that transaction will create and use a fresh snapshot (a version of the read rows at that point in time). This means that if a row is modified and has been comitted by another transaction, the next select will create a new snapshot which now represents the latest committed version of that data.

The key difference between read committed and read uncommitted is that read committed will always use the latest comitted row, whereas read uncommitted will the latest version of that row even if it hasn't yet been committed.

This mode will still allow non-repeatable read and phantom read violations.

Repeatable read

Repeatable read is the default MySQL isolation level used for all connections unless configured otherwise.

With InnoDB's MVCC (Multi Version Concurrency Control), a row may have multiple versions at any given time, depending on open transactions. In repeatable read level, queries will use a consistent snapshot of the data, pinning the reads to a single transaction ID throughout the transaction. Compare that with read committed where reads always pick the latest committed version of any row, and where subsequent reads may return different results.

With locking reads, the method that MySQL uses to lock the data depends on whether a unique index is used.

If an index can be used based on the where condition, MySQL will only lock the necessary rows that match the query. If an index is NOT used and the table is scanned, MySQL will lock all of the rows it reads regardless if they match the where, as well as perform gap locking to prevent inserts that may alter the data if the query is run multiple times.

Using the repeatable read isolation level prevents dirty reads, non-repeatble reads, and phantom reads.

Serializable

Setting the isolation level to Serializable also prevents all violations, but it has the most performance impact on your MySQL server. This mode works exactly like Repeatable read, but implicitly creates a shared lock on all select statements whether you use for share or not. Due to the excessive locks used, there is a greater risk of deadlocks occuring.

How to set a MySQL isolation level

Setting a specific MySQL isolation level is relatively straightforward, and can be done on a specific session or on the entire server.

The following statements will set the isolation level on the current session. The SESSION keyword is optional if you run it prior to starting a transaction, but will error out if you try:

SQL

set transaction isolation level read uncommitted;
set session transaction isolation level read committed; -- 'session' is optional

If you have the CONNECTION_ADMIN user permissions, you can also use the following command to set the default isolation level across the server. This only affects future sessions, and any session is open to change its own isolation level regardless of what the server has it set to:

SQL

set global transaction isolation level serializable;

Finally, isolation levels can be configured prior to MySQL starting, by using the --transaction-isolation=READ-UNCOMMITED or in the configuration file:

[mysqld]
transaction-isolation = REPEATABLE-READ

Conclusion

As you can see, MySQL uses a combination of several different concepts to ensure that it’s ACID compliant, making it a reliable and trustworthy database. Many of these concepts revolve around transaction isolation and the various levels available, allowing you to fine tune the balance between safety and performance!

Share

Next post