The evolution of a highly adaptive system.
Posted by Chris Subritzky - 07 June 2016
One of the key architectural decisions that needs to be made when designing a transactional system is the trade-off between concurrency and accuracy. The decision is not always straightforward, yet it plays a key part in the success or otherwise of the system in the eyes of the stakeholders and key users. Too little accuracy and people won’t trust the system. Too little concurrency and the system will eventually grind to a halt.
In a database, accuracy is maintained by locking, a process whereby only one user can interact with a piece of data at a time. By preventing multiple people from interacting with the same piece of data we can be sure that we are not going to end up with inconsistent data. Take a bank account balance update as a simple example:
|Opening balance||Transaction amount||Closing balance|
The closing balance is calculated based on the opening balance and transaction amount. If two transactions occurred at the exact same time and both used the same opening balance then the closing balance after the second transaction would be incorrect. Thus it is of utmost importance that only one transaction alters the balance of the bank account at one time. By locking the account balance the first transaction prevents the second transaction from accessing the data until the first transaction has completely finished, resulting in the closing balance being correct at all times.
There is however a price to pay with locking and that is concurrency. As the volume and complexity of transactions occurring in the system grows, the chance that two transactions requiring the same data also grows, increasing the likelihood that blocking will occur. Blocking is bad from a user perspective as it introduces tangible delays that can grow to the point where the system becomes unresponsive to users. Blocking can be removed by telling the system to ignore any locks when reading data, but this increases the risk of the system returning incorrect or inconsistent data. In technical terms, telling the system whether to respect or ignore locks is known as setting the isolation level.
The question a system architect asks is how important that any given piece of data in the system is accurate. Undoubtedly, when recording transactions against a bank account you’ll want to use locking to ensure that everything is 100% accurate. But how about when reading that data? If an account holder went to an ATM and requested their account balance while these two transactions were being processed the system architect would have to decide whether to:
It is this balancing act between accuracy and concurrency that can make the difference between a system that is a success and one that is deemed a failure. Determining which isolation level to use within a system comes down to a combination of experience and understanding what the system aimsto do. Different isolation levels can be applied to different screens and processes within the same system so with correct planning it is possible to design a system that is both accurate where it matters while still allowing a degree of concurrency that will allow the system to remain usable even as it scales.
Typically when determining an isolation level for a given screen or process the architect will ask:
There are many variables that come into play when designing a transactional system that is both robust and scalable. In order to get the best results the system architect needs a technical understanding of how the database handles concurrent requests and also a strong understanding of each business process so a judgement on an appropriate isolation level can be made.