[Music] [Music] in today’s class we are going to look at transaction processing concepts any doubts in this particular class you can email your doubts to the email address that is shown in the slide here DJ ROM at iit m dot AC dot in now we will look at what is the meaning of a transaction in databases a transaction is essentially an atomic unit of access which is either completely executed or not executed at all typically databases store data of interest and we have applications trying to access the database and modify the data that is stored in the database now one example is a banking database where you wish to transfer funds or withdraw amount from your account in the bank database or you want to debit or credit certain amounts are transfer amounts from one account into another account when the database is operated upon by the applications we have certain instructions that will be executed and these instructions are not just normal instructions as executed in an operating system but they need to obey something more than the normal instructions what we’re going to see is we’re going to look at the instructions and then see how these instructions need to be taken care of or what properties need to be enforced on these instructions which we call as transactions now there are some examples of typical examples of transaction any enterprises use databases to store information about the state now any occurrence of a real-world event that changes the enterprise state requires the execution of a program that changes the database state in a corresponding way for example balance must be updated when you deposit into your banking database when you withdraw an amount it basically need to update again your account by modifying the balance correctly so typically what we say is a transaction is a program that accesses the database in response to real-world events they’re basically going to modify the state of the database transaction essentially modifies the state of the database now here is a very simple example of a database transaction as you can see here it’s a debit transaction which tries to withdraw a particular amount from a banking database here we show that there is a debit and account number and a debit amount is given to the debit transaction now the begin transaction shows that this is a start of the transaction execution of the transaction now there are three instructions which are part of this transaction the first one is read the account number and the balance that is there in the database in their banking database now as part of the read instruction we’re going to see later what are all the other instructions that need to be executed when this read instruction has to be executed by the transaction now after this read instruction succeeds you will have even account balance in the variable called balance and now this balance has to be updated correspondingly with the amount that is going to be withdrawn from the account as you can see later after the withdrawal amount the new balance has to be computed by changing the balance and then the new balance has to be written onto the database back and finally you signal that this transaction is finished by giving an end transaction now what we basically see here is the three instructions id-1 id2 and ID three shown in the slide here together constitute what we call as a transaction now normal programs we don’t basically distinguish by grouping instructions together and saying that they together constitute a transaction now we will go further deep analyze why these instructions put together will be called as transaction in this particular case another example could be a credit transaction as you can see here again we have credit transaction being supplied with an account number and the amount that you would like to credit as in the other case you have here a begin transaction and the end transaction signalling that all the instructions in between constitute together a transaction now as done in the earlier case the read instruction or read instruction is going to fetch the balance from your bank database given the account number that particular account number the balance will be read from the back-end database now the new balance is computed by adding the amount that you are crediting into the account and now you have to write back the new balance back into the database now these are very simple examples of what is a debit transaction and a credit transaction in the case of a banking database now we can see how these transactions in in reality operate on the database for example it is shown here in terms of the credit process which is shown in the slide shows the credit part of the transaction which we saw earlier now if you look at the debit out of the transaction you can see that the debate process executes the id-1 id2 ID three instructions shown earlier as shown in this figure the back-end database which is stored on the disk contains the information or the data relating to the bank customers now the credit or debit transaction needs to access the database and retrieve the information and correspondingly modify this information and after modification they need to write the information back onto the database okay now let us understand this little more carefully here now we have as part of the debit transaction ID 1 ID 2 and ID 3 ID 1 is a read instruction now this read instruction has to go to the back-end database management system which takes care of actually now finding out the corresponding data on the disk and move the data back to the local variables of the process debit process that is now after the ID 1 is executed the balance variable will have correct data relating to the current balance that is there in the account number now since it’s a debit transaction the balance is going to be updated here by withdrawing the amount from the current balance and the new value will be computed which will be stored on the balance variable now at the end of the transaction debit transaction the value has to be written back onto the back-end database and that what signals the end of the transaction now at the beginning of the transaction the value is read from the bank database and at the end of the transaction the new value is written back onto the database this is what we mean by a transaction a transaction essentially is reading some data from the database processing that data according to the semantics of the transaction and the new changed values are being written back onto the database at the end of the transaction the same thing actually happens in the case of the credit transaction that is shown here except that the New Balance is now added with the amount supplied now we’re going to see when the transactions are operating on the database we need to have certain properties that needs to be enforced on these instructions so that the state of the database is in a consistent fashion okay now let us understand what are those properties and what happens really when the transactions are executing on the database now here is the case where we are explaining what happens when the transactions have to read data from the database as shown earlier in the figure we have to first find the address of the disk block that contains the data item X X in this particular case can be a balance it can be an account number so it’s basically the data that needs to be fetched from the back-end database now once you have actually found the disk block you have to copy the disk block information into a buffer in the main memory that is the local variable that is shown in the figure earlier now once the item has been copied into the buffer the value of the disk block is copied into the buffer then you have to copy that value into the item X to show that the program variable named X now contains the value that is fixed from the disk block this is what the read operation which is shown in the transaction signifies now we have correspondingly right operation which shows how the right is executed in this particular case you can see that find the address of the disk block that contains X and now in this particular case you not only fetch the value of the data you go to modify the data and then the updated value is going to be written back onto the buffer as shown here you can understand here the last step is different from the earlier read operation store the updated block from the buffer back to the disk that is the extra instruction that’s going to be executed in this particular case now essentially read and write are the two operations that are going to be used by the database transactions and we have seen how the read and the write fits the required information for the database affects the required information for the transaction from the database that’s what was explained right now this is similar to a normal process which would have read the value from a back-end disk file this is essentially same concept as of a process trying to open a file and read the information from a file what more has to be done here is it is not just a set of file operations that are being performed by a process but we also need to enforce certain conditions on these instructions so that the database state is always maintained consistently and that is what we mean by a transaction and the transaction has to enforce these properties on the instructions it is executing one of the things that will happen for a normal processes when failures occur the process can leave the files that it has hole opened in a inconsistent state for example if you understand the operating system and open if your process opens files and the operating system crashes due to power failure or other reasons what really happens is the files that are open could be left in an inconsistent fashion similarly if a file is being operated simultaneously by more than one user again the chance of corruption exists on the file because there is no guarantees on how the file is being accessed simultaneously by multiple users and this has to be prevented in a database because the data that is being stored in a database needs to be in a consistent fashion always that is to be maintained for example if you consider a banking database whatever happens you you wish that you should not lose your money that you’re depositing into the bank if the bank comes in tells there is a power failure and we have last year 1 million rupees that you have deposited you’re going to I know say that this is quite unacceptable and you want the bank to ensure that whatever happens the data that is stored in the banking database is consistent all the time this is an essential difference between database and file systems file systems could be they may not be any guarantees associated with file systems in operating systems whereas when you take the database transactions there is certain level of guarantee that is given to you regarding the state of the database at the end of the execution of the transactions now let us look at what kind of failures can happen in a system and what are the consequence of those failures there could be different kinds of failures we have actually listed a few failures here and we will start discussing them in more detail as we go along now you can see here the first kind of failure that can happen is a hardware or a software error in the computer during transaction executing this means it’s possible that there is a problem in the hardware or the software for example it could be an operating system bug or it could be a hardware bug that could have made the computer fail which means that when you are executing the set of instruction that is shown earlier as a transaction the failure can occur okay the other kind of failure that can occur is internal to the transaction failure caused by an operation in the transaction for example you are actually dividing by a 0 divided by 0 will cause cause the program to crash right so this is another kind of error that can happen in the system the other kind of errors is conditions that cause a cancellation of a transaction for example data needed for the transaction not phone you are trying to transfer funds from one account to another account then you recognize that the other account doesn’t exist this will result in the transaction to be aborted because the fund transfer is not happening correctly the account into which you should be transferring the funds is not found in this particular case the other important issue is concurrency control enforcement which is to be done when multiple transactions are simultaneously operating on the database for example if you really see how people can operate with the accounts it’s possible that – you could be withdrawing some amount of money using your ATM ATM machine but at the same time they could be another transaction trying to transfer funds from your account to another account when this happens it’s possible that the state of the database could be corrupted unless there is some kind of a concurrency control that is enforced to ensure that the system is in a consistent state we’re going to look at during the subsequent lectures in detail how the concurrency control is enforced by the database management system on transactions so that the database is in a consistent fashion now it’s also possible because of the concurrency control that is enforced a transaction is aborted because the transaction has started executing and the concurrency control mechanism found that the transaction cannot proceed any more then it may also abort the transaction so this is another reason why a transaction could fail there could be other reasons as well something like loss of data in the disk blocks during a transaction due to let it let us say the disc ahead has crashed so then it’s possible that you’re not able to retrieve the data correctly from the disk this is disk failure this occurs when the disk hard disk has failed there could be other catastrophic reasons for you when you deal with the databases things like power failure v and other kinds of catastrophes like at quake which could destroy the data and they are beyond the human control now one of the things you should realize is all these kinds of failures are possible and in the event of these failures the database still should ensure that the data that is stored in the database is consistent and it is available by other means that is you’re able to retrieve the data back even when failures of this nature occur now part of this lecture will explore how you can handle this situation when failures occur when transactions are executing now what we are going to look at it is we will intuitively understand the concept of transaction to start with and we will say what can happen to a transaction in the event of failures we in fact looked at several kinds of failures starting from transactional error to a disk failure to a power failure to a more catastrophic failure so what we would like to see is what happens if these failures occur when transactions are in progress imagine you are withdrawing money from your bank account and the power fails now what happens is your bank account still shows correct balance or is it going to show that you have already withdrawn when you have not taken your amount what is the state in which your bank database will be left when the failure occurs when you are withdrawing money from the account now here is a case where it is shown more precisely to say what kind of scenarios can prevail and how those scenarios have to be addressed look at the scenario one what happens if the credit transaction fails after executing I see one and before executing I see two remember I see one is a read balance instruction and I see two years when it is actually modifying that balance locally it still has not written that value back on to the database because a right has to be done at the end of it I see three has to be done to write the balance back on to the backend database now what happens if the credit transaction fails after executing I see one but before executing I see two now in a normal scenario if you don’t really take care of this situation it is possible that the database is left in an unknown and undetermined aesthetic condition when the failure occurs but you have to actually prevent this from happening by saying that you will bring back the database to a consistent fashion if a failure occurs in this particular case you have to ensure that all the instructions I see one I see two and I see three are either executed or not executed at all this is a very important property that needs to be insured for database transactions we are going to see this property in more detail this property is called the atomicity property of the transactions that is all the instructions put together have to be executed either in full or none of them should be executed at all in fact if you carefully look at the initial example where we had preceded the three instructions I see one I see two and I see three with the begin transaction and an end transaction all the instructions between the begin and end have to obey this property called the atomicity property that either all the instructions are executed in full or none of them are executed this is what we see a scenario one what can happen if the credit transaction fails after executing I see 1b and before executing I see two let us move to scenario to what happens in scenario 2 if the credit and debit transaction execute simultaneously what are the likely things that can happen I in fact shown a case where I see one is executed then followed by ID one is executed then I see two is executed then ID 2 is executed then I see 3 and followed by ID 3 if you carefully look at the way it was written here both the credit and the debit intervened debit instructions have been interleaved I see one is basically a read account number and the balance ID one is also a read instruction on the database except that this is a debit instruction the earlier is a credit instruction now this will also read the balance in the account number now if we say that both are operating on the same account number they are raiding the balance at the same time now imagine I see one has the same balance the value that is currently let us say the account has a balance of 500 rupees in your account then both I see one and ID one read the value as 500 now imagine that you are depositing 200 rupees and withdrawing 100 rupees now if I see 2 will say that 500 plus 200 which is actually 700 and I see 3 will try writing 700 back into the database whereas ID 2 will try to reduce the balance from 500 by 400 and I did 3 will write the value as 400 now we can see of last some amount in the process because the credit amount is completely lost because both the credit and the debit transactions or simultaneously operating and only that debit is shown here the credit is lost the credit that is done into the database is lost in this particular case so this is what we see a scenario to when in transactions operate concurrently on the database items it’s possible that the database State is left in an inconsistent fashion as shown in this particular example now we have to prevent this from happening and this is what we call as a consistency property of the transaction now what we mean by consistency here is when the transactions are operating concurrently simultaneously you need to enforce the condition that the transactions in effect have executed one after the other rather than simultaneously this is some sense you need to prevent if there is conflict between transactions they operating simultaneously on the database items has to be prevented and this is achieved by what we earlier called as concurrency control mechanisms so we need concurrency control mechanisms for making sure that the database when it is operated upon simultaneously by multiple transactions is not left in an inconsistent state this is what we see as a scenario 2 and scenario 2 gives the property of consistency where a scenario 1 gives the property of atomicity to the transaction now let us move on to the third scenario scenario 3 where basically it is possible for one transaction to see the values of the other transaction before it’s actually finished it’s full operations now that’s what is actually stated here what happens if the result of the credit transaction are visible to debit transaction before it is actually written on to the database what does this mean this is elaborated further by saying that I see two rights the balance in the earlier case if you’re actually depositing 200 rupees when your initial balance is 500 I see two will write a value of 700 now the debit can read this value of 700 even before the credit has actually committed its values to the back-end database now you can the debit can now go and then withdraw the money from the new balance even before it is written back onto the back-end database if this happens the results of one transaction are visible in this particular case the credit transaction are visible to the debit transaction before it is actually finished execution now this results in what we call as an isolation property because for some reason if the credit transaction fails later okay for various other reasons if the credit transaction fails and if its results are already visible for the debit transaction you need to abort the debit transaction also because it has read the values of a transaction that is aborted this is what we mean by causing cascading aborts if a transaction values or results are available for some other transaction before it is committed it could lead to cascading aborts to avoid this what we have to do is we have to enforce the property called isolation isolation ensures that the transaction results the values that a transaction or the values which the transaction has changed the values of the data items which a transaction has changed are not available for other transactions till the transaction is actually committed actually concurrency control protocols and commit protocols which go together ensure the consistency of the database in the presence of multiple transactions executing simultaneously on the database now let us move on to the fourth scenario which leads to the final property fourth property of the transaction now this scenario 4 tells what happens if the database server crashes before the change data is written onto a stable storage one could imagine several situations where the database values have been written that transactions have committed but their final values have not been written on to the database for various reasons now whatever happens after the transaction says it has committed its value should be preserved the value that the transaction has changed should never be altered after the transaction has committed so you have no way of saying that the results of the transaction is lost after it has committed this is what we mean by the property of durability all results of the committed transactions are preserved after that point once the transaction has committed so you have to guarantee this in spite of any other kind of failure that may happen to your database these four properties are very important properties when we talk about transactions now just to repeat these properties atomicity ensures which we have actually derived from scenario one ensures that all the instructions of a transaction are executed in full or none so the first question of some part of the instruction being executed and some part of the transaction instruction not being executed doesn’t arise at all because you ensure that all the instructions of the transaction are executed in full or none the second property which we discussed is a consistency property when multiple transactions are accessing data simultaneously the access is protected through concurrency control mechanisms to ensure that the updates which are done by the concurrently executing transactions are not lost on the database this is what we actually mean by the property of consistency and we also mentioned that consistency is ensured in database management systems by using a set of concurrency control protocols and we are going to study these concurrency control protocols in depth during these lectures now the third property is isolation the isolation property ensures that the results of one transaction will not be visible to the other transaction till the transaction commits this ensures that there are no problems relating to partial results being available for other transactions we also mentioned that when this happens cascading aborts takes place when one transaction results have been read by other transaction and the earlier transaction has to be aborted and to prevent this cascading aborts we enforce the property of isolation on the transactions okay the fourth property is the durability property and it says that the effects of the committed transactions are not last after commitment for example you’ve deposited some mo into your bank and you want to ensure that it’s never last after you have actually deposited the money into the bank it will never be lost that’s basically the durability property now all these four properties put together are nicely known as the acid properties of the transaction as shown here it’s a summarization of the four properties that we have been so far discussing a stands for atomicity C stands for consistency i stands for isolation and D stands for durability so these four properties put together are called as the acid properties of the transactions a normal process will not obey these acid properties whereas the transactions in the data base context will obey these acid properties now one of the things that we are going to look at through these lectures is see how these acid properties are realized by the database management system when we are actually executing transactions in the database now we will also further elaborate these acid properties little more formally by actually taking what happens and how these acid properties are ensured now as you can see here in the case of the debit transaction all the instructions starting from the begin transaction to the end transaction will be executed in full or none which actually means that id-1 id2 id3 have to be executed in full now one of the things that we are going to do is when there is a begin transaction we record the state of the database now whatever happens after the transaction starts executing if there is a failure you ensure that you get back to that state by restoring the state to the original state if the instructions are not executed in full for example if your original balance starting balance is 500 rupees and for some reason the debit transaction cannot be executed all the instructions restore the balance back to 500 this is what we mean by undoing a transaction the transaction all the instructions which were executed partially till completion of the transaction are rolled back which means all those instructions will be nullified you actually roll back on those transactions so that the effect of those instructions is nullified so this is what we mean by atomicity property we ensure that either all the instructions are either executed together or none of them are executed okay now again to stress again what really we were talking about consistency in case of both debit transaction and credit transaction access the balance data simultaneously will protect them through the concurrency control mechanisms a simple mechanism that we are going to use is we log the database items and allow only transactions which acquired these locks to change the values of those data items and only when the transaction releases the locks on those data items other transactions will be allowed to use those data items this is a very simple technique there are more sophisticated techniques that can be used for enforcing concurrency control mechanisms but this is what we we would like to do to ensure that the consistency property is enforced or realized on the database now when you go to the isolation property you’re going to look at either debit or credit transaction results will not be available unless they are committed in one way these transactions have to hold on to these locks when they required and should not release those locks for other transactions till they have committed to ensure the values that they’re modified are not available to other transactions till they’re finished execution till they reach the state of end transaction which means that now they have committed their values and after that only those results will be visible for other transactions now there are several ways in which we can ensure the durability property the durability property will ensure that you know you have backups sufficient backups we have written all your locks committed transaction logs and there are various ways in which the effects of these are preserved to make sure that all the committed transaction values can always be obtained by using the backups and the transaction logs I’m going to look at this property and how this is realized in detail as we go along now we come into more details of what really happens with transactions for this actually will introduce certain terminology to start with the idea is to get more formal with the transaction concepts see them in more detail as we progress so far we have been looking at the properties very intuitively trying to understand them in a very intuitive fashion now we will try to understand the concepts in a more formal way now there are two tails in which the transactions can enter into one is a committed which actually means when the transaction has completely executed all its instructions it can enter into a commit state which actually means that all the reads and writes of the data items it has actually read can now be written back and they have been safely written back on to the database in which case we say that the transaction has committed itself now for some reason transaction has started executing but it cannot commit the values of the data items that it has changed which it has read from the database then we say that the database has entered the state of abort which actually means that all the effects of the transaction will be nullified and database state will be left when the transaction start executed that’s equivalent to saying that I actually had a begin transaction the state of the database when I started executing this transaction which is the beginning instruction and I actually keep the state back to that initial state when the transaction started executing that’s called the abort State so we have commit and abort a transaction could be either committing or aborting when it says it has committed it is writing all the values that it has read the unchanged back on to the database when it says it is aborting it is not committing any of the values that it has changed yeah so to emphasis the transactions are not just ordinary programs ordinary program instructions all our discussion today highlights the fact that additional requirements are placed on transactions to ensure that these acid properties as some atomicity consistency and isolation are realized with the transactions now here is a case where we can quickly see with respect to commit and abort what really happens for the atomicity property now atomicity property says that a real-world event either happens or does not happen now if you take the case where a person either books or does not book his tickets which actually means that when he’s actually booked his tickets the transaction has committed the values when he says he didn’t book the ticket it means that the transaction has aborted and again to give the state the thing that it is not true of ordinary programs a hardware or a software failure could leave files partially updated which is not the case in the case of transactions when you say I have booked my ticket it means that you have booked your ticket that means that transaction booking tickets has committed and when it says it does not book the ticket it means that it has not committed it is aborted now coming to the consistency property you have to ensure that the set of integrity constraints that are specified by the transaction are all enforced when the transaction has executed now what we mean by this is transaction designer must ensure that assuming the database was in a state that is satisfied all integrity constraints when execution of a transaction got started then when the transaction is actually completed execution you need to ensure that all the integrity constraints are once again satisfied in a simple way consistency can be ensured by saying that the transaction execution results in a serializable execution that is the transactions executed as if all the operations have been executed in a serial fashion we go to look at that particular property right now in the next few minutes in a little more detail and see what does it mean by consistency preservation and isolation essentially will avoid cascaded robots as explained earlier here is a simple case where isolation was given in a more detailed fashion takes it relates to when multiple transactions execute concurrently and you won’t actually ensure that the final execution does preserve the consistency by ensuring that one transaction values are not read by the other by other transaction it has finished now we will see this property of concurrency and isolation together by taking a simple example as given in this particular case as explained in this diagram as you can see this here t1 has two operations op1 and op1 to and t2 has two operations operation 2 1 and 2 2 now it’s possible that these sequence of operations can be interleaved in multiple ways on your database as you can see here one possible sequences operation 1 1 executed first of transaction 1 then operation to one of transaction 2 then operation to 2 is executed of transaction 2 and operation 1 2 of the first transaction is executed as you can see here from the execution sequence of this we may not be able to say that transaction t1 completed all its operations before transaction 2 is executed but on the other hand if there is a way you can ensure that all the instructions of transaction 1 are finished before transaction 2 it is equivalent to saying that the set of instructions that are executed or all in a serial fashion now one of the requirements of operation 1 1 and operation 1 2 to be serializable in this context is they are not operating on the same account it’s possible that two people are withdrawing money from two different accounts since there is no conflict in this particular case between the two operations it does it doesn’t really matter even if 1 2 is executed later we can always say interchange the operations since there is no conflict in this particular case and rewrite the set of operations as if they have been executed as Row 1 2 whoa one go one two and then go to one and war to 2 when operations don’t conflict it is possible for us to interchange operations and then ensure what we see the property of serializability that is all the operations of t1 have finished before t2 okay that is what we mean by serialize ability or in other sense what we’re going to say is all the operations of t1 since they’re finished t1 precedes t2 in terms of how the operations have been executed on the database this is called serializability however let us say that the operation 1 1 and operation 2 1 conflict with each other in the sense that they access the same database item in this particular case we can say that they’re accessing the same account and the same balance and they’re trying to modify the same balance that they are not just reading but writing values modified values on to the database in which case we say there is a conflict now operations will conflict if they operate on the same data item and one of them is right that is what we mean by conflicting operations when transactions conflict we need to serialize the transactions and this is what we mean by conflict serializability the conflicting operations should be executed in such a way that we know that the conflicting operations are executed in a serial fashion which actually means that we can ensure that the operations on the database in this particular case if 1 1 and Rho 2 1 have conflicting they have been executed one after the other and that decides how the transactions proceed with each other now let us see the following scenario where on Row 1 1 and war 2 1 which are two conflicting operations we say that t1 preceded a t2 and let us say Rho 1 2 and what – 2 are also conflicting now let us say as far as those operations are concerned T to proceed Stephen now this is a this is a scenario that will result in the transactions t1 and CT to not being serializable because as far as the conflicting operations were 1 1 and war 2 1 are concerned the t1 is preceding t2 and in the case of war 1 tooth and war 2 2 which are again conflicting the transactions are proceeding in the other direction t2 is proceeding t1 so we can’t say as far as the conflicting operations are concerned t1 is executed before t2 in one case t1 executed before t2 in the other case t2 is executed before t1 this is a very interesting thing which we are going to study in detail in the next lecture we see how transactions need to preserve the property of conflict serializability wooly when transactions execute and their serializable conflict serializable we say that the database is the transactions have executed and the correct fashion on the database we go to further study this property in tomorrow’s lecture in detail as concurrency control mechanisms the concurrency control mechanisms are expected to provide the property of conflict serializability they ensure that when transactions are executing concurrently we can serialize them in a the transactions are serializable and that’s a property that is ensured by concurrency control mechanisms we go to study the concurrency control mechanisms in tell in tomorrow’s lecture okay [Music] you [Music]
Related posts
-
C# Tutorial For Beginners – Learn C# Basics in 1 Hour
Hi! Thank you for taking my C# tutorial for beginners. Let me quickly give you an... -
WordPress Tutorial: Build Your Affiliate Marketing Sales Funnel
Hi Kevin Barham now and in this shorttutorial I’m gonna show you how to build an... -
Python OOP Tutorial 1: Classes and Instances
Hey, everybody. How’s going in this streaks of videos? We’ll be learning how to create and...