Expert Oracle Database Architecture, Third Edition (2014)

Chapter 8. Transactions

Transactions are one of the features that set databases apart from file systems. In a file system, if you are in the middle of writing a file and the operating system crashes, that file will probably be corrupted, though there are “journaled” file systems and the like that may be able to recover your file to some point in time. However, if you need to keep two files synchronized, such a system won’t help—if you update one file and the system fails before you finish updating the second, your files won’t be synchronized.

This is the main purpose of transactions—they take the database from one consistent state to the next. That is their function. When you commit work in the database, you are assured that either all of your changes, or none of them, have been saved. Furthermore, you are assured that the various rules and checks that protect data integrity were implemented.

In the previous chapter, we discussed transactions in terms of concurrency control and how, as a result of Oracle’s multiversioning, read-consistent model, Oracle transactions can provide consistent data every time, under highly concurrent data access conditions. Transactions in Oracle exhibit all of the required ACID characteristics:

·     Atomicity: Either all of a transaction happens or none of it happens.

·     Consistency: A transaction takes the database from one consistent state to the next.

·     Isolation: The effects of a transaction may not be visible to other transactions until the transaction has committed.

·     Durability: Once the transaction is committed, it is permanent.

In particular, we discussed how Oracle obtains consistency and isolation in the previous chapter. Here we’ll focus most of our attention on the concept of atomicity and how that is applied in Oracle.

In this chapter, we’ll discuss the implications of atomicity and how it affects statements in Oracle. We’ll cover transaction control statements such as COMMIT, SAVEPOINT, and ROLLBACK, and we’ll discuss how integrity constraints are enforced in a transaction. We’ll also look at why you may have some bad transaction habits if you’ve been developing in other databases. We’ll look at distributed transactions and the two-phase commit (2PC). Lastly, we’ll examine autonomous transactions, what they are, and the role they play.

Transaction Control Statements

You don’t need a “begin transaction” statement in Oracle. A transaction implicitly begins with the first statement that modifies data (the first statement that gets a TX lock). You can explicitly begin a transaction using SET TRANSACTION or the DBMS_TRANSACTION package, but it is not a necessary step, unlike in some other databases. Issuing either a COMMIT or ROLLBACK statement explicitly ends a transaction.

Image Note  Not all ROLLBACK statements are created equal. It should be noted that a ROLLBACK TO SAVEPOINT command will not end a transaction! Only a full, proper ROLLBACK will.

You should always explicitly terminate your transactions with a COMMIT or ROLLBACK; otherwise, the tool or environment you’re using will pick one or the other for you. If you exit your SQL*Plus session normally, without committing or rolling back, SQL*Plus assumes you wish to commit your work and it does so. If you just exit from a Pro*C program, on the other hand, an implicit rollback takes place. Never rely on implicit behavior, as it could change in the future. Always explicitly COMMIT or ROLLBACK your transactions.

Image Note  As an example of something changing in the future, SQL*Plus in Oracle 11g Release 2 and above contains a setting, “exitcommit.” This setting controls whether SQL*Plus issues a COMMIT or ROLLBACK upon exit. So when you use 11g Release 2, the default behavior that has been in place since SQL*Plus was invented may well be different!

Transactions are atomic in Oracle, meaning that either every statement that comprises the transaction is committed (made permanent) or all of the statements are rolled back. This protection is extended to individual statements as well. Either a statement entirely succeeds or the statement is entirely rolled back. Note that I said the “statement” is rolled back. The failure of one statement does not cause previously executed statements to be automatically rolled back. Their work is preserved and must either be committed or rolled back by you. Before we get into the details of exactly what it means for a statement and transaction to be atomic, let’s take a look at the various transaction control statements available to us:

·     COMMIT: To use this statement’s simplest form, you just issue COMMIT. You could be more verbose and say COMMIT WORK, but the two are equivalent. A COMMIT ends your transaction and makes any changes permanent (durable). There are extensions to the COMMITstatement used in distributed transactions that allow you to label a COMMIT (label a transaction) with some meaningful comment and force the commit of an in-doubt distributed transaction. There are also extensions that allow you to perform an asynchronous commit—a commit that actually breaks the durability concept. We’ll take a look at this in a bit and see when it might be appropriate to use.

·     ROLLBACK: To use this statement’s simplest form, you just issue ROLLBACK. Again, you could be more verbose and say ROLLBACK WORK, but the two are equivalent. A rollback ends your transaction and undoes any uncommitted changes. It does this by reading information stored in the rollback/undo segments (going forward I’ll refer to these exclusively as undo segments, the favored terminology for Oracle 10g and later) and restoring the database blocks to the state they were before your transaction began.

·     SAVEPOINT: A SAVEPOINT allows you to create a marked point within a transaction. You may have multiple SAVEPOINTs within a single transaction.

·     ROLLBACK TO <SAVEPOINT>: This statement is used with the SAVEPOINT command. You can roll back your transaction to that marked point without rolling back any of the work that preceded it. So, you could issue two UPDATE statements, followed by aSAVEPOINT and then two DELETE statements. If an error or some sort of exceptional condition occurs during execution of the DELETE statements, and you catch that exception and issue the ROLLBACK TO SAVEPOINT command, the transaction will roll back to the named SAVEPOINT, undoing any work performed by the DELETEs but leaving the work performed by the UPDATE statements intact.

·     SET TRANSACTION: This statement allows you to set various transaction attributes, such as the transaction’s isolation level and whether it is read-only or read-write. You can also use this statement to instruct the transaction to use a specific undo segment when using manual undo management, but this is not recommended. We’ll discuss manual and automatic undo management in more detail in Chapter 9.

That’s it—there are no other transaction control statements. The most frequently used control statements are COMMIT and ROLLBACK. The SAVEPOINT statement has a somewhat special purpose. Internally, Oracle uses it frequently; in fact Oracle uses it every time you execute any SQL or PL/SQL statement, and you may find some use for it in your applications as well.

Atomicity

Now we’re ready to see what’s meant by statement, procedure, and transaction atomicity.

Statement-Level Atomicity

Consider the following statement:

Insert into t values ( 1 );

It seems fairly clear that if the statement were to fail due to a constraint violation, the row would not be inserted. However, consider the following example, where an INSERT or DELETE on table T fires a trigger that adjusts the CNT column in table T2 appropriately:

EODA@ORA12CR1> create table t2 ( cnt int );
Table created.

EODA@ORA12CR1> insert into t2 values ( 0 );
1 row created.

EODA@ORA12CR1> commit;
Commit complete.

EODA@ORA12CR1> create table t ( x int check ( x>0 ) );
Table created.

EODA@ORA12CR1> create trigger t_trigger
  2  before insert or delete on t for each row
  3  begin
  4     if ( inserting ) then
  5          update t2 set cnt = cnt +1;
  6     else
  7          update t2 set cnt = cnt -1;
  8     end if;
  9     dbms_output.put_line( 'I fired and updated '  ||
 10                                     sql%rowcount || ' rows' );
 11  end;
 12  /
Trigger created.

In this situation, it is less clear what should happen. If the error occurs after the trigger has fired, should the effects of the trigger persist, or not? That is, if the trigger fired and updated T2, but the row was not inserted into T, what should the outcome be? Clearly the answer is that we don’t want the CNT column in T2 to be incremented if a row is not actually inserted into T. Fortunately in Oracle, the original statement from the client—INSERT INTO T, in this case—either entirely succeeds or entirely fails. This statement is atomic. We can confirm this, as follows:

EODA@ORA12CR1> set serveroutput on
EODA@ORA12CR1> insert into t values (1);
I fired and updated 1 rows

1 row created.

EODA@ORA12CR1> insert into t values(-1);
I fired and updated 1 rows
insert into t values(-1)
*
ERROR at line 1:
ORA-02290: check constraint (EODA.SYS_C0061484) violated

EODA@ORA12CR1> select * from t2;

       CNT
----------
         1

Image Note  When using SQL*Plus from Oracle9i Release 2 and before, in order to see that the trigger fired, you need to add a line of code, EXEC NULL, after the second INSERT. This is because SQL*Plus does not retrieve and display the DBMS_OUTPUT information after a failed DML statement in those releases. In Oracle 10g and above it does.

So, one row was successfully inserted into T and we duly received the message I fired and updated 1 rows. The next INSERT statement violates the integrity constraint we have on T. The DBMS_OUTPUT message appeared—the trigger on T in fact did fire and we have evidence of that. The trigger performed its updates of T2 successfully. We might expect T2 to have a value of 2 now, but we see it has a value of 1. Oracle made the original INSERT atomic—the original INSERT INTO T is the statement, and any side effects of that original INSERT INTO T are considered part of that statement.

Oracle achieves this statement-level atomicity by silently wrapping a SAVEPOINT around each of our calls to the database. The preceding two INSERTs were really treated like this:

Savepoint statement1;
   Insert into t values ( 1 );
If error then rollback to statement1;
Savepoint statement2;
   Insert into t values ( -1 );
If error then rollback to statement2;

For programmers used to Sybase or SQL Server, this may be confusing at first. In those databases exactly the opposite is true. The triggers in those systems execute independently of the firing statement. If they encounter an error, the triggers must explicitly roll back their own work and then raise another error to roll back the triggering statement. Otherwise, the work done by a trigger could persist even if the triggering statement, or some other part of the statement, ultimately fails.

In Oracle, this statement-level atomicity extends as deep as it needs to. In the preceding example, if the INSERT INTO T fires a trigger that updates another table, and that table has a trigger that deletes from another table (and so on, and so on), either all of the work succeeds or none of it does. You don’t need to code anything special to ensure this; it’s just the way it works.

Procedure-Level Atomicity

It is interesting to note that Oracle considers PL/SQL blocks to be statements as well. Consider the following stored procedure and reset of the example tables:

EODA@ORA12CR1> create or replace procedure p
  2  as
  3  begin
  4          insert into t values ( 1 );
  5          insert into t values (-1 );
  6  end;
  7  /
Procedure created.

EODA@ORA12CR1> delete from t;
0 rows deleted.

EODA@ORA12CR1> update t2 set cnt = 0;
1 row updated.

EODA@ORA12CR1> commit;
Commit complete.

EODA@ORA12CR1> select * from t;
no rows selected

EODA@ORA12CR1> select * from t2;

       CNT
----------
         0

So, we have a procedure we know will fail, and the second INSERT will always fail in this case. Let’s see what happens if we run that stored procedure:

EODA@ORA12CR1> begin
  2      p;
  3  end;
  4  /
I fired and updated 1 rows
I fired and updated 1 rows
begin
*
ERROR at line 1:
ORA-02290: check constraint (EODA.SYS_C0061484) violated
ORA-06512: at "EODA.P", line 5
ORA-06512: at line 2

EODA@ORA12CR1> select * from t;
no rows selected

EODA@ORA12CR1> select * from t2;

       CNT
----------
         0

As you can see, Oracle treated the stored procedure call as an atomic statement. The client submitted a block of code—BEGIN P; END;—and Oracle wrapped a SAVEPOINT around it. Since P failed, Oracle restored the database back to the point right before it was called.

Image Note  The preceding behavior—statement-level atomicity—relies on the PL/SQL routine not performing any commits or rollbacks. It is my opinion that COMMIT and ROLLBACK should not be used in general in PL/SQL; the invoker of the PL/SQL stored procedure is the only one that knows when a transaction is complete. It is a bad programming practice to issue a COMMIT or ROLLBACK in your developed PL/SQL routines.

Now, if we submit a slightly different block, we will get entirely different results:

EODA@ORA12CR1> begin
  2      p;
  3  exception
  4      when others then
  5          dbms_output.put_line( 'Error!!!! ' || sqlerrm );
  6  end;
  7  /
I fired and updated 1 rows
I fired and updated 1 rows
I fired and updated 1 rows
Error!!!! ORA-02290: check constraint (EODA.SYS_C0061484) violated
PL/SQL procedure successfully completed.

EODA@ORA12CR1> select * from t;

         X
----------
         1

EODA@ORA12CR1> select * from t2;

       CNT
----------
         1

EODA@ORA12CR1> rollback;
Rollback complete.

Here, we ran a block of code that ignored any and all errors, and the difference in outcome is huge. Whereas the first call to P effected no changes, this time the first INSERT succeeds and the CNT column in T2 is incremented accordingly.

Oracle considered the “statement” to be the block that the client submitted. This statement succeeded by catching and ignoring the error itself, so the If error then rollback... didn’t come into effect and Oracle didn’t roll back to the SAVEPOINT after execution. Hence, the partial work performed by P was preserved. The reason this partial work is preserved in the first place is that we have statement-level atomicity within P: each statement in P is atomic. P becomes the client of Oracle when it submits its two INSERT statements. Each INSERT either succeeds or fails entirely. This is evidenced by the fact that we can see that the trigger on T fired twice and updated T2 twice, yet the count in T2 reflects only one UPDATE. The second INSERT executed in P had an implicit SAVEPOINT wrapped around it.

THE “WHEN OTHERS” CLAUSE

I consider virtually all code that contains a WHEN OTHERS exception handler that does not also include a RAISE or RAISE_APPLICATION_ERROR to re-raise the exception to be a bug. It silently ignores the error and it changes the transaction semantics. Catching WHEN OTHERSand translating the exception into an old-fashioned return code changes the way the database is supposed to behave.

In fact, when Oracle 11g Release 1 was still on the drawing board, I was permitted to submit three requests for new features in PL/SQL. I jumped at the chance, and my first suggestion was simply “remove the WHEN OTHERS clause from the language.”  My reasoning was simple: the most common cause of developer-introduced bugs I see—the most common cause—is a WHEN OTHERS not followed by a RAISE or RAISE_APPLICATION_ERROR. I felt the world would be a safer place without this language feature. The PL/SQL implementation team could not do this, of course, but they did the next best thing. They made it so that PL/SQL will generate a compiler warning if you have a WHEN OTHERS that is not followed by a RAISE or RAISE_APPLICATION_ERROR call. For example:

EODA@ORA12CR1> alter session set
  2  PLSQL_Warnings = 'enable:all'
  3  /
Session altered.

EODA@ORA12CR1> create or replace procedure some_proc( p_str in varchar2 )
  2  as
  3  begin
  4      dbms_output.put_line( p_str );
  5  exception
  6    when others
  7    then
  8      -- call some log_error() routine
  9      null;
 10  end;
 11  /
SP2-0804: Procedure created with compilation warnings

EODA@ORA12CR1> show errors procedure some_proc
Errors for PROCEDURE P:

LINE/COL ERROR
-------- -----------------------------------------------------------------
1/1      PLW-05018: unit SOME_PROC omitted optional AUTHID clause; default
         value DEFINER used

6/10     PLW-06009: procedure "SOME_PROC" OTHERS handler does not end in
         RAISE or RAISE_APPLICATION_ERROR

So, if you include WHEN OTHERS in your code and it is not followed by a RAISE or RAISE_APPLICATION_ERROR, be aware that you are almost certainly looking at a bug in your developed code, a bug placed there by you.

The difference between the two blocks of code, one with a WHEN OTHERS exception block and one without, is subtle, and something you must consider in your applications. Adding an exception handler to a block of PL/SQL code can radically change its behavior. A different way to code this—one that restores the statement-level atomicity to the entire PL/SQL block—is as follows:

EODA@ORA12CR1> begin
  2      savepoint sp;
  3      p;
  4  exception
  5      when others then
  6          rollback to sp;
  7          dbms_output.put_line( 'Error!!!! ' || sqlerrm );
  8  end;
  9  /
I fired and updated 1 rows
I fired and updated 1 rows
Error!!!! ORA-02290: check constraint (EODA.SYS_C0061484) violated
PL/SQL procedure successfully completed.

EODA@ORA12CR1> select * from t;
no rows selected

EODA@ORA12CR1> select * from t2;

       CNT
----------
         0

Image Caution  The preceding code represents an exceedingly bad practice. In general, you should neither catch a WHEN OTHERS nor explicitly code what Oracle already provides as far as transaction semantics is concerned.

Here, by mimicking the work Oracle normally does for us with the SAVEPOINT, we are able to restore the original behavior while still catching and “ignoring” the error. I provide this example for illustration only; this is an exceedingly bad coding practice.

Transaction-Level Atomicity

The entire goal of a transaction, a set of SQL statements executed together as a unit of work, is to take the database from one consistent state to another consistent state. To accomplish this goal, transactions are atomic as well—the entire set of successful work performed by a transaction is either entirely committed and made permanent or rolled back and undone. Just like a statement, the transaction is an atomic unit of work. Upon receipt of “success” from the database after committing a transaction, you know that all of the work performed by the transaction has been made persistent.

DDL and Atomicity

It is worth noting that there is a certain class of statements in Oracle that are atomic—but only at the statement level. Data Definition Language (DDL) statements are implemented in a manner such that:

1.    They begin by committing any outstanding work, ending any transaction you might already have in place.

2.    They perform the DDL operation, such as a CREATE TABLE.

3.    They commit the DDL operation if it was successful, or roll back the DDL operation otherwise.

This means that any time you issue a DDL statement such as CREATE, ALTER and so on, you must expect your existing transaction to be immediately committed and the subsequent DDL command to be performed and either committed and made durable or rolled back in the event of any error. DDL does not break the ACID concepts in any way, but the fact that it commits is something you definitely need to be aware of.

Durability

Normally, when a transaction is committed, its changes are permanent; you can rely on those changes being in the database, even if the database crashed the instant after the commit completed. This is not true, however, in two specific cases:

·     You use the WRITE extensions (available in Oracle 10g Release 2 and above) available in the COMMIT statement.

·     You issue COMMITs in a nondistributed (accesses only a single database, no database links) PL/SQL block of code.

We’ll look at each in turn.

WRITE Extensions to COMMIT

Starting with Oracle 10g Release 2 and above, you may add a WRITE clause to your COMMIT statements. The WRITE clause allows the commit to either WAIT for the redo you generated to be written to disk (the default) or NOWAIT—to not wait—for the redo to be written. The NOWAIToption is the capability—a capability that must be used carefully, with forethought, and with understanding of exactly what it means.

Normally, a COMMIT is a synchronous process. Your application invokes COMMIT and then your application waits for the entire COMMIT processing to be complete (what that entails exactly will be covered in detail in Chapter 9). This is the behavior of COMMIT in all the database releases before Oracle 10g Release 2 and is the default behavior in Oracle 10g Release 2 and above.

In current releases of the database, instead of waiting for the commit to complete, which may take measurable time since a commit involves a physical write—a physical I/O—to the redo log files stored on disk, you may have the commit performed in the background, without waiting for it. That comes with the side-effect that your commit is no longer assured to be durable. That is, your application may get a response back from the database that the asynchronous commit you submitted was received, other sessions may be able to see your changes, but later find that the transaction you thought was committed was not. This situation will occur only in very rare cases and will always involve a serious failure of the hardware or software. It requires the database to be shutdown abnormally in order for an asynchronous commit to not be durable, meaning the database instance or computer the database instance is running on would have to suffer a complete failure.

So, if transactions are meant to be durable, what is the potential use of a feature that might make them possibly not durable? Raw performance. When you issue a COMMIT in your application, you are asking the LGWR process to take the redo you’ve generated and ensure that it is written to the online redo log files. Performing physical I/O, which this process involves, is measurably slow; it takes a long time, relatively speaking, to write data to disk. So, a COMMIT may well take longer than the DML statements in the transaction itself! If you make the COMMIT asynchronous, you remove the need to wait for that physical I/O in the client application, perhaps making the client application appear faster—especially if it does lots of COMMITs.

This might suggest that you’d want to use this COMMIT WRITE NOWAIT all of the time—after all isn’t performance the most important thing in the world? No, it is not. Most of the time, you need the durability achieved by default with COMMIT. When you COMMIT and report back to an end user “we have committed,” you need to be sure that the change is permanent. It will be recorded in the database even if the database/hardware failed right after the COMMIT. If you report to an end user that “Order 12352 has been placed,” you need to make sure that Order 12352 was truly placed and persistent. So, for most every application, the default COMMIT WRITE WAIT is the only correct option (note that you only need say COMMIT—the default setting is WRITE WAIT).

When would you want to use this capability to commit without waiting then? Three scenarios come to mind:

·     A custom data load program. It must be custom, since it will have additional logic to deal with the fact that a commit might not persist a system failure.

·     An application that processes a live data feed of some sort, say a stock quote feed from the stock markets that inserts massive amounts of time-sensitive information into the database. If the database goes offline, the data stream keeps on going and the data generated during the system failure will never be processed (Nasdaq does not shut down because your database crashed, after all!). That this data is not processed is OK, because the stock data is so time-sensitive, after a few seconds it would be overwritten by new data anyway.

·     An application that implements its own “queuing” mechanism, for example one that has data in a table with a PROCESSED_FLAG column. As new data arrives, it is inserted with a value of PROCESSED_FLAG='N' (unprocessed). Another routine is tasked with reading the PROCESSED_FLAG='N' records, performing some small, fast transaction and updating the PROCESSED_FLAG='N' to 'Y'. If it commits but that commit is later undone (by a system failure), it is OK because the application that processes these records will just process the record again—it is “restartable.”

If you look at these application categories, you’ll notice that all three of them are background, noninteractive applications. They do not interact with a human being directly. Any application that does interact with a person—that reports to the person “Commit complete”—should use the synchronous commit. Asynchronous commits are not a tuning device for your online customer-facing applications. Asynchronous commits are applicable only to batch-oriented applications, those that are automatically restartable upon failure. Interactive applications are not restartable automatically upon failure—a human being must redo the transaction. Therefore, you have another flag that tells you whether this capability can be considered—do you have a batch application or an interactive one? Unless it is batch-oriented, synchronous commit is the way to go.

So, outside of those three categories of batch applications, this capability—COMMIT WRITE NOWAIT—should probably not be used. If you do use it, you need to ask yourself what would happen if your application is told commit processed, but later, the commit is undone. You need to be able to answer that question and come to the conclusion that it will be OK if that happens. If you can’t answer that question, or if a committed change being lost would have serious repercussions, you should not use the asynchronous commit capability.

COMMITS in a Nondistributed PL/SQL Block

Since PL/SQL was first introduced in version 6 of Oracle, it has been transparently using an asynchronous commit. That approach has worked because all PL/SQL is like a batch program in a way—the end user does not know the outcome of the procedure until it is completely finished. That’s also why this asynchronous commit is used only in nondistributed PL/SQL blocks of code; if we involve more than one database, then there are two things—two databases—relying on the commit being durable. When two databases are relying on the commit being durable, we have to utilize synchronous protocols or a change might be committed in one database but not the other.

Image Note  Of course, pipelined PL/SQL functions deviate from “normal” PL/SQL functions. In normal PL/SQL functions, the outcome is not known until the end of the stored procedure call. Pipelined functions in general are able to return data to a client long before they complete (they return “chunks” of data to the client, a bit at a time). But since pipelined functions are called from SELECT statements and would not be committing anyway, they do not come into play in this discussion.

Therefore, PL/SQL was developed to utilize an asynchronous commit, allowing the COMMIT statement in PL/SQL to not have to wait for the physical I/O to complete (avoiding the “log file sync” wait). That does not mean that you can’t rely on a PL/SQL routine that commits and returns control to your application to not be durable with respect to its changes—PL/SQL will wait for the redo it generated to be written to disk before returning to the client application—but it will only wait once, right before it returns.

Image Note  The following example demonstrates a bad practice—one that I call “slow-by-slow processing” or “row-by-row processing,” as row-by-row is synonymous with slow-by-slow in a relational database. It is meant just to illustrate how PL/SQL processes a COMMIT statement.

First let’s create table T:

EODA@ORA12CR1> create table t
  2  as
  3  select *
  4    from all_objects
  5   where 1=0
  6  /
Table created.

Now consider this PL/SQL procedure:

EODA@ORA12CR1> create or replace procedure p
  2  as
  3  begin
  4      for x in ( select * from all_objects )
  5      loop
  6          insert into t values X;
  7          commit;
  8      end loop;
  9  end;
 10  /

Procedure created.

That PL/SQL code reads a record at a time from ALL_OBJECTS, inserts the record into table T and commits each record as it is inserted. Logically, that code is the same as this:

EODA@ORA12CR1> create or replace procedure p
  2  as
  3  begin
  4      for x in ( select * from all_objects )
  5      loop
  6          insert into t values X;
  7          commit write NOWAIT;
  8      end loop;
  9
 10      -- make internal call here to ensure
 11      -- redo was written by LGWR
 12  end;
 13  /

Procedure created.

So, the commits performed in the routine are done with WRITE NOWAIT and before the PL/SQL block of code returns to the client application, PL/SQL makes sure that the last bit of redo it generated was safely recorded to disk—making the PL/SQL block of code and its changes durable.

In Chapter 11, we’ll see the salient effects of this feature of PL/SQL when measuring the performance of reverse key indexes. If you’d like to see how PL/SQL performs in the manner described earlier, skip there for a moment to review the reverse key index benchmark.

Integrity Constraints and Transactions

It is interesting to note exactly when integrity constraints are checked. By default, integrity constraints are checked after the entire SQL statement has been processed. There are also deferrable constraints that permit the validation of integrity constraints to be postponed until either the application requests they be validated by issuing a SET CONSTRAINTS ALL IMMEDIATE command or upon issuing a COMMIT.

IMMEDIATE Constraints

For the first part of this discussion, we’ll assume that constraints are in IMMEDIATE mode, which is the norm. In this case, the integrity constraints are checked immediately after the entire SQL statement has been processed. Note that I used the term “SQL statement,” not just “statement.” If I have many SQL statements in a PL/SQL stored procedure, each SQL statement will have its integrity constraints validated immediately after its individual execution, not after the stored procedure completes.

So, why are constraints validated after the SQL statement executes? Why not during? This is because it is very natural for a single statement to make individual rows in a table momentarily inconsistent. Taking a look at the partial work by a statement would result in Oracle rejecting the results, even if the end result would be OK. For example, suppose we have a table like this:

EODA@ORA12CR1> create table t ( x int unique );
Table created.

EODA@ORA12CR1> insert into t values ( 1 );
1 row created.

EODA@ORA12CR1> insert into t values ( 2 );
1 row created.

EODA@ORA12CR1> commit;
Commit complete.

And we want to execute a multiple-row UPDATE:

EODA@ORA12CR1> update t set x=x-1;
2 rows updated.

If Oracle checked the constraint after each row was updated, on any given day we would stand a 50-50 chance of having the UPDATE fail. The rows in T are accessed in some order, and if Oracle updated the X=1 row first, we would momentarily have a duplicate value for X and it would reject the UPDATE. Since Oracle waits patiently until the end of the statement, the statement succeeds because by the time it is done, there are no duplicates.

DEFERRABLE Constraints and Cascading Updates

Starting with Oracle 8.0, we also have the ability to defer constraint checking, which can be quite advantageous for various operations. The one that immediately jumps to mind is the requirement to cascade an UPDATE of a primary key to the child keys. Many people claim you should never need to do this—that primary keys are immutable (I am one of those people), but many others persist in their desire to have a cascading UPDATE. Deferrable constraints make this possible.

Image Note  It is considered an extremely bad practice to perform update cascades to modify a primary key. It violates the intent of the primary key. If you have to do it once to correct bad information, that’s one thing, but if you find you are constantly doing it as part of your application, you will want to go back and rethink that process—you have chosen the wrong attributes to be the key!

In early releases of Oracle, it was possible to do a CASCADE UPDATE, but doing so involved a tremendous amount of work and had certain limitations. With deferrable constraints, it becomes almost trivial. The code could look like this:

EODA@ORA12CR1> create table parent
  2  ( pk  int primary key )
  3  /
Table created.

EODA@ORA12CR1> create table child
  2  ( fk  constraint child_fk_parent
  3        references parent(pk)
  4        deferrable
  5        initially immediate
  6  )
  7  /
Table created.

EODA@ORA12CR1> insert into parent values ( 1 );
1 row created.

EODA@ORA12CR1> insert into child values ( 1 );
1 row created.

We have a parent table, PARENT, and a child table, CHILD. Table CHILD references table PARENT, and the constraint used to enforce that rule is called CHILD_FK_PARENT (child foreign key to parent). This constraint was created as DEFERRABLE, but it is set to INITIALLY IMMEDIATE. This means we can defer that constraint until COMMIT or to some other time. By default, however, it will be validated at the statement level. This is the most common use of the deferrable constraints. Most existing applications won’t check for constraint violations on aCOMMIT statement, and it is best not to surprise them with that. As defined, table CHILD behaves in the fashion tables always have, but it gives us the ability to explicitly change its behavior. Now let’s try some DML on the tables and see what happens:

EODA@ORA12CR1> update parent set pk = 2;
update parent set pk = 2
*
ERROR at line 1:
ORA-02292: integrity constraint (EODA.CHILD_FK_PARENT) violated - child record found

Since the constraint is in IMMEDIATE mode, this UPDATE fails. We’ll change the mode and try again:

EODA@ORA12CR1> set constraint child_fk_parent deferred;
Constraint set.

EODA@ORA12CR1> update parent set pk = 2;
1 row updated.

Now it succeeds. For illustration purposes, I’ll show how to check a deferred constraint explicitly before committing, to see if the modifications we made are in agreement with the business rules (in other words, to check that the constraint isn’t currently being violated). It’s a good idea to do this before committing or releasing control to some other part of the program (which may not be expecting the deferred constraints):

EODA@ORA12CR1> set constraint child_fk_parent immediate;
set constraint child_fk_parent immediate
*
ERROR at line 1:
ORA-02291: integrity constraint (EODA.CHILD_FK_PARENT) violated - parent key not found

It fails and returns an error immediately as expected, since we knew that the constraint had been violated. The UPDATE to PARENT was not rolled back (that would violate the statement-level atomicity); it is still outstanding. Also note that our transaction is still working with theCHILD_FK_PARENT constraint deferred because the SET CONSTRAINT command failed. Let’s continue now by cascading the UPDATE to CHILD:

EODA@ORA12CR1> update child set fk = 2;
1 row updated.

EODA@ORA12CR1> set constraint child_fk_parent immediate;
Constraint set.

EODA@ORA12CR1> commit;
Commit complete.

And that’s the way it works. Note that to defer a constraint, you must create it that way—you have to drop and re-create the constraint to change it from nondeferrable to deferrable. That might lead you to believe that you should create all of your constraints as “deferrable initially immediate,” just in case you wanted to defer them at some point. In general, that is not true. You want to allow constraints to be deferred only if you have a real need to do so. By creating deferred constraints, you introduce differences in the physical implementation (in the structure of your data) that might not be obvious. For example, if you create a deferrable UNIQUE or PRIMARY KEY constraint, the index that Oracle creates to support the enforcement of that constraint will be a non-unique index. Normally, you expect a unique index to enforce a unique constraint, but since you have specified that the constraint could temporarily be ignored, it can’t use that unique index. Other subtle changes will be observed, for example, with NOT NULL constraints. In Chapter 11, we’ll see how an index on a NOT NULL column can be used in many cases where a similar index on a NULL column can’t be. If you allow your NOT NULL constraints to be deferrable, the optimizer will start treating the column as if it supports NULLs—because it in fact does support NULLs during your transaction. For example, suppose you have a table with the following columns and data:

EODA@ORA12CR1> create table t
  2  ( x int constraint x_not_null not null deferrable,
  3    y int constraint y_not_null not null,
  4    z varchar2(30)
  5  );
Table created.

EODA@ORA12CR1> insert into t(x,y,z)
  2  select rownum, rownum, rpad('x',30,'x')
  3    from all_users;
45 rows created.

EODA@ORA12CR1> exec dbms_stats.gather_table_stats( user, 'T' );
PL/SQL procedure successfully completed.

In this example, column X is created such that when you COMMIT, X will not be null. However, during your transaction X is allowed to be null since the constraint is deferrable. Column Y, on the other hand, is always NOT NULL. Let’s say you were to index column Y:

EODA@ORA12CR1> create index t_idx on t(y);
Index created.

And you then ran a query that could make use of this index on Y—but only if Y is NOT NULL, as in following query:

EODA@ORA12CR1> explain plan for select count(*) from t;
Explained.

EODA@ORA12CR1> select * from table(dbms_xplan.display(null,null,'BASIC'));

----------------------------------
| Id  | Operation        | Name  |
----------------------------------
|   0 | SELECT STATEMENT |       |
|   1 |  SORT AGGREGATE  |       |
|   2 |   INDEX FULL SCAN| T_IDX |
----------------------------------

You would be happy to see the optimizer chose to use the small index on Y to count the rows rather than to full-scan the entire table T. However, let’s say that you drop that index and index column X instead:

EODA@ORA12CR1> drop index t_idx;
Index dropped.

EODA@ORA12CR1> create index t_idx on t(x);
Index created.

And you then ran the query to count the rows once more, you would discover that the database does not, in fact cannot, use your index:

EODA@ORA12CR1> explain plan for select count(*) from t;
Explained.

EODA@ORA12CR1> select * from table(dbms_xplan.display(null,null,'BASIC'));

-----------------------------------
| Id  | Operation          | Name |
-----------------------------------
|   0 | SELECT STATEMENT   |      |
|   1 |  SORT AGGREGATE    |      |
|   2 |   TABLE ACCESS FULL| T    |
-----------------------------------

It full-scanned the table. It had to full-scan the table in order to count the rows. This is due to the fact that in an Oracle B*Tree index, index key entries that are entirely null are not made. That is, the index will not contain an entry for any row in the table T, such that all of the columns in the index are null. Since X is allowed to be null temporarily, the optimizer has to assume that X might be null and therefore would not be in the index on X. Hence a count returned from the index might be different (wrong) from a count against the table.

We can see that if X had a nondeferrable constraint placed on it, this limitation is removed; that is, column X is in fact as good as column Y if the NOT NULL constraint is not deferrable:

EODA@ORA12CR1> alter table t drop constraint x_not_null;
Table altered.

EODA@ORA12CR1> alter table t modify x constraint x_not_null not null;
Table altered.

EODA@ORA12CR1> explain plan for select count(*) from t;
Explained.

EODA@ORA12CR1> select * from table(dbms_xplan.display(null,null,'BASIC'));
----------------------------------
| Id  | Operation        | Name  |
----------------------------------
|   0 | SELECT STATEMENT |       |
|   1 |  SORT AGGREGATE  |       |
|   2 |   INDEX FULL SCAN| T_IDX |
----------------------------------

So, the bottom line is, only use deferrable constraints where you have an identified need to use them. They introduce subtle side effects that could cause differences in your physical implementation (non-unique vs. unique indexes) or in your query plans—as just demonstrated!

Bad Transaction Habits

Many developers have some bad habits when it comes to transactions. I see this frequently with developers who have worked with a database that “supports” but does not “promote” the use of transactions. For example, in Informix (by default), Sybase, and SQL Server, you must explicitlyBEGIN a transaction; otherwise, each individual statement is a transaction all by itself. In a similar manner to the way in which Oracle wraps a SAVEPOINT around discrete statements, these databases wrap a BEGIN WORK/COMMIT or ROLLBACK around each statement. This is because, in these databases, locks are precious resources, and readers block writers and vice versa. In an attempt to increase concurrency, these databases want you to make the transaction as short as possible—sometimes at the expense of data integrity.

Oracle takes the opposite approach. Transactions are always implicit, and there is no way to have an “autocommit” unless an application implements it (see the “Using Autocommit” section later in this chapter for more details). In Oracle, every transaction should be committed when it must and never before. Transactions should be as large as they need to be. Issues such as locks, blocking, and so on should not really be considered the driving forces behind transaction size—data integrity is the driving force behind the size of your transaction. Locks are not a scarce resource, and there are no contention issues between concurrent readers and writers of data. This allows you to have robust transactions in the database. These transactions do not have to be short in duration—they should be exactly as long as they need to be (but no longer). Transactions are not for the convenience of the computer and its software; they are to protect your data.

Committing in a Loop

Faced with the task of updating many rows, most programmers will try to figure out some procedural way to do it in a loop, so that they can commit every so many rows. I’ve heard two (false!) reasons for doing it this way:

·     It is faster and more efficient to frequently commit lots of small transactions than it is to process and commit one big transaction.

·     We don’t have enough undo space.

Both of these reasons are misguided. Furthermore, committing too frequently leaves you prone to the danger of leaving your database in an “unknown” state should your update fail halfway through. It requires complex logic to write a process that is smoothly restartable in the event of failure. By far the best option is to commit only as frequently as your business processes dictate and to size your undo segments accordingly.

Let’s take a look at these issues in more detail.

Performance Implications

It is generally not faster to commit frequently—it is almost always faster to do the work in a single SQL statement. By way of a small example, say we have a table, T, with lots of rows, and we want to update a column value for every row in that table. We’ll use this to set up such a table (run these four setup steps before each of the following three cases):

EODA@ORA12CR1> drop table t;
Table dropped.

EODA@ORA12CR1> create table t as select * from all_objects;
Table created.

EODA@ORA12CR1> exec dbms_stats.gather_table_stats( user, 'T' );
PL/SQL procedure successfully completed.

EODA@ORA12CR1> variable n number

Well, when we go to update, we could simply do it in a single UPDATE statement, like this:

EODA@ORA12CR1> exec :n := dbms_utility.get_cpu_time;
PL/SQL procedure successfully completed.

EODA@ORA12CR1> update t set object_name = lower(object_name);
72614 rows updated.

EODA@ORA12CR1> exec dbms_output.put_line((dbms_utility.get_cpu_time-:n)|| ' cpu hsecs...' );
49 cpu hsecs...

Many people, for whatever reason, feel compelled to do it like this—slow-by-slow/row-by-row—in order to have a commit every N records:

EODA@ORA12CR1> exec :n := dbms_utility.get_cpu_time;
PL/SQL procedure successfully completed.

EODA@ORA12CR1> begin
  2     for x in ( select rowid rid, object_name, rownum r
  3                  from t )
  4     loop
  5          update t
  6             set object_name = lower(x.object_name)
  7           where rowid = x.rid;
  8          if ( mod(x.r,100) = 0 ) then
  9             commit;
 10          end if;
 11     end loop;
 12     commit;
 13  end;
 14  /
PL/SQL procedure successfully completed.

EODA@ORA12CR1> exec dbms_output.put_line((dbms_utility.get_cpu_time-:n)||' cpu hsecs...' );
275 cpu hsecs...

In this simple example, it is many times slower to loop in order to commit frequently. If you can do it in a single SQL statement, do it that way, as it is almost certainly faster. Even if we “optimize” the procedural code, using bulk processing for the updates (as follows), it is in fact much faster, but still much slower than it could be.

EODA@ORA12CR1> exec :n := dbms_utility.get_cpu_time;
PL/SQL procedure successfully completed.

EODA@ORA12CR1> declare
  2      type ridArray is table of rowid;
  3      type vcArray is table of t.object_name%type;
  4
  5      l_rids  ridArray;
  6      l_names vcArray;
  7
  8      cursor c is select rowid, object_name from t;
  9  begin
 10      open c;
 11      loop
 12          fetch c bulk collect into l_rids, l_names LIMIT 100;
 13          forall i in 1 .. l_rids.count
 14              update t
 15                 set object_name = lower(l_names(i))
 16               where rowid = l_rids(i);
 17          commit;
 18          exit when c%notfound;
 19      end loop;
 20      close c;
 21  end;
 22  /

PL/SQL procedure successfully completed.

EODA@ORA12CR1> exec dbms_output.put_line((dbms_utility.get_cpu_time-:n)||' cpu hsecs...' );
67 cpu hsecs...

PL/SQL procedure successfully completed.

Not only that, but you should notice that the code is getting more and more complex. From the sheer simplicity of a single UPDATE statement, to procedural code, to even more complex procedural code—we are going in the wrong direction! Furthermore (yes, there is more to complain about), the preceding procedural code is not done yet. It doesn’t deal with “what happens when we fail” (not if we but rather when we). What happens if this code gets halfway done and then the system fails? How do you restart the procedural code with a commit? You’d have to add yet more code so you knew where to pick up and continue processing. With the single UPDATE statement, we just reissue the UPDATE. We know that it will entirely succeed or entirely fail; there will not be partial work to worry about. We visit this point more in the section “Restartable Processes Require Complex Logic.”

Now, just to supply a counterpoint to this discussion, recall in Chapter 7 when we discussed the concept of write consistency and how an UPDATE statement, for example, could be made to restart. In the event that the preceding UPDATE statement was to be performed against a subset of the rows (it had a WHERE clause, and other users were modifying the columns this UPDATE was using in the WHERE clause), then there would be a case either for using a series of smaller transactions rather than one large transaction or for locking the table prior to performing the mass update. The goal here would be to reduce the opportunity for restarts to occur.

If we were to UPDATE the vast majority of the rows in the table, that would lead us toward using the LOCK TABLE command. In my experience, however, these sorts of large mass updates or mass deletes (the only statement types really that would be subject to the restart) are done in isolation. That large, one-time bulk update or the purge of old data generally is not done during a period of high activity. Indeed, the purge of data should not be affected by this at all, since you would typically use some date field to locate the information to purge, and other applications would not modify this data.

Snapshot Too Old Error

Let’s now look at the second reason developers are tempted to commit updates in a procedural loop, which arises from their (misguided) attempts to use a “limited resource” (undo segments) sparingly. This is a configuration issue; you need to ensure that you have enough undo space to size your transactions correctly. Committing in a loop, apart from generally being slower, is also the most common cause of the dreaded ORA-01555 error. Let’s look at this in more detail.

As you will appreciate after reading Chapters 1 and 7, Oracle’s multiversioning model uses undo segment data to reconstruct blocks as they appeared at the beginning of your statement or transaction (depending on the isolation mode). If the necessary undo information no longer exists, you will receive an ORA-01555: snapshot too old error message and your query will not complete. So, if you are modifying the table that you are reading (as in the previous example), you are generating undo information required for your query. Your UPDATE generates undo information that your query will probably be making use of to get the read-consistent view of the data it needs to update. If you commit, you are allowing the system to reuse the undo segment space you just filled up. If it does reuse the undo, wiping out old undo data that your query subsequently needs, you are in big trouble. Your SELECT will fail and your UPDATE will stop partway through. You have a partly finished logical transaction and probably no good way to restart it (more about this in a moment).

Let’s see this concept in action with a small demonstration. In a small test database, I set up a table:

EODA@ORA12CR1> create table t as select * from all_objects;
Table created.

EODA@ORA12CR1> create index t_idx on t(object_name);
Index created.

EODA@ORA12CR1> exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );
PL/SQL procedure successfully completed.

I then created a very small undo tablespace and altered the system to use it. Note that by setting AUTOEXTEND off, I have limited the size of all UNDO to be 10MB or less in this system:

EODA@ORA12CR1> create undo tablespace undo_small
  2    datafile '/u01/dbfile/ORA12CR1/undo_small.dbf'
  3    size 10m reuse
  4    autoextend off
  5  /

Tablespace created.
EODA@ORA12CR1> alter system set undo_tablespace = undo_small;
System altered.

Now, with only the small undo tablespace in use, I ran this block of code to do the UPDATE:

EODA@ORA12CR1> begin
  2      for x in ( select /*+ INDEX(t t_idx) */ rowid rid, object_name, rownum r
  3                   from t
  4                  where object_name > ' ' )
  5      loop
  6          update t
  7             set object_name = lower(x.object_name)
  8           where rowid = x.rid;
  9          if ( mod(x.r,100) = 0 ) then
 10             commit;
 11          end if;
 12     end loop;
 13     commit;
 14  end;
 15  /
begin
*
ERROR at line 1:
ORA-01555: snapshot too old: rollback segment number  with name "" too small
ORA-06512: at line 2

I get the error. I should point out that I added an index hint to the query and a WHERE clause to make sure I was reading the table randomly (together, they caused the cost-based optimizer to read the table “sorted” by the index key). When we process a table via an index, we tend to read a block for a single row, and then the next row we want will be on a different block. Ultimately, we will process all of the rows on block 1, just not all at the same time. Block 1 might hold, say, the data for all rows with OBJECT_NAMEs starting with the letters A, M, N, Q, and Z. So we would hit the block many times, since we are reading the data sorted by OBJECT_NAME and presumably many OBJECT_NAMEs start with letters between A and M. Since we are committing frequently and reusing undo space, we eventually revisit a block where we can simply no longer roll back to the point our query began, and at that point we get the error.

This was a very artificial example just to show how it happens in a reliable manner. My UPDATE statement was generating undo. I had a very small undo tablespace to play with (10MB). I wrapped around in my undo segments many times, since they are used in a circular fashion. Every time I committed, I allowed Oracle to overwrite the undo data I generated. Eventually, I needed some piece of data I had generated, but it no longer existed and I received the ORA-01555 error.

You would be right to point out that in this case, if I had not committed on line 10, I would have received the following error:

begin
*
ERROR at line 1:
ORA-30036: unable to extend segment by 8 in undo tablespace 'UNDO_SMALL'
ORA-06512: at line 6

The major differences between the two errors are as follows:

·     The ORA-01555 example left my update in a totally unknown state. Some of the work had been done; some had not.

·     There is absolutely nothing I can do to avoid the ORA-01555 error, given that I committed in the cursor FOR loop.

·     The ORA-30036 error can be avoided by allocating appropriate resources in the system. This error is avoidable by correct sizing; the first error is not. Further, even if I don’t avoid this error, at least the update is rolled back and the database is left in a known, consistent state—not halfway through some large update.

The bottom line here is that you can’t “save” on undo space by committing frequently—you need that undo. I was in a single-user system when I received the ORA-01555 error. It takes only one session to cause that error, and many times even in real life it is a single session causing its own ORA-01555 errors. Developers and DBAs need to work together to size these segments adequately for the jobs that need to be done. There can be no short-changing here. You must discover, through analysis of your system, what your biggest transactions are and size appropriately for them. The dynamic performance view V$UNDOSTAT can be very useful to monitor the amount of undo you are generating and the duration of your longest running queries. Many people consider things like temp, undo, and redo as overhead—things to allocate as little storage to as possible. This is reminiscent of a problem the computer industry had on January 1, 2000, which was all caused by trying to save 2 bytes in a date field. These components of the database are not overhead, but rather are key components of the system. They must be sized appropriately (not too big and not too small).

Image Note  Speaking of UNDO segments being too small, make sure to set your undo tablespace back to your regular one after running these examples, otherwise you’ll be hitting ORA-30036 errors for the rest of the book!

Restartable Processes Require Complex Logic

The most serious problem with the “commit before the logical transaction is over” approach is the fact that it frequently leaves your database in an unknown state if the UPDATE fails halfway through. Unless you planned for this ahead of time, it is very hard to restart the failed process, allowing it to pick up where it left off. For example, say we were not applying the LOWER() function to the column, as in the previous example, but rather some other function of the column, such as this:

last_ddl_time = last_ddl_time + 1;

If we halted the UPDATE loop partway through, how would we restart it? We could not just rerun it, as we would end up adding 2 to some dates, and 1 to others. If we fail again, we would add 3 to some, 2 to others, 1 to the rest, and so on. We need yet more complex logic—some way to “partition” the data. For example, we could process every OBJECT_NAME that starts with A, and then B, and so on:

EODA@ORA12CR1> create table to_do
  2  as
  3  select distinct substr( object_name, 1,1 ) first_char
  4    from T
  5  /
Table created.

EODA@ORA12CR1> begin
  2          for x in ( select * from to_do )
  3          loop
  4              update t set last_ddl_time = last_ddl_time+1
  5               where object_name like x.first_char || '%';
  6
  7              dbms_output.put_line( sql%rowcount || ' rows updated' );
  8              delete from to_do where first_char = x.first_char;
  9
 10              commit;
 11          end loop;
 12  end;
 13  /
238 rows updated
5730 rows updated
1428 rows updated
...
262 rows updated
1687 rows updated
PL/SQL procedure successfully completed.

Now, we could restart this process if it fails, since we would not process any object name that had already been processed successfully. The problem with this approach, however, is that unless we have some attribute that evenly partitions the data, we will end up having a very wide distribution of rows. The second UPDATE did more work than all of the others combined. Additionally, if other sessions are accessing this table and modifying the data, they might update the OBJECT_NAME field as well. Suppose that some other session updates the object named Z to be A,after we already processed the As. We would miss that record. Furthermore, this is a very inefficient process compared to UPDATE T SET LAST_DDL_TIME = LAST_DDL_TIME+1. We are probably using an index to read every row in the table, or we are full-scanning it n times, both of which are undesirable. There are so many bad things to be said about this approach.

Image Note  In Chapter 14, we’ll visit a feature available starting with Oracle 11g Release 2—the DBMS_PARALLEL_EXECUTE package. There we’ll revisit this restartable approach and deal with the non-uniform update patterns as well.

The best approach is the one I advocated at the beginning of Chapter 1: do it simply. If it can be done in SQL, do it in SQL. What can’t be done in SQL, do in PL/SQL. Do it using the least amount of code you can. Have sufficient resources allocated. Always think about what happens in the event of an error. So many times, I’ve seen people code update loops that worked great on the test data but then failed halfway through when applied to the real data. Then they are really stuck, as they have no idea where the loop stopped processing. It’s a lot easier to size undo correctly than to write a restartable program. If you have truly large tables that need to be updated, you should be using partitions (more on that in Chapter 10), which you can update each individually. You can even use parallel DML to perform the update, or in Oracle11g Release 2 and above, theDBMS_PARALLEL_EXECUTE package.

Using Autocommit

My final words on bad transaction habits concern the one that arises from using the popular programming APIs ODBC and JDBC. These APIs “autocommit” by default. Consider the following statements, which transfer $1,000 from a checking account to a savings account:

update accounts set balance = balance - 1000 where account_id = 123;
update accounts set balance = balance + 1000 where account_id = 456;

If your program is using ODBC or JDBC when you submit these statements, they (silently) inject a commit after each UPDATE. Consider the impact of this if the system fails after the first UPDATE and before the second. You’ve just lost $1,000!

I can sort of understand why ODBC does this. The developers of SQL Server designed ODBC, and this database demands that you use very short transactions due to its concurrency model (writes block reads, reads block writes, and locks are a scarce resource). What I can’t understand is how this got carried over into JDBC, an API that is supposed to support “the enterprise.” It is my belief that the very next line of code after opening a connection in JDBC should always be this:

Connection conn = DriverManager.getConnection
              ("jdbc:oracle:oci:@database","scott","tiger");

conn.setAutoCommit (false);

This returns control over the transaction back to you, the developer, which is where it belongs. You can then safely code your account transfer transaction and commit it after both statements have succeeded. Lack of knowledge of your API can be deadly in this case. I’ve seen more than one developer unaware of this autocommit “feature” get into big trouble with his application when an error occurred.

Distributed Transactions

One of the really nice features of Oracle is its ability to transparently handle distributed transactions. I can update data in many different databases in the scope of a single transaction. When I commit, either I commit the updates in all of the instances or I commit none of them (they will all be rolled back). I need no extra code to achieve this; I simply “commit.”

A key to distributed transactions in Oracle is the database link. A database link is a database object that describes how to log into another instance from your instance. However, the purpose of this section is not to cover the syntax of the database link command (it is fully documented in the Oracle Database SQL Language Reference manual), but rather to expose you to its very existence. Once you have a database link set up, accessing remote objects is as easy as this:

select * from T@another_database;

This would select from the table T in the database instance defined by the database link ANOTHER_DATABASE. Typically, you would “hide” the fact that T is a remote table by creating a view of it, or a synonym. For example, I can issue the following and then access T as if it were a local table:

create synonym T for T@another_database;

Now that I have this database link set up and can read some tables, I am also able to modify them (assuming I have the appropriate privileges, of course). Performing a distributed transaction is now no different from a local transaction. All I would do is this:

update local_table set x = 5;
update remote_table@another_database set y = 10;
commit;

That’s it. Oracle will commit either in both databases or in neither. It uses a two-phase commit protocol (2PC) to do this. 2PC is a distributed protocol that allows for a modification that affects many disparate databases to be committed atomically. It attempts to close the window for distributed failure as much as possible before committing. In a 2PC between many databases, one of the databases—typically the one the client is logged into initially—will be the coordinator for the distributed transaction. This one site will ask the other sites if they are ready to commit. In effect, this site will go to the other sites and ask them to be prepared to commit. Each of the other sites reports back its “prepared state” as YES or NO. If any one of the sites votes NO, the entire transaction is rolled back. If all sites vote YES, the site coordinator broadcasts a message to make the commit permanent on each of the sites.

This limits the window in which a serious error could occur. Prior to the “voting” on the 2PC, any distributed error would result in all of the sites rolling back. There would be no doubt as to the outcome of the transaction. After the order to commit or rollback, there again is no doubt as to the outcome of the distributed transaction. It is only during the very short window when the coordinator is collecting the votes that the outcome might be in doubt, after a failure.

Assume, for example, we have three sites participating in the transaction with Site 1 being the coordinator. Site 1 has asked Site 2 to prepare to commit, and Site 2 has done so. Site 1 then asks Site 3 to prepare to commit, and it does so. At this point, Site 1 is the only site that knows the outcome of the transaction, and it is now responsible for broadcasting the outcome to the other sites. If an error occurs right now—the network fails, Site 1 loses power, whatever—Sites 2 and 3 will be left hanging. They will have what is known as an in-doubt distributed transaction. The 2PC protocol attempts to close the window of error as much as possible, but it can’t close it entirely. Sites 2 and 3 must keep that transaction open, awaiting notification of the outcome from Site 1.

If you recall from the architecture discussion in Chapter 5, it is the function of the RECO process to resolve this issue. This is also where COMMIT and ROLLBACK with the FORCE option come into play. If the cause of the problem was a network failure between Sites 1, 2, and 3, then the DBAs at Sites 2 and 3 could actually call the DBA at Site 1, ask him for the outcome, and apply the commit or rollback manually, as appropriate.

There are some, but not many, limitations to what you can do in a distributed transaction, and they are reasonable (to me, anyway, they seem reasonable). The big ones are as follows:

·     You can’t issue a COMMIT over a database link. That is, you can’t issue a COMMIT@remote_site. You may commit only from the site that initiated the transaction.

·     You can’t do DDL over a database link. This is a direct result of the preceding issue. DDL commits. You can’t commit from any site other than the initiating site, hence you can’t do DDL over a database link.

·     You can’t issue a SAVEPOINT over a database link. In short, you can’t issue any transaction control statements over a database link. All transaction control is inherited from the session that opened the database link in the first place; you can’t have different transaction controls in place in the distributed instances in your transaction.

The lack of transaction control over a database link is reasonable, since the initiating site is the only one that has a list of everyone involved in the transaction. If in our three-site configuration, Site 2 attempted to commit, it would have no way of knowing that Site 3 was involved. In Oracle, only Site 1 can issue the commit command. At that point, it is permissible for Site 1 to delegate responsibility for distributed transaction control to another site.

We can influence which site will be the actual commit site by setting the COMMIT_POINT_STRENGTH (a parameter) of the site. A COMMIT_POINT_STRENGTH associates a relative level of importance to a server in a distributed transaction. The more important the server (the more available the data needs to be), the more probable that it will coordinate the distributed transaction. You might want to do this if you need to perform a distributed transaction between your production machine and a test machine. Since the transaction coordinator is never in doubt as to the outcome of a transaction, it’s best if the production machine coordinated the distributed transaction. You don’t care so much if your test machine has some open transactions and locked resources. You certainly do care if your production machine does.

The inability to do DDL over a database link is actually not so bad at all. First, DDL is rare. You do it once at installation or during an upgrade. Production systems don’t do DDL (well, they shouldn’t do DDL). Second, there is a method to do DDL over a database link, in a fashion, using the job queue facility, DBMS_JOB or, in Oracle 10g and higher, the scheduler package, DBMS_SCHEDULER. Instead of trying to do DDL over the link, you use the link to schedule a remote job to be executed as soon as you commit. In that fashion, the job runs on the remote machine, is not a distributed transaction, and can do the DDL. In fact, this is the method by which the Oracle Replication Services perform distributed DDL to do schema replication.

Autonomous Transactions

Autonomous transactions allow you to create a “transaction within a transaction” that will commit or roll back changes independently of its parent transaction. They allow you to suspend the currently executing transaction, start a new one, do some work, and commit or roll back—all without affecting the currently executing transaction state. Autonomous transactions provide a new method of controlling transactions in PL/SQL and may be used in

·     Top-level anonymous blocks

·     Local (a procedure in a procedure), stand-alone, or packaged functions and procedures

·     Methods of object types

·     Database triggers

Before we take a look at how autonomous transactions work, I’d like to emphasize that this type of transaction is a powerful and therefore dangerous tool when used improperly. The true need for an autonomous transaction is very rare indeed. I would be very suspicious of any code that makes use of them—that code would get extra examination. It is far too easy to accidentally introduce logical data integrity issues into a system using them. In the sections that follow, we’ll discuss when they may safely be used after seeing how they work.

How Autonomous Transactions Work

The best way to demonstrate the actions and consequences of an autonomous transaction is by example. We’ll create a simple table to hold a message:

EODA@ORA12CR1> create table t ( msg varchar2(25) );
Table created.

Next, we’ll create two procedures, each of which simply INSERTs its name into the message table and commits. However, one of these procedures is a normal procedure and the other is coded as an autonomous transaction. We’ll use these objects to show what work persists (is committed) in the database under various circumstances.

First, here’s the AUTONOMOUS_INSERT procedure:

EODA@ORA12CR1> create or replace procedure Autonomous_Insert
  2  as
  3          pragma autonomous_transaction;
  4  begin
  5          insert into t values ( 'Autonomous Insert' );
  6          commit;
  7  end;
  8  /
Procedure created.

Note the use of the pragma AUTONOMOUS_TRANSACTION. This directive tells the database that this procedure, when executed, is to be executed as a new autonomous transaction, independent from its parent transaction.

Image Note  A pragma is simply a compiler directive, a method to instruct the compiler to perform some compilation option. Other pragmas are available. Refer to the Oracle Database PL/SQL Language Reference manual; you’ll find a list of them in its index.

And here’s the “normal” NONAUTONOMOUS_INSERT procedure:

EODA@ORA12CR1> create or replace procedure NonAutonomous_Insert
  2  as
  3  begin
  4          insert into t values ( 'NonAutonomous Insert' );
  5          commit;
  6  end;
  7  /
Procedure created.

Now let’s observe the behavior of the nonautonomous transaction in an anonymous block of PL/SQL code:

EODA@ORA12CR1> begin
  2          insert into t values ( 'Anonymous Block' );
  3          NonAutonomous_Insert;
  4          rollback;
  5  end;
  6  /
PL/SQL procedure successfully completed.

EODA@ORA12CR1> select * from t;

MSG
-------------------------
Anonymous Block
NonAutonomous Insert

As you can see, the work performed by the anonymous block, its INSERT, was committed by the NONAUTONOMOUS_INSERT procedure. Both rows of data were committed, so the ROLLBACK command had nothing to roll back. Compare this to the behavior of the autonomous transaction procedure:

EODA@ORA12CR1> delete from t;
2 rows deleted.

EODA@ORA12CR1> commit;
Commit complete.

EODA@ORA12CR1> begin
  2          insert into t values ( 'Anonymous Block' );
  3          Autonomous_Insert;
  4          rollback;
  5  end;
  6  /
PL/SQL procedure successfully completed.

EODA@ORA12CR1> select * from t;

MSG
-------------------------
Autonomous Insert

Here, only the work done by and committed in the autonomous transaction persists. The INSERT done in the anonymous block was rolled back by the ROLLBACK statement on line 4. The autonomous transaction procedure’s COMMIT has no effect on the parent transaction started in the anonymous block. In a nutshell, this captures the essence of autonomous transactions and what they do.

To summarize, if you COMMIT inside a “normal” procedure, it will make durable not only its own work but also any outstanding work performed in that session. However, a COMMIT performed in a procedure with an autonomous transaction will make durable only that procedure’s work.

When to Use Autonomous Transactions

The Oracle database has supported autonomous transactions internally for quite a while. We see them all of the time in the form of recursive SQL. For example, a recursive transaction may be performed when selecting from a sequence, in order for you to increment the sequence immediately in the SYS.SEQ$ table. The update of the SYS.SEQ$ table in support of your sequence is immediately committed and visible to other transactions, but your transaction is not yet committed. Additionally, if you roll back your transaction, the increment to the sequence remains in place; it is not rolled back with your transaction, as it has already been committed. Space management, auditing, and other internal operations are performed in a similar recursive fashion.

This feature has now been exposed for all to use. However, I have found that the legitimate real-world use of autonomous transactions is very limited. Time after time, I see them used as a workaround to such problems as a mutating table constraint in a trigger. This almost always leads to data integrity issues, however, since the cause of the mutating table is an attempt to read the table upon which the trigger is firing. Well, by using an autonomous transaction you can query the table, but you are querying the table now without being able to see your changes (which is what the mutating table constraint was trying to do in the first place; the table is in the middle of a modification, so query results would be inconsistent). Any decisions you make based on a query from that trigger would be questionable—you are reading “old” data at that point in time.

A potentially valid use for an autonomous transaction is in custom auditing, but I stress the words “potentially valid.” There are more efficient ways to audit information in the database than via a custom-written trigger. For example, you can use the DBMS_FGA package or just the AUDITcommand itself.

A question that application developers often pose to me is, “How can I log errors in my PL/SQL routines in a manner that will persist, even when my PL/SQL routines’ work is rolled back?” Earlier, we described how PL/SQL statements are atomic—they either completely succeed or completely fail. If we logged an error in our PL/SQL routines, by default our logged error information would roll back when Oracle rolled back our statement. Autonomous transactions allow us to change that behavior, to have our error logging information persist even while the rest of the partial work is rolled back.

Let’s start by setting up a simple error logging table to use; we’ll record the timestamp of the error, the error message, and the PL/SQL error stack (for pinpointing where the error emanated from):

EODA@ORA12CR1> create table error_log
  2  ( ts   timestamp,
  3    err1 clob,
  4    err2 clob )
  5  /
Table created.

Now we need the PL/SQL routine to log errors into this table. We can use this small example:

EODA@ORA12CR1> create or replace
  2  procedure log_error
  3  ( p_err1 in varchar2, p_err2 in varchar2 )
  4  as
  5      pragma autonomous_transaction;
  6  begin
  7      insert into error_log( ts, err1, err2 )
  8      values ( systimestamp, p_err1, p_err2 );
  9      commit;
 10  end;
 11  /
Procedure created.

The “magic” of this routine is on line 5 where we used the pragma autonomous_transaction directive to inform PL/SQL that we want this subroutine to start a new transaction, perform some work in it, and commit it—without affecting any other transaction currently in process. The COMMIT on line 9 can affect only the SQL performed by this LOG_ERROR procedure.

Now let’s test it out. To make it interesting, we’ll create a couple of procedures that will call each other:

EODA@ORA12CR1> create table t ( x int check (x>0) );
Table created.

EODA@ORA12CR1> create or replace procedure p1( p_n in number )
  2  as
  3  begin
  4      -- some code here
  5      insert into t (x) values ( p_n );
  6  end;
  7  /
Procedure created.

EODA@ORA12CR1> create or replace procedure p2( p_n in number )
  2  as
  3  begin
  4      -- code
  5      -- code
  6      p1(p_n);
  7  end;
  8  /
Procedure created.

And then we’ll invoke those routines from an anonymous block:

EODA@ORA12CR1> begin
  2      p2( 1 );
  3      p2( 2 );
  4      p2( -1);
  5  exception
  6      when others
  7      then
  8          log_error( sqlerrm, dbms_utility.format_error_backtrace );
  9          RAISE;
 10  end;
 11  /
begin
*
ERROR at line 1:
ORA-02290: check constraint (EODA.SYS_C0061527) violated
ORA-06512: at line 9

Now, we can see the code failed (you want that error returned, hence the RAISE on line 9). We can verify that Oracle undid our work (we know that the first two calls to procedure P2 succeeded; the values 1 and 2 are successfully inserted into our table T):

EODA@ORA12CR1> select * from t;
no rows selected

But we can also verify that our error log information has persisted, and in fact is committed:

EODA@ORA12CR1> rollback;
Rollback complete.

EODA@ORA12CR1> select * from error_log;

TS
---------------------------------------------------------------------------
ERR1
--------------------------------------------------------------------------------
ERR2
--------------------------------------------------------------------------------
09-MAY-14 05.11.03.818918 PM
ORA-02290: check constraint (EODA.SYS_C00204351) violated
ORA-06512: at "EODA.P1", line 5
ORA-06512: at "EODA.P2", line 6
ORA-06512: at line 4

In my experience, that is the only truly valid use of an autonomous transaction—to log errors or informational messages in a manner that can be committed independently of the parent transaction.

Summary

In this chapter, we looked at many aspects of transaction management in Oracle. Transactions are among the major features that set a database apart from a file system. Understanding how they work and how to use them is necessary to implement applications correctly in any database. Understanding that in Oracle all statements are atomic (including their side effects) and that this atomicity is extended to stored procedures is crucial. We saw how the placement of a WHEN OTHERS exception handler in a PL/SQL block could radically affect what changes took place in the database. As database developers, having a good understanding of how transactions work is crucial.

We took a look at the somewhat complex interaction between integrity constraints (unique keys, check constraints, and the like) and transactions in Oracle. We discussed how Oracle typically processes integrity constraints immediately after a statement executes, but that we can defer this constraint validation until the end of the transaction if we wish. This feature is key in implementing complex multitable updates when the tables being modified are all dependent on each other—the cascading update is an example of that.

We moved on to consider some of the bad transaction habits that people tend to pick up from working with databases that “support” rather than “promote” the use of transactions. We looked at the cardinal rule of transactions: they should be as short as they can be but as long as they need to be. Data integrity drives the transaction size—that is a key concept to take away from this chapter. The only things that should drive the size of your transactions are the business rules that govern your system. Not undo space, not locks—business rules.

We covered distributed transactions and how they differ from single database transactions. We explored the limitations imposed upon us in a distributed transaction and discussed why they exist. Before you build a distributed system, you need to understand these limitations. What works in a single instance might not work in a distributed database.

The chapter closed with a look at autonomous transactions and covered what they are and, more important, when they should and should not be used. I would like to emphasize once again that the legitimate real-world use of autonomous transactions is exceedingly rare. If you find them to be a feature you are using constantly, you’ll want to take a long, hard look at why.