Database Administrator

ORA-1555 Snapshot Too Old



Applies to:
________________________________________________________________________________


Oracle Database - Enterprise Edition - Version 9.0.1.0 to 12.1.0.1 [Release 9.0.1 to 12.1]


Information in this document applies to any platform.



Goal:
________________________________________________________________________________


A clear, step by step, method to diagnose an ORA-1555 issue using either

   UNDO_MANAGEMENT = AUTO ... Automatic Undo Management (AUM)

OR

   UNDO_MANAGEMENT = MANUAL ... Manual Undo (Rollback)




Solution:
_________________________________________________________________________________


There are many good notes on ORA-1555 causes and solutions.  In order to be able to resolve a problem, we must first understand the problem.


We will examine an extended example of the most common cause of the ORA-1555.


EXAMPLE:


     Time 1: Session #1 starts a query on table A
     Time 2: Session #2 updates row X in table A
     Time 3: The query from session #1 arrives at row X and discovers the last updated time
                    (based on the SCN number) of the row to be later than Time 1 ... so the UNDO or
                     LOB segments are used to get the read consistent view (before image) of the row
                     and then the query proceeds
     Time 4: Session #2 updates row Y in table A and then COMMITs (thus making it possible
                   for this transaction slot to be overwritten)
     Time 5: Session #2 updates row Z in table B and COMMITs ... either due to space pressure
                   (using AUM) or bad luck (using rollback) in the read consistent view for the update
                   of row Y in table A at time 4 ... is overwritten (we wont examine why the overwrite
                   occurs at this point)
     Time 6: The query from session #1 arrives at row Y and and discovers the last updated time
                    (based on the SCN number) of the row to be later than Time 1 ... so the UNDO or
                    LOB segments are examined to find the read consistent view ... BUT ... the
                    transaction slot containing the view ... was overwritten at time 5 ... so no read
                    consistent view is available ... so an ORA-1555 occurs


NOTE ... UNDO is a generic term that can refer to either UNDO (AUM) or Rollback segments 


=====================================================


How can AUM allow a read consistent view of a transaction slot for a committed row to be overwritten?

1) The transaction slot for the row has expired. 


 


 This means that the current time minus the commit time of the row is greater than the
   UNDO_RETENTION.  
Once a transaction slot of a committed row is 'expired', it is available for
   reuse.  Often we get customers who ask:


   Why do my queries sometimes run so much longer than my UNDO_RETENTION before
   encountering an ORA-1555 and other times it occurs soon after?


   The answer is all left up to chance and how much pressure the undo tablespace is under.

2) The transaction slot for the row has NOT expired and yet was still overwritten.

   This condition occurs under two conditions:

      a) The undo tablespace has become full and must 'steal' transaction slots from committed, unexpired rows

      b) A bug

3) The LOB segment read consistent copy of the LOB is no longer available 

   Investigate how the LOB column is configured, in-row or out-of-row.   In-row LOBs should be utilizing normal UNDO algorithms in the UNDO tablespace.   Out-of-row LOBs use the information below.

   Read consistency with LOBs is controlled by 2 methods

      a) Use of PCTVERSION (old method)

          Specify the maximum percentage of overall LOB storage space used for maintaining old versions of the LOB. The default value is 10, meaning that older versions of the LOB data are not overwritten until they consume 10% of the overall LOB storage space.
         The downfall to this method is customers who do frequent updates or deletes of tables with LOBs often exceed even 100% PCTVERSION ...

      b) Use of RETENTION (current method for use with AUM)

          Use this clause to indicate that Oracle Database should retain old versions of this LOB column.
          Oracle Database uses the value of the UNDO_RETENTION initialization parameter to  determine the amount of committed undo data to retain in the database.
          This method uses the same expiration as undo segments
          If an ORA-1555 occurs with a LOB segment. then this means that either:


            PCTVERSION was exceeded and the read consistent view of the LOB was overwritten


            OR 


            the LOB has exceeded RETENTION and the row LOB was overwritten sometime during
            the query that generated the ORA-1555

     New information


     LOB problems with ORA-1555 can be very difficult to fix.  Updates to out-of-row LOBs
     should be minimized.   Deletes and inserts have much less impact to UNDO operations than
     updates.   PCTVERSION is the preferred approach with 10.2.x and AUM.  However,
     100% will mean that lots of disk space is required to keep up with UNDO "copies" of LOB
      information in environments with frequent LOB updates.

   In some cases it has helped to move the LOB column to a  tablespace where the Segment
   Space Management is manual.

NOTE: transaction slots for UNCOMMITTED rows cannot be overwritten    


=====================================================


Now that we understand why the ORA-1555 occurs and some aspects about how they can occur we need to examine the following:

How can we logically determine and resolve what has occurred to cause the ORA-1555?

1) Determine if UNDO_MANAGEMENT is MANUAL or AUTO

   If set to MANUAL, it is best to move to AUM.  If it is not feasible to switch to AUM
   to attempt to tune around the ORA-1555 using V$ROLLSTAT

   If set to AUTO, proceed to #2

2) Gather the basic data 

   a) Acquire both the error message from the user / client ... and the message in the alert log

      User / Client session example:

         ORA-01555: snapshot too old: rollback segment number 9 with name "_SYSSMU1$" too small

      Alert log example

         ORA-01555 caused by SQL statement below (Query Duration=9999 sec, SCN:0x000.008a7c2d)

   b) Determine the QUERY DURATION from the message in the alert log

      From our example above ... this would be 9999

   c) Determine the undo segment name from the user / client message

      From our example above ... this would be _SYSSMU1$

   d) Determine the UNDO_RETENTION of the undo tablespace

      show parameter undo_retention

3) Determine if the ORA-1555 is occurring with an UNDO or a LOB segment 

   If the undo segment name is null ...

          ORA-01555: snapshot too old: rollback segment number  with name "" too small

   or the undo segment is unknown

         ORA-01555: snapshot too old: rollback segment number # with name "???" too small

   then this means this is a read consistent failure on a LOB segment

   If the segment_name or the undo segment is known the error is occurring with an UNDO segment
=====================================================




What to do if an ORA-1555 is occurring with an UNDO segment 
------------------------------------------------------------------------------------------



1) QUERY DURATION > UNDO_RETENTION 

   There are no guarantees that read consistency can be maintained after the transaction slot for the
   committed row has expired (exceeded UNDO_RETENTION)

   Why would one think that the transaction slot's time has exceeded UNDO_RETENTION?

      Lets answer this with an example

         If UNDO_RETENTION = 900 seconds ... but our QUERY DURATION is 2000 seconds ...

              This says that our query has most likely encountered a row that was committed more than 900
               seconds ago ... and has been overwritten as we KNOW that the transaction slot being examined
               no longer matches the row we are looking for

      The reason we say "most likely" is that it is possible that an unexpired committed transaction slot was
      overwritten due to either space pressure on the undo segment or this is a bug

SOLUTION:

   
The best solution is to tune the query can to reduce its duration. If that cannot be done then increase 

   UNDO_RETENTION
 based on QUERY DURATION to allow it to protect the committed
   transaction slots for a longer period of time

   NOTE 1: Increasing UNDO_RETENTION requires additional space in the UNDO tablespace.  Make
   sure to accommodate for this space.  One method of doing this is to set AUTOEXTEND on one or
   more of the UNDO tablespace datafiles for a period of time to allow for the increased space. Once the
   size has stabilized, AUTOEXTEND can be removed.

       See the solution for #2 below for more options


   NOTE 2: It is NOT recommended to have a mix of AUTOEXTENSIBLE and NON-AUTOEXTENSIBLE files within an UNDO tablespace as this can cause TUNED_UNDORETENTION  to miscalculate



2) QUERY DURATION <= UNDO_RETENTION 

   This case is most often due to the UNDO tablespace becoming full sometime during the time when the
   query was running

   How do we tell if the UNDO tablespace has become full during the query? 

   Examine V$UNDOSTAT.UNXPSTEALCNT for the period while the query that generated the
   ORA-1555 occurred.

   This column shows committed transaction slots that have not exceeded UNDO_RETENTION but
   were overwritten due to space pressure on the undo tablespace (IE became full).

   If UNEXPSTEACNT > 0 for the time period during which the query was running then this shows
   that the undo tablespace was too small to be able to maintain UNDO_RETENTION.  Unexpired
   blocks were over written, thus ending read consistency for those blocks for that time period.


set pagesize 25
set linesize 120

select inst_id,
          to_char(begin_time,'MM/DD/YYYY HH24:MI') begin_time,
          UNXPSTEALCNT "# Unexpired|Stolen",
          EXPSTEALCNT "# Expired|Reused",
          SSOLDERRCNT "ORA-1555|Error",
          NOSPACEERRCNT "Out-Of-space|Error",
          MAXQUERYLEN "Max Query|Length"
from gv$undostat
where begin_time between
     to_date('<start time of the ORA-1555 query>','MM/DD/YYYY HH24:MI:SS')
and
     to_date('<time when ORA-1555 occured>','MM/DD/YYYY HH24:MI:SS')
order by inst_id, begin_time;



   NOTE: Start time of the query can be determined by subtracting the query duration from


 the timestamp of the ORA-1555


EXAMPLE:

Tue May 26 16:16:57 2009
ORA-01555 caused by SQL statement below (SQL ID: 54yn3n36w24ft, Query Duration=922 sec, SCN: 0x0007.8a55f4e3)


922 seconds is 15 min 22 seconds

So the start time of the query would be May 26 16:01:35



   If after researching, you find no existing bugs seem to match your case, then additional info will be
   needed to file a bug.
   Starting with release 9.2.0.4 an event, 10442, can be set to dump all the necessary diagnostics
   information to one trace file. NOTE: Event 10442 does not trigger diagnostics in cases where the
   ORA-1555 error is associated with out-of-row LOB undo.
   The event can be enabled using the following settting in init.ora




    event="10442 trace name context forever, level 10"
   or
   Alter system set events '10442 trace name context forever , level 10';
   Reproduce the ORA-1555 error with a new connection to trigger the trace diagnostics file.



What to do if an ORA-1555 is occurring with a LOB segment
---------------------------------------------------------------------------------------------

1) Often the ORA-1555 with the null undo segment is accompanied by an ORA-22924

   Error: ORA-22924
   Text: snapshot too old
---------------------------------------------------------------------------
   Cause: The version of the LOB value needed for the consistent read was already overwritten by anothe writer.

2) Determine if PCTVERSION or RETENTION is used

   Examine DBA_LOBS.PCTVERSION and DBA_LOBS.RETENTION
   (or investigate through the data dictionary--see below)

   NOTE: Only one of these may be set, so this will determine which method of undo management is used for this LOB

   There is a bug with these two columns prior to 10.2 and as such the data dictionary must be examined to determine which of these is being used



If PCTVERSION is used
------------------------------------

   This method of LOB undo management is akin to use of rollback segments and as such there is little control to tune undo usage.

   The only parameter we have control over with this method is PCTVERSION itself.

SOLUTION:

   Increase PCTVERSION


ALTER TABLE **table name** MODIFY LOB (**lob column name**) (PCTVERSION 100)



   If the parameter is at 100% then the only solution will be to move to use of RETENTION


ALTER TABLE **table name** MODIFY LOB (**lob column name**) (RETENTION)



If RETENTION is used
---------------------------------
   You may have RETENTION in place, but it may not be working as you expect.

1)
 QUERY DURATION> UNDO_RETENTION

   There are no guarantees that read consistency can be maintained after the old lob segment for
   the committed row has expired (exceeded UNDO_RETENTION)

SOLUTION:

   The best solution is to tune the query to reduce its duration. If that cannot be done then
   increase UNDO_RETENTION based on QUERY DURATION to allow it to protect the
   committed transaction slots for a longer period of time.  A change to UNDO_RETENTION
   does not automatically change RETENTION.  See 
Note 563470.1
 Lob retention not
   changing when undo_retention is changed.

   NOTE 1: Increasing UNDO_RETENTION requires additional space in the LOB segments
                   tablespace.  Make sure to accommodate for this.  One method of doing this is to set
                   AUTOEXTEND on one of more of the lob segments tablespace datafiles.  There may or
                   may not be a stabilization of size like the UNDO tablespace.  If it does stabilize then the
                   AUTOEXTEND can be removed.  See the solution for #2 below for more options.

   NOTE 2: RETENTION for LOBs does not function the same as it does for undo.  LOB segments
                   will not automatically extend to allow for retention of read consistent copies of a lob.  As
                   such, at times like these, PCTVERSION 100 is a better solution.

2) QUERY DURATION <= UNDO_RETENTION

 
  This case is most often due to the LOB segments tablespace becoming full sometime during the time
   when the query was running.

   How do we tell if the LOB segment tablespace has become full during the query?

   Unfortunately there is no easy way to do this.  V$UNDOSTAT does not contain info about the LOB
   segments.  There may be a way to use Enterprise Manager and examine the high watermark
   (HWM) of the tablespace in question.

SOLUTION:

Provide more space in the LOB segments tablespace so that transaction slots do not have to be
stolen.  This can be done by:

         * Using AUTOEXTEND on an existing LOB segments tablespace datafile
         * Increase the size of an existing LOB segments tablespace datafile
         * Add a datafile to the LOB segments tablespace


References:
_________________________________________________________________________________
Oracle
[ID 467872.1]