Tuesday, May 11, 2010

7 GREAT ******* DBA TRICKS

Writing DBA scripts :
Seven SQL and SQL*Plus tricks which work wonders



Whether you like it or not and however strongly tool providers - Oracle included - try to sell their wares, SQL*Plus remains and is likely to remain for a while, the tool of choice for DBAs. You obviously needn't be a SQL expert to be a good DBA - although it doesn't hurt but, expert or not, when you want to write usable SQL*Plus scripts there are a number of tricks worth knowing..

Here are seven of them which ,if you don't already know them, will probably save you a lot of hassle.


If in order to be a good DBA all you had to do was to type SELECT * from the ad hoc DBA_something view, check a few things and then execute a single DROP, ALTER or whatever to fix them, life would be easy. Unfortunately, most of the time you need to run a sequence of statements, which moreover changes over time. Therefore, one of the favorite tricks of experienced DBAs is to write SQL scripts which generate other SQL scripts which do the job.

Let's take an example to illustrate most of the SQL*Plus tricks you need.

Say that you need a script to execute an on-line backup of your database (we are assuming your database is running in the required ARCHIVELOG mode).

Basically, what you need to do for each database is to execute

ALTER TABLESPACE xxx BEGIN BACKUP;
then backup the file(s) associated with the tablespace, then execute

ALTER TABLESPACE xxx END BACKUP;

The brute force method would be to write a first SQL script in which the BEGIN BACKUP is hard-coded for each tablespace, some kind of operating-system script which deals with the file backup and a second SQL script for the END BACKUP part.

Very, very bad method indeed, chiefly because things evolve. It's highly likely that you will add files to your database and quite possibly you will have new tablespaces too, which means that you will have to maintain all your scripts. Are you ready to bet you (or, for that matter, the guy who looks after your database while you are on vacation) won't forget? And your initial backup script may run smoothly every day without telling you you're omitting some files. One of the irritating things with backup scripts is that it's only when you're in really dire straits and trying to recover your production database, that you realize that they have been out of date and therefore totally useless for months. This is typically a case where you MUST use a SQL script to generate what is needed; for the Oracle data dictionary contains all the information you need, and is ALWAYS up-to-date.

To generate the 'ALTER TABLESPACE' statements, no better place to look than DBA_TABLESPACES. You can easily type :

select 'alter tablespace ' || tablespace_name || chr(10) ||
'begin backup;'
from dba_tablespaces
where status <> 'INVALID';

(INVALID refers to tablespaces which were dropped).


Trick #1 : chr(10) is quite a useful thing to concatenate in strings, as it inserts a 'newline' character.

It's not really required here but it's very useful with lines which would otherwise be much too long and generally speaking improves the legibility of the scripts you generate.


To get the up-to-date list of files which constitute your database, you can refer to DBA_DATAFILES. Let's assume that our underlying system is Unix and that we want to copy files to the /backup directory, you can write something such as :

select 'cp ' || f.file_name || ' /backup'
from dba_datafiles f;

Trick #2 : WATCH THE LINE LENGTH!

SQL*Plus has the bad habit of 'wrapping' long lines (slicing them, in effect), which can generate invalid operating system commands. Don't forget that a full file name can be up to 255 (give or take one) characters in length and so can the name of the backup directory. If you really want to be safe, you will have to insert at the beginning of your script :

set linesize 600

This is the first but certainly not the last SQL*Plus command we meet.


We are beginning to see the generation of the required commands taking shape but we're certainly not done yet. For one thing, backup procedures are among the most sensitive procedures in operations. You must handle errors, either SQL errors or operating system errors - what about some change of protections preventing you from copying where you want, or a full file system ?


Trick #3 : Use WHENEVER OSERROR EXIT

and WHENEVER SQLERROR EXIT SQL.SQLCODE and end your SQL procedures with EXIT 0.

This will enable you to test return codes from SQL*Plus in an operating system script and will allow you to handle errors properly.


Now let's improve our script. We could of course generate and run BEGIN BACKUP on all the tablespaces, generate and run the copy of all the files, and then generate and run END BACKUP on all the tablespaces. This is fairly crude and (let's leave out the details) not very good in terms of database management. What should be done is for each tablespace

step 1 : execute BEGIN BACKUP

step 2 : copy the files associated with the tablespace

step 3 : execute END BACKUP


The difficulty is in ordering correctly a set of unrelated commands. In that case we need two more tricks :

Trick #4 : Use UNIONs to fetch commands generated by independent queries

and

Trick #5 : Use dummy, unprinted columns for ordering.


Here's how it works :

We want to order by tablespace, so we return the tablespace name but we use the COLUMN SQL*Plus comment to make it invisible in the output. We also want the steps to be returned in a given order, so we add a constant column, also invisible, which we shall use as a minor sorting key (note that in a UNION you need to give aliases to the columns in the first select list only):


column tablespace_name noprint

column step noprint
column text format A550
select tablespace_name,
'step_1' step,
'alter tablespace ' || tablespace_name || chr(10) ||
'begin backup;' text
from dba_tablespaces
where status <> 'INVALID'
union
select tablespace_name,
'step_2',
'cp ' || file_name || ' /backup'
from dba_datafiles
union
select tablespace_name,
'step_3',
'alter tablespace ' || tablespace_name || chr(10) ||
'end backup;'
from dba_tablespaces
where status <> 'INVALID'
order by 1, 2;

This will generate what we want... except that we can't spool it directly because of headers, feedback information on the number of rows returned, etc...


Trick #6 : Use the three required SQL*Plus commands needed to remove unwanted output!

Those three commands are :

set pagesize 0 -- Removes breaks on pages AND headings
set feedback off -- Removes the message about how many lines
-- were selected
set recsep off -- Removes unwanted blank lines.
With these settings, you can write your script in the following way :

spool do_backup.sql

spool off
set feedback on
@do_backup

to generate and immediately run a backup procedure which is always guaranteed to be up-to-date.


Well, we have what we want, so what about the seventh promised trick?

Don't forget that when you run something under SQL*Plus you may run several 'profile' files : glogin.sql which you Unix DBAs will find under $ORACLE_HOME/sqlplus/admin and possibly a login.sql file in the current directory. You don't know what is or ever will be in these scripts. So :


Trick #7 : code defensively and unset any potentially troublesome setting, even if the default is just right.

For instance :

set pause off
set echo off
set verify off -- If you are using parameters
set scan off
You now know most of the techniques you need to write truly useful scripts; then it's a matter of imagination, knowledge of the data dictionary and, of course, SQL skills. Beware that hot-backup as it is introduced here is still a prototype for tutorial purposes. If you are interested by the subject, a complete version named hotbackup.sql is freely available on this site, www.oriolecorp.com. The adventurous can also have a look at other SQL scripts, some of them are in the 'North Face' category...


The mythical rollback segment All you ever wanted to know about exp/imp




Best regards,


Rafi.

ORA-01555:SNAPSHOT TOO OLD ERROR BY TOM(IMPORTANT INTERVIEW QUESTION)

Tom,
Would you tell me what snapshot too old error. When does it happen? What's the possible
causes? How to fix it?

Thank you very much.

Jane


and we said...
I think support note covers this topic very well:

ORA-01555 "Snapshot too old" - Detailed Explanation
===================================================

Overview
~~~~~~~~

This article will discuss the circumstances under which a query can return the Oracle
error ORA-01555 "snapshot too old (rollback segment too small)". The article will then
proceed to discuss actions that can be taken to avoid the error and finally will provide
some simple PL/SQL scripts that illustrate the issues discussed.

Terminology
~~~~~~~~~~~

It is assumed that the reader is familiar with standard Oracle terminology such as
'rollback segment' and 'SCN'. If not, the reader should first read the Oracle Server
Concepts manual and related Oracle documentation.

In addition to this, two key concepts are briefly covered below which help in the
understanding of ORA-01555:

1. READ CONSISTENCY:
====================

This is documented in the Oracle Server Concepts manual and so will not be discussed
further. However, for the purposes of this article this should be read and understood if
not understood already.

Oracle Server has the ability to have multi-version read consistency which is invaluable
to you because it guarantees that you are seeing a consistent view of the data (no 'dirty
reads').


2. DELAYED BLOCK CLEANOUT:
==========================

This is best illustrated with an example: Consider a transaction that updates a million
row table. This obviously visits a large number of database blocks to make the change to
the data. When the user commits the transaction Oracle does NOT go back and revisit these
blocks to make the change permanent. It is left for the next transaction that visits any
block affected by the update to 'tidy up' the block (hence the term 'delayed block
cleanout').

Whenever Oracle changes a database block (index, table, cluster) it stores a pointer in
the header of the data block which identifies the rollback segment used to hold the
rollback information for the changes made by the transaction. (This is required if the
user later elects to not commit the changes and wishes to 'undo' the changes made.)

Upon commit, the database simply marks the relevant rollback segment header entry as
committed. Now, when one of the changed blocks is revisited Oracle examines the header of
the data block which indicates that it has been changed at some point. The database needs
to confirm whether the change has been committed or whether it is currently uncommitted.
To do this, Oracle determines the rollback segment used for the previous transaction
(from the block's header) and then determines whether the rollback header indicates
whether it has been committed or not.

If it is found that the block is committed then the header of the data block is updated
so that subsequent accesses to the block do not incur this processing.

This behaviour is illustrated in a very simplified way below. Here we walk through the
stages involved in updating a data block.

STAGE 1 - No changes made

Description: This is the starting point. At the top of the
data block we have an area used to link active
transactions to a rollback
segment (the 'tx' part), and the rollback segment
header has a table that stores information upon
all the latest transactions
that have used that rollback segment.

In our example, we have two active transaction
slots (01 and 02)
and the next free slot is slot 03. (Since we are
free to overwrite committed transactions.)

Data Block 500 Rollback Segment Header 5
+----+--------------+ +----------------------+---------+
| tx | None | | transaction entry 01 |ACTIVE |
+----+--------------+ | transaction entry 02 |ACTIVE |
| row 1 | | transaction entry 03 |COMMITTED|
| row 2 | | transaction entry 04 |COMMITTED|
| ... .. | | ... ... .. | ... |
| row n | | transaction entry nn |COMMITTED|
+-------------------+ +--------------------------------+

STAGE 2 - Row 2 is updated

Description: We have now updated row 2 of block 500. Note that
the data block header is updated to point to the
rollback segment 5, transaction
slot 3 (5.3) and that it is marked uncommitted
(Active).

Data Block 500 Rollback Segment Header 5
+----+--------------+ +----------------------+---------+
| tx |5.3uncommitted|-+ | transaction entry 01 |ACTIVE |
+----+--------------+ | | transaction entry 02 |ACTIVE |
| row 1 | +-->| transaction entry 03 |ACTIVE |
| row 2 *changed* | | transaction entry 04 |COMMITTED|
| ... .. | | ... ... .. | ... |
| row n | | transaction entry nn |COMMITTED|
+------------------+ +--------------------------------+

STAGE 3 - The user issues a commit

Description: Next the user hits commit. Note that all that
this does is it
updates the rollback segment header's
corresponding transaction
slot as committed. It does *nothing* to the data
block.

Data Block 500 Rollback Segment Header 5
+----+--------------+ +----------------------+---------+
| tx |5.3uncommitted|--+ | transaction entry 01 |ACTIVE |
+----+--------------+ | | transaction entry 02 |ACTIVE |
| row 1 | +--->| transaction entry 03 |COMMITTED|
| row 2 *changed* | | transaction entry 04 |COMMITTED|
| ... .. | | ... ... .. | ... |
| row n | | transaction entry nn |COMMITTED|
+------------------+ +--------------------------------+

STAGE 4 - Another user selects data block 500

Description: Some time later another user (or the same user)
revisits data block 500. We can see that there
is an uncommitted change in the
data block according to the data block's header.

Oracle then uses the data block header to look up
the corresponding rollback segment transaction
table slot, sees that it has been committed, and
changes data block 500 to reflect the
true state of the datablock. (i.e. it performs
delayed cleanout).

Data Block 500 Rollback Segment Header 5
+----+--------------+ +----------------------+---------+
| tx | None | | transaction entry 01 |ACTIVE |
+----+--------------+ | transaction entry 02 |ACTIVE |
| row 1 | | transaction entry 03 |COMMITTED|
| row 2 | | transaction entry 04 |COMMITTED|
| ... .. | | ... ... .. | ... |
| row n | | transaction entry nn |COMMITTED|
+------------------+ +--------------------------------+


ORA-01555 Explanation
~~~~~~~~~~~~~~~~~~~~~

There are two fundamental causes of the error ORA-01555 that are a result of Oracle
trying to attain a 'read consistent' image. These are :

o The rollback information itself is overwritten so that Oracle is unable to rollback
the (committed) transaction entries to attain a sufficiently old enough version of the
block.

o The transaction slot in the rollback segment's transaction table (stored in the
rollback segment's header) is overwritten, and Oracle cannot rollback the transaction
header sufficiently to derive the original rollback segment transaction slot.

Both of these situations are discussed below with the series of steps that cause the
ORA-01555. In the steps, reference is made to 'QENV'. 'QENV' is short for 'Query
Environment', which can be thought of as the environment that existed when a query is
first started and to which Oracle is trying to attain a read consistent image. Associated
with this environment is the SCN
(System Change Number) at that time and hence, QENV 50 is the query environment with SCN
50.

CASE 1 - ROLLBACK OVERWRITTEN

This breaks down into two cases: another session overwriting the rollback that the
current session requires or the case where the current session overwrites the rollback
information that it requires. The latter is discussed in this article because this is
usually the harder one to understand.

Steps:

1. Session 1 starts query at time T1 and QENV 50

2. Session 1 selects block B1 during this query

3. Session 1 updates the block at SCN 51

4. Session 1 does some other work that generates rollback information.

5. Session 1 commits the changes made in steps '3' and '4'.
(Now other transactions are free to overwrite this rollback information)

6. Session 1 revisits the same block B1 (perhaps for a different row).

Now, Oracle can see from the block's header that it has been changed and it is
later than the required QENV (which was 50). Therefore we need to get an image of the
block as of this QENV.

If an old enough version of the block can be found in the buffer cache then we
will use this, otherwise we need to rollback the current block to generate another
version of the block as at the required QENV.

It is under this condition that Oracle may not be able to get the required
rollback information because Session 1's changes have generated rollback information that
has overwritten it and returns the ORA-1555 error.

CASE 2 - ROLLBACK TRANSACTION SLOT OVERWRITTEN

1. Session 1 starts query at time T1 and QENV 50

2. Session 1 selects block B1 during this query

3. Session 1 updates the block at SCN 51

4. Session 1 commits the changes
(Now other transactions are free to overwrite this rollback information)

5. A session (Session 1, another session or a number of other sessions) then use the
same rollback segment for a series of committed transactions.

These transactions each consume a slot in the rollback segment transaction table
such that it eventually wraps around (the slots are written to in a circular fashion) and
overwrites all the slots. Note that Oracle is free to reuse these slots since all
transactions are committed.

6. Session 1's query then visits a block that has been changed since the initial QENV
was established. Oracle therefore needs to derive an image of the block as at that point
in time.

Next Oracle attempts to lookup the rollback segment header's transaction slot
pointed to by the top of the data block. It then realises that this has been overwritten
and attempts to rollback the changes made to the rollback segment header to get the
original transaction slot entry.

If it cannot rollback the rollback segment transaction table sufficiently it will
return ORA-1555 since Oracle can no longer derive the required version of the data block.


It is also possible to encounter a variant of the transaction slot being overwritten
when using block cleanout. This is briefly described below :

Session 1 starts a query at QENV 50. After this another process updates the blocks that
Session 1 will require. When Session 1 encounters these blocks it determines that the
blocks have changed and have not yet been cleaned out (via delayed block cleanout).
Session 1 must determine whether the rows in the block existed at QENV 50, were
subsequently changed,

In order to do this, Oracle must look at the relevant rollback segment transaction table
slot to determine the committed SCN. If this SCN is after the QENV then Oracle must try
to construct an older version of the block and if it is before then the block just needs
clean out to be good enough for the QENV.

If the transaction slot has been overwritten and the transaction table cannot be rolled
back to a sufficiently old enough version then Oracle cannot derive the block image and
will return ORA-1555.

(Note: Normally Oracle can use an algorithm for determining a block's SCN during block
cleanout even when the rollback segment slot has been overwritten. But in this case
Oracle cannot guarantee that the version of the block has not changed since the start of
the query).

Solutions
~~~~~~~~~

This section lists some of the solutions that can be used to avoid the ORA-01555 problems
discussed in this article. It addresses the cases where rollback segment information is
overwritten by the same session and when the rollback segment transaction table entry is
overwritten.

It is worth highlighting that if a single session experiences the ORA-01555 and it is not
one of the special cases listed at the end of this article, then the session must be
using an Oracle extension whereby fetches across commits are tolerated. This does not
follow the ANSI model and in the rare cases where
ORA-01555 is returned one of the solutions below must be used.

CASE 1 - ROLLBACK OVERWRITTEN

1. Increase size of rollback segment which will reduce the likelihood of overwriting
rollback information that is needed.

2. Reduce the number of commits (same reason as 1).

3. Run the processing against a range of data rather than the whole table. (Same
reason as 1).

4. Add additional rollback segments. This will allow the updates etc. to be spread
across more rollback segments thereby reducing the chances of overwriting required
rollback information.

5. If fetching across commits, the code can be changed so that this is not done.

6. Ensure that the outer select does not revisit the same block at different times
during the processing. This can be achieved by :

- Using a full table scan rather than an index lookup
- Introducing a dummy sort so that we retrieve all the data, sort it and then
sequentially visit these data blocks.

CASE 2 - ROLLBACK TRANSACTION SLOT OVERWRITTEN

1. Use any of the methods outlined above except for '6'. This will allow transactions
to spread their work across multiple rollback segments therefore reducing the likelihood
or rollback segment transaction table slots being consumed.

2. If it is suspected that the block cleanout variant is the cause, then force block
cleanout to occur prior to the transaction that returns the ORA-1555. This can be
achieved by issuing the following in SQL*Plus, SQL*DBA or Server Manager :

alter session set optimizer_goal = rule;
select count(*) from table_name;

If indexes are being accessed then the problem may be an index block and clean out
can be forced by ensuring that all the index is traversed. Eg, if the index is on a
numeric column with a minimum value of 25 then the following query will force cleanout of
the index :

select index_column from table_name where index_column > 24;

Examples
~~~~~~~~

Listed below are some PL/SQL examples that can be used to illustrate the ORA-1555 cases
given above. Before these PL/SQL examples will return this error the database must be
configured as follows :

o Use a small buffer cache (db_block_buffers).

REASON: You do not want the session executing the script to be able to find old
versions of the block in the buffer cache which can be used to satisfy a block visit
without requiring the rollback information.

o Use one rollback segment other than SYSTEM.

REASON: You need to ensure that the work being done is generating rollback
information that will overwrite the rollback information required.

o Ensure that the rollback segment is small.

REASON: See the reason for using one rollback segment.

ROLLBACK OVERWRITTEN

rem * 1555_a.sql -
rem * Example of getting ora-1555 "Snapshot too old" by
rem * session overwriting the rollback information required
rem * by the same session.

drop table bigemp;
create table bigemp (a number, b varchar2(30), done char(1));

drop table dummy1;
create table dummy1 (a varchar2(200));

rem * Populate the example tables.
begin
for i in 1..4000 loop
insert into bigemp values (mod(i,20), to_char(i), 'N');
if mod(i,100) = 0 then
insert into dummy1 values ('ssssssssssss');
commit;
end if;
end loop;
commit;
end;
/

rem * Ensure that table is 'cleaned out'.
select count(*) from bigemp;

declare
-- Must use a predicate so that we revisit a changed block at a different
-- time.

-- If another tx is updating the table then we may not need the predicate
cursor c1 is select rowid, bigemp.* from bigemp where a < 20;

begin
for c1rec in c1 loop

update dummy1 set a = 'aaaaaaaa';
update dummy1 set a = 'bbbbbbbb';
update dummy1 set a = 'cccccccc';
update bigemp set done='Y' where c1rec.rowid = rowid;
commit;
end loop;
end;
/

ROLLBACK TRANSACTION SLOT OVERWRITTEN

rem * 1555_b.sql - Example of getting ora-1555 "Snapshot too old" by
rem * overwriting the transaction slot in the rollback
rem * segment header. This just uses one session.

drop table bigemp;
create table bigemp (a number, b varchar2(30), done char(1));

rem * Populate demo table.
begin
for i in 1..200 loop
insert into bigemp values (mod(i,20), to_char(i), 'N');
if mod(i,100) = 0 then
commit;
end if;
end loop;
commit;
end;
/

drop table mydual;
create table mydual (a number);
insert into mydual values (1);
commit;

rem * Cleanout demo table.
select count(*) from bigemp;

declare

cursor c1 is select * from bigemp;

begin

-- The following update is required to illustrate the problem if block
-- cleanout has been done on 'bigemp'. If the cleanout (above) is commented
-- out then the update and commit statements can be commented and the
-- script will fail with ORA-1555 for the block cleanout variant.
update bigemp set b = 'aaaaa';
commit;

for c1rec in c1 loop
for i in 1..20 loop
update mydual set a=a;
commit;
end loop;
end loop;
end;
/

Special Cases
~~~~~~~~~~~~~

There are other special cases that may result in an ORA-01555. These are given below but
are rare and so not discussed in this article :

o Trusted Oracle can return this if configured in OS MAC mode. Decreasing
LOG_CHECKPOINT_INTERVAL on the secondary database may overcome the problem.

o If a query visits a data block that has been changed by using the Oracle discrete
transaction facility then it will return ORA-01555.

o It is feasible that a rollback segment created with the OPTIMAL clause maycause a
query to return ORA-01555 if it has shrunk during the life of the query causing rollback
segment information required to generate consistent read versions of blocks to be lost.

Summary
~~~~~~~

This article has discussed the reasons behind the error ORA-01555 "Snapshot too old", has
provided a list of possible methods to avoid the error when it is encountered, and has
provided simple PL/SQL scripts that illustrate the cases discussed.