Recovery when the first block of a datafile containing the datafile header becomes corrupt

Here some lines explain about the header of a block. Normally we know have seen block corrupt but what if when the first block of a datafile containing the datafile header becomes corrupt. The header of a block contains information like the type of segment the block belongs to and the SCN of the last transaction that modified that block. For a more advanced description. The v$ views are actually structures stored in RAM. The v$datafile view contains details about each database, but the v$datafile_header contains important space management information that applies to all data files within the header. Now have some knowledge about those steps which are used by DBA to perform a recovery when the first block of a datafile containing the datafile header becomes corrupt.

Let’s simulate this scenario using the following dd command:

[oracle@localhost trace]$ dd if=/dev/zero of=/home/oracle/app/oracle/oradata/orcl/marcov01.dbf bs=8k conv=notrunc seek=1 count=1
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000432187 seconds, 19.0 MB/s

To check and have a completed list of corrupt blocks as we have already seen we should issue a backup validate command:

RMAN> backup validate tablespace marcov;
Starting backup at 19-03-2013 23:07:11
using channel ORA_DISK_1
RMAN-06169: could not read file header for datafile 11 error reason 7
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of backup command at 03/19/2013 23:07:11
RMAN-06056: could not access datafile 11

The V$DATABASE_BLOCK_CORRUPTION view as usual should contain data about which blocks are corrupt, but it doesn’t show any rows:

SQL> select * from v$database_block_corruption;

no rows selected

What does it happen if I try to recover that first block ? It seems to work fine…

RMAN> recover datafile 11 block 1;

Starting recover at 19-03-2013 23:08:09
using channel ORA_DISK_1
starting media recovery
media recovery complete, elapsed time: 00:00:01

Finished recover at 19-03-2013 23:08:10

… but trying again a backup validate command I obtain the same error.

RMAN> backup validate tablespace marcov;

Starting backup at 19-03-2013 23:08:30
using channel ORA_DISK_1
RMAN-06169: could not read file header for datafile 11 error reason 7
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of backup command at 03/19/2013 23:08:30
RMAN-06056: could not access datafile 11

Ok, it’s time to perform a more familiar restore/recover tablespace operation.
So a typical sequence is put that tablespace offline, restore it, recover it and then put it back online:

RMAN> sql 'alter tablespace marcov offline immediate';

sql statement: alter tablespace marcov offline immediate

RMAN> restore tablespace marcov;

Starting restore at 19-03-2013 23:13:30
using channel ORA_DISK_1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 03/19/2013 23:13:30
ORA-01135: file 11 accessed for DML/query is offline
ORA-01110: data file 11: '/home/oracle/app/oracle/oradata/orcl/marcov01.dbf'
RMAN-06019: could not translate tablespace name "MARCOV"

Oops… the familiar restore/recover sequence is not working at all.
And I can’t even put the tablespace online again.

RMAN> sql 'alter tablespace marcov online';

sql statement: alter tablespace marcov online
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of sql command on default channel at 03/19/2013 23:19:30
RMAN-11003: failure during parse/execution of SQL statement: alter tablespace marcov online
ORA-01122: database file 11 failed verification check
ORA-01110: data file 11: '/home/oracle/app/oracle/oradata/orcl/marcov01.dbf'
ORA-01210: data file header is media corrupt

It seems the familiar restore/recover operation cannot start because of the corrupt data block number 1.
So how can I proceed ? I put again the tablespace offline …

RMAN> sql 'alter tablespace marcov offline immediate';

sql statement: alter tablespace marcov offline immediate

… and simply removed the entire datafile with the corrupt data block number 1.

[oracle@localhost trace]$ rm /home/oracle/app/oracle/oradata/orcl/marcov01.dbf

Without the entire datafile, the restore tablespace command is able to complete successfully.

RMAN> restore tablespace marcov;

Starting restore at 19-03-2013 23:20:29
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00011 to /home/oracle/app/oracle/oradata/orcl/marcov01.dbf
channel ORA_DISK_1: reading from backup piece /home/oracle/app/oracle/flash_recovery_area/ORCL/backupset/2013_03_18/o1_mf_nnndf_TAG20130318T224952_8nhz414b_.bkp
channel ORA_DISK_1: piece handle=/home/oracle/app/oracle/flash_recovery_area/ORCL/backupset/2013_03_18/o1_mf_nnndf_TAG20130318T224952_8nhz414b_.bkp tag=TAG20130318T224952
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:02
Finished restore at 19-03-2013 23:20:32

Also the recover tablespace command is able to finish its operation.

RMAN> recover tablespace marcov;

Starting recover at 19-03-2013 23:20:45
using channel ORA_DISK_1

starting media recovery
media recovery complete, elapsed time: 00:00:00

Finished recover at 19-03-2013 23:20:47

The tablespace should be now available again to all the users.

RMAN> sql 'alter tablespace marcov online';

sql statement: alter tablespace marcov online

Ok… It’s not true and I know why.

During the restore operation RMAN simply used the first available backup piece: o1_mf_nnndf_TAG20130318T224952_8nhz414b_.bkp. But that backup piece contains a datafile with few corrupt data blocks because I forced to skip them during the backup operation. You can verify what I’m saying here in one of my previous posts.

SQL> select count(*) from marcov.t1;
select count(*) from marcov.t1
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 11, block # 10)
ORA-01110: data file 11: '/home/oracle/app/oracle/oradata/orcl/marcov01.dbf'

Let’s update the V$DATABASE_BLOCK_CORRUPTION view issuing a backup validate command.

RMAN> backup validate tablespace marcov;

Starting backup at 19-03-2013 23:21:53
using channel ORA_DISK_1
channel ORA_DISK_1: starting compressed full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00011 name=/home/oracle/app/oracle/oradata/orcl/marcov01.dbf
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
11 OK 6 553 768 14877710
File Name: /home/oracle/app/oracle/oradata/orcl/marcov01.dbf
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data 0 196
Index 0 0
Other 0 19

Finished backup at 19-03-2013 23:21:56

As you can see datafile number 11 has few corrupt blocks.

SQL> select * from v$database_block_corruption;

FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO
---------- ---------- ---------- ------------------ ---------
11 8 1 14877710 CORRUPT
11 7 1 0 CORRUPT
11 10 1 0 CORRUPT
11 3 3 0 CORRUPT

Using the recover corruption list we are able to correct the content of those corrupt blocks.
Just note also how RMAN “failover to the previous backup” (skipping the already mentioned o1_mf_nnndf_TAG20130318T224952_8nhz414b_.bkp backup piece and also the next available one, o1_mf_nnndf_TAG20130318T090132_8ngglx76_.bkp) because using the current backup piece is not able to perform and complete the recovery operation.

RMAN> recover corruption list;

Starting recover at 19-03-2013 23:22:36
using channel ORA_DISK_1

channel ORA_DISK_1: restoring block(s)
channel ORA_DISK_1: specifying block(s) to restore from backup set
restoring blocks of datafile 00011
channel ORA_DISK_1: reading from backup piece /home/oracle/app/oracle/flash_recovery_area/ORCL/backupset/2013_03_18/o1_mf_nnndf_TAG20130318T224952_8nhz414b_.bkp
channel ORA_DISK_1: piece handle=/home/oracle/app/oracle/flash_recovery_area/ORCL/backupset/2013_03_18/o1_mf_nnndf_TAG20130318T224952_8nhz414b_.bkp tag=TAG20130318T224952
channel ORA_DISK_1: restored block(s) from backup piece 1
channel ORA_DISK_1: block restore complete, elapsed time: 00:00:00
failover to previous backup

channel ORA_DISK_1: restoring block(s)
channel ORA_DISK_1: specifying block(s) to restore from backup set
restoring blocks of datafile 00011
channel ORA_DISK_1: reading from backup piece /home/oracle/app/oracle/flash_recovery_area/ORCL/backupset/2013_03_18/o1_mf_nnndf_TAG20130318T090132_8ngglx76_.bkp
channel ORA_DISK_1: piece handle=/home/oracle/app/oracle/flash_recovery_area/ORCL/backupset/2013_03_18/o1_mf_nnndf_TAG20130318T090132_8ngglx76_.bkp tag=TAG20130318T090132
channel ORA_DISK_1: restored block(s) from backup piece 1
channel ORA_DISK_1: block restore complete, elapsed time: 00:00:01
failover to previous backup

channel ORA_DISK_1: restoring block(s)
channel ORA_DISK_1: specifying block(s) to restore from backup set
restoring blocks of datafile 00011
channel ORA_DISK_1: reading from backup piece /home/oracle/app/oracle/flash_recovery_area/ORCL/backupset/2013_03_18/o1_mf_nnndf_TAG20130318T072836_8ng94o0k_.bkp
channel ORA_DISK_1: piece handle=/home/oracle/app/oracle/flash_recovery_area/ORCL/backupset/2013_03_18/o1_mf_nnndf_TAG20130318T072836_8ng94o0k_.bkp tag=TAG20130318T072836
channel ORA_DISK_1: restored block(s) from backup piece 1
channel ORA_DISK_1: block restore complete, elapsed time: 00:00:01

starting media recovery
media recovery complete, elapsed time: 00:00:04

Finished recover at 19-03-2013 23:22:43

RMAN after looking into o1_mf_nnndf_TAG20130318T072836_8ng94o0k_.bkp backup piece completes the recovery operation and the T1 table is now really available to all the users.

SQL> select count(*) from marcov.t1;

COUNT(*)
----------
1000

Thank you for giving your valuable time to read the above information.

Source

For More Detail , You can join us follow:

LinkedIn Group: Oracle Cloud DBAAS

Facebook Page: OracleHelp

Recovery when the first block of a datafile containing the datafile header becomes corrupt

Like this:

About The Author

Anuradha Mudgal

Leave a Reply Cancel reply

Oracle Help

Follow Us

Share this:

Like this:

About The Author

Related Articles

Leave a Reply Cancel reply

Tags

Follow Us