Oracle Database Blog: 2013

Thursday, December 12, 2013

Problem with "with clause" queries after upgrade to 11.2.0.3

This was the issue identified when we upgraded the database from 11.2.0.2 to 11.2.0.3 the performance of entire database was impacted.This was a datawarehouse database and there were lots of queries generated from OBIEE using the "with clause".To workaround the issue quickly we switched back the optimizer_features_enable parameter to 11.2.0.2. Now it came to identifying the issue with 11.2.0.3 optimizer.We ran 10053 trace on the problem query which was not performing well with optimizer_features_enable set to 11.2.0.3 and 11.2.0.2 and compared the two traces.It was found that with 11.2.0.3 the CBQT (cost based query transformation ) was not getting successful causing the issue
.Later it was identified to be an issue with bug fix 11740670.To solve the issue we did following

Alter system set "_fix_control"='11740670:OFF';
Alter system set optimizer_features_enable=11.2.0.3;

11.2.0.3 optimizer

Query transformations (QT)

**************************

CBQT: copy not possible on query block SEL$10 (#0) because linked to with clause

CBQT bypassed for query block SEL$10 (#0): Cannot copy query block.

CBQT: Validity checks failed for 98utkpdkpq0af.

11.2.0.2 optimizer

Query transformations (QT)

**************************

JF: Checking validity of join factorization for query block SEL$1 (#0)

JF: Bypassed: not a UNION or UNION-ALL query block.

ST: not valid since new CBQT star transformation parameter is FALSE

TE: Checking validity of table expansion for query block SEL$1 (#0)

TE: Bypassed: table expansion disabled.

JF: Checking validity of join factorization for query block SEL$4 (#0)

JF: Bypassed: not a UNION or UNION-ALL query block.

ST: not valid since new CBQT star transformation parameter is FALSE

TE: Checking validity of table expansion for query block SEL$4 (#0)

TE: Bypassed: table expansion disabled.

CBQT: Validity checks passed for 98utkpdkpq0af.

Wednesday, December 11, 2013

ORA-15150: instance lock mode 'EXCLUSIVE' conflicts with other ASM instance(s)

This is a new standalone server using ASM and crs.The database is not clustered so cluster_database and RAC options are turned off for DB and ASM as expected.

SQL> startup nomount

ORA-15150: instance lock mode 'EXCLUSIVE' conflicts with other ASM instance(s)

We were able to start the database using srvctl start database -d command but not using sqlplus.

The cause of the problem was some parameter in the spfile used by the DB instance when were using sqlplus (I was suspecting it to be instance_number)

To resolve the issue we did the following.

Since the DB was able to start with using the srvctl command,we started it and created pfile from the spfile.Then we used this pfile to start the instance using sqlplus.

This issue could be caused by some other reasons in the cluster environment which could be due to following.

Cluster_database parameter is set to false in ASM/database
RAC option is not turned on.To turn on RAC option use the following oracle note

How to Check Whether Oracle Binary/Instance is RAC Enabled and Relink Oracle Binary in RAC (Doc ID 284785.1)

Tuesday, December 10, 2013

Unidirectional Goldengate replication 12c on same host and same pluggable database

After release of goldengate 12c we hear two terms in GG replication integrated and classic replication .In short Integrated replication is applicable for the version 12c database and classic is for the older database versions like 11.2.0.4.I have configured a simple setup to test the replication.
Integrated replicat is tightly integrated to oracle database.Some of the features of oracle streams are also taken into consideration that's why we see the apply process in the integrated replicat which is coming from the streams.

Environment:
Container database:cdbfs
Pluggable database:pdb1
Replication between schema src to schema tgt on the same pluggable database pdb1

1) First we need to configure a golden-gate capture user and assign it the necessary privileges.This user has to be on the root or container database.

CREATE USER c##ggadm IDENTIFIED BY oracle container= ALL;

grant dba to c##ggadm container=all;

exec dbms_goldengate_auth.grant_admin_privileges('c##ggadm','capture', grant_optional_privilege=>'*')

2) Log in to SQL*Plus as a user with ALTER SYSTEM privilege.

Issue the following command to determine whether the database is in supplemental logging mode and in forced logging mode. If the result is YES for both queries, the database meets the Oracle GoldenGate requirement.

SELECT supplemental_log_data_min, force_logging FROM v$database;

If the result is NO for either or both properties, continue with these steps to enable them as needed:

SQL> ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;
SQL> ALTER DATABASE FORCE LOGGING;

Issue the following command to verify that these properties are now enabled.

SELECT supplemental_log_data_min, force_logging FROM v$database;

The output of the query must be YES for both properties.

Switch the log files.

SQL> ALTER SYSTEM SWITCH LOGFILE;

3)Adjust the streams pool size as per the requirement.

alter system set streams_pool_size=25m;

4)Create the extract process.One extract process can capture the changes from multiple pluggable databases so you can specify the multiple PDBs with the help of "sourcecatalog" parameter in the param file.I have specified pdb1.

GGSCI (localhost.localdomain) 4> add extract ext1,tranlog,begin now

EXTRACT added.

Register the extract with the pluggable database.This is new in 12c and you will have to login first to the container database to do that

dblogin userid c##ggadm password oracle

GGSCI (localhost.localdomain) 16> REGISTER EXTRACT ext1 database CONTAINER (pdb1)

Extract EXT1 successfully registered with database at SCN 2170346.

5)Add extract trail and modify the parameter file.

GGSCI (localhost.localdomain) 7> add exttrail /home/oracle/gg/dirdat/ex,extract ext1

EXTTRAIL added.

GGSCI (localhost.localdomain) 9> edit param ext1

extract EXT1

USERID C##ggadm@cdb_fs, PASSWORD oracle

LOGALLSUPCOLS

UPDATERECORDFORMAT COMPACT

DDL INCLUDE MAPPED SOURCECATALOG pdb1

exttrail /home/oracle/gg/dirdat/ex

SOURCECATALOG pdb1

TABLE SRC.*;

6)Add the pump process and modify the parameter file

ADD EXTRACT EPUMP, EXTTRAILSOURCE /home/oracle/gg/dirdat/ex, begin now

ADD RMTTRAIL /home/oracle/gg/dirdat/rt, EXTRACT EPUMP

EXTRACT EPUMP

PASSTHRU

RMTHOST localhost,MGRPORT 7809

RMTTRAIL /home/oracle/gg/dirdat/rt

SOURCECATALOG pdb1

TABLE SRC.*;

7)Add the replicat process

add checkpointtable pdb1.c##ggadm.ctable

dblogin userid c##ggadm password oracle

ADD REPLICAT REP1, EXTTRAIL /home/oracle/gg/dirdat/ex, CHECKPOINTTABLE c##ggadm.ctable

edit param rep1

REPLICAT REP1
USERID C##ggadm@pdb1, PASSWORD oracle
ASSUMETARGETDEFS
DISCARDFILE /home/oracle/gg/dirrpt/rep1.dsc,APPEND
DDLOPTIONS REPORT
HANDLECOLLISIONS
APPLYNOOPUPDATES
DDL
SOURCECATALOG pdb1
MAP SRC.*, TARGET TGT.*;

GGSCI (localhost.localdomain) 10> start *

GGSCI (localhost.localdomain) 11> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING

EXTRACT RUNNING EPUMP 00:00:00 00:00:09

EXTRACT RUNNING EXT1 00:00:06 00:00:05

REPLICAT RUNNING REP1 00:00:00 00:00:09

8)Testing

I am creating a table t1 in the src schema and populating it with some values.

[oracle@localhost gg]$ sqlplus src/oracle@pdb1

SQL*Plus: Release 12.1.0.1.0 Production on Tue Dec 10 09:10:57 2013

Last Successful login time: Wed Dec 04 2013 15:26:51 -07:00

Connected to:

Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production

With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> create table t1(c1 number);

Table created.

SQL> insert into t1 values('&val');

Enter value for val: 2

old 1: insert into t1 values('&val')

new 1: insert into t1 values('2')

1 row created.

SQL> /

Enter value for val: 3

old 1: insert into t1 values('&val')

new 1: insert into t1 values('3')

1 row created.

SQL> /

Enter value for val: 4

old 1: insert into t1 values('&val')

new 1: insert into t1 values('4')

1 row created.

SQL> commit;

The same table gets populated into the tgt schema as shown below.

[oracle@localhost ~]$ sqlplus tgt/oracle@pdb1

SQL*Plus: Release 12.1.0.1.0 Production on Tue Dec 10 09:12:12 2013

Connected to:

Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production

With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> select table_name from user_tables;

TABLE_NAME

--------------------------------------------------------------------------------

SQL> select * from t1;

----------

Tuesday, August 13, 2013

ORA-00600: internal error code, arguments: [sorput_1], [], [], [], [], [], [], []

ORA-0060 and ORA 600 with sorput_1 errors in the alert log causing lot of TM locks and database hung.

Cause was identified a a bitmap index causing TM locks.

Resolution :Drop the bitmap index.

Thursday, August 8, 2013

11gR2 :Getting CRS-2718 while trying to relocate the instance resource

Customer has single instance on 2 node cluster and it runs one instnace at a time on given 2 nodes.He wants to relocate the instance to node 1 from node 2 but getting the following error while relocating.

8:21:34 $ $CRS_HOME/bin/crsctl relocate resource ora.ABC.inst -n host11 -f
CRS-2718: Server 'host1' is not a hosting member of resource 'ora.ABC.inst'
CRS-4000: Command Relocate failed, or completed with errors.

On examining the output of "crsctl stat resource ora.ABC.inst -f" I found that hosting members is set to node 1 only

HOSTING_MEMBERS=HOST1

Resolution

Resolution was to modify the HOSTING_MEMBERS string to a space delimited host names

crsctl modify resource ora.ABC.inst -attr "HOSTING_MEMBERS='HOST1 HOST2'"

Monday, August 5, 2013

11gR2 :Query running fast on one node of RAC database while running very slow on the other 2 nodes

I came across this very interesting and unique problem recently.One of the query running on one of the RAC instances (version 11.2.0.3) was performing well where as on the other 2 instances it was performing pretty bad.The query was returning results in sub seconds on one node where as on the other 2 nodes it was taking long time (20-30 secs).I took 10046 trace from both the instances and created a tkprof report .From the tkprof output looking at node 1 disk reads were 0 where as on node 2 there were 167006 disk reads.That means the query was hitting cache on node 1 all the time whereas on node 2 and 3 it was doing physical reads all the time no matter how many times you run the query.

Following points to note

Execution plan didn't change and it remained the same across all the instances.
Stats were not collected from long time but that was another issue which we addressed later.
Queries were doing full table scans(because the filter predicate was the most popular value and it was obvious for the queries to go for FTS so FTS was not an issue).
db_cache_size is same on all the 3 instances

What changed?

The database was bounced in rolling fashion after some maintenance.

We tried to run the query in question multiple times in hope that the data will get cached if we keep running it again and again but that was not helping.It was very strange behavior.

After analyzing the data for long time I noticed the the section "events waited on" in the tkprof .Which seemed little out of place.There were lot of waits on "Direct path read" on instance 2 and 3 where as there were no such wait events on node 1.

If you will do some research on "Direct path read" wait events you will come to know that the query results were skipping the buffer cache and were loaded directly into the PGA .Since the query results were never going though the buffer cache ,they were never getting cached and the query was doing physical reads each time it was getting executed.

Now the question arises ,How come the query results got cached on node 1 but not on other 2 .It is still a mystery but I used the following event to turn off the direct path reads on all the nodes and it resolved the issue.

alter system set events '10949 trace name context forever' ;

Interestingly this issue didn't happen at the time of upgrade from 10.2 to 11.2 but happened months later after database bounce.

The trade-off here is that that you need to to have db_cache_size large enough to accommodate all the query results which are frequently executed otherwise they will age out from the cache.In case you are using AMM ,you need to set a minimum size for db_cache_size along with the parameters memory_target and memory_max_target.

If you want to create the tkprof use the following procedure.

First create the 10046 trace using following

alter session set tracefile_identifier='10046_trace';

alter session set timed_statistics = true;

alter session set statistics_level=all;

alter session set max_dump_file_size = unlimited;

alter session set events '10046 trace name context forever,level 12';

-- Execute the queries or operations to be traced here --

select * from dual;

exit;

If the session is not exited then the trace can be disabled using:

alter session set events '10046 trace name context off';

Generate tkprof reports from the raw SQL Traces in the USER_DUMP_DEST

tkprof "10046_trcfile" "output_filename" sort=execpu, exedsk explain=user/password waits=yes

Please feel free to contact in case there are any questions or suggestions.

Friday, March 29, 2013

ORA-15564: contents of the replay directory provided to the workload replay client do not match with the replay directory provided to the database server

This can happen due to many different reasons.Some of the reasons I have noticed while running a DB replay were.

1)During connection remapping you are not providing the same connection string/service name as to the replay client.
2) The replay directory is not accessible from all the instances.
3) Replay workload is not pre-processed or it is pre-processed on some other version of the database as the target database.

ORA-00600: internal error code, arguments: [kdBlkCheckError], [75], [409240], [18018], [], [], [], [], [], [], [], []

I was working on Real application testing on an Exadata box and we faced following error while running the test.The changes made to the database:

We flashed back the database to guaranteed restore point after doing a Database replay.

ORA-00600: internal error code, arguments: [kdBlkCheckError], [75], [409240], [18018], [], [], [], [], [], [], [], []

On examining the alert log I noticed following error.This error is pointing for corruption in tablespace ABC.There is no segment name or segment owner reported ,so my initial thought was that the corroupt block is actually a free space in this locally managed tablespace. I ran dbverify,rman validate and some other tests to identify the location of this corrupt block but all in vain .Then I looked at the trace file to find some more details of the corruption.The corruption actually happened at the time of flashback and it was in the bimap free space.The workaround was to restart the flashback and rebuild the bitmap segment.

Corrupt Block Found

TSN = 12, TSNAME = ABC

RFN = 75, BLK = 409240, RDBA = -998346247

OBJN = 2, OBJD = -1, OBJECT = , SUBOBJECT =

SEGMENT OWNER = , SEGMENT TYPE =

The ORA -600 error number is explained as below

arg[a] --- File #
arg[b] --- Block #
arg[c] --- Application

So in case you are looking for a solution of other ORA-600 issues, you should be searching with ORA 600 and arg[c].

Workaround

srvctl stop instance -d rac -i rac1

srvctl start instance -d rac -i rac1 -o MOUNT

sqlplus / as sysdba

Alter Database Flashback OFF;

Alter Database Flashback ON;

exit

srvctl stop instance -d rac -i rac1

srvctl start database -d rac1

2) EXECUTE DBMS_SPACE_ADMIN.TABLESPACE_REBUILD_BITMAPS('ABC');

Monday, February 18, 2013

Start dbconsole for multiple databases on same host

If you want to start the dbconsole of multiple database on same host the parameter ORACLE_UNQNAME and ORACLE_HOSTNAME needs to be set along with ORACLE_SID for that particular database .

[oracle@localhost ~]$ export ORACLE_SID=migrate
[oracle@localhost ~]$ export ORACLE_HOSTNAME=localhost
[oracle@localhost ~]$ export ORACLE_UNQNAME=migrate

[oracle@localhost ~]$ export ORACLE_HOSTNAME=localhost
[oracle@localhost ~]$ emctl start dbconsole
Oracle Enterprise Manager 11g Database Control Release 11.2.0.1.0
Copyright (c) 1996, 2009 Oracle Corporation. All rights reserved.
https://localhost:5501/em/console/aboutApplication
Starting Oracle Enterprise Manager 11g Database Control ................................. started.
------------------------------------------------------------------
Logs are generated in directory /u01/app/oracle/product/11.1.0/db_1/localhost_migrate/sysman/log

Thursday, February 7, 2013

ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"

This error can happen due to many different reasons .When it happens you may need to think of what has changed in your environment.It was on my testing machine that I started getting this error .I was wondering what could have gone wrong. I searched on the ineternet but I was not able to get anything which can match my error and the environment that I have.It am using a standalone 11.2 machine on OEL5 with oracle restart feature enabled.From 11.2 onwards ASM is part of grid infrastructure and a diskgroup resource is always there under clusterware.This diskgroup houses the most important file for the ASM instance ie SPFILE.This diskgroup resource (ora.DATA.dg) was not coming up and complaining ORA-15063 when I was trying to start it manually.

[oracle@localhost ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
ONLINE OFFLINE localhost
ora.LISTENER.lsnr
ONLINE ONLINE localhost
ora.LISTENER_GRID.lsnr
ONLINE ONLINE localhost
ora.asm
ONLINE OFFLINE localhost Instance Shutdown
ora.eons
OFFLINE OFFLINE localhost
ora.ons
OFFLINE OFFLINE localhost
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
1 ONLINE ONLINE localhost
ora.diskmon
1 ONLINE ONLINE localhost
ora.migrate.db
1 ONLINE OFFLINE Instance Shutdown
ora.orcl.db
1 OFFLINE OFFLINE
ora.orcl.test.svc
1 OFFLINE OFFLINE
ora.stby.db
1 OFFLINE OFFLINE

From the above output I can see that the resource ora.DATA.dg is offline which is the crs resource of my data diskgroup DATA.Now when I tried to start it manually I got the following error.

[oracle@localhost ~]$ crsctl start resource ora.DATA.dg
CRS-2672: Attempting to start 'ora.asm' on 'localhost'
ORA-01012: not logged on
CRS-2676: Start of 'ora.asm' on 'localhost' succeeded
CRS-2672: Attempting to start 'ora.DATA.dg' on 'localhost'
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"
CRS-2674: Start of 'ora.DATA.dg' on 'localhost' failed
CRS-2679: Attempting to clean 'ora.DATA.dg' on 'localhost'
CRS-2681: Clean of 'ora.DATA.dg' on 'localhost' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'localhost'
CRS-2677: Stop of 'ora.asm' on 'localhost' succeeded
CRS-4000: Command Start failed, or completed with errors.

Clearly from the above output I can see that I am not able to access the diskgroup data and this error makes me believe that there could be some issue with the disks like corruption or missing metadata but that's not true in my case .This error message is misleading.

When I try to start the ASM instance manually I am able to do that but with the following error that no spfile found.

[oracle@localhost ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.1.0 Production on Thu Feb 7 02:05:37 2013

Connected to an idle instance.

SQL> startup

ORA-00099: warning: no parameter file specified for ASM instance

ASM instance started

Total System Global Area 283930624 bytes

Fixed Size 2212656 bytes

Variable Size 256552144 bytes

ASM Cache 25165824 bytes

ORA-15110: no diskgroups mounted

SQL> show parameter asm_

NAME TYPE VALUE

------------------------------------ ----------- ------------------------------

asm_diskgroups string

asm_diskstring string

asm_power_limit integer 1

asm_preferred_read_failure_groups string

SQL>

So I needed to investigate further about my ASM resource and I looked at the ASM resource profile.Suddenly I remembered that I had removed and added ASM resource again one day before. Following is the output of the ASM resource profile and you can clearly see that that spfile and ASM_diskstring are blank.The point is I didn't specify the ASM_DISKstring at the time of adding the ASM resource and that was my mistake.

[oracle@localhost ~]$ crsctl stat resource ora.asm -f

NAME=ora.asm

TYPE=ora.asm.type

STATE=ONLINE

TARGET=ONLINE

ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r--

ACTION_FAILURE_TEMPLATE=

ACTION_SCRIPT=

AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%

ALIAS_NAME=

ASM_DISKSTRING= <<<<<<<<<<<<<<

AUTO_START=restore

CHECK_INTERVAL=1

CHECK_TIMEOUT=600

CREATION_SEED=83

CURRENT_RCOUNT=0

DEFAULT_TEMPLATE=PROPERTY(RESOURCE_CLASS=asm) ELEMENT(INSTANCE_NAME= %GEN_USR_ORA_INST_NAME%)

DEGREE=1

DESCRIPTION=Oracle ASM resource

ENABLED=1

FAILURE_COUNT=0

FAILURE_HISTORY=

GEN_USR_ORA_INST_NAME=+ASM

ID=ora.asm

INCARNATION=0

LAST_FAULT=0

LAST_RESTART=0

LAST_SERVER=

LOAD=1

LOGGING_LEVEL=1

NLS_LANG=

NOT_RESTARTING_TEMPLATE=

OFFLINE_CHECK_INTERVAL=0

PROFILE_CHANGE_TEMPLATE=

RESTART_ATTEMPTS=5

SCRIPT_TIMEOUT=60

SPFILE= <<<<<<<<<<<<<

START_DEPENDENCIES=hard(ora.cssd) weak(ora.LISTENER.lsnr)

START_TIMEOUT=900

STATE_CHANGE_TEMPLATE=

STATE_CHANGE_VERS=0

STATE_DETAILS=

STOP_DEPENDENCIES=hard(ora.cssd)

STOP_TIMEOUT=600

UPTIME_THRESHOLD=1d

USR_ORA_ENV=

USR_ORA_INST_NAME=+ASM

USR_ORA_OPEN_MODE=mount

USR_ORA_OPI=false

USR_ORA_STOP_MODE=immediate

VERSION=11.2.0.1.0

Solution:

The Solution was to modeify or remove and add asm resource again with the correct asm_diskstring path.Everything came online after that.You can also modify the resource in case you don't want to remove it.

srvctl modeify asm -d /dev/oracleasm/disks

And stop and start the ASM

OR

[oracle@localhost ~]$ srvctl stop asm

[oracle@localhost disks]$ srvctl remove asm

[oracle@localhost disks]$ srvctl add asm -d /dev/oracleasm/disks

[oracle@localhost disks]$ srvctl start asm

[oracle@localhost disks]$ crsctl stat res -t

--------------------------------------------------------------------------------

NAME TARGET STATE SERVER STATE_DETAILS

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATA.dg

ONLINE ONLINE localhost

ora.LISTENER.lsnr

ONLINE ONLINE localhost

ora.LISTENER_GRID.lsnr

ONLINE ONLINE localhost

ora.asm

ONLINE ONLINE localhost Started

ora.eons

OFFLINE OFFLINE localhost

ora.ons

OFFLINE OFFLINE localhost

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.cssd

1 ONLINE ONLINE localhost

ora.diskmon

1 ONLINE ONLINE localhost

ora.migrate.db

1 ONLINE OFFLINE Instance Shutdown

ora.orcl.db

1 OFFLINE OFFLINE

ora.orcl.test.svc

1 OFFLINE OFFLINE

ora.stby.db

1 OFFLINE OFFLINE