Thursday, December 12, 2013

Problem with "with clause" queries after upgrade to 11.2.0.3

This was the issue identified when we upgraded the database from 11.2.0.2 to 11.2.0.3 the performance of entire database was impacted.This was a datawarehouse database and there were lots of queries generated from OBIEE using the "with clause".To workaround the issue quickly we switched back the optimizer_features_enable parameter to 11.2.0.2. Now it came to identifying the issue with 11.2.0.3 optimizer.We ran 10053 trace on the problem query which was not performing well with optimizer_features_enable set to 11.2.0.3 and 11.2.0.2 and compared the two traces.It was found that with 11.2.0.3 the CBQT (cost based query transformation ) was not getting successful causing the issue
.Later it was identified to be an issue with bug fix 11740670.To solve the issue we did following

Alter system set "_fix_control"='11740670:OFF';
Alter system set optimizer_features_enable=11.2.0.3;


11.2.0.3 optimizer

Query transformations (QT)
**************************
CBQT: copy not possible on query block SEL$10 (#0) because linked to with clause
CBQT bypassed for query block SEL$10 (#0): Cannot copy query block.
CBQT: Validity checks failed for 98utkpdkpq0af.


11.2.0.2 optimizer

Query transformations (QT)
**************************
JF: Checking validity of join factorization for query block SEL$1 (#0)
JF: Bypassed: not a UNION or UNION-ALL query block.

ST: not valid since new CBQT star transformation parameter is FALSE
TE: Checking validity of table expansion for query block SEL$1 (#0)
TE: Bypassed: table expansion disabled.
JF: Checking validity of join factorization for query block SEL$4 (#0)
JF: Bypassed: not a UNION or UNION-ALL query block.
ST: not valid since new CBQT star transformation parameter is FALSE
TE: Checking validity of table expansion for query block SEL$4 (#0)
TE: Bypassed: table expansion disabled.
CBQT: Validity checks passed for 98utkpdkpq0af.

Wednesday, December 11, 2013

ORA-15150: instance lock mode 'EXCLUSIVE' conflicts with other ASM instance(s)



This is a new standalone server using ASM and crs.The database is not clustered so cluster_database and RAC options are turned off for DB and ASM as expected.


SQL> startup nomount

ORA-15150: instance lock mode 'EXCLUSIVE' conflicts with other ASM instance(s)

We were able to start the database using srvctl start database -d command but not using sqlplus.

The cause of the problem was some parameter in the spfile used by the DB instance when were using sqlplus (I was suspecting it to be instance_number)

To resolve the issue we did the following.
Since the DB was able to start with using the srvctl command,we started it and created pfile from the spfile.Then we used this pfile to start  the instance using sqlplus.

This issue could be caused by some other  reasons in the cluster environment which could be due to following.
  • Cluster_database parameter is set to false in ASM/database
  • RAC option is not turned on.To turn on RAC option use the following oracle note

How to Check Whether Oracle Binary/Instance is RAC Enabled and Relink Oracle Binary in RAC (Doc ID 284785.1)  

Tuesday, December 10, 2013

Unidirectional Goldengate replication 12c on same host and same pluggable database

After release of goldengate 12c we hear two terms in  GG replication integrated and classic replication .In short Integrated replication is applicable for the version 12c database and classic is for the older database versions like 11.2.0.4.I have configured a simple setup to test the replication.
Integrated replicat is tightly integrated to oracle database.Some of the features of oracle streams are also taken into consideration that's why we see the apply process in the integrated replicat which is coming from the streams.

Environment:
Container database:cdbfs
Pluggable database:pdb1
Replication between schema src to schema tgt on the same pluggable database pdb1 


1) First we need to configure a golden-gate capture user and assign it the necessary privileges.This user has to be on the root or container database.

CREATE USER c##ggadm IDENTIFIED BY oracle container= ALL;


grant dba to c##ggadm container=all;

exec dbms_goldengate_auth.grant_admin_privileges('c##ggadm','capture', grant_optional_privilege=>'*') 


2) Log in to SQL*Plus as a user with ALTER SYSTEM privilege.

Issue the following command to determine whether the database is in supplemental logging mode and in forced logging mode. If the result is YES for both queries, the database meets the Oracle GoldenGate requirement.

SELECT supplemental_log_data_min, force_logging FROM v$database;

If the result is NO for either or both properties, continue with these steps to enable them as needed:

SQL> ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;
SQL> ALTER DATABASE FORCE LOGGING;

Issue the following command to verify that these properties are now enabled.

SELECT supplemental_log_data_min, force_logging FROM v$database;

The output of the query must be YES for both properties.

Switch the log files.

SQL> ALTER SYSTEM SWITCH LOGFILE;


3)Adjust the streams pool size as per the requirement.

alter system set streams_pool_size=25m;

4)Create the extract process.One extract process can capture the changes from multiple pluggable databases so you can specify the multiple PDBs with the help of "sourcecatalog" parameter in the param file.I have specified pdb1.

GGSCI (localhost.localdomain) 4> add extract ext1,tranlog,begin now
EXTRACT added.

  • Register the extract with the pluggable database.This is new in 12c and you will have to login first to the container database to do that
dblogin userid c##ggadm password oracle

GGSCI (localhost.localdomain) 16> REGISTER EXTRACT ext1 database CONTAINER (pdb1)

Extract EXT1 successfully registered with database at SCN 2170346.

5)Add extract trail and modify the parameter file.

GGSCI (localhost.localdomain) 7> add exttrail /home/oracle/gg/dirdat/ex,extract ext1
EXTTRAIL added.

GGSCI (localhost.localdomain) 9> edit param ext1

extract EXT1
USERID C##ggadm@cdb_fs, PASSWORD oracle
LOGALLSUPCOLS
UPDATERECORDFORMAT COMPACT
DDL INCLUDE MAPPED SOURCECATALOG pdb1
exttrail /home/oracle/gg/dirdat/ex
SOURCECATALOG pdb1
TABLE SRC.*;

6)Add the pump process and modify the parameter file

 ADD EXTRACT EPUMP, EXTTRAILSOURCE /home/oracle/gg/dirdat/ex, begin now

ADD RMTTRAIL /home/oracle/gg/dirdat/rt, EXTRACT EPUMP

EXTRACT EPUMP
PASSTHRU
RMTHOST localhost,MGRPORT 7809
RMTTRAIL /home/oracle/gg/dirdat/rt
SOURCECATALOG pdb1
TABLE SRC.*;

7)Add the replicat process 

add checkpointtable pdb1.c##ggadm.ctable

dblogin userid c##ggadm password oracle

ADD REPLICAT REP1, EXTTRAIL /home/oracle/gg/dirdat/ex, CHECKPOINTTABLE c##ggadm.ctable

register replicat rep1 database

edit param rep1

REPLICAT REP1
USERID C##ggadm@pdb1, PASSWORD oracle
ASSUMETARGETDEFS
DISCARDFILE /home/oracle/gg/dirrpt/rep1.dsc,APPEND
DDLOPTIONS REPORT
HANDLECOLLISIONS
APPLYNOOPUPDATES
DDL
SOURCECATALOG pdb1
MAP SRC.*, TARGET TGT.*;


GGSCI (localhost.localdomain) 10> start *


GGSCI (localhost.localdomain) 11> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
EXTRACT     RUNNING     EPUMP       00:00:00      00:00:09
EXTRACT     RUNNING     EXT1        00:00:06      00:00:05
REPLICAT    RUNNING     REP1        00:00:00      00:00:09


8)Testing

I am creating a table t1 in the src schema and populating it with some values.

[oracle@localhost gg]$ sqlplus src/oracle@pdb1

SQL*Plus: Release 12.1.0.1.0 Production on Tue Dec 10 09:10:57 2013

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Last Successful login time: Wed Dec 04 2013 15:26:51 -07:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> create table t1(c1 number);

Table created.

SQL> insert into t1 values('&val');
Enter value for val: 2
old   1: insert into t1 values('&val')
new   1: insert into t1 values('2')

1 row created.

SQL> /
Enter value for val: 3
old   1: insert into t1 values('&val')
new   1: insert into t1 values('3')

1 row created.

SQL> /
Enter value for val: 4
old   1: insert into t1 values('&val')
new   1: insert into t1 values('4')

1 row created.

SQL> commit;

The same table gets populated into the tgt schema as shown below.

[oracle@localhost ~]$ sqlplus tgt/oracle@pdb1

SQL*Plus: Release 12.1.0.1.0 Production on Tue Dec 10 09:12:12 2013

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> select table_name from user_tables;

TABLE_NAME
--------------------------------------------------------------------------------
T1


SQL> select * from t1;

        C1
----------
         2
         3
         4


Tuesday, August 13, 2013

ORA-00600: internal error code, arguments: [sorput_1], [], [], [], [], [], [], []

ORA-0060 and ORA 600 with sorput_1 errors in the alert log causing lot of TM locks and database hung.

Cause was identified a a bitmap index causing TM locks.

Resolution :Drop the bitmap index.

Thursday, August 8, 2013

11gR2 :Getting CRS-2718 while trying to relocate the instance resource


Customer has single instance on 2 node cluster and it runs one instnace at a time on given 2 nodes.He wants to relocate the instance to node 1 from node 2 but getting  the following error while relocating.

8:21:34 $ $CRS_HOME/bin/crsctl relocate resource ora.ABC.inst -n host11  -f
CRS-2718: Server 'host1' is not a hosting member of resource 'ora.ABC.inst'
CRS-4000: Command Relocate failed, or completed with errors.

On examining the output of "crsctl stat resource ora.ABC.inst  -f" I found that hosting members is set to node 1 only

HOSTING_MEMBERS=HOST1


Resolution

Resolution was to modify the HOSTING_MEMBERS string to a space delimited host names

crsctl modify resource ora.ABC.inst -attr "HOSTING_MEMBERS='HOST1 HOST2'"

Monday, August 5, 2013

11gR2 :Query running fast on one node of RAC database while running very slow on the other 2 nodes

I came across this very interesting and unique problem recently.One of the query running on one of the RAC instances (version 11.2.0.3) was performing well where as on the other 2 instances it was performing pretty bad.The query was returning results in sub seconds on one node where as on the other 2 nodes it was taking long time (20-30 secs).I  took 10046 trace from both the instances and created a tkprof report .From the tkprof output looking  at node 1  disk reads were 0 where as on node 2 there were 167006   disk reads.That means the query was hitting cache on node 1 all the time whereas on node 2 and 3 it was doing physical reads all the time no matter how many times you run the query.

Following points to note

  • Execution plan didn't change and it remained the same across all the instances.
  • Stats were not collected from long time but that was another issue which we addressed later.
  • Queries were doing full table scans(because the filter predicate was the most popular value and it was obvious for the queries to go for FTS so FTS was not an issue).
  • db_cache_size is same on all the 3 instances

What changed?

The database was bounced in rolling fashion after some maintenance.

We tried to run the query  in question multiple times in hope that the data will get cached if we keep running it again and again but that was not helping.It was very strange behavior.

After analyzing the data for long time I noticed the the section "events waited on" in the tkprof .Which seemed little out of place.There were lot of waits on "Direct path read" on instance 2 and 3 where as there were no such wait events on node 1.

If you will do some research on "Direct path read" wait events you will come to know that the query results were skipping the buffer cache and were loaded directly into the PGA .Since the query results were never going though the buffer cache ,they were never getting cached and the query was doing physical reads each time it was getting executed.

Now the question arises ,How come the query results got cached on node 1 but not on other 2 .It is still a mystery but I used the following event to turn off the direct path reads on all the nodes and it resolved the issue.

alter system set events '10949 trace name context forever' ;

Interestingly this issue didn't happen at the time of upgrade from 10.2 to 11.2 but happened months later after database bounce.

The trade-off here is that that you need to to have db_cache_size large enough to accommodate all the query results which are frequently executed otherwise they will age out from the cache.In case you  are using AMM ,you need to set a minimum size for db_cache_size along with the parameters memory_target and memory_max_target.


If you want to create the tkprof  use the following procedure.

  • First create the 10046 trace using following
alter session set tracefile_identifier='10046_trace';
alter session set timed_statistics = true;  
alter session set statistics_level=all;  
alter session set max_dump_file_size = unlimited;  
alter session set events '10046 trace name context forever,level 12'; 
 -- Execute the queries or operations to be traced here --  
select * from dual;  
exit;

If the session is not exited then the trace can be disabled using:
alter session set events '10046 trace name context off';
  
  •  Generate tkprof reports from the raw SQL Traces in the USER_DUMP_DEST

    tkprof    "10046_trcfile"  "output_filename" sort=execpu, exedsk explain=user/password waits=yes


Please feel free to contact in case there are any questions or suggestions.

               

Friday, March 29, 2013

ORA-15564: contents of the replay directory provided to the workload replay client do not match with the replay directory provided to the database server

This can happen due to many different reasons.Some of the reasons I have noticed while running a DB replay were.

1)During connection remapping you are not providing the same connection string/service name as to the replay client.
2) The replay directory is not accessible from all the instances.
3) Replay workload is not pre-processed or it is pre-processed on some other version of the database as the target database.