Thursday, December 12, 2013

Problem with "with clause" queries after upgrade to 11.2.0.3

This was the issue identified when we upgraded the database from 11.2.0.2 to 11.2.0.3 the performance of entire database was impacted.This was a datawarehouse database and there were lots of queries generated from OBIEE using the "with clause".To workaround the issue quickly we switched back the optimizer_features_enable parameter to 11.2.0.2. Now it came to identifying the issue with 11.2.0.3 optimizer.We ran 10053 trace on the problem query which was not performing well with optimizer_features_enable set to 11.2.0.3 and 11.2.0.2 and compared the two traces.It was found that with 11.2.0.3 the CBQT (cost based query transformation ) was not getting successful causing the issue
.Later it was identified to be an issue with bug fix 11740670.To solve the issue we did following

Alter system set "_fix_control"='11740670:OFF';
Alter system set optimizer_features_enable=11.2.0.3;


11.2.0.3 optimizer

Query transformations (QT)
**************************
CBQT: copy not possible on query block SEL$10 (#0) because linked to with clause
CBQT bypassed for query block SEL$10 (#0): Cannot copy query block.
CBQT: Validity checks failed for 98utkpdkpq0af.


11.2.0.2 optimizer

Query transformations (QT)
**************************
JF: Checking validity of join factorization for query block SEL$1 (#0)
JF: Bypassed: not a UNION or UNION-ALL query block.

ST: not valid since new CBQT star transformation parameter is FALSE
TE: Checking validity of table expansion for query block SEL$1 (#0)
TE: Bypassed: table expansion disabled.
JF: Checking validity of join factorization for query block SEL$4 (#0)
JF: Bypassed: not a UNION or UNION-ALL query block.
ST: not valid since new CBQT star transformation parameter is FALSE
TE: Checking validity of table expansion for query block SEL$4 (#0)
TE: Bypassed: table expansion disabled.
CBQT: Validity checks passed for 98utkpdkpq0af.

Wednesday, December 11, 2013

ORA-15150: instance lock mode 'EXCLUSIVE' conflicts with other ASM instance(s)



This is a new standalone server using ASM and crs.The database is not clustered so cluster_database and RAC options are turned off for DB and ASM as expected.


SQL> startup nomount

ORA-15150: instance lock mode 'EXCLUSIVE' conflicts with other ASM instance(s)

We were able to start the database using srvctl start database -d command but not using sqlplus.

The cause of the problem was some parameter in the spfile used by the DB instance when were using sqlplus (I was suspecting it to be instance_number)

To resolve the issue we did the following.
Since the DB was able to start with using the srvctl command,we started it and created pfile from the spfile.Then we used this pfile to start  the instance using sqlplus.

This issue could be caused by some other  reasons in the cluster environment which could be due to following.
  • Cluster_database parameter is set to false in ASM/database
  • RAC option is not turned on.To turn on RAC option use the following oracle note

How to Check Whether Oracle Binary/Instance is RAC Enabled and Relink Oracle Binary in RAC (Doc ID 284785.1)  

Tuesday, December 10, 2013

Unidirectional Goldengate replication 12c on same host and same pluggable database

After release of goldengate 12c we hear two terms in  GG replication integrated and classic replication .In short Integrated replication is applicable for the version 12c database and classic is for the older database versions like 11.2.0.4.I have configured a simple setup to test the replication.
Integrated replicat is tightly integrated to oracle database.Some of the features of oracle streams are also taken into consideration that's why we see the apply process in the integrated replicat which is coming from the streams.

Environment:
Container database:cdbfs
Pluggable database:pdb1
Replication between schema src to schema tgt on the same pluggable database pdb1 


1) First we need to configure a golden-gate capture user and assign it the necessary privileges.This user has to be on the root or container database.

CREATE USER c##ggadm IDENTIFIED BY oracle container= ALL;


grant dba to c##ggadm container=all;

exec dbms_goldengate_auth.grant_admin_privileges('c##ggadm','capture', grant_optional_privilege=>'*') 


2) Log in to SQL*Plus as a user with ALTER SYSTEM privilege.

Issue the following command to determine whether the database is in supplemental logging mode and in forced logging mode. If the result is YES for both queries, the database meets the Oracle GoldenGate requirement.

SELECT supplemental_log_data_min, force_logging FROM v$database;

If the result is NO for either or both properties, continue with these steps to enable them as needed:

SQL> ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;
SQL> ALTER DATABASE FORCE LOGGING;

Issue the following command to verify that these properties are now enabled.

SELECT supplemental_log_data_min, force_logging FROM v$database;

The output of the query must be YES for both properties.

Switch the log files.

SQL> ALTER SYSTEM SWITCH LOGFILE;


3)Adjust the streams pool size as per the requirement.

alter system set streams_pool_size=25m;

4)Create the extract process.One extract process can capture the changes from multiple pluggable databases so you can specify the multiple PDBs with the help of "sourcecatalog" parameter in the param file.I have specified pdb1.

GGSCI (localhost.localdomain) 4> add extract ext1,tranlog,begin now
EXTRACT added.

  • Register the extract with the pluggable database.This is new in 12c and you will have to login first to the container database to do that
dblogin userid c##ggadm password oracle

GGSCI (localhost.localdomain) 16> REGISTER EXTRACT ext1 database CONTAINER (pdb1)

Extract EXT1 successfully registered with database at SCN 2170346.

5)Add extract trail and modify the parameter file.

GGSCI (localhost.localdomain) 7> add exttrail /home/oracle/gg/dirdat/ex,extract ext1
EXTTRAIL added.

GGSCI (localhost.localdomain) 9> edit param ext1

extract EXT1
USERID C##ggadm@cdb_fs, PASSWORD oracle
LOGALLSUPCOLS
UPDATERECORDFORMAT COMPACT
DDL INCLUDE MAPPED SOURCECATALOG pdb1
exttrail /home/oracle/gg/dirdat/ex
SOURCECATALOG pdb1
TABLE SRC.*;

6)Add the pump process and modify the parameter file

 ADD EXTRACT EPUMP, EXTTRAILSOURCE /home/oracle/gg/dirdat/ex, begin now

ADD RMTTRAIL /home/oracle/gg/dirdat/rt, EXTRACT EPUMP

EXTRACT EPUMP
PASSTHRU
RMTHOST localhost,MGRPORT 7809
RMTTRAIL /home/oracle/gg/dirdat/rt
SOURCECATALOG pdb1
TABLE SRC.*;

7)Add the replicat process 

add checkpointtable pdb1.c##ggadm.ctable

dblogin userid c##ggadm password oracle

ADD REPLICAT REP1, EXTTRAIL /home/oracle/gg/dirdat/ex, CHECKPOINTTABLE c##ggadm.ctable

register replicat rep1 database

edit param rep1

REPLICAT REP1
USERID C##ggadm@pdb1, PASSWORD oracle
ASSUMETARGETDEFS
DISCARDFILE /home/oracle/gg/dirrpt/rep1.dsc,APPEND
DDLOPTIONS REPORT
HANDLECOLLISIONS
APPLYNOOPUPDATES
DDL
SOURCECATALOG pdb1
MAP SRC.*, TARGET TGT.*;


GGSCI (localhost.localdomain) 10> start *


GGSCI (localhost.localdomain) 11> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
EXTRACT     RUNNING     EPUMP       00:00:00      00:00:09
EXTRACT     RUNNING     EXT1        00:00:06      00:00:05
REPLICAT    RUNNING     REP1        00:00:00      00:00:09


8)Testing

I am creating a table t1 in the src schema and populating it with some values.

[oracle@localhost gg]$ sqlplus src/oracle@pdb1

SQL*Plus: Release 12.1.0.1.0 Production on Tue Dec 10 09:10:57 2013

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Last Successful login time: Wed Dec 04 2013 15:26:51 -07:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> create table t1(c1 number);

Table created.

SQL> insert into t1 values('&val');
Enter value for val: 2
old   1: insert into t1 values('&val')
new   1: insert into t1 values('2')

1 row created.

SQL> /
Enter value for val: 3
old   1: insert into t1 values('&val')
new   1: insert into t1 values('3')

1 row created.

SQL> /
Enter value for val: 4
old   1: insert into t1 values('&val')
new   1: insert into t1 values('4')

1 row created.

SQL> commit;

The same table gets populated into the tgt schema as shown below.

[oracle@localhost ~]$ sqlplus tgt/oracle@pdb1

SQL*Plus: Release 12.1.0.1.0 Production on Tue Dec 10 09:12:12 2013

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> select table_name from user_tables;

TABLE_NAME
--------------------------------------------------------------------------------
T1


SQL> select * from t1;

        C1
----------
         2
         3
         4


Tuesday, August 13, 2013

ORA-00600: internal error code, arguments: [sorput_1], [], [], [], [], [], [], []

ORA-0060 and ORA 600 with sorput_1 errors in the alert log causing lot of TM locks and database hung.

Cause was identified a a bitmap index causing TM locks.

Resolution :Drop the bitmap index.

Thursday, August 8, 2013

11gR2 :Getting CRS-2718 while trying to relocate the instance resource


Customer has single instance on 2 node cluster and it runs one instnace at a time on given 2 nodes.He wants to relocate the instance to node 1 from node 2 but getting  the following error while relocating.

8:21:34 $ $CRS_HOME/bin/crsctl relocate resource ora.ABC.inst -n host11  -f
CRS-2718: Server 'host1' is not a hosting member of resource 'ora.ABC.inst'
CRS-4000: Command Relocate failed, or completed with errors.

On examining the output of "crsctl stat resource ora.ABC.inst  -f" I found that hosting members is set to node 1 only

HOSTING_MEMBERS=HOST1


Resolution

Resolution was to modify the HOSTING_MEMBERS string to a space delimited host names

crsctl modify resource ora.ABC.inst -attr "HOSTING_MEMBERS='HOST1 HOST2'"

Monday, August 5, 2013

11gR2 :Query running fast on one node of RAC database while running very slow on the other 2 nodes

I came across this very interesting and unique problem recently.One of the query running on one of the RAC instances (version 11.2.0.3) was performing well where as on the other 2 instances it was performing pretty bad.The query was returning results in sub seconds on one node where as on the other 2 nodes it was taking long time (20-30 secs).I  took 10046 trace from both the instances and created a tkprof report .From the tkprof output looking  at node 1  disk reads were 0 where as on node 2 there were 167006   disk reads.That means the query was hitting cache on node 1 all the time whereas on node 2 and 3 it was doing physical reads all the time no matter how many times you run the query.

Following points to note

  • Execution plan didn't change and it remained the same across all the instances.
  • Stats were not collected from long time but that was another issue which we addressed later.
  • Queries were doing full table scans(because the filter predicate was the most popular value and it was obvious for the queries to go for FTS so FTS was not an issue).
  • db_cache_size is same on all the 3 instances

What changed?

The database was bounced in rolling fashion after some maintenance.

We tried to run the query  in question multiple times in hope that the data will get cached if we keep running it again and again but that was not helping.It was very strange behavior.

After analyzing the data for long time I noticed the the section "events waited on" in the tkprof .Which seemed little out of place.There were lot of waits on "Direct path read" on instance 2 and 3 where as there were no such wait events on node 1.

If you will do some research on "Direct path read" wait events you will come to know that the query results were skipping the buffer cache and were loaded directly into the PGA .Since the query results were never going though the buffer cache ,they were never getting cached and the query was doing physical reads each time it was getting executed.

Now the question arises ,How come the query results got cached on node 1 but not on other 2 .It is still a mystery but I used the following event to turn off the direct path reads on all the nodes and it resolved the issue.

alter system set events '10949 trace name context forever' ;

Interestingly this issue didn't happen at the time of upgrade from 10.2 to 11.2 but happened months later after database bounce.

The trade-off here is that that you need to to have db_cache_size large enough to accommodate all the query results which are frequently executed otherwise they will age out from the cache.In case you  are using AMM ,you need to set a minimum size for db_cache_size along with the parameters memory_target and memory_max_target.


If you want to create the tkprof  use the following procedure.

  • First create the 10046 trace using following
alter session set tracefile_identifier='10046_trace';
alter session set timed_statistics = true;  
alter session set statistics_level=all;  
alter session set max_dump_file_size = unlimited;  
alter session set events '10046 trace name context forever,level 12'; 
 -- Execute the queries or operations to be traced here --  
select * from dual;  
exit;

If the session is not exited then the trace can be disabled using:
alter session set events '10046 trace name context off';
  
  •  Generate tkprof reports from the raw SQL Traces in the USER_DUMP_DEST

    tkprof    "10046_trcfile"  "output_filename" sort=execpu, exedsk explain=user/password waits=yes


Please feel free to contact in case there are any questions or suggestions.

               

Friday, March 29, 2013

ORA-15564: contents of the replay directory provided to the workload replay client do not match with the replay directory provided to the database server

This can happen due to many different reasons.Some of the reasons I have noticed while running a DB replay were.

1)During connection remapping you are not providing the same connection string/service name as to the replay client.
2) The replay directory is not accessible from all the instances.
3) Replay workload is not pre-processed or it is pre-processed on some other version of the database as the target database.

ORA-00600: internal error code, arguments: [kdBlkCheckError], [75], [409240], [18018], [], [], [], [], [], [], [], []

I was working on Real application testing on an Exadata box and we faced following error while running the test.The changes made to the database:


  • We flashed back the database to guaranteed restore point after doing a Database replay.


ORA-00600: internal error code, arguments: [kdBlkCheckError], [75], [409240], [18018], [], [], [], [], [], [], [], []

On examining the alert log I noticed following error.This error is pointing for corruption in tablespace ABC.There is no segment name or segment owner reported ,so my initial thought  was that the corroupt block is actually a free space in this locally managed tablespace. I ran dbverify,rman validate and some other tests to identify the location of this corrupt block but all in vain .Then I looked at the trace file to find some more details of the corruption.The corruption actually happened at the time of flashback and it was in the bimap free space.The workaround was to restart the flashback and rebuild the bitmap segment.


Corrupt Block Found
         TSN = 12, TSNAME = ABC
         RFN = 75, BLK = 409240, RDBA = -998346247
         OBJN = 2, OBJD = -1, OBJECT = , SUBOBJECT =
         SEGMENT OWNER = , SEGMENT TYPE =


  • The ORA -600 error number is explained as below

arg[a]  --- File #
arg[b]  --- Block #
arg[c]  --- Application

So in case you are looking for a solution of other ORA-600 issues, you should be searching with ORA 600 and arg[c].

Workaround

1)
srvctl stop instance  -d rac -i rac1
srvctl start instance -d rac -i rac1 -o MOUNT
sqlplus / as sysdba
 Alter Database Flashback OFF;
 Alter Database Flashback ON;
 exit
srvctl stop instance  -d rac -i rac1
srvctl start database -d rac1

2) EXECUTE DBMS_SPACE_ADMIN.TABLESPACE_REBUILD_BITMAPS('ABC');

Monday, February 18, 2013

Start dbconsole for multiple databases on same host

If you want to start the dbconsole of multiple database on same host the parameter ORACLE_UNQNAME  and ORACLE_HOSTNAME needs to be set along with ORACLE_SID for that particular database .


[oracle@localhost ~]$ export ORACLE_SID=migrate
[oracle@localhost ~]$ export ORACLE_HOSTNAME=localhost
[oracle@localhost ~]$ export ORACLE_UNQNAME=migrate


[oracle@localhost ~]$ export ORACLE_HOSTNAME=localhost
[oracle@localhost ~]$ emctl start dbconsole
Oracle Enterprise Manager 11g Database Control Release 11.2.0.1.0
Copyright (c) 1996, 2009 Oracle Corporation.  All rights reserved.
https://localhost:5501/em/console/aboutApplication
Starting Oracle Enterprise Manager 11g Database Control ................................. started.
------------------------------------------------------------------
Logs are generated in directory /u01/app/oracle/product/11.1.0/db_1/localhost_migrate/sysman/log



Thursday, February 7, 2013

ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"

This error can happen due to many different reasons .When it happens you may need to think of what has changed  in your environment.It was on my testing machine that I started getting this error  .I was wondering what could have gone wrong. I searched on the ineternet but I was not able to get anything which can match my  error and the environment that I have.It am using a standalone 11.2 machine on OEL5 with oracle restart feature enabled.From 11.2 onwards ASM is part of grid infrastructure and a diskgroup resource is always there under clusterware.This diskgroup houses the most important file for the ASM instance ie  SPFILE.This diskgroup resource (ora.DATA.dg) was not coming up and complaining ORA-15063 when I was trying to start it manually.



[oracle@localhost ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  OFFLINE      localhost
ora.LISTENER.lsnr
               ONLINE  ONLINE       localhost
ora.LISTENER_GRID.lsnr
               ONLINE  ONLINE       localhost
ora.asm
               ONLINE  OFFLINE      localhost                Instance Shutdown
ora.eons
               OFFLINE OFFLINE      localhost
ora.ons
               OFFLINE OFFLINE      localhost
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
      1        ONLINE  ONLINE       localhost
ora.diskmon
      1        ONLINE  ONLINE       localhost
ora.migrate.db
      1        ONLINE  OFFLINE                               Instance Shutdown
ora.orcl.db
      1        OFFLINE OFFLINE
ora.orcl.test.svc
      1        OFFLINE OFFLINE
ora.stby.db
      1        OFFLINE OFFLINE



From the above output I can see that the resource ora.DATA.dg is offline which is the crs resource of my data diskgroup DATA.Now when I tried to start it manually I got the following error.



[oracle@localhost ~]$ crsctl start resource ora.DATA.dg
CRS-2672: Attempting to start 'ora.asm' on 'localhost'
ORA-01012: not logged on
CRS-2676: Start of 'ora.asm' on 'localhost' succeeded
CRS-2672: Attempting to start 'ora.DATA.dg' on 'localhost'
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"
CRS-2674: Start of 'ora.DATA.dg' on 'localhost' failed
CRS-2679: Attempting to clean 'ora.DATA.dg' on 'localhost'
CRS-2681: Clean of 'ora.DATA.dg' on 'localhost' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'localhost'
CRS-2677: Stop of 'ora.asm' on 'localhost' succeeded
CRS-4000: Command Start failed, or completed with errors.




Clearly from the above output I can see that I am not able to access the diskgroup data  and this error makes me believe that there could be some issue with the disks like corruption or missing metadata but that's not true in my case .This error message is misleading.

When I try to start the ASM instance manually I am able to do that but with the following error that no spfile found.



[oracle@localhost ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.1.0 Production on Thu Feb 7 02:05:37 2013

Copyright (c) 1982, 2009, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORA-00099: warning: no parameter file specified for ASM instance
ASM instance started

Total System Global Area  283930624 bytes
Fixed Size                  2212656 bytes
Variable Size             256552144 bytes
ASM Cache                  25165824 bytes
ORA-15110: no diskgroups mounted


SQL> show parameter asm_

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
asm_diskgroups                       string
asm_diskstring                       string
asm_power_limit                      integer     1
asm_preferred_read_failure_groups    string
SQL>


So I needed to investigate further about my ASM resource and I looked at the ASM resource profile.Suddenly I remembered that I had removed and added ASM resource again one day before. Following is the output of the ASM resource profile and you can clearly see that that spfile and ASM_diskstring are blank.The  point is I didn't specify the ASM_DISKstring at the time of adding the ASM resource and that was my mistake.




[oracle@localhost ~]$ crsctl stat resource ora.asm -f
NAME=ora.asm
TYPE=ora.asm.type
STATE=ONLINE
TARGET=ONLINE
ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r--
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
ALIAS_NAME=
ASM_DISKSTRING=              <<<<<<<<<<<<<<
AUTO_START=restore
CHECK_INTERVAL=1
CHECK_TIMEOUT=600
CREATION_SEED=83
CURRENT_RCOUNT=0
DEFAULT_TEMPLATE=PROPERTY(RESOURCE_CLASS=asm) ELEMENT(INSTANCE_NAME= %GEN_USR_ORA_INST_NAME%)
DEGREE=1
DESCRIPTION=Oracle ASM resource
ENABLED=1
FAILURE_COUNT=0
FAILURE_HISTORY=
GEN_USR_ORA_INST_NAME=+ASM
ID=ora.asm
INCARNATION=0
LAST_FAULT=0
LAST_RESTART=0
LAST_SERVER=
LOAD=1
LOGGING_LEVEL=1
NLS_LANG=
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=0
PROFILE_CHANGE_TEMPLATE=
RESTART_ATTEMPTS=5
SCRIPT_TIMEOUT=60
SPFILE=    <<<<<<<<<<<<<
START_DEPENDENCIES=hard(ora.cssd) weak(ora.LISTENER.lsnr)
START_TIMEOUT=900
STATE_CHANGE_TEMPLATE=
STATE_CHANGE_VERS=0
STATE_DETAILS=
STOP_DEPENDENCIES=hard(ora.cssd)
STOP_TIMEOUT=600
UPTIME_THRESHOLD=1d
USR_ORA_ENV=
USR_ORA_INST_NAME=+ASM
USR_ORA_OPEN_MODE=mount
USR_ORA_OPI=false
USR_ORA_STOP_MODE=immediate
VERSION=11.2.0.1.0



Solution:  

The Solution was to modeify or remove and add asm resource again with the correct asm_diskstring path.Everything came online after that.You can also modify the resource in case you don't want to remove it.

srvctl modeify asm -d /dev/oracleasm/disks

And stop and start the ASM

OR

[oracle@localhost ~]$ srvctl stop asm
[oracle@localhost disks]$ srvctl remove asm
[oracle@localhost disks]$  srvctl add asm -d /dev/oracleasm/disks
[oracle@localhost disks]$ srvctl start asm


[oracle@localhost disks]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       localhost
ora.LISTENER.lsnr
               ONLINE  ONLINE       localhost
ora.LISTENER_GRID.lsnr
               ONLINE  ONLINE       localhost
ora.asm
               ONLINE  ONLINE       localhost                Started
ora.eons
               OFFLINE OFFLINE      localhost
ora.ons
               OFFLINE OFFLINE      localhost
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
      1        ONLINE  ONLINE       localhost
ora.diskmon
      1        ONLINE  ONLINE       localhost
ora.migrate.db
      1        ONLINE  OFFLINE                               Instance Shutdown
ora.orcl.db
      1        OFFLINE OFFLINE
ora.orcl.test.svc
      1        OFFLINE OFFLINE
ora.stby.db
      1        OFFLINE OFFLINE