Monday, April 23, 2018

ORA-01013: user requested cancel of current operation


If ORA-01013 appears in the alert log file, it would mean that your application is closing a session forcefully and database records this as action canceled by the user/application (as name description of this error suggests). It would mean that most of the time it is not something that can be handled from the database side to stop this message from appearing in the alert log file. But, sometimes this error may be because Oracle internally kills some process that crosses a timeline already defined. In the following I would provide further detail of this.

Tuesday, April 17, 2018

Error 12154 received logging on to the standby

There are several error messages related to dataguard environment and many of them have been discussed in my blog, and you can find links to those articles at the bottom of this article. In this article, I will explain Error 12154 that I received in one of my dataguard environments. This error message may appear in the alert log file of your primary database and it would seem similar to the following.
Error 12154 received logging on to the standby
FAL[client, USER]: Error 12154 connecting to <FAL_SERVER> for fetching gap sequence

Friday, April 13, 2018

Error 1033 received logging on to the standby



Troubleshooting a dataguard environment is not very straightforward sometimes. If you come to know that physical standby database is out of sync with the primary database, first thing is to check the alert log files of primary and standby database to find out any error messages that might be appearing in the alert log files. For example, if error 1003 is reported in the alert log file, you may see entries similar to the following in the primary database’s alert log file.
Mon Feb 06 00:10:40 2017
Error 1033 received logging on to the standby
PING[ARC1]: Heartbeat failed to connect to standby 'MY_STANDBY_DB'. Error is 1033.

Monday, April 2, 2018

Starting and Stopping Grid Infrastructure on a Standalone GI Installation



This document explains how to start and stop an Oracle cluster. To start and stop Grid Infrastructure services for a standalone installation, there are slightly different commands. Using same commands as RAC to start and stop the GI resources would return errors as follows
[root@~]$ /u01/app/11203/grid/bin/crsctl stop crs
CRS-4013: This command is not supported in a single-node configuration.
CRS-4000: Command Stop failed, or completed with errors.
[root@ ~]$ /u01/app/11203/grid/bin/crsctl stop cluster -all
CRS-4013: This command is not supported in a single-node configuration.
CRS-4000: Command Stop failed, or completed with errors.

Monday, March 26, 2018

Finding IO statistics in Linux Environment for Slow Disks


I have written 2 articles (AWR 1, AWR 2) in relation to the disk latency and how to read AWR reports to investigate IO slowness. In this article I will explain how we check disks IO performance in Linux systems. We will check how the disks where our datafiles and redo log files are stored are performing. These disks could be simple Linux mount points, or ASM disks. In case of ASM, we will need to find out the disks that are part of ASM diskgroups so that we can check the performance of those disks. For example, following is the way how we can find out the disks that are part of ASM diskgroup.

Monday, March 19, 2018

Reading and Understanding AWR Report for IO or Disk latency - 2



This a second article regarding IO latency issues investigation using AWR. First article can be found here. In this article I will further explain about checking IO latencies at the OS level in Linux

Log file sync

We had a production server running on Virtual Machine (vmware), and after a downtime, we started receiving complains about slow database. AWR report showed that “log fie sync” wait event that comes under COMMIT wait class was at the top, and database was spending more than 30% of its time on log file sync wait. Log file sync wait even can be observed in a very busy OLTP database, but it should not consume this much time as we were seeing, and should be found at the bottom of the list of top wait events.

Monday, March 12, 2018

Reading and Understanding AWR Report for IO or Disk latency - 1

Recently I performed a failover of my Oracle database (running on Linux) to my standby database, and after the switchover, application team started complaining about extreme slowness. I was using OEM Cloud Control and the graph was showing high waits for “free buffer wait” and alert log started showing Checkpoint not Complete.  Since I never saw these waits on my previous primary server (now standby), so first thing came into my mind was that the disks on the standby server (now primary) are probably very slow, because hardware of my servers was very old. Servers also had internal disks (not SAN or NAS). I generated AWR report for the time when database was running fine and without any performance issue, and then a latest time report based on latest snapshots to see what is going wrong with the IO.

Tuesday, March 6, 2018

ORA-00742: Log read detects lost write in thread %d sequence %d block %

During real time apply on one of my physical standby RAC database , the managed recovery process crashed with this error message, following is the entry in alert log file.
CORRUPTION DETECTED: In redo blocks starting at block 169592count 142 for thread 4 sequence 157
Sat Jul 02 19:12:25 2016
MRP0: Background Media Recovery terminated with error 742

Monday, February 26, 2018

Oracle Patching in Multitenant Environment with Minimal Downtime

Starting 12c, it is even easier to patch/upgrade existing databases if you have already employed multitenant environment. To do patching in multitenant environment, we can install a new Oracle home and patch it to the level we want to patch, and then run requited scripts (post-patch scripts) in the container database (CDB). At this point, you will have 2 oracle homes and one old home where you have current pluggable database(s) running (which remain available while we install and patch new oracle home).

Sunday, February 18, 2018

Unplugging and Plugging in of a Pluggable Database

In a multitenant environment we can unplug a pluggable database (PDB) from a container database (CDB), and then plug it into another CDB. After we unplug a PDB, it needs to be dropped from dba CDB as it become unusable in this CDB. We can plug the same PDB back into the same CDB as well. In this article I will explain how we perform this operation. I will unplug a database and plug it back in the same CDB, method of plugging it into a different CDB is essentially the same.

Monday, February 5, 2018

Point in Time Recovery of a Pluggable Database

In this article I will explain how we can perform a point in time recovery for a single pluggable database. Point in time recovery of a single pluggable database would not have any effect on other pluggable databases in the same container database, or container database itself. In the following, a real time scenario of a pluggable database point in time recovery is explained. Following are the points to consider this scenario.

Monday, January 29, 2018

Testing ASM Disk Failure Scenario and disk_repair_time


When a disk failure occurs for an ASM disk, behavior of ASM would be different, based on what kind of redundancy for the diskgroup is in use. If diskgroup has EXTERNAL REDUDANCY, diskgroup would keep working if you have redundancy at external RAID level. If there is no RAID at external level, the diskgroup would immediately get dismounted and disk would need a repair/replaced and then diskgroup might need to be dropped and re-created, and data on this diskgroup would require recovery.

Monday, January 22, 2018

ORA-00845: MEMORY_TARGET not supported on this system

If we face ORA-00845 during database startup, it would mean that /dev/shm file system is not configured with appropriate value required to start the database instance. We need to mount /dev/shm with a value that should be equal or more than the value of memory we want to allocate to all instances/SGAs (ASM as well as database instances) that would run on this host.
You may find following type of warning in alert log file.

Monday, January 8, 2018

Slow RMAN Performance for CROSSCHECK and DELETE OBSOLETE

I faced an issue recently where my CROSSCHECK and DELETE OBSOLETE commands were too much slow to execute, actually these were hung. The reason for this issue was IP address change of my NFS server form where a drive was mounted on database server for backups and backups were being taken on this NFS share. After NFS server’s IP address change, RMAN would go to check old IP address to search for the old/obsolete backups when we executed CROSSCHECK and DELETE OBSOLETE commands.

Popular Posts - All Times