Print Friendly, PDF & Email

 
SYMPTOMS

The Oracle Enterprise Manage Cloud Control 13c is used to monitor the Oracle database. The Failed Logins Metric is enabled and by default executed every 30 minutes. The related metric queries with SQL_ID 83jcj6at6ugq3 (12c) and 1c74605ddbvjt (11gR2) are very slow and causes high IO in the monitored database.

 
 
RELATED
 
HOW TO FIND A QUERY THAT IS EXECUTED BY A DATABASE METRIC IN A TARGET DATABASE

 
 
DIAGNOSE
 
The Failed Logins Metric is enabled for the monitored database

The Failed Logins Metric executes in the monitored database the query with SQL_ID 83jcj6at6ugq3 every 30 minutes (by default)

83jcj6at6ugq3 (12c)

SELECT TO_CHAR(current_timestamp AT TIME ZONE :"SYS_B_0", :"SYS_B_1") AS curr_timestamp, 
SUM(failed_count) AS failed_count, TO_CHAR(MIN(first_occur_time), :"SYS_B_2") AS first_occur_time, 
TO_CHAR(MAX(last_occur_time), :"SYS_B_3") AS last_occur_time FROM ( SELECT COUNT(username) AS failed_count, 
MIN(timestamp) AS first_occur_time, MAX(timestamp) AS last_occur_time FROM sys.dba_audit_session 
WHERE returncode != :"SYS_B_4" AND timestamp >= current_timestamp - TO_DSINTERVAL(:"SYS_B_5") 
UNION SELECT COUNT(DBUSERNAME) AS failed_count, MIN(event_timestamp) AS first_occur_time, 
MAX(event_timestamp) AS last_occur_time  FROM unified_audit_trail WHERE ACTION_NAME=:"SYS_B_6" 
and return_code <> :"SYS_B_7" AND event_timestamp >= current_timestamp - TO_DSINTERVAL(:"SYS_B_8"))

1c74605ddbvjt (11g)

SELECT TO_CHAR(current_timestamp AT TIME ZONE :"SYS_B_0", :"SYS_B_1") AS curr_timestamp, 
COUNT(username) AS failed_count, TO_CHAR(MIN(timestamp), :"SYS_B_2") AS first_occur_time, 
TO_CHAR(MAX(timestamp), :"SYS_B_3") AS last_occur_time FROM  sys.dba_audit_session 
WHERE returncode != :"SYS_B_4" AND ACTION_NAME = :"SYS_B_5" AND 
timestamp >= current_timestamp - TO_DSINTERVAL(:"SYS_B_6")

 
The query is very slow and most part of the elapsed time of the query is the IO wait time. It can be easily confirmed by either reviewing execution details of the query in the v$sql view

*************************************** : ******************
                                  SQL_ID : 83jcj6at6ugq3
                                  Child# : 0
                           Cursor Status : VALID
                     Is Cursor Obsolete? : N
                          Parsing Schema : DBSNMP
                        System Timestamp : 17-08-2020 17-21-21
                               First Run : 2020-01-06/12:53:04
                        Last Active Time : 17-08-2020 17-19-39
                   Elapsed Time Total(s) : 34318.82
                       CPU Time Total(s) : 3851.97
                   IO Wait Time Total(s) : 33462.37
          Concurrency Wait Time Total(s) : 3.4
          Application Wait Time Total(s) : .95
              Cluster Wait Time Total(s) : 0
                              Executions : 91
                Elapsed Time per Exec(s) : 377.13
                    CPU Time per Exec(s) : 42.33
                IO Wait Time per Exec(s) : 367.72
       Concurrency Wait Time per Exec(s) : .04
       Application Wait Time per Exec(s) : .01
           Cluster Wait Time per Exec(s) : 0
                                         :
                              Disk Reads : 352363760
                             Buffer Gets : 371205986
                           Direct Writes : 0
                           Row Processed : 91
                                         :
                       Number of latches : 0
                             Sql Profile :
                        Sql Command Type : 3
*************************************** : ******************

SQL> select systimestamp from dual;

SYSTIMESTAMP
---------------------------------------------------------------------------
17-AUG-20 05.21.50.070321 PM +03:00

NOTE: According to v$sql view there were 91 executions those took 9.5 hours to complete

 
or by reviewing the section SQL ordered by User I/O Wait Time in an AWR report


 
In this specific example, the IO Wait Time of the query is almost 97,5% of elapsed time. The query is responsible for large part of I/O load (direct path read wait event) on the database therefore decreasing available performance of server.

In this example, the direct path read is almost half time (48,7%) of dbtime.

 
 
CAUSE
 
The reason why the Failed Logins Metric query 83jcj6at6ugq3 is too slow and causes the high IO is that the AUD$ table is big in size in the target database.

SET PAGES 999
SET LINES 300
COL OWNER FOR A10 
COL SEGMENT_NAME FOR A20 
COL SEGMENT_TYPE FOR A15 
COL MB FOR 9999999

SELECT OWNER,
       SEGMENT_NAME,
       SEGMENT_TYPE,
       ROUND(BYTES/1024/1024) MB 
FROM
       DBA_SEGMENTS
WHERE
       SEGMENT_NAME='AUD$';
OWNER      SEGMENT_NAME         SEGMENT_TYPE          MB
---------- -------------------- --------------- --------
SYS        AUD$                 TABLE              28780

 
By design the AUD$ table does not have any indexes but LOB indexes those are to be created for LOB columns by default.

SET PAGES 999
SET LINES 300
COL OWNER FOR A15 
COL INDEX_NAME FOR A30
COL INDEX_TYPE FOR A20

SELECT OWNER,
       INDEX_NAME,
       INDEX_TYPE	   
FROM
       DBA_INDEXES 
WHERE 
       TABLE_OWNER='SYS' AND 
       TABLE_NAME='AUD$';
OWNER           INDEX_NAME                     INDEX_TYPE
--------------- ------------------------------ --------------------
SYS             SYS_IL0000000407C00040$$       LOB
SYS             SYS_IL0000000407C00041$$       LOB

 

As there are no indexes for the AUD$ table performing any SELECT query against that table (Explicitly or Implicitly) will always lead to the TABLE ACCESS FULL operation. The more data in the table the more time is required to complete the FULL TABLE SCAN. The more queries performed against the table the more IO wait events in the database.
As the 83jcj6at6ugq3/1c74605ddbvjt query is executed repeatedly every 30 minutes (by default) and it implicitly accesses the AUD$ table through the SYS.DBA_AUDIT_SESSION view this is to be the reason for the described problem.

 
 
SOLUTION

 
The proper solution is to regularly purge the SYS.AUD$ table. Refer to THE AUD$ TABLE IS VERY LARGE for details about how to purge the table.
If THE AUD$ TABLE IS VERY LARGE does not work for you, eg you need to keep all audit records, then the only way you have is to create additional access paths (indexes).

The Effect Of Creating Index On Table Sys.Aud$ (Doc ID 1329731.1)
Partitioning The AUD$ Table (Doc ID 1379831.1)
Audit Tablespace Datafile Fails To Extend When AUD$ Grows. (Doc ID 1310035.1)
DBUPGDIAG.SQL Takes Long Time On Selecting SYS.AUD$ (Doc ID 1941679.1)
Performance Issues Caused by DB Collector When AUD$ Is Very Large (Full Table Scan on AUD$) (Doc ID 1356088.1)

 
 
As additional optional solution is to disable the Failed Logins Metric in the Oracle Enterprise Manage Cloud Control 13c. There are 2 Failed Logins Metrics : Failed Logins and Failed Logins(Historical). You should disable both

1. Failed Logins:

Target Database -> Monitoring -> Metric and Collection Settings -> Search for "Failed Logins" -> Click on "Every 30 minutes" -> Disable

 
2. Failed Logins (Historical):

Target Database -> Monitoring -> Metric and Collection Settings -> Other Collected Items -> Failed Logins (Historical) -> Click on the Duration "Every 12 hours" -> Disable.

 
 
REFERENCES
 
13.2 EM: “Failed Logins” and “Failed Logins (Historical)” Metrics Keep Running Despite Being Disabled (Doc ID 2459666.1)
EM13c : How to disable Failed Logins Metric Completely? (Doc ID 2429080.1)
EM 12c, 11g: High Cpu Utilization from Enterprise Manger Agent perl Process Running failedLogin.pl for audit_failed_logins metric (Doc ID 1265699.1)

 
 

Version  : 17:39 07.09.2020
Database : 11.2.0.4,12.1.0.1
OEM      : 13.3.0.0.0