Wednesday, August 24, 2022

Exadata Diagnostic Support Tools

 

Below are Tools available in Exadata, to upload logs to identify the issue. With updated models, tools may add / remove / merge. Please always refer to Oracle.com for latest update.


Note: This document is prepared by my friend Mr. Bakht Ali Khan (Oracle DBA).


1. SOSReport:

   ==========

   The is a tool to collect troubleshooting data on an Oracle Linux system.

   Among other Info its also includes about the installed rpm versions, syslog, network configuration, 

   mounted filesystems, disk partition details, loaded kernel modules and status of all services

   

   # rpm -qa | grep -i sos => To check rpm installation

   # yum install sos => Installing sos rpm

   # sosreport => Generating sos report

     --  

   # /opt/oracle.SupportTools/sundiag.sh -h

   

   How To Collect Sosreport on Oracle Linux (Doc ID 1500235.1)  

    

2. OSWatcher:

   ==========

   

   Oracle OSWatcher is a utility that collects data from commands such as vmstat, iostat, top, ps, netstat, 

   HP-UX sar, and Linux meminfo etc. OSWatcher archives the data files, automatically looks for issues, 

   and helps to determine the root cause of the issues, if possible.  

   

   Can download as standalone (osw<version>.tar) or Part of TFA support tool

   

   Doc - 301137.1 : OSWatcher (Includes: [Video])

   

3. ExaWatcher:

   ===========

   

   ExaWatcher produces a small set of charts when a collection is done. These charts are contained in the   Charts.ExaWatcher.<nodeName> directory within the collection. Charts can only be generated for the  following ExaWatcher collections:

   vmstat, iostat, mpstat, IBCardInfo, zonestat, meminfo, ldm, kstat etc

   The computer where you are viewing the charts must be connected to the internet.

   

   # ps -ef | grep -i ExaWatcher

   # cd /opt/oracle.ExaWatcher/

   # ./GetExaWatcherResults.sh --from 08/15/2022_13:00:00 --to 08/15/2022_14:00:00

     OR

   # /opt/oracle.ExaWatcher/GetExaWatcherResults.sh --from 10/14/2014_13:00:00 --to 10/14/2014_17:00:00

   # /opt/oracle.ExaWatcher/archive/ExtractedResults => (output directory)

   # cd  /opt/oracle.ExaWatcher/archive/ExtractedResults

   # ls -la

   # du -sh *

   

   File size should more than 10M or 100M

   

   Doc - 1617454.1 : ExaWatcher Utility On Exadata and SuperCluster Compute and Storage Nodes 

   

4. SUNDIAG:

   ========

   Oracle Exadata Diagnostic Information required for Disk Failures and some other Hardware issues.

   sundiag is an Exadata node/cell tool.  It works on both Linux and Solaris installations.

   Execution will create a date stamped tar.bz2 file in /tmp/sundiag_/tar.bz2. Upload this on SR.

   

   Run as root on the compute node or cell server:

      # /opt/oracle.SupportTools/sundiag.sh -h (Help to see detail)

      # /opt/oracle.SupportTools/sundiag.sh [ilom | snapshot] [osw <time ranges>]

     --

   # /opt/oracle.SupportTools/sundiag.sh

   # /opt/oracle.SupportTools/sundiag.sh ilom

   # /opt/oracle.SupportTools/sundiag.sh snapshot

   # /opt/oracle.SupportTools/sundiag.sh osw <from>-<to> e.g <date>_<time>-<date>_<time>

   # /opt/oracle.SupportTools/sundiag.sh osw 2014/03/31_15:00:00-2014/03/31_18:00:00 (Not < 9 hours)

   

x.  Procwatcher:

    ============

It is a tool to examine and monitor Oracle database and/or clusterware processes at an interval.   

The tool will collect stack traces of these processes using Oracle tools like oradebug short_stack 

and/or OS debuggers like pstack, gdb, dbx, or ladebug and collect SQL data if specified.

This tool can be used in conjunction with other tools or troubleshooting methods depending on the situation.

To install the script, simply download it put it in its own directory, unzip it, and give it execute permissions.

$ ./prw.sh stat - To check status

$ tfactl prw stat - To check the status if running inside of TFA.

$ ./prw.sh start - To start Procwatcher

$ ./prw.sh pack - To package up Procwatcher files to upload to support

Doc - 459694.1 : Procwatcher: Script to Monitor and Examine Oracle DB and Clusterware Processes

Procwatcher is Ideal for:


    Session level hangs or severe contention in the database/instance. See Note: 1352623.1

    Severe performance issues. See Note: 1352623.1

    Instance evictions and/or DRM timeouts.

    Clusterware or DB processes stuck or consuming high CPU (must set EXAMINE_CLUSTER=true and run as root for clusterware processes)

    ORA-4031 and SGA memory management issues. (Set sgamemwatch=diag or sgamemwatch=avoid4031 (not the default). See Note: 1355030.1

    ORA-4030 and DB process memory issues. (Set USE_SQL=true and process_memory=y).

    RMAN slowness/contention during a backup. (Set USE_SQL=true and rmanclient=y).


   5. ORAchk:

   =======

   

6. Exachk:

   =======

   Exachk is an Oracle Exadata diagnostic tool that comes with different levels of verification and

   collects hardware, software, firmware, Network Fabric Switches and configuration data on Exa systems.

   In addition to the automatic correction feature, individual checks have explanations, recommendations, 

   and manual verification commands so that customers and administrators can evaluate the risks of 

   and self-correct reported conditions.

 

   # ps -ef | grep exachk

   # cd /opt/oracle.SupportTools

   # ./exachk -d status

   # ./exachk -v

   

   How do I install only ORAchk or EXAchk without TFA?

   We can install with the following command, which will not install or alter any existing TFA installlation:

   # ahf_setup (Installation clude TFA, ORAchk, Exachk etc)

   # tfactl version -all

   # ahf_setup -extract orachk (Installation included only ORAchk)

   # ahf_setup -extract exachk (Installation included only Exachk)

   

   TFA, ORAchk & EXAchk work just the same as before. AHF is simply a combined installer for them.

   TFA is still the primary tool for diagnostic collection and management. ORAchk/EXAchk is still 

   the primary tool for proactively checking your system.

   The move to AHF is intended to make it easier, with a single installation with all tools in a single location.

   

7. ADR:

   ====

   The Automatic Diagnostics Repository (ADR) is a hierarchical file-based repository for handling diagnostic information.

   

   sql> select name,value from v$diag_info;

   

   $ adrci    

   $ adrci> SHOW PROBLEM 

   $ adrci> SHOW INCIDENT 

   $ adrci> SHOW CONTROL

   

   Doc - 422893.1 : Understanding Automatic Diagnostic Repository 

8. Oracle Trace File Analyzer (TFA):

   =================================

   Oracle Trace File Analyzer (TFA) provides a number of diagnostic tools in a single bundle, making it easy 

   to gather diagnostic information about the Oracle database and clusterware, which in turn helps with problem 

   resolution when dealing with Oracle Support.  

   

9. Autonomous Health Framework (AHF):   

   ==================================

   The Trace File Analyzer (TFA) is now part of the Autonomous Health Framework (AHF) described. AHF is collection of

   many tools i.e. EXAchk, ORAchk etc

 Oracle AHF is a collection of components that analyzes the diagnostic data collected, and proactively identifies issues 

   before they affect the health of your Database, clusters or your Oracle Real Application Clusters databases.

   

x. References:

 Doc - 1539451.1 : How to shutdown the Exadata database nodes and storage cells in a rolling fashion so certain hardware tasks can be performed. 

 Doc - 1093890.1 : Steps To Shutdown/Startup The Exadata & RDBMS Services and Cell/Compute Nodes On An Exadata Configuration

Doc - 19c E95727-09 - April 2021 : Clusterware Administration and Deployment Guide 

Doc - 1683842.1 : SRDC - EEST Sundiag

Doc -  761868.1 : Oracle Exadata Diagnostic Information required for Disk Failures and some other Hardware issues

Doc - 2799587.1 : Gathering sundiag in Exadata image version 20.1.11 and 20.1.12 fail with syntax error

 Doc - 1070954.1 : Oracle Exadata Database Machine EXAchk

Doc - 2673298.1 : How To Restart Exachk Daemon

Doc - 2550798.1 : Autonomous Health Framework (AHF) - Including TFA and ORAchk/EXAchk 






 

No comments: