Many of you have already used exachk for your Exadata machines and familiar with its uses. But with the virtual Exadata machine, things are little different. You need to run exachk from multiple locations. Number of locations will depend on how you have virtualized your Exadata machine. For example, if you have 2 VM clusters within your Exadata Machine, you will have to run exachk from 3 locations. It does not matter how many nodes you have in your VM cluster, you only need to run exachk in the first user domain (domU) and management domain (domO).
Why Management Domain?
Even though there is no rdbms and clusterware software installed on management domain, you will still need to run exachk to perform hardware and operating system level checks for database nodes, storage servers, InfiniBand fabric and InfiniBand switches. You can also run exachck individually for database servers, storage servers and InfiniBand switches by specifying following command line options (-clusternodes, -cells, -ibswitches).
For this blog, I am only going to focus on running exachk on domO (management domain), you can check following blog (http://blog.umairmansoob.com/running-exachk-on-exadata-machine/) for running exachk on VM clusters. You will need exachk version 184.108.40.206.2 and higher for virtualization support. You will be using the same command line option, exachk automatically detects that it is running in an Exadata OVM environment and whether it is running in a management domain or user domain and performs the applicable audit checks.
- Download latest exachk version from Oracle Metalink (Doc ID 1070954.1). Copy exack.zip to /opt/oracle.SupportTools/exachk and unzip
2. Check Version ( ./exachk –v )
3. Run exachk (./exachk –a )
Notes : – You will need root password for each InfiniBand switch.
4. Collecting Database Nodes information
5. Collecting Storage Nodes information
6. You can download zip file generated by exachk to your laptop and check exachk_XXXX_html
7. Check for fail items and warnings
Note : – If there are queries that return slowly from the underlying databases, then you can capture the SQL statements for the queries in the query log and provide them to the database administrator (DBA) for analysis. Usually DBA are able to fix performance issues. But let me summarizes methods that you can use to improve query performance:
- Table Indexes : It is very import for underlying table or tables to have indexes. There are different types of indexes E.g. (Primary Key, Bitmap, and Composite), make sure to use them properly. Indexes can become invalid for many reasons, make sure to check them on regular bases. It’s a big topic and I am planning to write a blog about Indexes in Data Warehouse in details.
- Table Partitions : if table is big and contain a lot of rows, you can improve OBIEE query performance by partitioning underlying table(s). There are many types to partitions in Oracle like range, hash and interval, make sure to use them properly.
- Table Join : There are different type of join in Oracle (Hash Join, Nested Loop Join), I have personally drastic performance different using different type of join.
- Avoid Disk Sorts : SQL statement execution can create sort activity, especially if you are using Oracle aggregate functions. Check if query is doing Disk Sort and find a way to avoid disk sorts.
- SQL HINT(s) : Sometime query don’t get best execution plan from optimizer and you can use SQL hints to enforce optimum execution plan in Oracle.
- Aggregate Tables : It is extremely important to use aggregate tables to improve query performance. Aggregate tables contain precalculated summarizations of data. It is much faster to retrieve an answer from an aggregate table than to recompute the answer from thousands of rows of detail.
- Database Cache : There are different types of caching technique in Oracle like result cache, database cache and Exadata Flash Cache. Caching can significantly improve query performance , use them properly.
- OBIEE Caching : The Oracle BI Server can store query results for reuse by subsequent queries. Query caching can dramatically improve the apparent performance of the system for users, particularly for commonly used dashboards, but it does not improve performance for most ad-hoc analysis.
Exachk is design to evaluate HW & SW configuration, MAA Best practices and database critical issues for all Oracle Engineered Systems. All checks have explanations, recommendations, and manual verification commands so that customers can self-correct all FAIL, ERROR and WARNING conditions reported.
Step 1 : Download latest exachk version from Oracle Metalink (Doc ID 1070954.1). Copy exack.zip to /opt/oracle.SupportTools/exachk and unzip.
Step : 2 Check exachk Version
$ ./exachk -v
Step 3 : Run Exadata check
Step 4: Select Database(s) for checking best practices
Step 5 : Enter root password:
Step 6: Download .zip file and unzip
$ ls –ltr
Step 7: Analyze exachk_XXXXX_html
Step 8 : Check Exadata System Health Score
Step 9 : Check for fail items
You can use following steps to extract flash cache contents into an external table. You can also automate this task by creating user equivalency between compute and all storage nodes.
Please download following .pdf file for details.
Extracting Information from Exadata Flash Cache
ACFS is now supported on Exadata. But ACFS does not support Exadata smart scan and offloading , this mean you cannot place your critical databases on ACFS. Please see following Oracle note 1929629.1 for details.
ACFS Support database version :
- Oracle Database 10g Rel. 2 (10.2.0.4 and 10.2.0.5)
- Oracle Database 11g (220.127.116.11 and higher)
- Oracle Database 12c (18.104.22.168 and higher)
- Oracle ACFS replication or security/encryption/audit is only supported with general purpose files.
- Oracle ACFS does not currently support the Exadata offload features.
- Hybrid Columnar Compression (HCC) support requires fix for bug 19136936.
- Exadata Smart Flash Cache will cache read operations.
- Exadata Smart Flash Logging is not supported.
According to Oracle, “Oracle recommends that you multiplex your redo log files. The loss of the log file data can be catastrophic if recovery is required”
Oracle also has a cautionary note on performance that is “When you multiplex the redo log, the database must increase the amount of I/O that it performs. Depending on your configuration, this may impact overall database performance.”
So the question is should we multiplex redo logs with Exadata, which is highly protected from disk failures? The answer YES / NO, It will all depend on your ASM disk group redundancy levels. Oracle recommends making DATA disk group redundancy level high and placing all the online Redo Logs / Standby Logs on DATA disk group and not to be multiplexed.
Please use following Exadata Best practice matrix to decide whether to multiplex online redo logs or not.
- If a high redundancy disk group exists, place all redo logs in that high redundancy disk group.
- If both DATA and RECO are high redundancy, place all redo logs in DATA.
- If only normal redundancy disk groups exist, multiplex redo logs, placing them in separate disk groups.
Recently I was tasked to look into the possibility of sharing Exadata machine between SAP and NON-SAP databases. As many of you already know, SAP has its own bundle patches called SBP (SAP Bundle Patches). Most of these patches are applied to Oracle RDBMS home and some are, may be, applied to Oracle GI Home. You are required to maintain patches for both RDBMS and GRID Home. Sharing RDBMS homes between SAP and NON-SAP databases are not supported.
Now if you want to share Exadata Machine between SAP and NON-SAP databases you have the following options:
- Install two separate RDBMS homes, one for SAP databases and one for non-SAP databases. Maintain SAP RDBMS home as per SAP specific instruction and maintain non-SAP database as per Oracle provide instructions. You also have a GRID Home (GI Home) that you need to maintain as per SAP specific instructions.
- If you have more than 2 compute nodes ( e.g Exadata half rack ) , you can install 2 clusters using 2 nodes for each cluster. Once you have installed two clusters, you can dedicate 1 cluster each for SAP and NON-SAP databases.
NOTE : SAP has not yet certified OVM with Exadata. Once that is done, you can Install and maintain two separate VM Clusters using OVM, 1 each for SAP and NON-SAP databases.
Every time I go through an Exadata deployment process with my client, there is a discussion about ASM Redundancy level. As many of you already know that Exadata only supports two ASM redundancy levels (Normal or High) and Oracle Recommends using a High Redundancy level for both DATA and RECO disk groups. Keep in mind that changing the redundancy level will require recreating disk groups.
A brief description about respective redundancy levels is as follows:
*NORMAL redundancy provides protection against a single disk failure or an entire storage server failure.
*HIGH redundancy provides protection against 2 simultaneous disk failures from 2 distinct storage servers or 2 entire storage servers. HIGH redundancy provides redundancy during Exadata storage server rolling upgrades.
Choosing redundancy level for your Exadata machine will depend on your database environment, available capacity, and desired protection level. Some databases are critical and need a HIGH redundancy disk group while most other databases can use NORMAL redundancy disk groups. So if you choose Normal redundancy, it will not be against the norm but you will not be following Oracle recommendations. I have seen clients using Normal Redundancy more often than I want to. Following are some reasons where you should always use High Redundancy level:
- If it is a production system with no DR in place.
- If your storage requirement is low and using HP capacity disks
- If you want to perform storage server rolling upgrades.
Now following are some situations where you can use Normal redundancy:
- If it is a Dev or UAT system.
- If you are space constrained.
- If you have Data Guard in place for production databases.
NOTE: Standard Exadata deployment will create 3 disk groups (DATA, RECO and DBFS_DG), but you can create additional disk groups with different redundancy levels based on your requirement.
Because an Oracle Standby database (Active Data Guard) is essentially a read-only database, it can be used as a Business intelligence query server, relieving the workload of the primary database and improving query performance.
How it works
You would think since Oracle Standby database is read only database and Oracle OBIEE only generate sql queries, it should work with default configuration. But it’s not that simple , OBIEE generates some write operations and they need to route to Primary database. Following are the example of OBIEE write operations.
- Oracle BI Scheduler job and instance data
- Temporary tables for performance enhancements
- Writeback scripts for aggregate persistence
- Usage tracking data, if usage tracking has been enabled
- Event polling table data, if event polling tables are being used
- Create a single database object for the standby database configuration, with temporary table creation disabled.
- Configure two connection pools for the database object:
A read-only connection pool that points to the standby database
A second connection pool that points to the primary database for write Operations
- Update any connection scripts that write to the database so that they explicitly specify the primary database connection pool.
- If usage tracking has been enabled, update the usage tracking configuration to use the primary connection.
- If event polling tables are being used, update the event polling database configuration to use the primary connection.
- Ensure that Oracle BI Scheduler is not configured to use any standby sources.
Because the cost-based approach relies on statistics, you should generate statistics for all tables and clusters and all indexes accessed by your SQL statements before using the cost-based approach. If the size and data distribution of the tables change frequently, then regenerate these statistics regularly to ensure the statistics accurately represent the data in the tables.
Collecting optimizer statistics on Exadata is not any different than other systems. I usually recommend my clients for migrate existing gather stats methods from old system. In case you were not collecting stats on existing system , you should gather should at least following types of optimizer statistics.
- Table stats
- Index stats
- System stats
You can gather table / index stats using following procedure at schema level :
Gathering Exadata specific system statistics ensure the optimizer is aware of Exadata scan speed. Accurately accounting for the speed of scan operations will ensure the Optimizer chooses an optimal execution plan in a Exadata environment. Lack of Exadata specific stats can lead to less performant optimizer plans.
The following command gathers Exadata specific system statistics
Note this best practice is not a general recommendation to gather system statistics in Exadata mode for all Exadata environments. For existing customers who have acceptable performance with their current execution plans, do not gather system statistics in Exadata mode.
For existing customers whose cardinality estimates are accurate, but suffer from the optimizer over estimating the cost of a full table scan where the full scan performs better, then gather system statistics in Exadata mode.
For new applications where the impact can be assessed from the beginning, and dealt with easily if there is a problem, gather system statistics in Exadata mode.