Because the cost-based approach relies on statistics, you should generate statistics for all tables and clusters and all indexes accessed by your SQL statements before using the cost-based approach. If the size and data distribution of the tables change frequently, then regenerate these statistics regularly to ensure the statistics accurately represent the data in the tables.
Collecting optimizer statistics on Exadata is not any different than other systems. I usually recommend my clients for migrate existing gather stats methods from old system. In case you were not collecting stats on existing system , you should gather should at least following types of optimizer statistics.
- Table stats
- Index stats
- System stats
You can gather table / index stats using following procedure at schema level :
Gathering Exadata specific system statistics ensure the optimizer is aware of Exadata scan speed. Accurately accounting for the speed of scan operations will ensure the Optimizer chooses an optimal execution plan in a Exadata environment. Lack of Exadata specific stats can lead to less performant optimizer plans.
The following command gathers Exadata specific system statistics
Note this best practice is not a general recommendation to gather system statistics in Exadata mode for all Exadata environments. For existing customers who have acceptable performance with their current execution plans, do not gather system statistics in Exadata mode.
For existing customers whose cardinality estimates are accurate, but suffer from the optimizer over estimating the cost of a full table scan where the full scan performs better, then gather system statistics in Exadata mode.
For new applications where the impact can be assessed from the beginning, and dealt with easily if there is a problem, gather system statistics in Exadata mode.