Using Oracle ZFS Storage Appliance for OLTP Workload

As it mentioned earlier that ZFS appliance is best suited to OLAP workload but you can use ZFS Appliance for Non-critical OLTP workload. By nature, Online transaction processing (OLTP) workloads tend to go after a small number of rows per IO request. Imagine a busy OLTP system where thousands of random IO’s going after a small amount of data, will require shorter response time to maintain and achieve healthy response time. That means you should use utilize ZFS read & write flash SSD’s to get reasonable performance for your OLTP applications. Similar to OLAP database workload, it is recommended to use Bigfile tablespace and Oracle Direct NFS. But unlike OLAP workload its best to use Advance Row Compression to optimize IO response time and memory utilization.

 

Recommended OLAP Database Layout on ZFS Storage
Oracle Files Record Size Sync Write Bias Read Cache Comp

ression

Example Share Name
Datafiles 128K throughput do not use cache devices LZ4 data
Temp 128K latency do not use cache devices LZ4 temp
Archive Logs 1M throughput do not use cache devices LZ4 archive
RMAN Backup 1M throughput do not use cache devices LZ4 backup

Source: Oracle 

Again, let’s look into rest of the recommendation of mapping different databases files to their recommended location and record sizes. Data files, temp files, and archive logs can be placed ton ZFS share with respective record sizes 32K, 128K and 1M.  Similarly, it’s best practices to place online redo logs and control files in the default Fast Recovery Area (FRA) on the preconfigured Exadata ASM diskgroup.

 

Using Oracle ZFS Storage Appliance for OLAP Workload

OLAP database workloads are mostly SQL queries going after a large chunk of tables or batch process loading bulk of data overnight. You might be loading Terabytes of data to your data warehouse system but those will not be considered random DML statements, hence critical workload will be users only query supporting DSS. This very true nature of DDS can be a good fit for databases compression including HCC. When it comes to using database compression one must understand that HCC will require some maintenance work to keep database HCC compress and it does not support OLTP compression. You should only use compression if you are cloning or duplicating a databases environment which already uses database compression so it can be a true representation of source environment. Additionally, you have an option to both tablespace format traditional big file tablespace but it’s recommended to user big file tablespace to achieve better performance, flexibility, and capacity. Finally, remember to use Oracle Direct NFS as it’s not recommended to use Oracle Intelligent Storage Protocol (OISP) for OLAP workload.

 

Recommended OLTP Database Layout on ZFS Storage
Oracle Files Record Size Sync Write Bias Read Cache Compression Example Share Name Storage Profile
Datafiles 32K latency all data and metadata LZ4 data Mirrored (NSPF)
Temp 128K latency do not use cache devices LZ4 temp Mirrored (NSPF)
Archive logs 1M throughput do not use cache devices LZ4 archive Mirrored (NSPF)
RMAN Backup 1M throughput do not use cache devices LZ4 backup Mirrored (NSPF)

Source: Oracle

Now let’s look into rest of the recommendation mapping different databases files to their recommended location and record sizes. Data files, temp files, and archive logs can be placed ton ZFS share with respective record sizes 128K, 128K, and 1M.  Even though you have the option to place redo logs and control on ZFS share, it’s best practices to place online redo logs and control files in the default Fast Recovery Area (FRA) on the preconfigured Exadata ASM diskgroup. It is hard to justify placing these two type of sensitive database files on ZFS share as they will not take much space and require high latency.

What is so special about Oracle ZFS Storage Appliance?

  1. Oracle ZFS provides extreme network bandwidth with 10G and InfiniBand connection with built-in network redundancy.
  2. Oracle ZFS provides extreme network bandwidth with 10G and InfiniBand connection with built-in network redundancy.
  3. Oracle ZFS ensure data integrity using feature like copy-on-write, metadata check summing, detect silent data corruption and errors correction before it is too late.
  4. Oracle ZFS uses an Oracle Intelligent Storage Protocol (OISP) to uniquely identify different types databases IO request and help you effectivity addressing performance bottlenecks using built-in Analytics.
  5. Oracle ZFS is extremely easy to manage using native web management interface and provides integration option with Oracle Enterprise Manager.
  6. Oracle ZFS support full range to compression options and tightly integrated with Oracle database to provide Hybrid Columnar Compression (HCC) which is only available on Oracle storage.
  7. Oracle ZFS is tightly integrated with Oracle database which helps achieve extreme backup rate up to 42 TB/hr and restore rate up 55 TB/hr.
  8. Oracle ZFS storage supports highly redundant and scalable InfiniBand architecture which can be seamlessly integrated with Oracle Exadata Machine to provide cost-effective storage option.
  9. Oracle ZFS appliance is integrated with Oracle RMAN to provide up to 2000 concurrent threads evenly distributed across many channels spread across multiple controllers.
  10. Oracle ZFS uses Oracle Direct NFS to reducing CPU and memory overhead by bypassing the operating system and writing buffers directly to user space.
  11. Oracle ZFS support 1MP record size to reduce number of IOPS that are required to disk, preserves the I/O size from RMAN buffers to storage, and improves performance of large block sequential operations.

Isolate your Exadata Network with InfiniBand Partitioning

An efficient system is which provide a balance of CPU performance, memory bandwidth, and I/O performance. A lot of professionals will agree that Oracle Engineered Systems are good examples of efficient systems. InfiniBand network plays a key role to provide that balance and gives close to 32 Gigabit per second network with very low latency. It provides a high-speed and high-efficiency network for Oracle’s Engineered Systems namely Exadata, Exalogic, Big Data Appliance, SuperCluster, ZFS Storage Appliances etc.

If you are planning to consolidate systems on Exadata machine, you might be required to implement network isolation across the multiple environments within the consolidated system for security or compliance reasons. This is accomplished by using custom InfiniBand partitioning with the use of dedicated partition keys and partitioned tables. InfiniBand partitioning will provide you isolation across the different RAC clusters so that network traffic of one RAC cluster is not accessible to another RAC cluster. Note that you can implement similar functionality for the Ethernet networks with VLAN tagging.

An InfiniBand partition creates a group of InfiniBand members that only allows communicating with each other. A unique partition key plays a key role in identifying and maintaining a partition that is managed by the master subnet manager. Members are assigned to these new custom partitions and they can only communicate to another member within that partition. For example, if you implement InfiniBand partitioning with OVM Exadata clusters, one particular cluster is assigned to one dedicated partition for the Clusterware communication and one partition for communication with the storage cells. One RAC cluster will not be able to talk to the nodes of another RAC cluster which will belong to a different partition, hence provide you network isolation within one Exadata Machine.

Temporary Tablespace on Exadata Machine

As we all know Oracle generate a lot of temporary data by operations like bitmap merges, hash join, bitmap index creation, sort. This data only persists only for duration of a transaction or session and will not require media and instance recovery. Since Oracle RAC environment share tablespace between multiple instances, high concurrency of space management operations is very critical. Please use following guidelines for creating temporary tablespaces on Exadata machine.

  1. A BigFile Tablespace
  2. Located in DATA or RECO, whichever one is not HIGH redundancy
  3. Sized 32GB Initially
  4. Configured with AutoExtend on at 4GB
  5. Configured with a Max Size defined to limit out of control growth.

 

Upgrade Exadata Machine Clusterware to 12.2.0.1

Hell All! It’s time to upgrade Exadata Machine infrastructure and database to Oracle 12c Release 2. There are some software and patch requirement you need to meet, before you can upgrade your Exadata machine to Oracle 12c Release 2. You might be able to perform most of the upgrade using rolling manner but database upgrade will require some down time. Ideally you would like to upgrade your infrastructure to 12c release 2 and install Oracle 12c release 2 software on separate mount with our impacting sexting database s running target Exadata machine. Later upgrade your databases to release 2 based on availability and downtime requirements. In any case, I strongly suggest making fool proof recovery plan for your Exadata machine and be prepared for a complete Exadata machine recovery. If possible, start with non-production environment and try to leverage DR Exadata Machine for production environments. This blog will provide you overview of upgrade process for your exadata machine to 12c release 2.

Caution : – I consider this a high risk activity and if you don’t have experience performing these kind of upgrades, hire someone who have experience with such upgrades.  

Software Requirements

Your current Exadata machine configuration should meet following requirement for Oracle 12c Release 2 upgrade:

  • Current Oracle Database and Grid Infrastructure version must be 11.2.0.3, 11.2.0.4, 12.1.0.1 or 12.1.0.2.
  • Upgrades from 11.2.0.1 or 11.2.0.2 directly to 12.2.0.1 are not supported.
  • Exadata Storage Server version 12.2.1.1.0 will be required for full Exadata functionality including  ‘Smart Scan offloaded filtering’, ‘storage indexes’ and’ I/O Resource Management’ (IORM).
  • When available: GI PSU 12.2.0.1.1 or later (which includes DB PSU 12.2.0.1.1). To be applied during the upgrade process, before running rootupgrade.sh on the Grid Infrastructure home, or after installing the new Database home, before upgrading the database.
  • Fix for bug 17617807 and bug 21255373 is required to successfully upgrade to 12.2.0.1 from 11.2.0.3.28, 11.2.0.4.160419 and 12.1.0.1.160419. The fix is already contained in 11.2.0.4.160419 and 12.1.0.2.160419.
  • Fix for bug 25556203 is required on top of the 12.2.0.1 Grid Infrastructure home before running rootupgrade.sh

Pre-Upgrade

Create temporary directory to hold all the software and patches

dcli -l oracle -g ~/dbs_group mkdir /u01/app/oracle/patchdepot

Download following software and patches from E-delivery

V840012-01.zip Oracle Database 12c Release 2 Grid Infrastructure

(12.2.0.1.0) for Linux x86-64 V839960-01.zip Oracle Database 12c Release

(12.2.0.1.0) for Linux x86-64 Exadata Storage Server Software 12.2.1.1.0

Patch 6880880 - OPatch latest update for 11.2, 12.1 and 12.2

Create the new Grid Infrastructure (GI_HOME) directory

(root)# dcli -g ~/dbs_group -l root mkdir -p /u01/app/12.2.0.1/grid
 (root)# dcli -g ~/dbs_group -l root chown oracle:oinstall /u01/app/12.2.0.1/grid

Install Grid Software using zip option , runinstaller is no longer supported

(oracle)$ unzip -q /u01/app/oracle/patchdepot/grid_home.zip -d /u01/app/12.2.0.1/grid

Obtain and Apply latest OPatch to the target 12.2.0.1 Grid Infrastructure home.

[oracle@exadb1 unzip -oq -d /u01/app/12.2.0.1/grid /u01/upgrade/p6880880_122010_Linux-x86-64.zip

[oracle@exadb1 OPatch]# cd /u01/app/12.2.0.1/grid/OPatch

[oracle@exadb1 OPatch]# ./opatch version

Validate Readiness for Oracle Clusterware upgrade using CVU

Before executing CVU as the owner of the Grid Infrastructure unset ORACLE_HOME, ORACLE_BASE and ORACLE_SID.

unset ORACLE_HOME ORACLE_BASE ORACLE_SID

(Oracle)$ /u01/app/12.2.0.1/grid/runcluvfy.sh stage -pre crsinst -upgrade -rolling -src_crshome /u01/app/12.1.0.2/grid -dest_crshome /u01/app/12.2.0.1/grid -dest_version 12.2.0.1.0 -fixupnoexec –verbose


Verifying User Equivalence ...PASSED

Verifying /dev/shm mounted as temporary file system ...PASSED

Verifying File system mount options for path /var ...PASSED

Verifying zeroconf check ...PASSED

Verifying ASM Filter Driver configuration ...PASSED


Pre-check for cluster services setup was successful.

CVU operation performed:      stage -pre crsinst

Date:                         Apr 20, 2018 8:15:00 AM

CVU home:                     /u01/app/12.2.0.1/grid/

Backup /u01 filesystem

tar -cvf oracle_inventory_apr20.tar /u01/app/oraInventory

tar -cvf oracle_gi_home_apr20.tar /u01/app/12.1.0.2/grid

Note : Task need to be performed on all three nodes as root

Run Exachk report 

./exachk –a

Verify no active rebalance is running

SYS@+ASM1> select count(*) from gv$asm_operation;

COUNT(*)

----------

0

 Upgrade Clusterware

Upgrade Grid Infrastructure to 12.2.0.1 (Will also apply latest CPU)

(oracle)$ cd /u01/app/12.2.0.1/grid

(oracle)$ gridSetup.sh -applyPSU /u01/upgrade/26737266

Launching Oracle Grid Infrastructure Setup Wizard...

Upgrade wizard steps 

On "Select Configuration Options" screen, select "Upgrade Oracle Grid Infrastructure , and then click Next.

On "Grid Infrastructure Node Selection" screen, verify all database nodes are shown and selected, and then click Next.

On "Specify Management Options" screen, specify Enterprise Management details when choosing for Cloud Control registration.

On "Privileged Operating System Groups" screen, verify group names and change if desired, and then click Next. If presented with warning: IINS-41808, INS-41809, INS-41812 OSDBA for ASM,OSOPER for ASM, and OSASM are the same group Are you sure you want to continue? Click Yes

On "Specify Installation Location" screen, choose "Oracle Base" and change the software location. The GI_HOME directory cant be chosen. It shows software location: /u01/app/12.2.0.1/grid from where you started gridSetup.sh

If prompted "The Installer has detected that the specified Oracle Base location is not empty on this and remote servers!" Are you sure you want to continue? Click Yes

On "Root script execution" screen, do not check the box. Keep root execution in your own control

On "Prerequisite Checks" screen, resolve any failed checks or warnings before continuing.

Solaris only: Solaris only: Review Document <2186095.1> Oracle Solaris-specific guidelines for GI software installation prerequisite check failure.

On "Summary" screen, verify the plan and click 'Install' to start the installation (recommended to save a response file for the next time)

On "Install Product" screen monitor the install, until you are requested to run rootupgrade.sh (recommended to save a response file for the next time)

Before executing the last steps (rootupgrade.sh) of the installation process an additional step is required. rootupgrade.sh execution will happen after next steps.

When required relink oracle binary with RDS 

(oracle)$ dcli -l oracle -g ~/dbs_group /u01/app/12.2.0.1/grid/bin/skgxpinfo

 If the command does not return 'rds' relink as follows:

For Linux: as owner of the Grid Infrastructure Home on all nodes execute the steps as follows before running rootupgrade.sh:

(oracle)$ dcli -l oracle -g ~/dbs_group ORACLE_HOME=/u01/app/12.2.0.1/grid \ make -C /u01/app/12.2.0.1/grid/rdbms/lib -f ins_rdbms.mk ipc_rds ioracle

Execute rootupgrade.sh on each database server

/u01/app/12.2.0/grid/rootupgrade.sh

Note:- Don't Run this script all at once on all nodes

Verify cluster status

(root)# /u01/app/12.2.0.1/grid/bin/crsctl check cluster -all
 **************************************************************
 node-1:
 CRS-4537: Cluster Ready Services is online
 CRS-4529: Cluster Synchronization Services is online
 CRS-4533: Event Manager is online
 **************************************************************
 node-2:
 CRS-4537: Cluster Ready Services is online
 CRS-4529: Cluster Synchronization Services is online
 CRS-4533: Event Manager is online
 **************************************************************

check, re-configuring for Cloud Control, and cleaning up the old, unused home areas.

Post-Upgrade

Disable Diagsnap for Exadata

NOTE: Due to unpublished bugs 24900613 25785073 and 25810099, Diagsnap should be disabled for Exadata.

(oracle)$ cd /u01/app/12.2.0.1/grid/bin

(oracle)$ ./oclumon manage -disable diagsnap

Advance COMPATIBLE.ASM diskgroup attribute

ALTER DISKGROUP RECOC1 SET ATTRIBUTE 'compatible.asm' = '12.2.0.1.0';

ALTER DISKGROUP DBFS_DG SET ATTRIBUTE 'compatible.asm' = '12.2.0.1.0';

ALTER DISKGROUP DATAC1 SET ATTRIBUTE 'compatible.asm' = '12.2.0.1.0';

De-Install GI HOME

Wait for few days before De-Install

oracle)$ cd $ORACLE_HOME/deinstall

(oracle)$ ./deinstall -checkonly

(oracle)$ ./deinstall

change ORACLE_HOME to grid home for previous grid home deinstall:

(root)# dcli -l root -g ~/dbs_group chown -R oracle:oinstall /u01/app/12.1.0.2

(root)# dcli -l root -g ~/dbs_group chmod -R 755 /u01/app/12.1.0.2


(oracle)$ ./deinstall -checkonly

(oracle)$ ./deinstall

Acceptable Hidden Parameters on Exadata Machine

We all have seem hidden parameters being used as a workaround to solve a specific problem, and should be removed once a system has been upgraded to a version level that contains the fix for the specific problem. So what happen when you migrate database to Exadata Machine, specially using physical migration methods? Most likely they are not removed during the migration process, even though the version level might contains the correct fix. Verifying the hidden database initialization parameter usage helps avoid hidden parameters being used any longer than necessary. Otherwise, use of hidden initialization parameters not recommended by Oracle development can introduce instability, performance problems, corruptions, and crashes to your Exadata environments.

Please verify hidden initialization parameter usage in each ASM and database instance using following sql.

select name,value from v$parameter where substr(name,1,1)=’_’;

All being said, there are some acceptable hidden parameters for Exadata Machine. Please review the list of acceptable hidden parameters based on their usage.

Generally Acceptable Hidden Parameters Table

  1. _file_size_increase_increment with possible value of 2143289344
  2. _enable_NUMA_support depend on database versions
  3. _asm_resyncckpt with value of 0 to Turns off resync checkpointing
  4. _smm_auto_max_io_size 1024 to permits 1MB IOs for hash joins that spill to disk
  5. _parallel_adaptive_max_users with value of 2
  6. _assm_segment_repair_bg as false for bug 23734075 work around
  7. _backup_disk_bufcnt as 64 (Only when ZFS based backups are in use)
  8. _backup_disk_bufsz as 1048576 (Only when ZFS based backups are in use)
  9. _backup_file_bufcnt as 64 (Only when ZFS based backups are in use)
  10. _backup_file_bufsz as 1048576 (Only when ZFS based backups are in use)

 

 

How to Perform a Detail Exadata Health Check

Exadata is a significant investment for any customer and one should make sure to maximize investment by configuring Exadata machine as per best practices and utilize all the features of engineered systems. Oracle has provided an array of tools for Exadata machine, but we see a gap between standard Exadata configuration vs a truly optimize Exadata machine. Exachk is a great tool provided by Oracle to validate Exadata configuration and Oracle best practices, but it’s designed as a standard tool for all Exadata machines. Exachk is not specific to a particular type of workload or application and doesn’t investigate enchantment opportunities to achieve extreme performance from Exadata machine.

That is why you should perform a detail Exadata health check of your Exadata machine which goes above and beyond Exachk validation and Oracle Enterprise Manager monitoring capabilities. The goal of this health check is to maximize the Exadata investment and reduce the number of incidents which can impact the availability of critical applications. Here is list of task you should perform to perform a detail Exadata Health check

  1. Review Exachk report to evaluate Exadata configuration, MAA Best practices, and database critical issues.
  2. Review various types of Exadata logs including Exawatcher, alert, trace, CRS, ASM, listener.
  3. Review Flash cache contents, verify smart flash log feature and check write-back cache functionality.
  4. Review Exadata feature usage like HCC Compression, Smart Scan, offloading, Storage Indexes
  5. Review Maximum Availability Architecture including backup of critical configuration files
  6. Review and validate Oracle Enterprise Manager Configuration of Exadata plugin.
  7. Review resource utilization at storage & database level and provide recommendations.
  8. Review AWR reports for contention and slow running processes.
  9. Review database parameter settings as per Oracle best practices including hidden parameters.
  10. Review log retention policy to optimize storage utilization and maintain historical data for troubleshooting any future issues.

 

Benefit of Using Oracle Bare Metal Cloud

As many organizations look to the cloud as a way to improve agility and flexibility, as well as to try and cut down their infrastructure support and maintenance costs, they are introduced to new cloud terminology: “Bare Metal Cloud,” or “Dedicated Instances.”

Let’s start by describing Oracle Bare Metal Cloud Services: A dedicated physical hardware/server with no other tenant residing and running workload on it. Bare Metal Cloud Services basically let you lease or rent dedicated physical hardware resource for running isolated workloads at an optimal cost. Depending on the cloud vendor, billing can be metered (hourly) or non-metered (monthly fixed cost).

When compared to the traditional public cloud, which is a Hypervisor Cloud that has many tenants per physical machine sharing hardware resources, Bare Metal Cloud Services is a dedicated physical hardware resource for isolation and performance comparable to on-premises hardware.

Flexibility

Flexibility is a key benefit of Oracle Bare Metal Cloud Services. It gives you complete control over cloud resources, so you can setup and customize based on your requirements. Basically, you have direct physical access to the resources when compared to typical cloud offerings where physical resources are hidden behind the hypervisor layer.

Bare Metal Cloud Services also allows a hypervisor on top of the dedicated physical resources, giving you the best of both worlds: allowing you to control the number of virtual machines and the workload on them. It is also important to understand that Bare Metal Cloud Services flexibility comes with a price — it takes a little longer to provision cloud resources, introducing time and complexity to the provisioning process.

Given the added complexity, you might ask why you would opt for Bare Metal Cloud Services. It’s the same reason customers opt for IaaS versus PaaS / SaaS cloud models. You have more control over your environment to install and configure your applications; you start to lose that control as you climb up the cloud stack from IaaS>>>PaaS>>>SaaS models.

Bare Metal Cloud Services offers agility for fast provisioning and on-demand resources, as well as high flexibility to define your servers, network, storage and security based on your requirements. All this makes Bare Metal Cloud Services a great alternative to traditional cloud offerings.

Performance

Performance is a major concern for organizations when it comes to moving their workload to the public cloud. Migrating to a traditional cloud environment can be considered risky for some environments because going from on-premise dedicated hardware to virtualized shared-cloud resources can introduce performance issues. Also, applications that require high memory and CPU sometimes do not fit well into the traditional cloud model. Bare Metal Cloud Services can offer Memory, CPU and Storage Allocations that the traditional shared-cloud service model cannot.

Though many public cloud vendors have not published concrete performance metrics, performance degradation can often occur due to the introduction of hypervisor layer as well as the inherent performance issues from a fully shared resource. Basically, public cloud is a shared environment where multiple Virtual Machines are fighting for the same physical resources, so performance degradation is to be expected. Therefore, if performance is key to your applications, then Bare Metal Cloud Services is probably the best option to run your application in cloud.

Bare Metal Cloud Services let you run your workload on dedicated physical servers without any noisy neighbors running their workload on the same server. This also allows you to troubleshoot performance issues more easily as you are the sole owner of the physical server, and you exactly understand what other type of workload is being run by other applications.

Security & Compliance

Like performance, security is a major concern for organizations when considering moving their environments to the public cloud. Cloud security is about requirements and capabilities to provide layers of security. It does not mean that Bare Metal Cloud Services is more secure than a traditional public cloud, but since you have more control, you can install and configure additional layers of security to further improve the security.

Additionally, because Bare Metal Cloud Services is a single-tenant solution, it provides you isolation, which can be an important compliance requirement for your organization. This allows the possibility that many security-sensitive organizations can move their workload to the public cloud by being able to conform to regulatory compliance requirements.

Furthermore, there are some software vendors who do not support or accept licensing on virtualized hardware because of soft partitioning because it’s hard to determine actual number of required software licenses for any given virtualized server in cloud. In this scenario, Bare Metal Cloud Services can be considered a viable public cloud option to satisfy licensing requirements for any application or a software vendor.