Using Oracle ZFS Storage Appliance for OLTP Workload

As it mentioned earlier that ZFS appliance is best suited to OLAP workload but you can use ZFS Appliance for Non-critical OLTP workload. By nature, Online transaction processing (OLTP) workloads tend to go after a small number of rows per IO request. Imagine a busy OLTP system where thousands of random IO’s going after a small amount of data, will require shorter response time to maintain and achieve healthy response time. That means you should use utilize ZFS read & write flash SSD’s to get reasonable performance for your OLTP applications. Similar to OLAP database workload, it is recommended to use Bigfile tablespace and Oracle Direct NFS. But unlike OLAP workload its best to use Advance Row Compression to optimize IO response time and memory utilization.

 

Recommended OLTP Database Layout on ZFS Storage
Oracle Files Record Size Sync Write Bias Read Cache Compression Example Share Name Storage Profile
Datafiles 32K latency all data and metadata LZ4 data Mirrored (NSPF)
Temp 128K latency do not use cache devices LZ4 temp Mirrored (NSPF)
Archive logs 1M throughput do not use cache devices LZ4 archive Mirrored (NSPF)

 

FRA (redo and control) Preconfigured ASM Diskgroup (High Redundancy) +RECO

 

Recommended Redo and Control File OLTP Settings on ZFS Storage
Oracle Files Record Size Sync Write Bias Read Cache Compression Example Share Name
Redo Logs 128K latency do not use cache devices off reco
Control Files 8K latency all data and metadata off control

 

Again, let’s look into rest of the recommendation of mapping different databases files to their recommended location and record sizes. Data files, temp files, and archive logs can be placed ton ZFS share with respective record sizes 32K, 128K and 1M.  Similarly, it’s best practices to place online redo logs and control files in the default Fast Recovery Area (FRA) on the preconfigured Exadata ASM diskgroup.

 

Using Oracle ZFS Storage Appliance for OLAP Workload

OLAP database workloads are mostly SQL queries going after a large chunk of tables or batch process loading bulk of data overnight. You might be loading Terabytes of data to your data warehouse system but those will not be considered random DML statements, hence critical workload will be users only query supporting DSS. This very true nature of DDS can be a good fit for databases compression including HCC. When it comes to using database compression one must understand that HCC will require some maintenance work to keep database HCC compress and it does not support OLTP compression. You should only use compression if you are cloning or duplicating a databases environment which already uses database compression so it can be a true representation of source environment. Additionally, you have an option to both tablespace format traditional big file tablespace but it’s recommended to user big file tablespace to achieve better performance, flexibility, and capacity. Finally, remember to use Oracle Direct NFS as it’s not recommended to use Oracle Intelligent Storage Protocol (OISP) for OLAP workload.

Recommended OLAP Database Layout on ZFS Storage
Oracle Files Record Size Sync Write Bias Read Cache Compression Example Share Name
Datafiles 128K Throughput do not use cache devices LZ4 data
Temp 128K Latency do not use cache devices LZ4 temp
Archive Logs 1M Throughput do not use cache devices LZ4 archive
FRA (redo and control) Preconfigured ASM Diskgroup (High Redundancy) +RECO
Recommended Redo and Control File OLAP Settings on ZFS Storage
Oracle Files Record Size Sync Write Bias Read Cache Compression Example Share Name
Redo Logs 128K Latency do not use cache devices Off reco
Control Files 8K Latency all data and metadata Off control

 

Now let’s look into rest of the recommendation mapping different databases files to their recommended location and record sizes. Data files, temp files, and archive logs can be placed ton ZFS share with respective record sizes 128K, 128K, and 1M.  Even though you have the option to place redo logs and control on ZFS share, it’s best practices to place online redo logs and control files in the default Fast Recovery Area (FRA) on the preconfigured Exadata ASM diskgroup. It is hard to justify placing these two type of sensitive database files on ZFS share as they will not take much space and require high latency.

What is so special about Oracle ZFS Storage Appliance?

  1. Oracle ZFS provides extreme network bandwidth with 10G and InfiniBand connection with built-in network redundancy.
  2. Oracle ZFS provides extreme network bandwidth with 10G and InfiniBand connection with built-in network redundancy.
  3. Oracle ZFS ensure data integrity using feature like copy-on-write, metadata check summing, detect silent data corruption and errors correction before it is too late.
  4. Oracle ZFS uses an Oracle Intelligent Storage Protocol (OISP) to uniquely identify different types databases IO request and help you effectivity addressing performance bottlenecks using built-in Analytics.
  5. Oracle ZFS is extremely easy to manage using native web management interface and provides integration option with Oracle Enterprise Manager.
  6. Oracle ZFS support full range to compression options and tightly integrated with Oracle database to provide Hybrid Columnar Compression (HCC) which is only available on Oracle storage.
  7. Oracle ZFS is tightly integrated with Oracle database which helps achieve extreme backup rate up to 42 TB/hr and restore rate up 55 TB/hr.
  8. Oracle ZFS storage supports highly redundant and scalable InfiniBand architecture which can be seamlessly integrated with Oracle Exadata Machine to provide cost-effective storage option.
  9. Oracle ZFS appliance is integrated with Oracle RMAN to provide up to 2000 concurrent threads evenly distributed across many channels spread across multiple controllers.
  10. Oracle ZFS uses Oracle Direct NFS to reducing CPU and memory overhead by bypassing the operating system and writing buffers directly to user space.
  11. Oracle ZFS support 1MP record size to reduce number of IOPS that are required to disk, preserves the I/O size from RMAN buffers to storage, and improves performance of large block sequential operations.

Benefit of Using Oracle Bare Metal Cloud

As many organizations look to the cloud as a way to improve agility and flexibility, as well as to try and cut down their infrastructure support and maintenance costs, they are introduced to new cloud terminology: “Bare Metal Cloud,” or “Dedicated Instances.”

Let’s start by describing Oracle Bare Metal Cloud Services: A dedicated physical hardware/server with no other tenant residing and running workload on it. Bare Metal Cloud Services basically let you lease or rent dedicated physical hardware resource for running isolated workloads at an optimal cost. Depending on the cloud vendor, billing can be metered (hourly) or non-metered (monthly fixed cost).

When compared to the traditional public cloud, which is a Hypervisor Cloud that has many tenants per physical machine sharing hardware resources, Bare Metal Cloud Services is a dedicated physical hardware resource for isolation and performance comparable to on-premises hardware.

Flexibility

Flexibility is a key benefit of Oracle Bare Metal Cloud Services. It gives you complete control over cloud resources, so you can setup and customize based on your requirements. Basically, you have direct physical access to the resources when compared to typical cloud offerings where physical resources are hidden behind the hypervisor layer.

Bare Metal Cloud Services also allows a hypervisor on top of the dedicated physical resources, giving you the best of both worlds: allowing you to control the number of virtual machines and the workload on them. It is also important to understand that Bare Metal Cloud Services flexibility comes with a price — it takes a little longer to provision cloud resources, introducing time and complexity to the provisioning process.

Given the added complexity, you might ask why you would opt for Bare Metal Cloud Services. It’s the same reason customers opt for IaaS versus PaaS / SaaS cloud models. You have more control over your environment to install and configure your applications; you start to lose that control as you climb up the cloud stack from IaaS>>>PaaS>>>SaaS models.

Bare Metal Cloud Services offers agility for fast provisioning and on-demand resources, as well as high flexibility to define your servers, network, storage and security based on your requirements. All this makes Bare Metal Cloud Services a great alternative to traditional cloud offerings.

Performance

Performance is a major concern for organizations when it comes to moving their workload to the public cloud. Migrating to a traditional cloud environment can be considered risky for some environments because going from on-premise dedicated hardware to virtualized shared-cloud resources can introduce performance issues. Also, applications that require high memory and CPU sometimes do not fit well into the traditional cloud model. Bare Metal Cloud Services can offer Memory, CPU and Storage Allocations that the traditional shared-cloud service model cannot.

Though many public cloud vendors have not published concrete performance metrics, performance degradation can often occur due to the introduction of hypervisor layer as well as the inherent performance issues from a fully shared resource. Basically, public cloud is a shared environment where multiple Virtual Machines are fighting for the same physical resources, so performance degradation is to be expected. Therefore, if performance is key to your applications, then Bare Metal Cloud Services is probably the best option to run your application in cloud.

Bare Metal Cloud Services let you run your workload on dedicated physical servers without any noisy neighbors running their workload on the same server. This also allows you to troubleshoot performance issues more easily as you are the sole owner of the physical server, and you exactly understand what other type of workload is being run by other applications.

Security & Compliance

Like performance, security is a major concern for organizations when considering moving their environments to the public cloud. Cloud security is about requirements and capabilities to provide layers of security. It does not mean that Bare Metal Cloud Services is more secure than a traditional public cloud, but since you have more control, you can install and configure additional layers of security to further improve the security.

Additionally, because Bare Metal Cloud Services is a single-tenant solution, it provides you isolation, which can be an important compliance requirement for your organization. This allows the possibility that many security-sensitive organizations can move their workload to the public cloud by being able to conform to regulatory compliance requirements.

Furthermore, there are some software vendors who do not support or accept licensing on virtualized hardware because of soft partitioning because it’s hard to determine actual number of required software licenses for any given virtualized server in cloud. In this scenario, Bare Metal Cloud Services can be considered a viable public cloud option to satisfy licensing requirements for any application or a software vendor.

Oracle ZFS Storage Pool Data Profile Best Practices

Hello everyone, recently I was part of Oracle ZFS storage Pool design discussion, mostly focused on data profile types and Oracle best practices. Oracle recommend Mirrored data profile for many ZFS storage used cases including RMAN traditional backup and image backups for best performance and availability. I strongly recommend using mirrored pool production systems. Additionally, you can use double parity or triple parity, wide stripes for non-production systems if performance is not a major concern. Believing picture say a thousand words, please see below chart representing availability, performance and capacity detail of a 70 GB storage pool.  As you see from below chart Stripe data profile will provide you the most capacity without providing availability which can lead to a data loss. Additionally, you can see Mirrored data profile provide you both performance and availability.

Note: – Above figure is based on 70GB storage pool storage capacity

Please see below detail description of all available data profiles: 

Double parity: Each array stripe contains two parity disks, yielding high availability while increasing capacity over mirrored configurations. Double parity striping is recommended for workloads requiring little or no random access, such as backup/restore.

Mirrored: Duplicate copies of data yield fast and reliable storage by dividing access and redundancy evenly between two sets of disks. Mirroring is intended for workloads favoring high performance and availability over capacity, such as databases. When storage space is ample, consider triple mirroring for increased throughput and data protection at the cost of one-third total capacity.

Single parity, narrow stripes: Each narrow stripe assigns one parity disk for each set of three data disks, offering better random read performance than double parity stripes and larger capacity than mirrored configurations. Narrow stripes can be effective for configurations that are neither heavily random nor heavily sequential as it offers a compromise between the two access patterns.

Striped: Data is distributed evenly across all disks without redundancy, maximizing performance and capacity, but providing no protection from disk failure whatsoever. Striping is recommended only for workloads in which data loss is an acceptable tradeoff for marginal gains in throughput and storage space.

Triple mirrored: Three redundant copies of data yield a very fast and highly reliable storage system. Triple mirroring is recommended for workloads requiring both maximum performance and availability, such as critical databases. Compared to standard mirroring, triple mirrored storage offers increased throughput and an added level of protection against disk failure at the expense of capacity.

Triple parity, wide stripes : Each wide stripe has three disks for parity and allocates more data disks to maximize capacity. Triple parity is not generally recommended due to its limiting factor on I/O operations and low random access performance, however these effects can be mitigated with cache.

 

Using Oracle Direct NFS with Exadata Machine

Oracle Direct NFS (dNFS) is an NFS client which resides within the database kernel and should be enabled on Exadata for all direct database and RMAN workloads between Exadata and the Oracle ZFS Storage Appliance. With this feature enabled, you will have increased bandwidth and reduced CPU overhead. Even though there are no additional steps are required to enable dNFS although it is recommended to increase the number of NFS server threads from default to 1000.

As per Oracle documentation, using Oracle Direct NFS with Exadata can provide following benefits.

  • Significantly reduces system CPU utilization by bypassing the operating system (OS) and caching data just once in user space with no second copy in kernel space
  • Boosts parallel I/O performance by opening an individual TCP connection for each database process
  • Distributes throughput across multiple network interfaces by alternating buffers to multiple IP addresses in a round-robin fashion
  • Provides high availability (HA) by automatically redirecting failed I/O to an alternate address

In Oracle Database 12c, dNFS is already enabled by default.   In 11g, Oracle Direct NFS may be enabled on a single database node with the following command:

$ make -f $ORACLE_HOME/rdbms/lib/ins_rdbms.mk dnfs_on

Exadata dcli may be used to enable dNFS on all of the database nodes simultaneously:

$ dcli -l oracle -g /home/oracle/dbs_group make -f $ORACLE_HOME/rdbms/lib/ins_rdbms.mk dnfs_on

Note: – The database instance should be restarted after enabling Oracle Direct NFS. 

You can confirm that dNFS are enabled by checking the database alert log:

“Oracle instance running with ODM: Oracle Direct NFS ODM Library Version 3.0”

You can also use following SQL Query to confirm dNFS activity:

SQL> select * from v$dnfs_servers;

 

Patching Guidelines for Exadata Machine

Even though Oracle Offers free Exadata patching to their Exadata customers under Oracle Platinum Services, you might still end up applying patches to your Exadata Machine for many reasons. There can be a compliance issue or scheduling problem which may prevent you from using Oracle Platinum Service to patch your Exadata systems. Remember, Oracle needs Minimum 8-12 weeks prior notice before customer wants to be patched and might not work for you. So if you are one of those lucky Exadata Machine Admin planning to apply patches to your Exadata systems, here are some guidelines for safely completing patching task with minimum risks.

Guidelines

  1. You must carefully review the patch readme file and familiarize yourself with known issues and rollback options.
  2. Create a detailed workbook to Patch Exadata Machine including rollback option.
  3. Find Test system in your organization mimicking production system in terms of capacity and software version.
  4. Run Exachk utility before you start applying the patch to establish a baseline. Additionally, fix any major issues you see in the Exadata Health Check report.
  5. Reboot your Exadata Machine before you start applying the patch.
  6. Make sure you have enough Storage on all the mounts affected by the patch.
  7. Backup everything, I mean everything. Backup all the databases and storage mount holding software binaries.
  8. Apply the patch on a test system and document each step in a workbook to deploy patches for rest of the Exadata systems.
  9. Run Exachk utility after the successful patch application and compare its baseline Exachk report.
  10. Reboot Exadata Machine after deploying the patch to make sure there will not be issues with future Exadata Reboots.
  11. Verify all the Exadata Software and Hardware components InfiniBand, Storage Cells and Compute nodes.
  12. Move to applying the patch to Production systems, after successful patching exercise.

Isolate your Exadata Network with InfiniBand Partitioning

An efficient system is which provide a balance of CPU performance, memory bandwidth, and I/O performance. A lot of professionals will agree that Oracle Engineered Systems are good examples of efficient systems. InfiniBand network plays a key role to provide that balance and gives close to 32 Gigabit per second network with very low latency. It provides a high-speed and high-efficiency network for Oracle’s Engineered Systems namely Exadata, Exalogic, Big Data Appliance, SuperCluster, ZFS Storage Appliances etc.

If you are planning to consolidate systems on Exadata machine, you might be required to implement network isolation across the multiple environments within the consolidated system for security or compliance reasons. This is accomplished by using custom InfiniBand partitioning with the use of dedicated partition keys and partitioned tables. InfiniBand partitioning will provide you isolation across the different RAC clusters so that network traffic of one RAC cluster is not accessible to another RAC cluster. Note that you can implement similar functionality for the Ethernet networks with VLAN tagging.

An InfiniBand partition creates a group of InfiniBand members that only allows communicating with each other. A unique partition key plays a key role in identifying and maintaining a partition that is managed by the master subnet manager. Members are assigned to these new custom partitions and they can only communicate to another member within that partition. For example, if you implement InfiniBand partitioning with OVM Exadata clusters, one particular cluster is assigned to one dedicated partition for the Clusterware communication and one partition for communication with the storage cells. One RAC cluster will not be able to talk to the nodes of another RAC cluster which will belong to a different partition, hence provide you network isolation within one Exadata Machine.