Listing Oracle ZFS Appliance Snapshots

The Oracle ZFS Storage Appliance features a snapshot data service, Snapshots are read-only copies of a filesystem at a given point-in-time. You can think of ZFS snapshots as a restore point of data set for project and shares, which can be used to rollback state of data set to point-in-time just like Oracle database restore points conceptually. ZFS Snapshots are only logical entities, so you can create virtually unlimited number of snapshots without taking up any space. Snapshots can be scheduled or taken manually, depending on usage and policies. We can manage snapshots using Oracle ZFS Appliance graphical interface BUI or through scripts. There are times when you want to manage snapshot using scripts especially when you want to integrate them with Oracle backups. SSH user equivalence might be required if you are looking to execute the following script without providing the root passwords. Following is an example, how to list project snapshots using shell a script for both ZFS controllers (in case you are using active/active ZFS cluster).

List Project SnapShots

> cat list_snapshots.sh

echo "Head 1"

cat <<eof |ssh -T -i ~/.ssh/id_rsa root@zfscontroller-1

script

run('shares');

run ('set pool=pool1');

run ('select H1-dbname');

run ('snapshots');

snapshots = list();

for (i = 0; i < snapshots.length; i++) {

  printf("%20s:", snapshots[i]);

  run ('select ' + snapshots[i]);

  printf("%-10s\n", run('get space_data').split(/\s+/)[3]);

  run('cd ..');

}

eof

echo "Head 2"

cat <<eof |ssh -T -i ~/.ssh/id_rsa root@zfscontroller -2

script

run('shares');

run ('set pool=pool2');

run ('select H2-dbhome');

run ('snapshots');

snapshots = list();

for (i = 0; i < snapshots.length; i++) {

  printf("%20s:", snapshots[i]);

  run ('select ' + snapshots[i]);

  printf("%-10s\n", run('get space_data').split(/\s+/)[3]);

  run('cd ..');

}

eof

Script Output:

> ./list_snapshots.sh

Head 1

  snap_20170921_1720:9.17T

  snap_20170924_1938:18.5T

Head 2

  snap_20170921_1720:8.09T

  snap_20170924_1938:16.2T

oracle@exa2node:/home/oracle

Creating Oracle ZFS Snapshots using Shell Script

The Oracle ZFS Storage Appliance features a snapshot data service, Snapshots are read-only copies of a filesystem at a given point-in-time. You can think of ZFS snapshots as a restore point of data set for project and shares, which can be used to rollback state of data set to point-in-time just like Oracle database restore points conceptually. ZFS Snapshots are only logical entities, so you can create virtually unlimited number of snapshots without taking up any space. Snapshots can be scheduled or taken manually, depending on usage and policies. We can manage snapshots using Oracle ZFS Appliance graphical interface BUI or through scripts. There are times when you want to manage snapshot using scripts especially when you want to integrate them with Oracle backups. SSH user equivalence might be required if you are looking to execute the following script without providing the root passwords. Following is an example, how to create project snapshots using a shell script for both ZFS controllers (in case you are using active/active ZFS cluster).

Creating Project Snapshots

 

> cat snapshotdt_project.sh

#ssh-agent bash

#ssh-add ~/.ssh/id_dsa

{

echo script

echo "{"

echo " run('cd /');"

echo " run('shares');"

echo " run('set pool=pool1');"

echo " run('select H1-dbname');"

dt=`date "+%Y%m%d_%H%M"`;

echo " run('snapshots snapshot snap_$dt');"

echo " printf('snapshot of the project H1-dbname completed..\n');"

echo "}"

echo "exit"

} | ssh -T -i ~/.ssh/id_rsa root@ZFSControler-1

{

echo script

echo "{"

echo " run('cd /');"

echo " run('shares');"

echo " run('set pool=pool2');"

echo " run('select H2-dbname');"

dt=`date "+%Y%m%d_%H%M"`;

echo " run('snapshots snapshot snap_$dt');"

echo " printf('snapshot of the project H2-dbname completed..\n');"

echo "}"

echo "exit"

} | ssh -T -i ~/.ssh/id_rsa root@ ZFSControler-2

Script Output : 

> ./ snapshotdt_project.sh

snapshot of the project H1-dbname completed..

snapshot of the project H2-dbname completed..

VLAN Tagging with Oracle Exadata Database Machine

VLAN is a process, where we create an independent logical network within a physical network and when it’s spread across multiple switches, VLAN Tagging is required. Basically, you insert a VLAN ID with every network packet header so switches can identify proper VLAN network and route network packet to correct network interface and port. The network administrator can configure VLAN at switch level including specifying ports belonging to VLAN.

If there is a need for Exadata to access additional VLAN’s on the Public network, such as enabling network isolation, then 8021.Q based VLAN tagging is a solution. By default, Exadata switch is minimally configured, without VLAN tagging. If you want to use VLAN tagging, then you need to planned and enable it after the initial deployment. This applies to both physical and Oracle VM deployments. Both Compute nodes and storage nodes can be configured to use VLANs for the management network, ILOM, client network, and the backup access network

Overall configuration process can be divided into two phases.  In the first phase, VLAN tagged interfaces are created at Linux operating system level with persistent configurations. Their respective default gateway IP addresses are also configured via iproute2 using unique tables.  In the second phase, database listeners and Clusterware are provisioned.

Other considerations

  • VLANs do not exist in InfiniBand. For equivalent functionality, see InfiniBand partitioning.
  • Network VLAN tagging is supported for Oracle Real Application Clusters on the public network.
  • If the backup network is on a tagged VLAN network, the client network must also be on a separate tagged VLAN network.
  • The backup and client networks can share the same network cables.
  • OEDA supports VLAN tagging for both physical and virtual deployments.
  • Client and backup VLAN networks must be bonded.
  • The admin network is never bonded.

 

Exadata Storage Indexes can store up to 24 Columns with 12.2.1.1.0

As we all know Exadata Storage indexes use to hold information up to eight columns till Exadata Storage Server Software release 12.1.0.2 , now with the of Oracle Exadata Storage Server Software release 12.2.1.1.0, storage indexes have been enhanced to store column information for up to 24 columns.

It is important to understand that only space to store column information for eight columns is guaranteed. For more than eight columns, space is shared between column set membership summary and column minimum/maximum summary. The type of workload determines whether set membership summary gets stored in storage index. I am hopping to test out this new feature shortly and will post results for my readers.

 

 

Using Oracle Direct NFS with Exadata Machine

Oracle Direct NFS (dNFS) is an NFS client which resides within the database kernel and should be enabled on Exadata for all direct database and RMAN workloads between Exadata and the Oracle ZFS Storage Appliance. With this feature enabled, you will have increased bandwidth and reduced CPU overhead. Even though there are no additional steps are required to enable dNFS although it is recommended to increase the number of NFS server threads from default to 1000.

As per Oracle documentation, using Oracle Direct NFS with Exadata can provide following benefits.

  • Significantly reduces system CPU utilization by bypassing the operating system (OS) and caching data just once in user space with no second copy in kernel space
  • Boosts parallel I/O performance by opening an individual TCP connection for each database process
  • Distributes throughput across multiple network interfaces by alternating buffers to multiple IP addresses in a round-robin fashion
  • Provides high availability (HA) by automatically redirecting failed I/O to an alternate address

In Oracle Database 12c, dNFS is already enabled by default.   In 11g, Oracle Direct NFS may be enabled on a single database node with the following command:

$ make -f $ORACLE_HOME/rdbms/lib/ins_rdbms.mk dnfs_on

Exadata dcli may be used to enable dNFS on all of the database nodes simultaneously:

$ dcli -l oracle -g /home/oracle/dbs_group make -f $ORACLE_HOME/rdbms/lib/ins_rdbms.mk dnfs_on

Note: – The database instance should be restarted after enabling Oracle Direct NFS. 

You can confirm that dNFS are enabled by checking the database alert log:

“Oracle instance running with ODM: Oracle Direct NFS ODM Library Version 3.0”

You can also use following SQL Query to confirm dNFS activity:

SQL> select * from v$dnfs_servers;

 

Patching Guidelines for Exadata Machine

Even though Oracle Offers free Exadata patching to their Exadata customers under Oracle Platinum Services, you might still end up applying patches to your Exadata Machine for many reasons. There can be a compliance issue or scheduling problem which may prevent you from using Oracle Platinum Service to patch your Exadata systems. Remember, Oracle needs Minimum 8-12 weeks prior notice before customer wants to be patched and might not work for you. So if you are one of those lucky Exadata Machine Admin planning to apply patches to your Exadata systems, here are some guidelines for safely completing patching task with minimum risks.

Guidelines

  1. You must carefully review the patch readme file and familiarize yourself with known issues and rollback options.
  2. Create a detailed workbook to Patch Exadata Machine including rollback option.
  3. Find Test system in your organization mimicking production system in terms of capacity and software version.
  4. Run Exachk utility before you start applying the patch to establish a baseline. Additionally, fix any major issues you see in the Exadata Health Check report.
  5. Reboot your Exadata Machine before you start applying the patch.
  6. Make sure you have enough Storage on all the mounts affected by the patch.
  7. Backup everything, I mean everything. Backup all the databases and storage mount holding software binaries.
  8. Apply the patch on a test system and document each step in a workbook to deploy patches for rest of the Exadata systems.
  9. Run Exachk utility after the successful patch application and compare its baseline Exachk report.
  10. Reboot Exadata Machine after deploying the patch to make sure there will not be issues with future Exadata Reboots.
  11. Verify all the Exadata Software and Hardware components InfiniBand, Storage Cells and Compute nodes.
  12. Move to applying the patch to Production systems, after successful patching exercise.

Improve Temp Reads and Writes with Exadata Storage Server releases 12.2.1.1.0

We all have worked with large temp tablespaces in our data warehouse databases. I personally have worked with 10 TB temp tablespace for 50 TB Data Warehouse running on Exadata machine, which was required for large table joints and aggregate operations. Temp writes and temp reads are used when large joints or aggregation operations don’t fit in memory and must be spilled to storage. Before Oracle Exadata Storage Server released 12.2.1.1.0, temp writes were not cached in flash cache. Both temp writes and subsequent temp reads were from hard disk only. With the release of Oracle Exadata Storage Server 12.2.1.1.0, temp writes are sent to flash cache so that subsequent temp reads can be read from flash cache as well. This can drastically improve performance for queries that spill into temp area. As per Oracle, for certain queries performance can improve up to four times faster.

Additionally, imagine an application using a lot of temp tables and now they can run entirely from flash. This feature can enhance performance for these applications many folds. This feature uses a threshold of 128KB to decide whether to send request directory to disk or write it to flash cache. Therefore, direct load writes, flashback database log writes, archived log writes, and incremental backup writes would bypass flash cache. This feature will redirect large writes into the flash cache, provided that such large writes do not disrupt the higher priority OLTP or scan workloads. Such writes are later written back to the disks when the disks are less busy.

Considerations:

  • Write-back flash cache has to be enabled for this feature to work.
  • Oracle Database 11g release 2 (11.2) or Oracle Database 12c release 1 (12.1), then you need the patches for bug 24944847.
  • This feature is supported on all Oracle Exadata hardware except for V2 and X2 storage servers.
  • Flash caching of temp writes and large writes is not supported when flash compression is enabled

 

Configure Exadata Storage Server Email Alerts

It will very important to configure proper monitoring and alerting for your Exadata Machine to decreased risk of a problem not being detected in a timely manner. Oracle recommended best practice to monitor an Oracle Exadata Database Machine is with Oracle Enterprise Manager (OEM) and the suite of OEM plugins developed for the Oracle Exadata Database Machine.  Please reference My Oracle Support (MOS) Note 1110675.1 for details.

Additionally, Exadata Storage Servers can send alerts via emails. Sending these messages can helps to ensure that a problem is detected and corrected. First use following cellcli command to validate the email configuration by sending a test email:

alter cell validate mail;

The output will be similar to:

Cell slcc09cel01 successfully altered

If the output is not successful, configure a storage server to send email alerts using the following cellcli command (tailored to your environment):

ALTER CELL smtpServer='mailserver.maildomain.com', -

smtpFromAddr='firstname.lastname@maildomain.com', -

smtpToAddr='firstname.lastname@maildomain.com', -

smtpFrom='Exadata cell', -

smtpPort='<port for mail server>', -

smtpUseSSL='TRUE', -

notificationPolicy='critical,warning,clear', -

notificationMethod='mail';

 

 

 

Checking Exadata Image Info

Login to any storage cell & compute node using root user & run imageinfo command.

Checking Storage cell image :

login as: root
root@XX.XXX.XX.XX's password:
Last login: Mon Oct 16 17:13:57 2017 from cellnode1

[root@cellnode1 ~]# imageinfo

Kernel version: 4.1.12-61.47.1.el6uek.x86_64 #2 SMP Fri Jun 23 19:43:18 PDT 2017 x86_64
Cell version: OSS_12.2.1.1.2_LINUX.X64_170714
Cell rpm version: cell-12.2.1.1.2_LINUX.X64_170714-1.x86_64

Active image version: 12.2.1.1.2.170714
Active image kernel version: 4.1.12-61.47.1.el6uek
Active image activated: 2017-08-15 11:16:34 -0400
Active image status: success
Active system partition on device: /dev/md6
Active software partition on device: /dev/md8

Cell boot usb partition: /dev/sdm1
Cell boot usb version: 12.2.1.1.2.170714

Inactive image version: 12.2.1.1.1.170419
Inactive image activated: 2017-05-20 03:20:27 -0700
Inactive image status: success
Inactive system partition on device: /dev/md5
Inactive software partition on device: /dev/md7

Inactive marker for the rollback: /boot/I_am_hd_boot.inactive
Inactive grub config for the rollback: /boot/grub/grub.conf.inactive
Inactive kernel version for the rollback: 4.1.12-61.33.1.el6uek.x86_64
Rollback to the inactive partitions: Possible

Checking compute node image :

[root@dbnode1 ~]# imageinfo

Kernel version: 4.1.12-61.47.1.el6uek.x86_64 #2 SMP Fri Jun 23 19:43:18 PDT 2017 x86_64
Image kernel version: 4.1.12-61.47.1.el6uek
Image version: 12.2.1.1.2.170714
Image activated: 2017-08-15 15:44:12 -0400
Image status: success
System partition on device: /dev/mapper/VGExaDb-LVDbSys1

Verify Exadata Machine Configuration

Hello Everyone! I was recently task to perform a 360 health review and decide to share my experience with my readers. Part of Exadata 360 review, I performed detail review of Exadata configuration by verifying following 50 items. Initial Exadata Deployment usually don’t require verify following item if you have old deployment of Exadata machine, you might want to review following items at least once a year.

 

  1. Primary and standby databases should NOT reside on the same IB Fabric
  2. Use hostname and domain name in lower case
  3. Verify ILOM Power Up Configuration
  4. Verify Hardware and Firmware on Database and Storage Servers
  5. Verify InfiniBand Cable Connection Quality
  6. Verify Ethernet Cable Connection Quality
  7. Verify InfiniBand Fabric Topology (verify-topology)
  8. Verify InfiniBand switch software version is 1.3.3-2 or higher
  9. Verify InfiniBand subnet manager is running on an InfiniBand switch
  10. Verify celldisk configuration on flash memory devices
  11. Verify there are no griddisks configured on flash memory devices
  12. Verify griddisk count matches across all storage servers where a given prefix name exists
  13. Verify griddisk ASM status
  14. Verify InfiniBand is the Private Network for Oracle Clusterware Communication
  15. Verify Oracle RAC Databases use RDS Protocol over InfiniBand Network.
  16. Verify Database and ASM instances use same SPFILE
  17. Configure Storage Server alerts to be sent via email
  18. Configure NTP and Timezone on the InfiniBand switches
  19. Verify NUMA Configuration
  20. Verify Exadata Smart Flash Log is Created
  21. Verify Exadata Smart Flash Cache is Created
  22. Verify Exadata Smart Flash Cache status is “normal”
  23. Verify database server disk controllers use writeback cache
  24. Configure NTP and Timezone on the InfiniBand switches
  25. Verify that “Disk Cache Policy” is set to “Disabled”
  26. Verify Management Network Interface (eth0) is on a Separate Subnet
  27. Verify Platform Configuration and Initialization Parameters for Consolidation
  28. Verify all datafiles have “AUTOEXTEND” attribute “ON”
  29. Verify all “BIGFILE” tablespaces have non-default “MAXBYTES” values set
  30. Ensure Temporary Tablespace is correctly defined
  31. Enable auditd on database servers
  32. Verify AUD$ and FGA_LOG$ tables use Automatic Segment Space Management
  33. Use dbca templates provided for current best practices
  34. Gather system statistics in Exadata mode if needed
  35. Verify Hidden Database Initialization Parameter Usage
  36. Verify bundle patch version installed matches bundle patch version registered in database
  37. Verify service exachkcfg autostart status
  38. Verify database server file systems have “Maximum mount count” = “-1”
  39. Verify database server file systems have “Check interval” = “0”
  40. Set SQLNET.EXPIRE_TIME=10 in DB Home
  41. Verify /etc/oratab
  42. Verify all Database and Storage Servers are synchronized with the same NTP server
  43. Verify there are no failed diskgroup rebalance operations
  44. Verify the CRS_HOME is properly locked
  45. Verify db_unique_name is used in I/O Resource Management (IORM) interdatabase plans
  46. Verify Database Server Quorum Disks are used when beneficial
  47. Verify Oracle Clusterware files are placed appropriately
  48. Verify “_reconnect_to_cell_attempts=9” on database servers which access X6 storage servers
  49. Verify Flex ASM Cardinality is set to “ALL”
  50. Verify no Oracle Enterprise Linux 5 (el5) rpms exist on database servers running Oracle Linux (ol6)

 

Reference: Oracle Sun Database Machine Setup/Configuration Best Practices (Doc ID 1274318.1)