Install and Configure Oracle Database for Cloudera Software

Collecting Oracle Database Information

To configure CDH to work with an Oracle database, get the following information from your Oracle DBA:

  • Hostname - The DNS name or the IP address of the host where the Oracle database is installed.
  • Username - A username for each schema that is storing information. You could have four unique usernames for the four schema.
  • Password - A password corresponding to each username.

Configuring the Oracle Server

Adjusting Oracle Settings to Accommodate Larger Clusters

CDH services require high write throughput. Depending on the size of your deployments, your DBA may need to modify Oracle settings for monitoring services. These guidelines are for larger clusters and do not apply to smaller clusters. Many factors help determine whether you need to change your database settings, but in most cases, if your cluster has more than 100 hosts, you should consider making the following changes:
  • Enable direct and asynchronous I/O by setting the FILESYSTEMIO_OPTIONS parameter to SETALL.
  • Increase the RAM available to Oracle by changing the MEMORY_TARGET parameter. The amount of memory to assign depends on the size of the Hadoop cluster.
  • Create more redo log groups and spread the redo log members across separate disks or logical unit numbers.
  • Increase the size of redo log members to be at least 1 GB.

Reserving Ports for HiveServer 2

HiveServer2 uses port 10000 by default, but Oracle database changes the local port range. This can cause HiveServer2 to fail to start.

Manually reserve the default port for HiveServer2. For example, the following command reserves port 10000 and inserts a comment indicating the reason:

echo << EOF > /etc/sysctl.cnf
# HS2 uses port 10000
net.ipv4.ip_local_reserved_ports = 10000
EOF

sysctl -q -w net.ipv4.ip_local_reserved_ports=10000

Modifying the Maximum Number of Oracle Connections

Work with your Oracle database administrator to ensure appropriate values are applied for your Oracle database settings. You must determine the number of connections, transactions, and sessions to be allowed.

Allow 100 maximum connections for each service that requires a database and then add 50 extra connections. For example, for two services, set the maximum connections to 250. If you have five services that require a database on one host, set the maximum connections to 550.

From the maximum number of connections, you can determine the number of anticipated sessions using the following formula:

sessions = (1.1 * maximum_connections) + 5

For example, if a host has a database for two services, anticipate 250 maximum connections. If you anticipate a maximum of 250 connections, plan for 280 sessions.

Once you know the number of sessions, you can determine the number of anticipated transactions using the following formula:

transactions = 1.1 * sessions

Continuing with the previous example, if you anticipate 280 sessions, you can plan for 308 transactions.

Work with your Oracle database administrator to apply these derived values to your system.

Using the sample values above, Oracle attributes would be set as follows:

alter system set processes=250;
alter system set transactions=308;
alter system set sessions=280;

Ensuring Your Oracle Database Supports UTF8

The database you use must support UTF8 character set encoding. You can implement UTF8 character set encoding in Oracle databases by using the dbca utility. In this case, you can use the characterSet AL32UTF8 option to specify proper encoding. Consult your DBA to ensure UTF8 encoding is properly configured.

Installing the Oracle JDBC Connector

You must install the JDBC connector on any hosts that use a database.

Cloudera recommends that you assign all roles that require a database on the same host and install the connector on that host. Locating all such roles on the same host is recommended but not required. If you install a role on one host and other roles on a separate host, you must install the JDBC connector on each host running roles that access the database.

  1. Download the Oracle JDBC Driver from the Oracle website. For example, the version 6 JAR file is named ojdbc6.jar.

    For more information about supported Oracle Java versions, see CDH and Cloudera Manager Supported JDK Versions.

    The driver is available for download at http://www.oracle.com/technetwork/database/enterprise-edition/jdbc-112010-090769.html.

  2. Copy the Oracle JDBC JAR file to /usr/share/java/oracle-connector-java.jar. The Hive Mestastore database uses this shared file.
    sudo mkdir /usr/share/java (if necessary)
    sudo cp /tmp/ojdbc6.jar /usr/share/java/oracle-connector-java.jar

Creating Databases for Cloudera Software

Create schema and user accounts for components that require databases:

  • Each Hive metastore
  • Sentry Server
  • Sqoop Server
  • Hue
  • Oozie

You can create the Oracle database, schema and users on any hosts in the cluster.

Cloudera recommends installing the databases on different hosts than the services. Separating databases from services can help isolate the potential impact from failure or resource contention in one or the other. It can also simplify management in organizations that have dedicated database administrators.

The database must be configured to support UTF-8 character set encoding.

Record the values you enter for database names, usernames, and passwords. You will need them to configure the individual services later.

  1. Log into the Oracle client:
    sqlplus system@localhost
    Enter password: ******
  2. Create a user and schema for each service you are using from the below table:
    create user <user> identified by <password> default tablespace <tablespace>;
    grant CREATE SESSION to <user>;
    grant CREATE TABLE to <user>;
    grant CREATE SEQUENCE to <user>;
    grant EXECUTE on sys.dbms_lob to <user>;

    You can use any value you want for <schema>, <user>, and <password>. The following examples are the default names provided in the Cloudera Manager configuration settings, but you are not required to use them:

    Databases for Cloudera Software
    Service Database User
    Cloudera Manager Server scm scm
    Activity Monitor amon amon
    Reports Manager rman rman
    Hue hue hue
    Hive Metastore Server metastore hive
    Sentry Server sentry sentry
    Cloudera Navigator Audit Server nav nav
    Cloudera Navigator Metadata Server navms navms
    Oozie oozie oozie
  3. Grant a quota on the tablespace (the default tablespace is SYSTEM) where tables will be created:
    SQL> ALTER USER <user> quota 100m on <tablespace> 
    or for unlimited space:
    SQL> ALTER USER username quota unlimited on <tablespace>
  4. Set the following additional privileges for Oozie:
    SQL> grant alter index to oozie;
    grant alter table to oozie;
    grant create index to oozie;
    grant create sequence to oozie;
    grant create session to oozie;
    grant create table to oozie;
    grant drop sequence to oozie;
    grant select dictionary to oozie;
    grant drop table to oozie;
    alter user oozie quota unlimited on <tablespace>
  5. Set the following additional privileges for the Cloudera Navigator Audit Server database:
    GRANT EXECUTE ON sys.dbms_crypto TO <nav>;
    GRANT CREATE VIEW TO <nav>;
    where <nav> is the Navigator Audit Server user you specified above when you created the database.

For further information about Oracle privileges, see Authorization: Privileges, Roles, Profiles, and Resource Limitations.

Install CDH Packages

After completing the above instructions to install and configure Oracle databases for Cloudera software, continue to Step 4: Install CDH Packages.