Configuring and Starting the PostgreSQL Server
By default, PostgreSQL only accepts connections on the loopback interface. You must reconfigure PostgreSQL to accept connections from the fully qualified domain names (FQDN) of the hosts hosting the services for which you are configuring databases. If you do not make these changes, the services cannot connect to and use the database on which they depend.
Make sure that
LC_ALLis set to
en_US.UTF-8and initialize the database as follows:
echo 'LC_ALL="en_US.UTF-8"' >> /etc/locale.conf sudo su -l postgres -c "postgresql-setup initdb"
- RHEL 7:
- Enable MD5 authentication. Edit
pg_hba.conf, which is usually found in
/etc/postgresql/<version>/main. Add the following line:
host all all <range-start-ip-address>/28 md5If the default
pg_hba.conffile contains the following line:
host all all 127.0.0.1/32 ident
md5authentication shown above must be inserted before this
identline. Failure to do so may cause an authentication error when running the
scm_prepare_database.shscript. You can modify the contents of the
md5line shown above to support different configurations. For example, if you want to access PostgreSQL from a different host, replace
127.0.0.1with your IP address and update
postgresql.conf, which is typically found in the same place as
pg_hba.conf, to include:
listen_addresses = '*'
- Configure settings to ensure your system performs as expected. Update these
settings in the
/var/lib/postgresql/data/postgresql.conffile. Settings vary based on cluster size and resources as follows:
- Small to mid-sized clusters - Consider the following settings as starting points.
If resources are limited, consider reducing the buffer sizes and checkpoint segments
further. Ongoing tuning may be required based on each host's resource utilization. For
example, if the Cloudera Manager Server is running on the same host as other roles, the
following values may be acceptable:
max_connection- In general, allow each database on a host 100 maximum connections and then add 50 extra connections. You may have to increase the system resources available to PostgreSQL, as described at Connection Settings.
- Large clusters - Can contain up to 1000 hosts. Consider the following settings as
max_connection- For large clusters, each database is typically hosted on a different host. In general, allow each database on a host 100 maximum connections and then add 50 extra connections. You may have to increase the system resources available to PostgreSQL, as described at Connection Settings.
shared_buffers- 1024 MB. This requires that the operating system can allocate sufficient shared memory. See PostgreSQL information on Managing Kernel Resources for more information on setting kernel resources.
wal_buffers- 16 MB. This value is derived from the
wal_buffersto be approximately 3% of
shared_buffersup to a maximum of approximately 16 MB is sufficient in most cases.
checkpoint_segments- 128. The PostgreSQL Tuning Guide recommends values between 32 and 256 for write-intensive systems, such as this one.
- Small to mid-sized clusters - Consider the following settings as starting points. If resources are limited, consider reducing the buffer sizes and checkpoint segments further. Ongoing tuning may be required based on each host's resource utilization. For example, if the Cloudera Manager Server is running on the same host as other roles, the following values may be acceptable:
- Configure the PostgreSQL server to start at boot.
OS Command RHEL 7 compatible
sudo systemctl enable postgresql
- Restart the PostgreSQL database:
- RHEL 7
sudo systemctl restart postgresql
- RHEL 7 Compatible:
Creating Databases for Cloudera Software
You must create databases and service accounts for components that require databases.
The following components require databases:
- Cloudera Manager Server
- Cloudera Management Service roles:
- Reports Manager
- Data Analytics Studio (DAS) Supported with PostgreSQL only.
- Each Hive metastore
- Data Analytics Studio
- Schema Registry
- Streams Messaging Manager
The databases must be configured to support the PostgreSQL
UTF8 character set encoding.
Record the values you enter for database names, usernames, and passwords. The Cloudera Manager installation wizard requires this information to correctly connect to these databases.
To create databases for Cloudera Manager Server, Cloudera Management Service roles, Activity Monitor, Reports Manager, Hue, Hive metastores, Oozie, and DAS, complete the following steps:
Connect to PostgreSQL:
sudo -u postgres psql
- Create databases for each service you are using from the
CREATE ROLE <user> LOGIN PASSWORD '<password>';
You can use any value you want for <database>, <user>, and <password>. The following examples are the default names provided in the Cloudera Manager configuration settings, but you are not required to use them:
CREATE DATABASE <database> OWNER <user> ENCODING 'UTF8';
Table 1. Databases for Cloudera Software Service Database User Cloudera Manager Server scm scm Reports Manager rman rman Hue hue hue Hive Metastore Server metastore hive Oozie oozie oozie Data Analytics Studio (DAS) Supported with PostgreSQL only. das das Schema Registry schemaregistry schemaregistry Streams Messaging Manager smm smm
Record the databases, usernames, and passwords chosen because you will need them later.
- If you plan to use Apache Ranger, see the following topic for instructions on creating and configuring the Ranger database. See Configuring a PostgreSQL Database for Ranger or Ranger KMS.
- If you plan to use Schema Registry or Streams Messaging Manager, see the following topic for instructions on configuring the database: Configuring the Database for Streaming Components
- After you install and configure PostgreSQL databases for Cloudera software, continue to Step 5: Set up the Cloudera Manager Database to configure a database for Cloudera Manager.