Installing and Upgrading Apache Kudu
Kudu Installation Requirements
- Hardware
- Kudu currently requires a CPU that supports the SSSE3 and SSE4.2 instruction sets.
- One or more hosts to run Kudu masters. You should have either one master (provides no fault tolerance), three masters (can tolerate one failure), or five masters (can tolerate two failures).
- One or more hosts to run Kudu tablet servers. With replication, a minimum of three tablet servers is necessary.
- Operating systems
- Linux
- RHEL/CentOS 6.4, 6.5, 6.6, 6.7, 6.8, 7.1, 7.2, 7.3
- Oracle Linux (OL) 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, 7.1, 7.2, 7.3
- Ubuntu 14.04 (Trusty), 16.04 (Xenial)
- Debian 8.2, 8.4 (Jessie)
- SLES 12 Service Pack 1
- A kernel version and filesystem that support hole punching. Hole punching is the use of the fallocate(2) system call with the FALLOC_FL_PUNCH_HOLE option set. See Error during hole punch test. If you cannot meet this requirement, use this workaround.
- NTP
- xfs or ext4 formatted drives.
- MacOS
- OS X 10.10 Yosemite, OS X 10.11 El Capitan, and macOS Sierra.
- Pre-built macOS packages are not provided.
- Windows
- Microsoft Windows is not supported.
- Linux
- Management - To manage Kudu with Cloudera Manager, Cloudera Manager 5.10.0 or later and CDH 5.10.0 or later are required.
- Storage - If solid state storage is available, storing Kudu WALs on such high-performance media might significantly improve latency when Kudu is configured for its highest durability levels.
Install Kudu Using Cloudera Manager
Install Kudu Using Parcels
- In Cloudera Manager, go to Download. . Find KUDU in the list, and click
- When the download is complete, select your cluster from the Locations selector, and click Distribute. If you only have one cluster, it is selected automatically.
- When distribution is complete, click Activate to activate the parcel. Restart the cluster when prompted. This might take several minutes.
- Install the Kudu service on your cluster. Go to the cluster where you want to install Kudu. Click Kudu from the list, and click Continue. . Select
- Select a host for the master role and one or more hosts for the tablet server roles. A host can act as both a master and a tablet server, but this might cause performance problems on a large cluster. The Kudu master process is not resource-intensive and can be collocated with other similar processes such as the HDFS NameNode or YARN ResourceManager. After selecting hosts, click Continue.
- Configure the storage locations for Kudu data and write-ahead log (WAL) files on masters and tablet servers. Cloudera Manager will create the
directories.
- You can use the same directory to store data and WALs.
- You cannot store WALs in a subdirectory of the data directory.
- If any host is both a master and tablet server, configure different directories for master and tablet server. For instance, /data/kudu/master and /data/kudu/tserver.
- If you have chosen a filesystem that does not support hole punching, the Kudu service will fail to start. In this case only, exit the wizard by clicking the Cloudera logo at the top left, and enable the file block manager. This is not appropriate for production. See Enabling the File Block Manager.
- If your filesystem supports hole punching, do not exit the wizard. Click Continue. Kudu masters and tablet servers are started. Otherwise, go to the Kudu service, and click .
- Verify the Installation.
- To manage roles, go to the Kudu service and use the Actions menu to stop, start, restart, or otherwise manage the service.
Enabling the File Block Manager
If your filesystem supports hole punching, do not use the file blocker manager. The file blocker manager does not perform well at scale and must only be used for small-scale development and testing.
If your filesystem does not support hole punching, but you want to experiment with Kudu, you must enable the file block manager. If you do not enable the file block manager, Kudu will not start.
- If you are still in the Cloudera configuration wizard, exit the configuration wizard by clicking the Cloudera logo at the top of the Cloudera Manager interface.
- Go to the Kudu service.
- Go to Configuration and search for the Kudu Service Advanced Configuration Snippet (Safety Valve) for gflagfile configuration option.
- Add the following line to it, and save your changes:
--block_manager=file
Install Kudu Using Packages
Operating System | Repository Package | Individual Packages |
---|---|---|
RHEL | RHEL 6 or RHEL 7 | RHEL 6 |
Ubuntu | Trusty, Xenial | Trusty, Xenial |
SLES | SLES 12 | SLES 12 |
Debian | Jessie | Jessie |
- Cloudera recommends installing the Kudu repositories for your operating system. Use the links in Kudu Repository and Package Links to download the appropriate repository installer. Save the repository installer to /etc/yum.repos.d/ for RHEL, /etc/apt/sources.list.d/ for Ubuntu/Debian, or /etc/zypp/repos.d for SLES.
- Add the Cloudera Public GPG repository key for each operating system in the cluster. This key enables you to verify that you are downloading
genuine packages.
Operating System Command RHEL/CentOS 6 sudo rpm --import https://archive.cloudera.com/kudu/redhat/6/x86_64/kudu/RPM-GPG-KEY-cloudera
RHEL/CentOS 7 sudo rpm --import https://archive.cloudera.com/kudu/redhat/7/x86_64/kudu/RPM-GPG-KEY-cloudera
SLES sudo rpm --import https://archive.cloudera.com/kudu/sles/12/x86_64/kudu/RPM-GPG-KEY-cloudera
Debian Jessie wget https://archive.cloudera.com/kudu/debian/jessie/amd64/kudu/archive.key -O archive.key sudo apt-key add archive.key
Ubuntu Xenial wget https://archive.cloudera.com/kudu/ubuntu/xenial/amd64/kudu/archive.key -O archive.key sudo apt-key add archive.key
Trustywget https://archive.cloudera.com/kudu/ubuntu/trusty/amd64/kudu/archive.key -O archive.key sudo apt-key add archive.key
- Install the Kudu packages.
- If you use Cloudera Manager, you only need to install the kudu package:
Operating System Install Commands RHEL/CentOS sudo yum install kudu
Ubuntu/Debian sudo apt-get install kudu
SLES sudo zypper install kudu
- If you need the C++ client development libraries or the Kudu SDK, install kudu-client and kudu-client-devel packages for RHEL, or libkuduclient0 and libkuduclient-dev packages for Ubuntu.
- Do not install the kudu-master and kudu-tserver packages. They provide operating system startup scripts for using Kudu without Cloudera Manager.
- If you use Cloudera Manager, you only need to install the kudu package:
- Install the Kudu service on your cluster. Go to the cluster where you want to install Kudu. Click Kudu from the list, and click Continue. . Select
- Select a host for the master role and one or more hosts for the tablet server roles. A host can act as both a master and a tablet server, but this might cause performance problems on a large cluster. The Kudu master process is not resource-intensive and can be collocated with other similar processes such as the HDFS NameNode or YARN ResourceManager. After selecting hosts, click Continue.
- Configure the storage locations for Kudu data and write-ahead log (WAL) files on masters and tablet servers. Cloudera Manager will create the directories.
- You can use the same directory to store data and WALs.
- You cannot store WALs in a subdirectory of the data directory.
- If any host is both a master and tablet server, configure different directories for master and tablet server. For instance, /data/kudu/master and /data/kudu/tserver.
- If you have chosen a filesystem that does not support hole punching, the Kudu service will fail to start. In this case only, exit the wizard by clicking the Cloudera logo at the top left, and enable the file block manager. This is not appropriate for production. See Enabling the File Block Manager.
- If your filesystem supports hole punching, do not exit the wizard. Click Continue. Kudu masters and tablet servers are started. Otherwise, go to the Kudu service, and click .
- Verify the Installation.
- To manage roles, go to the Kudu service and use the Actions menu to stop, start, restart, or otherwise manage the service.
Install Kudu Using the Command Line
Follow these steps on each node in your Kudu cluster.
- Cloudera recommends installing the Kudu repositories for your operating system. Use the links in the following table to download the appropriate repository installer. Save the repository installer to /etc/yum.repos.d/ for RHEL, /etc/apt/sources.list.d/ for Ubuntu/Debian, or /etc/zypp/repos.d for SLES.
- Add the Cloudera Public GPG repository key for each operating system in the cluster. This key enables you to verify that you are downloading genuine packages.
Operating System Command RHEL/CentOS 6 sudo rpm --import https://archive.cloudera.com/kudu/redhat/6/x86_64/kudu/RPM-GPG-KEY-cloudera
RHEL/CentOS 7 sudo rpm --import https://archive.cloudera.com/kudu/redhat/7/x86_64/kudu/RPM-GPG-KEY-cloudera
SLES sudo rpm --import https://archive.cloudera.com/kudu/sles/12/x86_64/kudu/RPM-GPG-KEY-cloudera
Debian Jessie wget https://archive.cloudera.com/kudu/debian/jessie/amd64/kudu/archive.key -O archive.key sudo apt-key add archive.key
Ubuntu Xenial wget https://archive.cloudera.com/kudu/ubuntu/xenial/amd64/kudu/archive.key -O archive.key sudo apt-key add archive.key
Trustywget https://archive.cloudera.com/kudu/ubuntu/trusty/amd64/kudu/archive.key -O archive.key sudo apt-key add archive.key
- Install the kudu package, using the appropriate commands for your operating system. Also install the kudu-master and
kudu-tserver packages. They provide operating system start-up scripts for the Kudu master and tablet servers.
Operating System Install Commands RHEL/CentOS sudo yum install kudu # Base Kudu files sudo yum install kudu-master # Kudu master init.d service script and default configuration sudo yum install kudu-tserver # Kudu tablet server init.d service script and default configuration sudo yum install kudu-client0 # Kudu C++ client shared library sudo yum install kudu-client-devel # Kudu C++ client SDK
Ubuntu/Debian sudo apt-get install kudu # Base Kudu files sudo apt-get install kudu-master # Service scripts for managing kudu-master sudo apt-get install kudu-tserver # Service scripts for managing kudu-tserver sudo apt-get install libkuduclient0 # Kudu C++ client shared library sudo apt-get install libkuduclient-dev # Kudu C++ client SDK
SLES sudo zypper install kudu # Base Kudu files sudo zypper install kudu-master # Kudu master init.d service script and default configuration sudo zypper install kudu-tserver # Kudu tablet server init.d service script and default configuration sudo zypper install kudu-client0 # Kudu C++ client shared library sudo zypper install kudu-client-devel # Kudu C++ client SDK
- The packages create a kudu-conf entry in the operating system's alternatives database, and they ship the built-in conf.dist alternative. To adjust your configuration, you can either edit the files in /etc/kudu/conf/ directly, or create a new alternative using the
operating system utilities. If you create a new alternative, make sure the alternative is the directory pointed to by the /etc/kudu/conf/ symbolic link, and create
custom configuration files there. Some parts of the configuration are configured in /etc/default/kudu-master and /etc/default/kudu-tserver files as well. You must include or duplicate these configuration options if you create custom configuration files.
Review the configuration, including the default WAL and data directory locations, and adjust them according to your requirements.
- Configure the Kudu services to start automatically when the server starts, by adding them to the default runlevel.
sudo chkconfig kudu-master on # RHEL / CentOS sudo chkconfig kudu-tserver on # RHEL / CentOS sudo update-rc.d kudu-master defaults # Ubuntu / Debian sudo update-rc.d kudu-tserver defaults # Ubuntu / Debian
For instructions on how to perform common administrative tasks in Kudu, see Apache Kudu Administration.
- Verify the Installation.
Verify the Installation
- Verify that the Kudu master and tablet servers are running using one of the following methods:
-
Examine the output of the ps command on servers to verify that the kudu-master and kudu-tserver processes are running.
-
Access the master or tablet server web UI by going to http://<_host_name_>:8051/ for masters, or http://<_host_name_>:8050/ for tablet servers.
-
- If Kudu isn’t running, look at the log files in /var/log/kudu, and if there’s a file ending with .FATAL, that means
Kudu did not start.
-
If the error is related to a failed hole punch test or the file block manager, it might be a problem with your operating system.
-
If the error is related to clock synchronization, it is most likely a problem with the Network Time Protocol.
-
Upgrade Kudu using Cloudera Manager
To use Cloudera Manager to upgrade Kudu using parcels or packages, use the following instructions. If you do not use Cloudera Manager, see Upgrade Kudu Using the Command Line.
Before upgrading Kudu, read the Release Notes relevant to the version you are upgrading to.
Upgrade Kudu Using Parcels
- Log in to Cloudera Manager.
- Go to Hosts. Click Parcels.
- Click Check For New Parcels.
- Find the new version of KUDU in the list of parcels. Download, distribute, and activate it on your cluster.
Upgrade Kudu Using Packages
- If you use a repository, re-download the repository list file to ensure that you have the latest information. See Kudu Repository and Package Links.
- Stop the Kudu service in Cloudera Manager. Go to the Kudu service and select .
- Depending on your operating system, issue the following set of commands on each Kudu host:
Operating System Upgrade Commands RHEL/CentOS sudo yum -y clean all sudo yum -y upgrade kudu
Ubuntu/Debian sudo apt-get update sudo apt-get install kudu
SLES sudo zypper clean --all sudo zypper update kudu
- Start the Kudu service in Cloudera Manager. Go to the Kudu service and select .
Upgrade Kudu Using the Command Line
If you use Cloudera Manager, do not use the following command-line instructions. See Upgrade Kudu using Cloudera Manager.
Before upgrading Kudu, read the Release Notes relevant to the version you are upgrading to. Note that rolling upgrades are not supported. Shut down all Kudu services before you begin upgrading the software.
- If you use a repository, re-download the repository list file to ensure that you have the latest information. See Kudu Repository and Package Links.
- Stop the Kudu master and tablet servers using the following commands:
sudo service kudu-master stop sudo service kudu-tserver stop
- Depending on your operating system, issue the following set of commands on each Kudu host:
Operating System Upgrade Commands RHEL/CentOS sudo yum -y clean all sudo yum -y upgrade kudu
Ubuntu/Debian sudo apt-get update sudo apt-get install kudu
SLES sudo zypper clean --all sudo zypper update kudu
- Start the Kudu master and tablet servers using the following commands:
$ sudo service kudu-master start $ sudo service kudu-tserver start
Next Steps
Read about Using Apache Impala with Kudu.
For more information about using Kudu, go to the Kudu project page, where you can find official documentation, links to the Github repository and examples, and other resources.
For a reading list and other helpful links, refer to More Resources for Apache Kudu.