Software Distribution Management

A major function of Cloudera Manager is to install and upgrade Cloudera Runtime and other managed services. Cloudera Manager supports two software distribution formats: packages and parcels.

A package is a binary distribution format that contains compiled code and meta-information such as a package description, version, and dependencies. Package management systems evaluate this meta-information to allow package searches, perform upgrades to a newer version, and ensure that all dependencies of a package are fulfilled. Cloudera Manager uses the built-in system package manager for each supported OS to install and upgrade Cloudera Manager.

A parcel is a binary distribution format containing the program files, along with additional metadata used by Cloudera Manager. Parcels provide the following advantages:
  • Parcels are self-contained and installed in a versioned directory, which means that multiple versions of a given parcel can be installed side-by-side. You can then designate one of these installed versions as the active one. With packages, only one package can be installed at a time so there is no distinction between what is installed and what is active.
  • Parcels are required for rolling upgrades.
  • You can install parcels at any location in the filesystem. They are installed by default in /opt/cloudera/parcels. In contrast, packages are installed in /usr/lib.
  • When you install from the Parcels page, Cloudera Manager automatically downloads, distributes, and activates the correct parcel for the operating system running on each host in the cluster. All Cloudera Runtime hosts that make up a logical cluster must run on the same major OS release to be covered by Cloudera Support. Cloudera Manager must run on the same major OS release as at least one of the Cloudera Runtime clusters it manages, to be covered by Cloudera Support. The risk of issues caused by running different minor OS releases is considered lower than the risk of running different major OS releases. Cloudera recommends running the same minor release cross-cluster, because it simplifies issue tracking and supportability..
Because of their unique properties, parcels offer the following advantages over packages:
  • Distribution of Cloudera Runtime as a single object - Instead of having a separate package for each component of Cloudera Runtime, parcels are distributed as a single object. This makes it easier to distribute software to a cluster that is not connected to the Internet.
  • Internal consistency - All Cloudera Runtime components are matched, eliminating the possibility of installing components from different Cloudera Runtime versions.
  • Installation outside of /usr - In some environments, Hadoop administrators do not have privileges to install system packages. With parcels, administrators can install to /opt, or anywhere else.
  • Installation of Cloudera Runtime without sudo - Parcel installation is handled by the Cloudera Manager Agent running as root or another user, so you can install Cloudera Runtime without sudo.
  • Decoupled distribution from activation - With side-by-side install capabilities, you can stage a new version of Cloudera Runtime across the cluster before switching to it. This allows the most time-consuming part of an upgrade to be done ahead of time without affecting cluster operations, thereby reducing downtime.
  • Rolling upgrades - Using packages requires you to shut down the old process, upgrade the package, and then start the new process. Errors can be difficult to recover from, and upgrading requires extensive integration with the package management system to function seamlessly. With parcels, when a new version is staged side-by-side, you can switch to a new minor version by simply changing which version of Cloudera Runtime is used when restarting each process. You can then perform upgrades with rolling restarts, in which service roles are restarted in the correct order to switch to the new version with minimal service interruption. Your cluster can continue to run on the existing installed components while you stage a new version across your cluster, without impacting your current operations. Major version upgrades (for example, CDH 5 to Cloudera Runtime 7) require full service restarts because of substantial changes between the versions. Finally, you can upgrade individual parcels or multiple parcels at the same time.
  • Upgrade management - Cloudera Manager manages all the steps in a Cloudera Runtime cluster upgrade.
  • Additional components - Parcels are not limited to Cloudera Runtime. LZO and add-on service parcels are also available.
  • Compatibility with other distribution tools - Cloudera Manager works with other tools you use for download and distribution, such as Puppet. Or, you can download the parcel to Cloudera Manager Server manually if your cluster has no Internet connectivity and then have Cloudera Manager distribute the parcel to the cluster.

For more information, see Overview of Parcels.