3. Collect Information

To deploy your Hadoop installation, you need to collect the following information:

  • The fully qualified domain name (FQDN) for each host in your system, and which components you want to set up on which host. The Ambari install wizard does not support using IP addresses. You can use hostname -f to check for the FQDN if you do not know it.

    [Note]Note

    While it is possible to deploy all Hadoop components on a single host, this is appropriate only for initial evaluation. In general you should use at least three hosts: one master host and two slaves.

  • The base directories you want to use as mount points for storing:

    • NameNode data

    • DataNodes data

    • Secondary NameNode data

    • Oozie data

    • MapReduce data (Hadoop version 1.x)

    • YARN data (Hadoop version 2.x)

    • ZooKeeper data, if you install ZooKeeper

    • Various log, pid, and db files, depending on your install type

    [Important]Important

    You must use base directories that provide persistent storage locations for your HDP components and your Hadoop data. Installing HDP components in locations that may be removed from a host may result in cluster failure or data loss. For example: Do Not use /tmp in a base directory path.


loading table of contents...