3. Operations

Operations components deploy and effectively manage the platform.

Deployment and management tasks include the following:

  • Provisioning, management, monitoring. Ambari provides an open operational framework for provisioning, managing and monitoring Hadoop clusters. Ambari includes a web interface that enables administrators to start/stop/test services, change configurations, and manage the ongoing growth of the cluster.

  • Integrating with other applications. The Ambari RESTful API enables integration with existing tools, such as Microsoft System Center and Teradata Viewpoint. For deeper customization, Ambari also leverages standard technologies with Nagios and Ganglia. Operational services for Hadoop clusters, including a distributed configuration service, a synchronization service and a naming registry for distributed systems, is provided by ZooKeeper. Distributed applications use ZooKeeper to store and mediate updates to important configuration information.

  • Job scheduling. Oozie is a Java Web application used to schedule Hadoop jobs. Oozie enables Hadoop administrators to build complex data transformations out of multiple component tasks, enabling greater control over complex jobs and also making it easier to schedule repetitions of those jobs.