Cloudera Search Architecture

Cloudera Search runs as a distributed service on a set of servers, and each server is responsible for a portion of the entire set of content to be searched. The entire set of content is split into smaller pieces, copies are made of these pieces, and the pieces are distributed among the servers. This provides two main advantages:

  • Dividing the content into smaller pieces distributes the task of indexing the content among the servers.
  • Duplicating the pieces of the whole allows queries to be scaled more effectively and enables the system to provide higher levels of availability.

Each Cloudera Search server can handle requests for information. As a result, a client can send requests to index documents or perform searches to any Search server, and that server routes the request to the correct server.