Data Discovery and Exploration clusters

Learn about the default Data Discovery and Exploration clusters, including cluster definition and template names, included services, and compatible Runtime version.

Data Discovery and Exploration (Tech Preview)

Data Discovery and Exploration (DDE) is designed to simplify deployment, configuration, and serviceability of Solr-based analytics applications. DDE also makes it much easier for application developers or data workers to self-service and get started with building insight applications or exploration services based on Apache Solr. It is certified with all Data Hub Flow Management templates in CDP Data Hub, in case you want to ingest data to Solr via NiFi. This is useful, for instance, in cases where you need to make events or log data searchable in real time.

Common use cases are:
  • Ad-hoc exploration and discovery of data sets
  • Relevance-based analytics over unstructured data (logs, images, text, PDFs, etc.)
  • Log or event search
  • Making data in a data lake more accessible to everyone
  • Using Solr relevance-based match as an automated text-filter step in a bigger data workflow
Cluster Definition Names
  • Data Discovery and Exploration for AWS (Tech Preview)
Cluster Template Name
  • CDP - Data Discovery and Exploration
Included Services
  • Solr
  • Spark
  • HDFS
  • Hue
  • YARN
  • ZooKeeper
Compatible Runtime Version
7.2.0, 7.2.1, 7.2.2, 7.2.6, 7.2.7