Getting Started with Streaming Analytics

- Release Notes
- Release Notes
- Hortonworks DataFlow 3.3.0 Release Notes
- Release Notes
- Concepts
- HDF Platform Overview
- Apache NiFi Overview
- Streaming Analytics Manager Overview
- Schema Registry Overview
- Apache Kafka Overview
- Apache Storm Overview
- Installation & Upgrade
- Installing & Upgrading HDF
- Planning Your HDF Deployment
- Installing an HDF Cluster
- Installing Ambari
- Installing Databases
- Installing MySQL
- Configuring SAM and Schema Registry Metadata Stores in MySQL
- Configuring Druid and Superset Metadata Stores in MySQL
- Install Postgres
- Configure Postgres to Allow Remote Connections
- Configure SAM and Schema Registry Metadata Stores in Postgres
- Configure Druid and Superset Metadata Stores in Postgres
- Specifying an Oracle Database to Use with SAM and Schema Registry
- Switching to an Oracle Database After Installation
- Installing the HDF Management Pack on an HDF Cluster
- Install an HDF Cluster Using Ambari
- Configure HDF Components
- Configuring Schema Registry and SAM for High Availability
- Installing SmartSense
- Installing HDF Services on an Existing HDP Cluster
- Upgrade Ambari and HDP
- Installing Databases
- Installing MySQL
- Configuring SAM and Schema Registry Metadata Stores in MySQL
- Configuring Druid and Superset Metadata Stores in MySQL
- Install Postgres
- Configure Postgres to Allow Remote Connections
- Configure SAM and Schema Registry Metadata Stores in Postgres
- Configure Druid and Superset Metadata Stores in Postgres
- Specifying an Oracle Database to Use with SAM and Schema Registry
- Switching to an Oracle Database After Installation
- Installing the HDF Management Pack
- Update the HDF Base URL
- Add HDF Services to an HDP Cluster
- Configure HDF Components
- Configuring Schema Registry and SAM for High Availability
- Installing HDF Services on a New HDP Cluster
- Installing Ambari
- Installing Databases
- Installing MySQL
- Configuring SAM and Schema Registry Metadata Stores in MySQL
- Configuring Druid and Superset Metadata Stores in MySQL
- Install Postgres
- Configure Postgres to Allow Remote Connections
- Configure SAM and Schema Registry Metadata Stores in Postgres
- Configure Druid and Superset Metadata Stores in Postgres
- Specifying an Oracle Database to Use with SAM and Schema Registry
- Switching to an Oracle Database After Installation
- Deploying an HDP Cluster Using Ambari
- Installing the HDF Management Pack
- Update the HDF Base URL
- Add HDF Services to an HDP Cluster
- Configure HDF Components
- Configuring Schema Registry and SAM for High Availability
- Apache Ambari Managed HDF Upgrade
- Pre-upgrade tasks
- Upgrade Ambari and the HDF Management Pack
- Upgrade HDF
- Upgrading an HDF Cluster
- Prerequisites
- Registering Your Target Version
- Installing Your Target Version
- Upgrade Ambari Metrics
- Upgrade SmartSense
- Backup and Upgrade Ambari Infra
- Upgrade Ambari Log Search
- Verifying Symbolic Links for SAM and Schema Registry
- Upgrade HDF
- Start Ambari LogSearch and Metrics
- Migrate and Restore Ambari Infra
- Migrate Ambari Metrics Data
- Update Ranger Passwords
- Upgrading HDF 3.2.0 services on an HDP cluster
- Upgrading HDF 3.1.0 services on an HDP cluster
- Upgrading an HDF Cluster
- Post-Upgrade Tasks
- Installing & Upgrading HDF on IBM Power Systems
- Planning Your HDF Deployment on IBM Power Systems
- Installing an HDF Cluster on IBM Power Systems
- Installing Ambari
- Installing Databases
- Installing MySQL
- Configuring SAM and Schema Registry Metadata Stores in MySQL
- Configuring Druid and Superset Metadata Stores in MySQL
- Install Postgres
- Configure Postgres to Allow Remote Connections
- Configure SAM and Schema Registry Metadata Stores in Postgres
- Configure Druid and Superset Metadata Stores in Postgres
- Specifying an Oracle Database to Use with SAM and Schema Registry
- Switching to an Oracle Database After Installation
- Installing the HDF Management Pack on an HDF Cluster
- Install an HDF Cluster Using Ambari
- Configure HDF Components
- Configuring Schema Registry and SAM for High Availability
- Installing SmartSense
- Installing HDF Services on an Existing HDP Cluster using IBM Power Systems
- Upgrade Ambari and HDP
- Installing Databases
- Installing MySQL
- Configuring SAM and Schema Registry Metadata Stores in MySQL
- Configuring Druid and Superset Metadata Stores in MySQL
- Install Postgres
- Configure Postgres to Allow Remote Connections
- Configure SAM and Schema Registry Metadata Stores in Postgres
- Configure Druid and Superset Metadata Stores in Postgres
- Specifying an Oracle Database to Use with SAM and Schema Registry
- Switching to an Oracle Database After Installation
- Installing the HDF Management Pack
- Update the HDF Base URL
- Add HDF Services to an HDP Cluster
- Configure HDF Components
- Configuring Schema Registry and SAM for High Availability
- Installing HDF Services on a New HDP Cluster using IBM Power Systems
- Installing Ambari
- Installing Databases
- Installing MySQL
- Configuring SAM and Schema Registry Metadata Stores in MySQL
- Configuring Druid and Superset Metadata Stores in MySQL
- Install Postgres
- Configure Postgres to Allow Remote Connections
- Configure SAM and Schema Registry Metadata Stores in Postgres
- Configure Druid and Superset Metadata Stores in Postgres
- Specifying an Oracle Database to Use with SAM and Schema Registry
- Switching to an Oracle Database After Installation
- Deploying an HDP Cluster Using Ambari
- Installing the HDF Management Pack
- Update the HDF Base URL
- Add HDF Services to an HDP Cluster
- Configure HDF Components
- Configuring Schema Registry and SAM for High Availability
- Apache Ambari Managed HDF Upgrade for IBM Power Systems
- Pre-upgrade tasks
- Upgrade Ambari and the HDF Management Pack
- Upgrade HDF
- Upgrading an HDF Cluster
- Prerequisites
- Registering Your Target Version
- Installing Your Target Version
- Upgrade Ambari Metrics
- Upgrade SmartSense
- Backup and Upgrade Ambari Infra
- Upgrade Ambari Log Search
- Verifying Symbolic Links for SAM and Schema Registry
- Upgrade HDF
- Start Ambari LogSearch and Metrics
- Migrate and Restore Ambari Infra
- Migrate Ambari Metrics Data
- Update Ranger Passwords
- Upgrading HDF 3.2.0 services on an HDP cluster
- Upgrading HDF 3.1.0 services on an HDP cluster
- Upgrading an HDF Cluster
- Post-Upgrade Tasks
- Installing HDF Components
- Installing and Upgrading Apache NiFi
- Apache MiNiFi Quick Start
- Installing & Upgrading HDF
- How To
- Flow Management
- Using the Apache NiFi Interface
- Building an Apache NiFi DataFlow
- Managing an Apache NiFi DataFlow
- Navigating an Apache NiFi DataFlow
- Monitoring an Apache NiFi DataFlow
- Versioning an Apache NiFi DataFlow
- Using Apache NiFi Templates
- Using Apache NiFi Provenance Tools
- Adding functionality to Apache NiFi
- Introduction
- NiFi Components
- Processor API
- Documenting a Component
- Provenance Events
- Common Processor Patterns
- Error Handling
- General Design Considerations
- Controller Services
- Reporting Tasks
- UI Extensions
- Command Line Tools
- Testing
- NiFi Archives (NARs)
- Per-Instance ClassLoading
- Deprecating a Component
- Using Apache NiFi Registry
- Managing Schemas
- Streaming Analytics
- Creating Streaming Analytics Manager Data Visualizations using Superset
- Adding Custom Builder Components to Streaming Analytics Manager
- Building a Streaming Analytics Manager Application.
- Using Streaming Analytics Manager
- Setting Up Your Streaming Analytics Manager Environment
- Mirroring Data Across Clusters with Apache Kafka MirrorMaker
- Developing Apache Kafka Producers and Consumers
- Creating Apache Kafka Topics
- Developing Apache Storm Applications
- Developing Apache Storm Applications
- Using Apache Storm to Move Data
- Working with Apache Storm Topologies
- Security
- Enabling Kerberos
- NiFi Authentication
- Configuring NiFi Authentication and Proxying with Apache Knox
- SAM Authetication
- Authorization with Ranger
- NiFi Authorization
- SAM Authorization
- Deploying SAM Applications in a Secure Cluster
- Flow Management
- Reference
- Apache NiFi Record Path Reference
- Apache NiFi Expression Language Reference
- Streaming Analytics Manager Configuration Values
- Apache NiFi Configuration Best Practices
- Apache NiFi Security Reference
- Apache NiFi State Management
- Apache NiFi System Properties
- System Properties
- Core Properties
- State Management
- H2 Settings
- FlowFile Repository
- Swap Management
- Content Repository
- File System Content Repository Properties
- Volatile Content Repository Properties
- Provenance Repository
- Write Ahead Provenance Repository Properties
- Encrypted Write Ahead Provenance Repository Properties
- Persistent Provenance Repository Properties
- Volatile Provenance Repository Properties
- Component Status Repository
- Site to Site Properties
- Site to Site Routing Properties for Reverse Proxies
- Web Properties
- Security Properties
- Identity Mapping Properties
- Cluster Common Properties
- Cluster Node Properties
- Claim Management
- ZooKeeper Properties
- Kerberos Properties
- Custom Properties
- System Properties
- Apache NiFi Toolkit
- Administering Apache NiFi Registry
- Learning & Training
- Getting Started with Apache NiFi
- Getting Started with Streaming Analytics
- Building an End-to-End Stream Application
- Prepare Your Environment
- Creating a Dataflow Application
- Pick your Streaming Engine
- Creating a Streaming Analytics Application with SAM
- Creating a Stream Analytics Application with SAM
- Two Options for Creating the Streaming Analytics Applications
- Creating a Service Pool and Environment
- Creating Your First Application
- Creating and Configuring the Kafka Source Stream
- Connecting Components
- Joining Multiple Streams
- Filtering Events in a Stream using Rules
- Using Aggregate Functions over Windows
- Implementing Business Rules on the Stream
- Transforming Data using a Projection Processor
- Streaming Alerts to an Analytics Engine for Dashboarding
- Streaming Violation Events to an Analytics Engine for Descriptive Analytics
- Streaming Violation Events into a Data Lake and Operational Data Store
- Deploy a SAM Application
- Advanced: Performing Predictive Analytics on the Stream using SAM
- Logistical Regression Model
- Export the Model into SAM's Model Registry
- Enrichment and Normalization of Model Features
- Upload Custom Processors and UDFs for Enrichment and Normalization
- Scoring the Model in the Stream using a Streaming Split Join Pattern
- Streaming Split Join Pattern
- Score the Model Using the PMML Processor and Alert
- Creating Visualizations Using Superset
- SAM Test Mode
- Four Test Cases using SAM's Test Mode
- Test Case 1: Testing Normal Event with No Violation Prediction
- Analyzing Test Case 1 Results
- Test Case 2: Testing Normal Event with Yes Violation Prediction
- Analyzing Test Case 2 Results
- Test Case 3: Testing Violation Event
- Analyzing Test Case 3 Results
- Test Case 4: Testing Multiple-Speeding-Events
- Analyzing Test Case 4 Results
- Running SAM Test Cases as Junit Tests in CI Pipelines
- Four Test Cases using SAM's Test Mode
- Creating Custom Sources and Sinks
- Stream Operations
- Troubleshooting and Debugging a Stream Application
- Creating a Stream Analytics Application with SAM
- Spark Streaming
- Running the Stream Simulator
- Managing Kafka with Streams Messaging Manager
- Getting Started with Apache NiFi Registry