What is Schema Registry?
Schema Registry is a standalone application that allows you to efficiently store and manage schemas for your streaming data. It supports Avro and JSON schema formats, and provides schema evolution capabilities with configurable compatibility modes. Schema Registry helps ensure data consistency across your streaming applications by providing a central repository of schemas that applications use to validate their data. It provides a REST API and client libraries for programmatic access.
Main benefits
Schema Registry provides a central, versioned repository for message schemas used by streaming applications. It decouples schema management from producers and consumers, enabling safer evolution of message formats while preserving read and write compatibility according to configurable policies.
Schema Registry provides a number of capabilities that simplify schema governance for streaming systems:
Centralized schema storage with versioning and searchable metadata.
Support for Avro and JSON schema formats, including Avro logical types.
Configurable compatibility policies (backward, forward, full, none) to govern safe schema evolution.
REST API and client libraries for registering, retrieving, and validating schemas programmatically.
Key concepts
- Schemas
-
Schemas define the structure of messages. Each schema is assigned an identifier and can be versioned. Producers typically register schemas before sending data. Consumers retrieve the schema to deserialize messages.
- Schema entities
-
Schema Registry models three primary entity types:
-
Schema Group – Organizes related schemas.
-
Schema Metadata – Defines schema name, type, compatibility policy, and associated SerDes.
- Schema Version – Stores a versioned instance of a schema associated with a Schema Metadata definition.
-
- Compatibility policies
-
Compatibility policies determine whether a proposed schema can be registered in the presence of existing versions. The policies are as follows:
Backward – New schemas can read data produced with previous schemas.
Forward – Previous schema can read data produced with new schemas.
Full – Enforces both forward and backward compatibility.
None – No compatibility checks performed.
Schema Registry API
Schema Registry exposes a REST API used to programmatically manage schemas, serializer/deserializers, and related metadata. Major categories of supported operations include:
Schema metadata and version management – create, update, delete schema metadata; add, merge, branch, enable/disable, and delete schema versions; retrieve latest or specific versions.
Import/Export and file operations – upload/download files for bulk import/export and version upload.
Search and aggregation – search schemas by fields or metadata, and retrieve aggregated schema information.
-
Confluent-compatible endpoints – subjects, versions, schema ids, and compatibility checks compatible with Confluent Schema Registry API.
Documentation
-
Cloudera Streams Messaging Operator for Kubernetes library – Includes Helm-based installation and configuration instructions such as configuration basics, external access setup, and supported security options. The library also publishes the Schema Registry REST API reference.
-
Cloudera Runtime library – Includes conceptual descriptions, client development and integration guidance, usage instructions, and additional reference documentation.
