Project structure and development
Projects aim to provide a Software Development Lifecycle (SDLC) for streaming applications in SQL Stream Builder (SSB): they allow developers to think about a task they want to solve using SSB, and collect all related resources, such as job and table definitions or data sources in a central place.
Projects aim to facilitate collaboration between developers by sharing common resources among its members. Projects can be synchronized with a Git repository, allowing easy migration between different clusters. The environment concept in a project allows the templating of cluster-specific or sensitive properties.
A project is a collection of resources, static definitions of data sources, jobs with materialized views, virtual tables, user-defined functions (UDF), and materialized view API keys. These resources are called internal to a project and can be safely used by any job within the project.
Jobs can also use external resources that the logged in user has access to. These resources are defined in other projects (for example, a UDF or a virtual table in another project), or can come from a Data Source (for example, a Kudu table from the Kudu catalog). External resources are outside the scope of the project, thus they can be changed or altered by external factors. It is recommended to use only internal resources for projects that are intended to be exported.
The following table summarizes the concepts and structural elements of projects in SSB:
Project specific resources | |
Jobs | SSB job definitions including SQL, Materialized View configuration and endpoints, Job settings such as checkpointing and parallelism. The status of the job and its associated Flink job (if any) are not part of the definition. |
Functions | User-defined JavaScript functions. |
Virtual Tables | Virtual tables of the project. These are stored in the
<project_name> database of the ssb catalog. When
exporting a project, only these tables (ssb.<project_name>.* ) are
exported. |
Data Sources | Data Sources include Kafka providers and Catalogs in the project. While the definition of the Data Source is considered internal to the project, the tables or topics provided by them are external, as they are managed by an outside system. |
Materialized Views | Materialized Views that have been defined for Jobs in the project. |
API Keys | API Keys for accessing Materialized View endpoints. |
Job Notifications | Job Notification actions (webhook and email notifications) as well as notification
groups can be defined in a project. They may be assigned to Jobs to be triggered by
failure. Job notifications are not synchronized when using the Git import/export feature. |
External resources | |
Virtual Tables | Virtual tables from other projects and catalogs that the logged in user has access to. Other members of the active project might not have access to these. |
Connectors | Connectors available in SSB for connecting to external systems. These are shared among the whole SSB instance, and are accessible to all users in all projects. |
Data Formats | Data Formats available in SSB that can be used by connectors when connecting to external systems. These are shared among the whole SSB instance, and are accessible to all users in all projects. |
ssb_default
and
[***USERNAME***]_default
projects are automatically generated projects. The
ssb_default
project and its resources are visible to every user, while the
[***USERNAME***]_default
project and its resources are only visible to that
user. Every user who created an account in SSB has its own project generated by default where
members can be invited.