GitFlowPersistenceProvider

GitFlowPersistenceProvider stores flow contents under a Git directory.

In contrast to FileSystemFlowPersistenceProvider, this provider uses human friendly Bucket and Flow names so that those files can be accessed by external tools. However, it is NOT supported to modify stored files outside of NiFi Registry. Persisted files are only read when NiFi Registry starts up.

Buckets are represented as directories and Flow contents are stored as files in a Bucket directory they belong to. Flow snapshot histories are managed as Git commits, meaning only the latest version of Buckets and Flows exist in the Git directory. Old versions are retrieved from Git commit histories.

Example persisted files


Flow Storage Directory/
├── .git/
├── Bucket_A/
│   ├── bucket.yml
│   ├── Flow_1.snapshot
│   └── Flow_2.snapshot
└── Bucket_B/
    ├── bucket.yml
    └── Flow_4.snapshot

Each Bucket directory contains a YAML file named bucket.yml. The file manages links from NiFi Registry Bucket and Flow IDs to actual directory and file names. When NiFi Registry starts, this provider reads through Git commit histories and lookup these bucket.yml files to restore Buckets and Flows for each snapshot version.

Example bucket.yml


         layoutVer: 1
bucketId: d1beba88-32e9-45d1-bfe9-057cc41f7ce8
flows:
  219cf539-427f-43be-9294-0644fb07ca63: {ver: 7, file: Flow_1.snapshot}
  22cccb6c-3011-4493-a996-611f8f112969: {ver: 3, file: Flow_2.snapshot}
      

Qualified class name: org.apache.nifi.registry.provider.flow.git.GitFlowPersistenceProvider

Property

Description

Flow Storage Directory

REQUIRED: File system path for a directory where flow contents files are persisted to. The directory must exist when NiFi registry starts. Also must be initialized as a Git directory. See Initialize Git directory for detail.

Remote To Push

When a new flow snapshot is created, this persistence provider updates files in the specified Git directory, then creates a commit to the local repository. If Remote To Push is defined, it also pushes to the specified remote repository (e.g. origin). To define more detailed remote spec such as branch names, use Refspec (see https://git-scm.com/book/en/v2/Git-Internals-The-Refspec).

Remote Access User

This username is used to make push requests to the remote repository when Remote To Push is enabled, and the remote repository is accessed by HTTP protocol. If SSH is used, user authentication is done with SSH keys.

Remote Access Password

The password for the Remote Access User.

Remote Clone Repository

Remote repository URI to use to clone into Flow Storage Directory, if local repository is not present in Flow Storage Directory. If left empty the git directory needs to be configured as per Initialize Git directory. If URI is provided then Remote Access User and Remote Access Password also should be present. Currently, default branch of remote will be cloned.

Initialize Git directory

In order to use GitFlowPersistenceRepository, you need to prepare a Git directory on the local file system. You can do so by initializing a directory with git init command, or clone an existing Git project from a remote Git repository by git clone command. If you want to clone the default branch of remote repository automatically, set the Remote Clone Repository as described above.

Git user configuration

This persistence provider uses preconfigured Git user name and user email address when it creates Git commits. NiFi Registry user name is added to commit messages.

Example commit


commit 774d4bd125f2b1200f0a5ee1f1e9fedc6a415e83
Author: git-user <git-user@example.com>
Date:   Tue May 8 14:30:31 2018 +0900

    Commit message.

    By NiFi Registry user: nifi-registry-user-1

You can configure Git user name and email address by git config command.

Git user authentication

By default, this persistence repository only create commits to local repository. No user authentication is needed to do so. However, if 'Commit To Push' is enabled, user authentication to the remote Git repository is required.

If the remote repository is accessed by HTTP, then username and password for authentication can be configured in the providers XML configuration file.

When SSH is used, SSH keys are used to identify a Git user. In order to pick the right key to a remote server, the SSH configuration file ${USER_HOME}/.ssh/config is used. The SSH configuration file can contain multiple Host entries to specify a key file to login to a remote Git server. The Host must match with the target remote Git server hostname.

example SSH config file


Host git.example.com
  HostName git.example.com
  IdentityFile ~/.ssh/id_rsa

Host github.com
  HostName github.com
  IdentityFile ~/.ssh/key-for-github

Host bitbucket.org
  HostName bitbucket.org
  IdentityFile ~/.ssh/key-for-bitbucket