GitFlowPersistenceProvider
stores flow contents under a Git directory.
In contrast to FileSystemFlowPersistenceProvider
, this provider uses human
friendly Bucket and Flow names so that those files can be accessed by external tools.
However, it is NOT supported to modify stored files outside of NiFi Registry. Persisted
files are only read when NiFi Registry starts up.
Buckets are represented as directories and Flow contents are stored as files in a Bucket
directory they belong to. Flow snapshot histories are managed as Git commits, meaning only
the latest version of Buckets and Flows exist in the Git directory. Old versions are
retrieved from Git commit histories.
Example persisted files
Flow Storage Directory/
├── .git/
├── Bucket_A/
│ ├── bucket.yml
│ ├── Flow_1.snapshot
│ └── Flow_2.snapshot
└── Bucket_B/
├── bucket.yml
└── Flow_4.snapshot
Each Bucket directory contains a YAML file named bucket.yml
. The file
manages links from NiFi Registry Bucket and Flow IDs to actual directory and file names.
When NiFi Registry starts, this provider reads through Git commit histories and lookup
these bucket.yml
files to restore Buckets and Flows for each snapshot
version.
Example bucket.yml
layoutVer: 1
bucketId: d1beba88-32e9-45d1-bfe9-057cc41f7ce8
flows:
219cf539-427f-43be-9294-0644fb07ca63: {ver: 7, file: Flow_1.snapshot}
22cccb6c-3011-4493-a996-611f8f112969: {ver: 3, file: Flow_2.snapshot}
Qualified class name:
org.apache.nifi.registry.provider.flow.git.GitFlowPersistenceProvider
Property
|
Description
|
Flow Storage Directory
|
REQUIRED: File system path for a directory where flow contents files are
persisted to. The directory must exist when NiFi registry starts. Also must
be initialized as a Git directory.
|
Remote To Push
|
When a new flow snapshot is created, this persistence provider updates files
in the specified Git directory, then creates a commit to the local
repository. If Remote To Push is defined, it also pushes to
the specified remote repository (e.g. origin ). To define
more detailed remote spec such as branch names, use Refspec
(see https://git-scm.com/book/en/v2/Git-Internals-The-Refspec).
|
Remote Access User
|
This username is used to make push requests to the remote repository when
Remote To Push is enabled, and the remote repository is
accessed by HTTP protocol. If SSH is used, user authentication is done with
SSH keys.
|
Remote Access Password
|
The password for the Remote Access User .
|
Git user authentication
By default, this persistence repository only create commits to local repository. No user
authentication is needed to do so. However, if 'Commit To Push' is enabled, user
authentication to the remote Git repository is required.
If the remote repository is accessed by HTTP, then username and password for
authentication can be configured in the providers XML configuration file.
When SSH is used, SSH keys are used to identify a Git user. In order to pick the right
key to a remote server, the SSH configuration file
${USER_HOME}/.ssh/config
is used. The SSH configuration file can
contain multiple Host
entries to specify a key file to login to a
remote Git server. The Host
must match with the target remote Git
server hostname.
example SSH config file
Host git.example.com
HostName git.example.com
IdentityFile ~/.ssh/id_rsa
Host github.com
HostName github.com
IdentityFile ~/.ssh/key-for-github
Host bitbucket.org
HostName bitbucket.org
IdentityFile ~/.ssh/key-for-bitbucket