Create policies
Create basic permissions for the DB with two roles: Full Access (Finance) and Read Only/Limited Access (Cloudera Staff) with Ranger Policies.
Best practise is to apply policies to roles, for example:
- Create Role ClimateAuditor (User role for data) ClimateAdmin (Admin Role)
- Admin roles have full control of the ClimateDatabases, can create new tables, add columns etc
- User roles have read write access to data
- Roles can be updated with additional policies and linked to both pre-existing and new groups.
Typically Administrators will link these roles to pre-existing Groups that exist in the corporate directory to enable existing HR processes to maintain memberships, for example a “Finance” group.
Create New Groups and link ClimateAccounting (e.g Composed of individual members such as Architects and Engineers), new groups may be project specific.
Ranger RMS function ensures that as well as the Hive Object, the underlying files are similarly protected with equivalent file ACLs.
Sync ACLs summarize Ranger RMS features and how to configure.
Migrating from Sentry to Ranger
>Ranger policies allowing create privilege
for Hadoop_SQL databases
Data encrichment
Whilst travel data will be extracted directly from the finance and accounting system, the enrichment process will see that data blended with external sources such as aircraft carbon emission data, in order to arrive at a more accurate true emissions analytic. Apache NiFi is an excellent technology for these types of enrichment pipelines enabling them to be loosely coupled whilst retaining regulatory compliance.
Data masking
Ingest from a governed data source requiring masking of certain columns, for example an employee's ID which would likely be incorporated into any journey information. (Note we capture the EmployeeID as the organization plans to evaluate a policy of budgeting CO2 emissions for individuals as part of its ongoing sustainability efforts.)
Read and analytical access via Impala (JDBC/Tableau etc) illustrate permissions and connection strings.
- Navigate to Apache Ranger Access Manager > Service Manager and select Add New Policy as follows:
- We then create the policy adding the Name (GDPR), Hive Database (carbonAccounting), Table (trip), Hive column (employID), Audit Logging (yes), validity.
- We then apply to the roles in this case Finance and GDPR.
- Select Masking Type, in this case, Redact.