Using generic ignore patterns

Next to Hive metadata entities, generic ignore patterns can be set up to reduce resource consumption caused by the incoming metadata from any type of service.

Generic ignore patterns allow you to specify entities using regex patterns to be excluded from metadata capture and processing. This helps you to focus its metadata management efforts on relevant data entities, reducing clutter and improving the accuracy and efficiency of metadata operations.

The supported metadata sources include the following:
  • Apache Hbase Hook
  • Apache Hive Metastore Hook
  • Apache HiveServer2 Hook
  • Apache Impala Hook
  • Apache Spark Hook

Generic ignore patterns on the server side

Ignore pattern can be configured to ignore entities based on Type Name, qualifiedName or both. You can configure the generic ignore pattern in Atlas Server Advanced Configuration Snippet (Safety Valve) for conf/atlas-application.properties:

  1. Go to Cloudera Manager > Clusters > Atlas > Configuration.
  2. Search for conf/atlas-application.properties.
  3. Enter the following configurations combined with your regex expression:
    Configuration for Type Name
    atlas.notification.consumer.preprocess.entity.type.ignore.pattern
    
    Configuration for qualifiedName
    atlas.notification.consumer.preprocess.entity.ignore.pattern
    
  4. Click Save Changes.
  5. Click Actions > Restart to apply your changes.

Ignoring all Apache Hive entities

atlas.notification.consumer.preprocess.entity.type.ignore.pattern=hive_.*

Ignoring all entities with suffix _tmp in their name

atlas.notification.consumer.preprocess.type.ignore.pattern=.*\\..*_tmp.*

Ignoring all Apache Hive entities with column name "password" and "confidential" as table name

atlas.notification.consumer.preprocess.entity.ignore.pattern=.*\\..*\\..*password.*,.*\\..*confidential.*
atlas.notification.consumer.preprocess.entity.type.ignore.pattern=hive_column,hive_table_ddl

Ignoring all Apache Hive entities with suffix _tmp in their name

atlas.notification.consumer.preprocess.entity.type.ignore.pattern=hive_.*
atlas.notification.consumer.preprocess.entity.ignore.pattern=.*\\..*_tmp.*

Generic ignore patterns on the hook side

Ignore pattern can be configured on the hook side to ignore entities based on qualifiedName. You can configure the hook side generic ignore patterns Cloudera Manager Advanced Configuration Snippet (Safety Valve) for atlas-application properties section of each individual hook:

  1. Go to Cloudera Manager > Clusters > ***Hook Service*** > Configuration.
  2. Search for atlas-application.properties.
  3. Enter the following configurations combined with your regex expression:
    Configuration for qualifiedName
    atlas.hook.entity.ignore.pattern
  4. Click Save Changes.
  5. Click Actions > Restart to apply your changes.

Ignoring all entities with “test” in their name

atlas.hook.entity.ignore.pattern=.*\\.test.*