Step 2 Create DSL files
To create the DSL files, you combine the different behaviors and use various operators available in Kraptr Grammar.
You can create a schema to map columns matching these DSLs to specific tags defined in the previous step.
Define DSLs with the .kraptr_dsl.json
extension and place all the DSL schema
files in the following path:
/apps/dpprofiler/profilers/sensitive_info_profiler/${profiler_version}/lib/kraptr/dsl/sensitiveinfo/
Here is an example of sample schema with four DSL definitions.
{
"groupName": "demo",
"profilerInstanceName": "sensitivityinstance",
"dsls": [
{
"matchType": "value",
"dsl": "luhn_check",
"tags": [
"luhns_demo"
],
"isEnabled": true
},
{
"matchType": "value",
"dsl": "regex(\"\\\\b(([a-zA-Z0-9_\\\\-\\\\.]+)@((\\\\[[0-9]{1,3}\\\\.[0-9]{1,3}\\\\.[0-9]{1,3}\\\\.)|(([a-zA-Z0-9\\\\-]+\\\\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\\\\]?))\\\\b\")",
"tags": [
"combined_demo"
],
"isEnabled": true
},
{
"matchType": "value",
"dsl": "luhn_check or whitelist(\"/apps/dpprofiler/profilers/sensitive_info_profiler/1.0.0.1.2.1.0-16/meta/whitelist\")",
"tags": [
"combined_demo"
],
"isEnabled": true
},
{
"matchType": "name",
"dsl": "is_in_ci(\"id\")",
"tags": [
"column_name_only_demo"
],
"isEnabled": true
}
],
"isEnabled": true
}
The dsls section defines the different DSLs and the tag that they are mapped to.
matchType
- states whether this DSL has to be matched on column name (ideal for cases like age where identification using value is difficult) or column value. To apply DSL on column name and value, you must set this to a value and name it accordingly.dsl
- defines the DSL expression which will get applied on column name or value based onmatchType
.- Tags - list of tags (defined in Tag Schema) to apply if DSL expression succeeds on this column name or value. Note that the DSL validation success on one column value is not enough to tag the column. Kraptr also considers the percentages in tag schema, column name matches if any and the percentage of rows matched in total rows to compute a final percentage. If this final percentage is more than a threshold (now fixed at 70 for system), the column is tagged.
isEnabled
- If this is set to false, this DSL will not be applied.