Using constraints
You can use SQL constraints to enforce data integrity and improve performance. Using constraints, the optimizer can simplify queries. Constraints can make data predictable and easy to locate. Using constraints and supported modifiers, you can follow examples to constrain queries to unique or not null values, for example.
Hive enforces DEFAULT, NOT NULL and CHECK only, not PRIMARY KEY, FOREIGN KEY, and UNIQUE.
You can use the constraints listed below in your queries. Hive enforces DEFAULT, NOT NULL and CHECK only, not PRIMARY KEY, FOREIGN KEY, and UNIQUE. DEFAULT even if enforced, does not support complex types (array,map,struct). Constraint enforcement is limited to the metadata level. This limitation aids integration with third party tools and optimization of constraints declarations, such as materialized view rewriting.
- CHECK
- Limits the range of values you can place in a column.
- DEFAULT
- Ensures a value exists, which is useful in offloading data from a data warehouse.
- PRIMARY KEY
- Identifies each row in a table using a unique identifier.
- FOREIGN KEY
- Identifies a row in another table using a unique identifier.
- UNIQUE KEY
- Checks that values stored in a column are different.
- NOT NULL
- Ensures that a column cannot be set to NULL.
Supported modifiers
You can use the following optional modifiers:
- ENABLE
- Ensures that all incoming data conforms to the constraint.
- DISABLE
- Does not ensure that all incoming data conforms to the constraint.
- VALIDATE
- Checks that all existing data in the table conforms to the constraint.
- NOVALIDATE
- Does not check that all existing data in the table conforms to the constraint.
- ENFORCED
- Maps to ENABLE NOVALIDATE.
- NOT ENFORCED
- Maps to DISABLE NOVALIDATE.
- RELY
- Specifies abiding by a constraint; used by the optimizer to apply further optimizations.
- NORELY
- Specifies not abiding by a constraint.
You use modifiers as shown in the following syntax:
( ( ( (ENABLE | DISABLE) (VALIDATE | NOVALIDATE) ) | ( ENFORCED | NOT ENFORCED ) ) (RELY | NORELY) )
Default modfiers
The following default modifiers are in place:
- The default modifier for ENABLE is NOVALIDATE RELY.
- The default modifier for DISABLE is NOVALIDATE NORELY.
- If you do not specify a modifier when you declare a constraint, the default is ENABLE
NOVALIDATE RELY. The following constraints do not support ENABLE:
- PRIMARY KEY
- FOREIGN KEY
- UNIQUE KEY
Constraints examples
The optimizer uses the constraint information to make smart decisions. The following examples show the use of constraints.
The following example shows how to create a table that declares the NOT NULL in-line constraint to constrain a column.
CREATE TABLE t(a TINYINT, b SMALLINT NOT NULL ENABLE, c INT);
The constrained column b accepts a SMALLINT value as shown in the first INSERT statement.
INSERT INTO t values(2,45,5667);
...
----------------------------------------------------------
1 row affected ...
The constrained column b will not accept a NULL value.
INSERT INTO t values(2,NULL,5667);
Error: Error running query: org.apache.hadoop.hive.ql.exec.errors.DataConstraintViolationError: /
Either CHECK or NOT NULL constraint violated! (state=,code=0)
CREATE TABLE Persons (
ID INT NOT NULL,
Name STRING NOT NULL,
Age INT,
Creator STRING DEFAULT CURRENT_USER(),
CreateDate DATE DEFAULT CURRENT_DATE(),
PRIMARY KEY (ID) DISABLE NOVALIDATE);
CREATE TABLE BusinessUnit (
ID INT NOT NULL,
Head INT NOT NULL,
Creator STRING DEFAULT CURRENT_USER(),
CreateDate DATE DEFAULT CURRENT_DATE(),
PRIMARY KEY (ID) DISABLE NOVALIDATE,
CONSTRAINT fk FOREIGN KEY (Head) REFERENCES Persons(ID) DISABLE NOVALIDATE
);