Using constraints

You can use DEFAULT, PRIMARY KEY, FOREIGN KEY, and NOT NULL constraints in Hive ACID table definitions to improve the performance, accuracy, and reliability of data.

The Hive engine and BI tools can simplify queries if data is predictable and easily located. Hive enforces constraints as follows:
DEFAULT
Ensures a value exists, which is useful in EDW offload cases.
PRIMARY KEY
Identifies each row in a table using a unique identifier.
FOREIGN KEY
Identifies a row in another table using a unique identifier.
NOT NULL
Checks that a column value is not set to NULL.
The optimizer uses the information to make smart decisions. For example, if the engine knows that a value is a primary key, it does not look for duplicates. The following examples show the use of constraints:
CREATE TABLE Persons (   
   ID INT NOT NULL,   
   Name STRING NOT NULL,   
   Age INT,
   Creator STRING DEFAULT CURRENT_USER(),    
   CreateDate DATE DEFAULT CURRENT_DATE(),
   PRIMARY KEY (ID) DISABLE NOVALIDATE);
   
CREATE TABLE BusinessUnit (
   ID INT NOT NULL,    
   Head INT NOT NULL,
   Creator STRING DEFAULT CURRENT_USER(),    
   CreateDate DATE DEFAULT CURRENT_DATE(),
   PRIMARY KEY (ID) DISABLE NOVALIDATE,
   CONSTRAINT fk FOREIGN KEY (Head) REFERENCES Persons(ID) DISABLE NOVALIDATE
   );