Date strings are parsed using local timezone

Certain UDFs, such as datefiff() that use the VectorUDFDateDiffColScalar class, return an incorrect result when parsing scalar dates. Learn about the changes introduced to fix this issue.

Before Upgrade to CDP 7.1.7 SP1

The VectorUDFDateDiffColScalar class uses java.text.SimpleDateFormat to parse the date strings and interpret the scalar date to be in the local timezone.

Example:
create external table test_dt(id string, dt date);
insert into test_dt values('11', '2021-07-06'), ('22', '2021-07-07');

select datediff(dt1.dt, '2021-07-01') from test_dt dt1 left join test_dt dt on dt1.id = dt.id;

Output: 
+------+
| _c0  |
+------+
| 6    |
| 7    |
+------+

Expected output:
+------+
| _c0  |
+------+
| 5    |
| 6    |
+------+

After Upgrade to CDP 7.1.7 SP1

The parsing mechanism of the VectorUDFDateDiffColScalar class is updated to interpret date strings in Coordinated Universal Time (UTC) and the UDFs now return correct values.

For more information, see HIVE-25449.