Dates In Hive
After previously spending a lot of time looking at dates in SQL Server and oracle, I wanted to do the same justice to Hive.
Sample Date Functions
Hive is closer to oracle than to SQL Server when it comes to formatting datas, here are some samples:
WITH sample_date_data AS
(
SELECT TIMESTAMP(from_unixtime(unix_timestamp())) AS the_timestamp
)
SELECT the_timestamp AS the_timestamp,
CAST(the_timestamp AS DATE) AS the_date,
--TRUNC
TRUNC(the_timestamp, 'MM') AS date_month,
TRUNC(the_timestamp, 'YY') AS date_year,
--Common date parts
MONTH(the_timestamp) AS month,
YEAR(the_timestamp) AS year,
--DATE_FORMAT
DATE_FORMAT(the_timestamp, 'u') AS daynumber_of_week,
DATE_FORMAT(the_timestamp, 'EEEE') AS day_name
FROM sample_date_data
;
With the above example, one can also see the differentiation between a timestamp and a date (if you aren’t interested in the time, it’s easier not to store/manipulate it).
TRUNC
Like oracle, hive has the TRUNC command which rounds down a date to the granularity specified. Unlike Oracle, this doesn’t include a default of day. For getting a whole day. I’d recommend casting to DATE.
The TRUNC function is described as:
Returns date truncated to the unit specified by the format (as of Hive 1.2.0). Supported formats: MONTH/MON/MM, YEAR/YYYY/YY. Example: trunc(‘2015-03-17’, ‘MM’) = 2015-03-01.
Language Manual UDF
DATE_FORMAT
Similar to Oracle and SQL Server equivalents, this is based on a Java library for converting/representing values stored as dates in readable/different formats.
The DATE_FORMAT function is described as:
Converts a date/timestamp/string to a value of string in the format specified by the date format fmt (as of Hive 1.2.0). Supported formats are Java SimpleDateFormat formats – https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html. The second argument fmt should be constant. Example: date_format(‘2015-04-08’, ‘y’) = ‘2015’.
Language Manual UDF
The format strings for this come straight from Java, Some common format strings are:
- EEEE: Day Name
- u: Day number of the week
- M: Month in year
There is a full list of date format strings over on the oracle Java website: SimpleDateFormat