Dates In Hive
After previously spending a lot of time looking at dates in SQL Server and oracle, I wanted to do the same justice to Hive.
Sample Date Functions
Hive is closer to oracle than to SQL Server when it comes to formatting datas, here are some samples:
WITH sample_date_data AS ( SELECT TIMESTAMP(from_unixtime(unix_timestamp())) AS the_timestamp ) SELECT the_timestamp AS the_timestamp, CAST(the_timestamp AS DATE) AS the_date, --TRUNC TRUNC(the_timestamp, 'MM') AS date_month, TRUNC(the_timestamp, 'YY') AS date_year, --Common date parts MONTH(the_timestamp) AS month, YEAR(the_timestamp) AS year, --DATE_FORMAT DATE_FORMAT(the_timestamp, 'u') AS daynumber_of_week, DATE_FORMAT(the_timestamp, 'EEEE') AS day_name FROM sample_date_data ;
With the above example, one can also see the differentiation between a timestamp and a date (if you aren’t interested in the time, it’s easier not to store/manipulate it).
TRUNC
Like oracle, hive has the TRUNC
command which rounds down a date to the granularity specified. Unlike Oracle, this doesn’t include a default of day. For getting a whole day. I’d recommend casting to DATE
.
The TRUNC
function is described as:
Returns date truncated to the unit specified by the format (as of Hive 1.2.0). Supported formats: MONTH/MON/MM, YEAR/YYYY/YY. Example: trunc(‘2015-03-17’, ‘MM’) = 2015-03-01.
Language Manual UDF
DATE_FORMAT
Similar to Oracle and SQL Server equivalents, this is based on a Java library for converting/representing values stored as dates in readable/different formats.
The DATE_FORMAT
function is described as:
Converts a date/timestamp/string to a value of string in the format specified by the date format fmt (as of Hive 1.2.0). Supported formats are Java SimpleDateFormat formats – https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html. The second argument fmt should be constant. Example: date_format(‘2015-04-08’, ‘y’) = ‘2015’.
Language Manual UDF
The format strings for this come straight from Java, Some common format strings are:
- EEEE: Day Name
- u: Day number of the week
- M: Month in year
There is a full list of date format strings over on the oracle Java website: SimpleDateFormat