I'm trying to do something relatively simple in PIG, but it's tough. Basically I want to load time series data with known schema (flat csv list of values), and a
|
|
LinkBack | Thread Tools | Display Modes |
03-04-2014, 05:04 PM | #1 (permalink) |
A True Z Fanatic
Join Date: Jan 2009
Location: Kansas
Posts: 3,800
Drives: 09 Z34-TT 6MT
Rep Power: 43 |
Anyone know PIG/Hive queries?
I'm trying to do something relatively simple in PIG, but it's tough. Basically I want to load time series data with known schema (flat csv list of values), and a list of value pairs. For each item in the list, I want to filter down the data based on it's ID, and then call a function to decode and generate each tuple. I'm having trouble trying to filter, and it might be caused by a bug in PIG or something.
/* Input */ data // Time series data (year:int, month:int, day:int..., param_id:int, value:chararray) parm // (id:int, sub_id:int) /* Get rows target parameter */ out = FOREACH parm { p_tar = FILTER data BY param_id == id; --GENERATE FLATTEN(p_tar.(year, month, day, hour, minute, second, millisecond, model, serial)), id AS id, sub_id AS sub_id, GetValue(p_tar, parm.id, parm.sub_id) as val:double; } dump out;
__________________
|
Bookmarks |
|
|