Another saga from the big data frontier: at work we have a source of GPS data, and I've been working with a colleague to aggregate it to a degree grid, not unlike the example linked. The data is in hive, and bizarrely it has two columns for each dimension, latitude magnitude as a positive float, and latitude sense as a string (N or S) etc. for longitude. To make our life simple we round the magnitude in the custom SQL that connects to hive, and we make the data signed again with a simple calculation in the Tableau source:
It turns out that it is an issue. I'm not sure what's happening in the data engine (two's complement going crazy because of negative zero?) but the two zeros are treated differently, recreated with the data of the post linked above.
Sure enough, looking at the two zeros there's one for -0 and one for +0. So we refine our calculation to avoid multiplying zero by -1
[round Latitude magnitude]*The rounding of the magnitude is fine as we also keep the sense in the group by dimensions in the custom SQL. The only special case here is when the rounded magnitude is zero, where had we done the sign assignment before the rounding, we'd have one bucket for zero instead of one for 0-0.5 and one for -0.5-0. But surely that shouldn't be an issue once we do the calculation above in tableau?
(case [latitude sense]when 'S' then -1
when 'N' then 1
end)
It turns out that it is an issue. I'm not sure what's happening in the data engine (two's complement going crazy because of negative zero?) but the two zeros are treated differently, recreated with the data of the post linked above.
Sure enough, looking at the two zeros there's one for -0 and one for +0. So we refine our calculation to avoid multiplying zero by -1
if [round Longitude magnitude]=0 then 0 else
[round Longitude magnitude]*
(case [longitude sense]
when 'W' then -1
when 'E' then 1
end)
end
No comments:
Post a Comment