Wednesday, 8 February 2017

The story of Pig

Yahoo! (don't forget the exclamation mark!) nowadays makes the news for all the bad reasons: Takeover by other companies, hacked by unnamed state actors several times in the past, not to mention the very dodgy advertising all over their websites. It wasn't always thus.

Cast your minds back to the mid noughties, and you'll remember Yahoo! acquiring the smaller stars of the web 2.0 constellation: Flickr which they kept, which they sold on and Upcoming which they retired. Of course Yahoo! had a history of acquiring companies with great products and messing them up previously, from GeoCities to LAUNCHcast. But by 2005 it seemed like they were suddenly getting it and becoming cool.

On the technology side, Yahoo! was a pioneer of Big Data, with Open Source projects such as Hadoop (the writer of that first blog post, Jeremy Zawodny, did later sum up the story of that time nicely in his personal blog), Pig and other bits of that ecosystem that became part of an Apache project rather than a proprietary product.

One wonders if they would be better off now had they kept it as their own product. Maybe they would be the giants of cloud computing. Releasing it as open source though meant that it became an effective industry standard, with other companies contributing projects such as Hive (in fact have a look at this blog post that details the use for both Pig and Hive inside Yahoo. If only someone updated it to add Spark SQL to the mix!). So if anything, if Yahoo! goes down, the Apache Hadoop ecosystem will probably survive.

