Friday, January 31, 2014

Review on "Cassandra Design Patterns" book


A new book by Packt, on which I am a reviewer. Also, see my Amazon review for it here.

Sunday, January 26, 2014

Big Data Cartoon - Announcing Hadoop (TM) Coin

With Bitcoin market cap at reported 12B, and with the Hadoop yearly market targeting 4B by 2017, this is an apples-to-oranges comparison, and it is as hard to decide between the two as to answer the old children's question, "if an elephant steps on a whale, who will win?"

For this reason we decided to create a completely unauthorized Hadoop coin and offer it to the world. (Please keep in mind that Apache Hadoop is a trademark which we are only using here, not suggesting that we have the authority to speak on its behalf).

How to generate Hadoop coins? - much simpler than Bitcoins: all you need to do is forward this link or letter to a friend, and you have sent a Hadoop coin. It is eco-friendly, wasting no paper, and perhaps increasing the world's internet traffic by a paltry 1%. Thus, the possession of this coins provokes no envy and perhaps negates the ancient wisdom that "one who wants money will never have enough money." I, for one, can stare at this coin for a long time.

Tuesday, January 21, 2014

Big Data cartoon - a day in life of a Big Data startup

Have you ever been a part of a startup? The normal worries: the team, the investors, the payroll, the plans?

Now, in a Big Data startup it's magnified: Big Data is unwieldy, it may cost, your Amazon clusters have been running idle for a week, and your monthly bill is in the thousands, and your board is questioning you on the slow progress. Add to this that your HBase code works on a single node but not on a cluster, and you got a perfect storm.

Wednesday, January 15, 2014

Hadoop cartoons - one day in the life of a Big Data developer

What is one to when his HBase application hangs, regardless of how he connects to the ZooKeeper? - Banging your head on the keyboard helps.

The possible reasons for this are

  • ZooKeeper/HBase configuration problems (check them out independently with 'hbase shell' and zkCli.sh)
  • HADOOP_CLASSPATH not configured correctly
  • Running tests from maven may also give you problem.
When your 'hadoop classpath' is a few pages long, and you run outside of maven, using this classpath, it will work.

Thursday, January 9, 2014

Hadoop Operations and Cluster Management Cookbook from Packt

I am reviewer on this book, , and here is what I say on Amazon about it:

The book talks about every aspect of Hadoop administration: choice of hardware/software, installation of Hadoop and all the tools, Pig, Hive, Mahout, etc. There are chapters on maintenance and monitoring. Lots of screen shots and command-line instructions.

I wish the book showed the latest developments in these areas, which are Cloudera Manager, HortonWorks Ambari, etc., which make it all ridiculously simple. However, when those managers fail or are not supported, you are still back on the command line, so this approach definitely has its place.

I especially liked the monitoring chapters, nagious, Ganglia and Ambari.