Tuesday, November 24, 2015

Thanksgiving in the Big Data Land

There are a few ways how it could work out in the world of Big Data

  • The traditional Hadoop elephant serves the traditional turkey
  • Two vegetarians, turkey and elephant, eat the traditional pumpkin pie
  • Nobody eats anybody at all, in the style of Alice in Wonderland.
Please take your pick, then send the postcard to your friend.

Tuesday, October 27, 2015

Hadoop and Spark cartoon

Our artist outdid herself with this cartoon, this is totally hilarious.

We do teach a lot of Spark courses lately though.

Sunday, October 11, 2015

Learning Scala by Example

I have started a blog series, "Learning Scala by Example" using the excellent latest book of the same name by Martin Odersky.

So far I have three blog posts,

Following posts will appear regularly, one by one.

Enjoy the challenge and write back!

Wednesday, September 16, 2015

Houston Hadoop Meetup - David Ramirez presents Informatica

At the recent meeting, David Ramirez of Informatica presented the platform and explained how it allows to save time in data modeling when it comes to Big Data. His slides are found here, and they also contain a link to a demo on YouTube.

Keep in mind that David is working with Oil & Gas accounts in Houston - so you may get some interesting information from the slides.

Pizza was provided by Elephant Scale, and the meeting hosted by Microsoft. Thanks to Tiru for helping with hosting.

Thursday, September 3, 2015

Big Data Cartoon - Hadoop is quite mature

Signs of Hadoop maturity:

  1. There is a competitor, Spark
  2. There is consolidation: Cloudera, Hortonworks, MapR dominate
  3. Promising startups are snatched: Hortonwork acquired Onyara, the maker of NiFi
  4. The elephant himself grew up - see this picture :)

Thursday, August 20, 2015

Mike Drob presents Cloudera Search at Houston Hadoop Meetup

Last Tuesday it was Cloudera's turn to take the podium. Mike Drob presented "Cloudera Search". The slides are right here. Every component is available via the open source channels, mostly in Solr.

Cloudera's value add an open source, feature rich search engine plus all the integrations with a data management (also open source) ecosystem, to streamline multi-workload search, or search and other workloads of the same data, without moving it around between systems. Cloudera also provides production tooling, audit, and security.

In addition, open source buffs can use the implementation described in these slides, to glean the best practices to use in their own solutions.

Very good, clear discussion - thank you, Mike!

Thanks again, Microsoft, for hosting the meetup at MS Campus.

Wednesday, July 22, 2015

Review of “Monitoring Hadoop” by Gurmukh Singh

This book is recently published, April 2015, and it covers Nagios, Ganglia, Hadoop monitoring and monitoring best practices.
The first part is rightfully devoted to Nagios. Nagios is covered quite in depth: install, verification and configuration. It gives you the right balance: it does not say everything that there is in a Nagios manual, but tells you sufficient information to install Nagios and prepare it to monitor specific Hadoop daemons, ports, and hardware.
The same goes for Ganglia: it is covered in sufficient detail for one to be able to install and run, with enough attention to Hadoop specifics.
What I did not find in the book, and what could be useful... to read further