"LexisNexis announced today that it will open-source its High Performance Computing Cluster (HPCC) technology, as well as offer an enterprise version with commercial support. The company is positioning HPCC Systems, developed internally by its Risk Solutions unit, as an alternative to Apache Hadoop. A virtual machine for testing purposes will be available soon, and code will be available in a few weeks." For fuller announcement, go here.
As a first impression, what are the major comparison points?
- LexisNexis has been using its technology for a while and has a marketing clout to match, but it announced only plans to make the VM "available soon" and code "in a few weeks." One wonders if this is a reaction to the momentum that FreeEed has been gaining. On the other hand, FreeEed is already out on GitHub;
- LexisNexis is essentially a closed-source company, so one wonders how really open-sourced the offering is going to be. But they may be successful - look at Microsoft open-source contributions. In LexisNexis own words, "Only the core technology is being released, LexisNexis' own data linking techniques aren't being released, nor are its data sources." In contrast, FreeEed is pure open source (with commercial support options), and people are already investigating using it in ways beyond eDiscovery. This illustrates the flexibility of an open source offering.
- LexisNexis has Roxie, a system for query and data warehousing, but FreeEed will have the same based on Cassandra.
- LexisNexis sports ECL (Enterprise Control Language), but Cassandra has CQL (Cassandra Query Language).
- LexisNexis's "HPCC team has been working with Amazon Web Services to make sure the product work well on AWS servers," but FreeEed team has planned on the use of EC2 from the start and is actively working on it now.