Sunday, January 15, 2012

Open Source E-Discovery Finds Scalability

In his front-page article in LTN, Evan Koblentz writes

"Open-source electronic discovery products, which appeared on the market in 2011, are quickly expanding in the new year. There is an evolving system for email archiving and searching, along with a cloud-based processing engine, both of which are on the cusp of large scalability, allowing them to handle growing workloads."

The three products discussed are

  • Enkive from LinuxBox, for email storage, compliance, and eDiscovery;
  • MailArchiva from Stimulus Software; and
  • FreeEed from SHMsoft
Here is what Evan writes about FreeEed

On the processing side, SHMsoft's FreeEed project is now in its third generation. Previous versions could be clustered -- the concept of sharing large projects across many servers, resulting in faster jobs and better reliability -- by using the Hadoop system when hosted locally. But now there is a version called EC2Eed running atop the Amazon EC2 cloud, which will get clustering in two months, project leader Mark Kerzner said.
SHMsoft has also hired its first business development manager to help the project grow beyond word-of-mouth, Kerzner said.
The articles concludes with a prediction:
"Someone's got to take the plunge," David Horrigan, a 451 Group analyst, added, in Boston. "Invariably, as always, someone will. Because it's a huge potential market."

SHMsoft is Putting the “e” in Discovery with Hadoop Big Data Platforms

Press Release

SHMsoft, a worldwide leader in providing innovative solutions to high-tech problems based upon disruptive technologies, is pleased to announce new releases of FreeEed†™, and EC2Eed†™, open-source platforms for processing e-discovery.


PRLog (Press Release) - Jan 12, 2012 -
“FreeEed and EC2Eed provide fully scalable, fault-tolerant ways for processing data and can cope with any size of e-discovery job, from gigabytes to petabytes,” said Mark Kerzner, President of SHMsoft of Houston.

“The processing engine in FreeEed and EC2Eed is organized by the Hadoop framework which supports data-intensive distributed applications under a free license.  I further developed the Hadoop platform using HDFS, HBase/SimpleDB, Tika, Lucene, and Solr, cutting-edge technologies that enable FreeEed and EC2Eed to handle all types of structured, unstructured and semi-structured data in a fast and efficient manner,” Kerzner said.

●   FreeEed Release 3.1.4 provides the litigation support/IT professional an option of running Windows to process data on their personal workstations.  Each project will create its own Lucene index for later searches.  Metadata results are output as a CSV file, while the native files and the extracted text are stored in a zip file(s). The end results can be used for culling and producing native files for legal review.  An impending release of FreeEed planned for February 2012 will create PDFs and/or TIF files for loading in your litigation review platform.

●   EC2Eed, a premium edition of FreeEed, processes e-discovery on the Amazon cloud.  The litigation support or IT professional can log into their AWS account and use EC2Eed as a platform-as-a-service for all their processing needs at a cost of 96 cents per hour per server (prices vary depending on the choice of server).  Data can either be stored on the Amazon S3 cloud or remain in the organization’s own data center.

SHMsoft will provide support and training for both implementations.  SHMsoft’s e-discovery processing platforms provide a much needed solution to address the high costs of e-discovery.

†† FreeEed™ and EC2Eed™ are the registered trademarks of SHMsoft, Inc.

For More Information Contact:
Julie Wade
Director of Business Development
SHMsoft, Inc.
(713) 204-2565

# # #

Software development company located in Houston, Texas. Specializing in Hadoop clusters and cloud implementations.