Friday, June 17, 2011

Open Source in eDiscovery – Discussion Continues

A commentary/opinion published today in Law Technology News is titled “The Cost of Open Source in eDiscovery.” It starts by saying that “There has been a lot of talk about open-source software and the dramatic effect it may have on e-discovery.” The author, attorney Sean Doherty, makes a number of good points. Let us see what open source can learn from them, and in doing so, let us also analyze the other points of view.

Any discussion is good, because it makes one wise, but what makes this discussion especially interesting is the fact that the original article by Evan Koblentz to which Doherty refers was itself published only three days ago. A "lot of talk" takes a special meaning when applied to just three days.

After stating that the effect of open source may be just for more technology-savvy lawyers, or that it may fit well into the eDiscovery market as it becomes commoditized, or that it may even go as far as bring eDiscovery to every lawyer who needs it, Doherty makes an important point that open-source and free are not one and the same.

The meaning of the word “free” in the context of open source has been discussed since its inception in the 1998. The usual definition given is “free like in freedom, not like in free beer.” That still depends on who you ask, but in my opinion it is fair to say that open-source is free in the sense the source code for the software is freely available, although its usage is regulated by the license under which it is released. It may be a license that requires that products that derive, or build on top of this, are also open-sourced, or it may be less restrictive. As you can see, “less restrictive” means that you are more free to use it. In general, open-source publishes its source code and specifies what one can do with it.

Practically however, one can charge money for open source products, and certainly for services based on it and on support offered with it. The meaning of “free” in open-source is thus a general argument about open source.

Specific to eDiscovery (and we will use the FreeEed open-source tool as an example), you need to hire someone to do it for you, especially since lawyers usually do not deal with technical matters. That, however, is true for any eDiscovery and any technology used in the practice of law, as evidenced by the existence paralegals and IT departments in law firms. Still, the software being free might make a difference.

You also need to take care of upgrades – even though here open-source may come out a winner, because its upgrades are free and frequent. Unless, as Doherty mentions, you use software-as-a-service. Here too there is nothing specific to open-source. In particular, FreeEed, is designed to work on the Amazon cloud, thus being software-as-a-service and also providing the scalability that comes with the use of compute clouds.

The next important point is licensing. There are a number of open-source licenses out there, and there is an on-going disagreement between the Apache 2.0 license used by Apache and the GPL V3.0 license used by Linux, for example. However, lawyers do not shy away from licensing issues. In fact, they will probably have an advantage in this over a lay business person. The FreeEed uses the Apache V 2.0 license, because the software packages which it builds upon – Hadoop for cluster processing, Lucene for text search, and Cassandra as a scalable fast NoSQL database – all use the Apache V2.0 license.

The next argument is that “Open source tools often require a level of sophistication that exceeds that needed for plug-and-play and click-to-receive software.” This may or may not be true. Some tools that are designed for programmers (a web crawler tool called Nutch comes to mind) expect one to be able to configure and run Linux applications. However, there are open-source applications that are quite easy to use. Consider such examples as OpenOffice, a Microsoft Office replacement, Ubuntu Linux for the Desktop use, RedHat Linux for the enterprise, and the Firefox browser. The FreeEed, for example, will do well to offer an easy-to-use graphical interface. This, in fact, was the first suggestion by tRon Chichester when he came on the FreeEed team, and the implementation has already begun. It will all depend on the hard work and user feedback, but not on the inherent limitations of closed source or open-source software.

Next, “End point, it's one thing to be locked into a proprietary code base and another to be locked in by the developers and administrators to your open source tools.” Assuming that the tool is easy to use, its adoption is no different from the closed source: for closed source, you have to evaluate the vendor, and for open source you have to evaluate the developer community and/or the commercial company providing support. Commercial eDiscovery vendors may be slow to respond, they may not care about you specifically, and then they may be bought out, or even go out of business. One extra option of open-source is that your IT departments can participate in the development, offer their contributions, and if implemented for in-house processing, they have an additional guarantee that the code will always be available in the eventuality that they have to continue using and developing it themselves.

The final argument is “Whether open source or proprietary code, you get what you pay for. And that value is often found in the service and support from manufacturers, not the software. “ It is true that you get what you pay for, however, sometimes it costs less. And sometimes excellent products, like the ones mentioned above, are free. Google and Bing search are examples of free but excellent services. The service and support from manufacturers is another story. Any company is free to offer their support of any open-source tool, and SHMsoft is already offering this support for FreeEed. Any eDiscovery vendor can use FreeEed or any other tool of their choosing. Here everything will depend on the execution.



No comments: