Tuesday, November 22, 2011

Adding image creation to FreeEed

By "image creation" in eDiscovery we mean making the PDF or TIFF images of the originals. Having these is convenient for review, because it eliminates the need for the various applications required to open the native file formats, and is useful for redacting.

In the last three weeks I was starting a new assignment that has to deal with text analytics and understanding in the context of Big Data, which is great, because the deeper knowledge of it will help me create open source tools for automated document review later on. But it also meant that I only had a couple hours to work on FreeEed in the evening, and that only for two evenings.

Nevertheless, this was enough. OpenOffice/LibreOffice are open source free applications that allow printing MS office documents to PDF, and JodConverter is a bridge that allows the code to talk to it. Altogether, printing is done with five lines of code. Here they are:


OfficeManager officeManager = new DefaultOfficeManagerConfiguration().buildOfficeManager();
officeManager.start();
OfficeDocumentConverter converter = new OfficeDocumentConverter(officeManager);
converter.convert(new File("test.odt"), new File("test.pdf");
officeManager.stop();
Taking out the start/stop code, you have just one line:
converter.convert(new File("test.odt"), new File("test.pdf");
That's it! One line of code (and lots of computing power) to convert all MS Office file formats to PDF. Isn't this amazing? Anytime you need more computing power, you get it from the cloud on the cheap, so FreeEed begins to really shine, because it is designed for parallel processing in the cloud.
Sometimes I wish that I would have more time for FreeEed, perhaps even doing it full-time. But then again, since I can do so much with the great open source tools, then maybe it is not even necessary.

3 comments:

Diane said...
This comment has been removed by the author.
Diane said...

If you're looking free online image converter there's a free online application that converts image formats. From there, you can convert it into anything else you want. It works very well on any platform.

Mark Kerzner said...

Diane,

we already have one, and it needs to be in Linux and integrate with our code, so online wont' work.