From the beginning |
To the end |
Some interesting statistics, which will tell the users what to expect. Using the high-CPU EC2 machine, each Gigabyte took on the average one hour to process. The cost of processing was below $1 per Gig. The processing was done on one machine and took about 2 days. The time could have been shortened to under 4 hours by using 25-50 machines, but at this time we were interested in watching the process and on debugging it, not in the optimization.
While processing was going on, we were also fixing the bugs observed, mainly in the Tika parsers, and the Tika team fixed some bugs with a turn-around time of under one day. There is more work to do and more re-processing in sight, but the main take away: FreeEed is mature and stable and can be relied upon for processing. Now is the time to take it to the next level, by creating the Windows/Mac/Linux thin client and using Amazon EC2 for processing, which will make eDiscovery processing easily available for a non-geek user.
No comments:
Post a Comment