Tuesday, August 30, 2011

FreeEed processing is made verifiably stable

FreeEed relies on Tika to process emails. In fact, one code line tells Tika to extract all attachments in whatever formats they happen to be, and process these also, adding the extracted text to the total.

Well, today I found a bug in Tika (it was not closing those attachments) that led FreeEed to crash. But also today the nice Tika programmers Mike McCandless, Nich Birch, and Jukka Zitting fixed the bug! So FreeEed is happily churning those Gigabytes of Enron PST on the EC2 machines now.

