Hyperion Gray, LLC Receives DARPA Memex funding to research technological superiority in the area of domain specific indexing and search
[Dateline]—Hyperion Gray, LLC, a web security, distributed computing and software R&D company, today announced receiving research funding from the Defense Advanced Research Projects Agency (DARPA) to develop an advanced web crawling and scraping system that constitutes a revolutionary advancement to the state of the art. This contract is part of DARPA’s Memex program, a three-year research effort to develop software that will enable domain-specific indexing of open, public web content and domain-specific search capabilities. Hyperion Gray, LLC has been selected by DARPA as a performer in the technical area of domain-specific indexing. The contract is administered by the Air Force Research Laboratory, Rome, NY.
“We’re excited to be working with DARPA again. It gives us the opportunity to come up with and work on high risk, high reward ideas to solve real world problems, in an incredibly supportive environment and alongside incredibly talented people. We’re also looking forward to being able to open source a significant portion of our code,” says Hyperion Gray founder Alejandro Caceres.
The Hyperion Gray Team includes some of the foremost experts in the web crawling and big data fields. We’ve assembled a powerhouse team capable of meeting any challenge we encounter throughout our research effort.
Our partner Cloudera (www.cloudera.com) is supporting Team Hyperion Gray with their Big Data expertise. Cloudera is the leader in enterprise analytic data management powered by Apache Hadoop™. It offers organizations one place to store, access, process, secure, and analyze all their data, empowering them to extend the value of existing investments while enabling fundamental new ways to derive value from their data. Our proposed project must be able to handle truly massive-scale data, and our partner Cloudera will ensure that we can do so seamlessly.
Our partner Elephant Scale, LLC (www.elephantscale.com), a specialized Big Data consultant and Hadoop training provider, is also supporting this contract with Big Data expertise as well as machine learning and language analytics expertise. Elephant Scale delivers training and consulting in Big Data and publishes a popular Hadoop-based open source eDiscovery solution, FreeEed (http://freeeed.org). It is also known for its open source book “Hadoop Illuminated” and its new book “HBase Design Patterns”, which was recently published by Packt.
Our partner Openindex (www.openindex.io), a Dutch company specializing in customized web search solutions based on Apache Nutch, brings a deep and highly specialized knowledge of the Apache technologies to our team. Openindex develops custom search solutions based on Apache Solr/Lucene and Apache Nutch, and also offers 'Search as a Service' and a variety of web crawling, content extraction and data analysis services.
Our partner Scrapinghub, Ltd. (www.scrapinghub.com), an Irish company, are pioneers in the field of web crawling and scraping and their technologies provide our team with a strong foundation on which to build our revolutionary system. Scrapinghub was founded in 2010 by the creators of Scrapy (http://scrapy.org) the popular web crawling framework for Python, and has grown to a fully distributed team of 100 engineers, working from over 20 countries. They provide a developer platform for running web crawlers as well as professional services to help companies collect data from the web successfully.
The proposed project focuses on advancing the state of the art of web crawling and scraping systems to be able to handle dynamically generated content. “With this impressive team, we are confident that we’re going to do some great work for DARPA and provide some awesome tools back to the open source community.”
Hyperion Gray, LLC
Hyperion Gray is a small research and development company focused on web security, software development and distributed computing.
For more information on DARPA and the Memex program, visit www.darpa.mil
The views expressed are those of the author and do not reflect the official policy or position of the Department of Defense or the U.S. Government.
Approved for Public Release, Distribution Unlimited.