Hyperion
Gray, LLC Receives DARPA Memex funding to research technological
superiority in the area of domain specific indexing and search
[Dateline]—Hyperion Gray, LLC, a web security, distributed
computing and software R&D company, today announced receiving
research funding from the Defense Advanced Research Projects Agency
(DARPA) to develop an advanced web crawling and scraping system that
constitutes a revolutionary advancement to the state of the art. This
contract is part of DARPA’s Memex program, a three-year research
effort to develop software that will enable domain-specific indexing
of open, public web content and domain-specific search capabilities.
Hyperion Gray, LLC has been selected by DARPA as a performer in the
technical area of domain-specific indexing. The contract is
administered by the Air Force Research Laboratory, Rome, NY.
“We’re
excited to be working with DARPA again. It gives us the opportunity
to come up with and work on high risk, high reward ideas to solve
real world problems, in an incredibly supportive environment and
alongside incredibly talented people. We’re also looking forward to
being able to open source a significant portion of our code,” says
Hyperion Gray founder Alejandro Caceres.
The
Hyperion Gray Team includes some of the foremost experts in the web
crawling and big data fields. We’ve assembled a powerhouse team
capable of meeting any challenge we encounter throughout our research
effort.
Our
partner Cloudera (www.cloudera.com)
is supporting Team Hyperion Gray with their Big Data expertise.
Cloudera is the leader in enterprise analytic data management powered
by Apache Hadoop™. It offers organizations one place to store,
access, process, secure, and analyze all their data, empowering them
to extend the value of existing investments while enabling
fundamental new ways to derive value from their data. Our proposed
project must be able to handle truly massive-scale data, and our
partner Cloudera will ensure that we can do so seamlessly.
Our
partner Elephant Scale, LLC (www.elephantscale.com),
a specialized Big Data consultant and Hadoop training provider, is
also supporting this contract with Big Data expertise as well as
machine learning and language analytics expertise. Elephant Scale
delivers training and consulting in Big Data and publishes a popular
Hadoop-based open source eDiscovery solution, FreeEed
(http://freeeed.org).
It is also known for its open source book “Hadoop Illuminated”
and its new book “HBase Design Patterns”, which was recently
published by Packt.
Our
partner Openindex (www.openindex.io),
a Dutch company specializing in customized web search solutions based
on Apache Nutch, brings a deep and highly specialized knowledge of
the Apache technologies to our team. Openindex develops custom search
solutions based on Apache Solr/Lucene and Apache Nutch, and also
offers 'Search as a Service' and a variety of web crawling, content
extraction and data analysis services.
Our
partner Scrapinghub, Ltd. (www.scrapinghub.com),
an Irish company, are pioneers in the field of web crawling and
scraping and their technologies provide our team with a strong
foundation on which to build our revolutionary system. Scrapinghub
was founded in 2010 by the creators of Scrapy (http://scrapy.org)
the popular web crawling framework for Python, and has grown to a
fully distributed team of 100 engineers, working from over 20
countries. They provide a developer platform for running web crawlers
as well as professional services to help companies collect data from
the web successfully.
The
proposed project focuses on advancing the state of the art of web
crawling and scraping systems to be able to handle dynamically
generated content. “With this impressive team, we are confident
that we’re going to do some great work for DARPA and provide some
awesome tools back to the open source community.”
Hyperion
Gray, LLC
Hyperion
Gray is a small research and development company focused on web
security, software development and distributed computing.
For
more information on DARPA and the Memex program, visit www.darpa.mil
The views expressed are those
of the author and do not reflect the official policy or position of
the Department of Defense or the U.S. Government.
Approved for Public Release,
Distribution Unlimited.
###
No comments:
Post a Comment