Below is the background for Boyd's presentation, and here are the slides.
PROS provides big data software applications that are designed to help companies outperform in their markets by combining big data techniques and scientific analysis to sell more effectively. These applications are used by companies in the manufacturing, distribution, services, and travel industries, and can generate millions of events per customer, per day, in production.
In order to provide effective support at this scale, we require an up to date understanding of the state of each remote server running our applications, as well as the applications themselves. Attaining this understanding is complicated by several factors, including access to the remote systems and the fact that many of the different possible sources of information are received in different formats and styles.
We have developed a framework for shipping server information and logs from our customers’ servers to our internal support servers, where they can be analyzed for internal and external purposes. Incoming data is stored in Elasticsearch for short term storage and HDFS for long term storage. Batch jobs are run periodically from Hadoop to perform analytics on the data, and the results are stored in Elasticsearch as well as HDFS. We then leverage the Kibana 3 Elasticsearch front end to enable support personnel to query the analysis results and source data as they investigate reported issues and receive advance warning when imminent issues are predicted.