Showing posts with label analytics. Show all posts
Showing posts with label analytics. Show all posts

Tuesday, February 11, 2014

Mahout - Future Directions

Introduction

The Apache Mahout Machine Learning Library’s goal is to build scalable Machine Learning libraries. Mahout’s focus is primarily in the areas of Collaborative Filtering (Recommenders), Clustering and Classification (known as the "3Cs"), as well as the necessary infrastructure to support those implementations. That would include, math packages for statistics, linear algebra and others as well as Java primitive collections, local and distributed vector and matrix classes and a variety of integrative code to work with popular packages like Apache Hadoop, Apache Lucene, Apache HBase, Apache Cassandra and more.
Future Releases

Saturday, February 1, 2014

Using Apache Storm for real-time analytics at Rocket Lawyer.

With today’s data technologies, storing data and scaling the infrastructure is becoming a non-issue with HDFS, Hadoop, and related architectures. Hadoop provides the batch-processing framework with MapReduce for processing the data. However, batch processing poses challenges with high data read latency for use cases like real-time analytics, clickstream visualization, and machine learning. We needed a real-time system to process our customer and system generated data as it happens to make important and quick business decisions. At Rocket Lawyer, we have chosen Apache Storm to supplement our data platform with real-time processing capabilities.