Can Machine Learning Really do That?
Anthony Goldbloom, Founder and CEO, Kaggle
Big data is one of today's hot technology trends. We’re told that our ability to store petabytes of data and the profusion of sensors will transform modern business. One element of the big data story that’s often overlooked is the development of machine learning techniques that allow us to do more with the data that we’re storing. This talk will introduce data mining competitions as a way to solve challenging problems. In doing this, it will also cover some of what Kaggle is seeing at the cutting edge of machine learning, including some of the applications that are now possible with modern techniques.
Watson in the Emerging Era of Cognitive Computing
Rob High, Jr., IBM Corporation
In this talk I will describe the role of cognitive computing in addressing some of the world’s most important business and social problems. I will discuss how Watson leverages human-readable information at massive scale to meet those needs. In addition, I will provide an update on how IBM has applied that capability to assist clinical decision support for Oncology, and for engaging clients in Insurance and Financial Services institutions.
Taming Big Data with Berkeley Data Analytics Stack (BDAS)
Ion Stoicca, Professor, UC Berkeley; CEO, Databricks, CTO, Conviva
One of the most interesting developments over the past decade is the rapid increase in data; we are now deluged by data from on-line services (PBs per day), scientific instruments (PBs per minute), gene sequencing (250GB per person) and many other sources. Researchers and practitioners collect this massive data with one goal in mind: extract "value" through sophisticated exploratory analysis, and use it as the basis to make decisions as varied as personalized treatment and ad targeting. Unfortunately, today's data analytics tools are slow in answering even simple queries, as they typically require to sift through huge amounts of data stored on disk, and are even less suitable for complex computations, such as machine learning algorithms. These limitations leave the potential of extracting value of big data unfulfilled. To address this challenge, we are developing BDAS, an open source data analytics stack that provides interactive response times for complex computations on massive data. To achieve this goal, BDAS supports efficient, large-scale in-memory data processing, and allows users and applications to trade between query accuracy, time, and cost. In this talk, I'll present the architecture, challenges, early results, and our experience with developing BDAS. Some BDAS components have already been released: Mesos, a platform for cluster resource management has been deployed by Twitter on 6,000+ servers, while Spark, an in-memory cluster computing frameworks, is already being used by tens of companies and research institutions.
Big Data and Big Analytics – So Much more Gunpowder!
Paul Kent, Vice President, SAS
Mathematical Computing is changing for the better – Modern Analytic Platforms such as Hadoop are embracing the Massively Parallel Cluster. Spread your data out over the nodes, and find ways to send the work to the data (instead of the other way around), and you’ll soon be enjoying some awesome advances in computing firepower. This talk details the transition to this new style of computing and gives examples of software techniques that were un-reachable before but are now practical. Interactivity and visual data explorations are available across all your data and now you can access sophisticated programmatic model development that was previously computationally intractable! The talk should challenge you to consider the status quo at your organization – are you doing enough to adapt to the new wave of computing?