Analytics on Unstructured Data

Rama Akiraju

Rama Akkiraju

Distinguished Engineer

Building Compassionate Conversational Systems via User Modeling Analytics

Research on Conversational Systems is in vogue again spurred by consumer facing conversational agents such as Apple’s Siri, Amazon’s Alexa and Microsoft’s Cortana. To enable natural, personalized and compassionate conversations, we argue that Conversational Systems must be equipped with User Models. User Models are models of people/users, who use computer systems, that capture users’ context, preferences, personality, emotions, intentions etc. While User Modeling has been an area of research in the field of Human-Computer Interaction (HCI) for quite some time, the emergence and rise of social media platforms such as Twitter, Facebook and Instagram, where users share/broadcast their daily activities and have social conversations with friends, is providing increased access to user data that can be analyzed (with users’ permission) for personalization. In this talk, I will present the work we are doing at IBM Watson to build user models using psycho-linguistic, natural language processing and machine learning approaches in service of enabling natural, personalized and compassionate conversations.


Rama Akkiraju is a Distinguished Engineer, and Master Inventor at IBM. Rama is presently leading the mission of enabling natural, personalized and compassionate conversations between computers and humans via user modeling at IBM’s Watson division. Specific projects in user modeling domain that she is leading include: researching and developing technologies to infer people’s personalities, emotions, tone, attitudes and intentions etc. from social media data using linguistic and machine learning techniques.

In her career, Rama has worked on agent-based decision support systems, electronic market places, and business process integration technologies including semantic Web services, for which she drove a World-Wide-Web (W3C) standard.

Rama has co-authored 4 book chapters, and over 50 technical papers. Rama has over dozen issued patents and 20+ others pending. She is the recipient of 3 best paper awards in the areas of AI and Operations Research. Rama also received multiple awards and honors at IBM in her professional career.

Rama holds a Masters degree in Computer Science and has received a gold medal from New York University for her MBA.



Jay Hopman

Data Scientist & Strategic Analyst, IT Business Value Pathfinding Team
Intel Corporation

Harvesting Crowd Knowledge Using Text Analytics

Knowledge collection through surveys or social media relies on human creation, consumption, and interpretation of text-based information, and the speed, efficacy, and ultimate impact of the information is limited by the bandwidth of participants and analysts. We have explored using text analytics to speed analysis and increase comprehension of crowd knowledge. Our method of running iterative topic modelers to map information to hierarchical models customized to topic domains has already shown tremendous improvement (orders of magnitude) over non-automated processes. Our findings from four pilot implementations show that combinations of supervised and unsupervised learning help digest information in quantities and at rates we could not previously achieve. We anticipate near real-time use of employee and customer knowledge to drive strategic and tactical decisions across the organization.


Jay Hopman is a data scientist and strategic analyst in Intel IT’s Business Value Pathfinding team. He has been developing and testing crowdsourcing solutions for the past decade and has worked with groups across Intel’s organization to bring ideas from the lab to the field. Solutions have included variations of prediction markets, crowd-based diffusion models for new products and technologies, and most recently models based on qualitative (text-based) crowd information. Jay holds a B.S. in Computer & Electrical Engineering from Purdue University and an MBA from University of California, Davis. Outside of Intel, he has worked as visiting faculty as UC Davis.

Vishal Kapur

Vishal Kapur

Principal, Advanced Analytics and Financial Regulatory Practice

 A Practical Guide to Making NLP Work in Your Organization

Recent advances in Natural Language Processing technology and proliferation of tools have made the hype and expectations stronger than ever before. While many organizations see the promise of the technology and have invested in prototypes and proofs of concepts, delivering tangible sustainable results is often more difficult than anticipated. This session will cover practical insights and lessons learned in implementing NLP at enterprise scale problems – through a technical and a business lens. Using a case study format, the speakers will discuss practical insights and tips to address challenges across several dimensions such as data science, technology, architecture, strategy and organizational culture – and offer a roadmap for how you could make NLP-based solutions a reality in your organization.


Vishal Kapur is a principal in Deloitte’s Analytics and Information Management practice and has consulted with several large public sector agencies and commercial organizations for over 20 years. He has extensive experience in leading large scale business transformations using information-driven solutions enabled with technologies that span mobile, digital, analytics and domain-specific applications. He has successfully applied his broad expertise across the strategy and business domains, with the right data science techniques and enabling technology solutions – to drive solutions that are innovative, practical and create business impact. As a leader of Deloitte’s Federal Analytics practice, Vishal is actively engaged in the industry and with his clients in areas such as advanced analytics, cognitive computing, text mining, applied data science, and large scale data management disciplines.  Vishal received a Master’s degree in Mathematics and Computer Science from the Indian Institute of Technology, India.


Brian Ray

Cognitive Team Lead, Products & Solutions

 Co-presenting with Vishal Kapur


Brian Ray is the Cognitive Team Lead for Deloitte Consulting LLP’s Products and Solutions group and has been consulting on Business Analytics Applications for nearly 2 decades.

Mr. Ray’s current focus on Data Science leadership assembling and managing a Cognitive team of data scientists within Deloitte’s Products and Solutions Group. With a multi-disciplinary approach, his team builds models using natural language processing (NLP), machine learning, image recognition, open source or text analytics and more, to allow a higher level of interactions, understandings, and insights. He has deployed a dozen multi-million dollar products touching 100s of clients.

In his free time, Mr. Ray organizes Chicago Python User group as the Chair for the past 14 years. He’s a frequent global flyer and foodie based out of Chicago and Atlanta.

Damon Samuel

Damon Samuel

Director of Data Science
RCG Global Services

Application of Machine Learning Techniques on Semi-structured Social Media Data for Classification

This presentation focuses on identifying, parsing, and leveraging structural elements found within Social Media data in order to build enough structure for both supervised and unsupervised machine learning techniques to be applied for determining classes various posts belong to. Special attention is paid to the iterative process of exploring the social media data to leverage elements such as emoji’s, links, tags, hashtags, and common abbreviations and short cuts found in this data. The discussion also highlights common tools and techniques for this effort and ends with a brief discussion of use cases for these analyses will be included.


Damon Samuel, Director of Data Science at RCG Global Services brings nearly 20 years of analytical experience to bear. Samuel has built models for numerous industries including Insurance, Automotive, Retail, Credit, Pharmaceuticals, Telecom, Staffing, and Utilities. These models have touched IT, Finance, Marketing, Real Estate, and more. Samuel has been recognized by the Advertising Research Foundation as a top researcher in 2013 and was a board member for the Marketing Science Institute in 2015.

Jean Utke

Jean Utke

Data Scientist
Allstate Insurance

 Data Science at Allstate – What We Learned About Deep Learning

Allstate, the nation’s largest publicly held personal lines insurer, has a long tradition of using data and analytics to inform business decisions. With more than 200 data scientists and data engineers working in five locations across the U.S. and Europe, the company’s Quantitative Research & Analytics (QR&A) department uses unstructured and “big” data to solve complex problems for the business.
Allstate – like many other companies relying heavily on data science – is actively exploring deep learning approaches in an effort to determine where and how research results in artificial intelligence and deep learning can translate into tangible results for domain specific data analysis.
An early conclusion was that there are no off-the-shelf solutions for our specific problems and even published academic research didn’t address significant aspects of our problems. While the concrete use cases are proprietary, I will cover our experience in creating training data, establishing a hardware and software platform to support the effort, and highlight the algorithmic approaches developed with our academic partners. Using a non-insurance application scenario, I will demonstrate the gist of the problems and show that the fundamental algorithmic approaches are generic in nature and are of interest to the deep learning community in general.


Utke has been a data scientist at Allstate Insurance Company since 2014 working on predictive modeling projects for a variety of internal business partners and leading a team of data scientists investigating the applicability of novel methods and research results for Allstate’s use cases.
Prior to Allstate he worked for 11 years as a computer scientist on topics at the boundary of mathematics and computer science differentiation at the Mathematics and Computer Science division of Argonne National Laboratory with a joint appointment at The University of Chicago.
Preceding that he spent several years in consulting roles for software development in distributed systems at Motorola and other clients. He has a doctorate in mathematics (scientific computing) from Technical University of Dresden, Germany.