Poster presentations will be scheduled in two sessions held after lunch on Monday and Tuesday. Dessert will be served at the same time as the presentations. The poster presentations will be the only event on the program during these times so that all conference participants can attend the session.
Monday afternoon | Exhibit Hall
- Model Fit In Linear Regression: Overfitting Vs Underfitting
Shilpa Balan, College of Business and Economics at California State University-Los Angeles
Linear regression is a popular statistical data model used to make predictions. Linear regression has seen many uses in the prediction of sales trends, assessing risks of financial services, in the credit card industry and the healthcare industry to name a few. However, there continues to arise several issues and questions with linear regression for prediction. While previous research has examined overfitting and few underfitting models, no research yet discusses in detail or clearly states the break-even point between the underfitting and overfitting models that would determine the accurately predicted linear regression model. Does this imply that a moderately developed underfitting model could be mistaken as an overfitting model or vice-versa? Of course, extreme cases of overfitting or underfitting model are always clear and obvious. However, this is not always the case especially with moderately developed models. The prediction accuracy of moderately developed models is impacted by the dependent variables, the sample size, and the choice of independent variable to name a few. This research examines the issues with model development in linear regression for making better predictions. For example, if there is a co-relation of the independent predictor variables, how would it impact the linear regression model? We examine each such case in the linear regression model development.
Dr. Shilpa Balan is an Assistant Professor in the College of Business and Economics at California State University-Los Angeles. Dr. Balan’s primary research interests are in Data Mining and Business Analytics. She has co-authored a book titled Business Intelligence in Healthcare with IBM Watson Analytics. Dr. Balan was awarded the Most Promising Research Award by the College of Business and Economics at California State University-Los Angeles in August 2016. Dr. Balan is a recipient of several Business Intelligence and Analytics Education Grants and a recipient of the Microsoft Azure Research Grant Award. She is a recipient of the Research Grant Award from the Center for Research and Development at California State University-Los Angeles for her research in Healthcare Data Analytics for the academic years 2015-2016 and 2016-2017.
- Interventional Radiology Resource Modeling And Simulation – A Case Study
Michael Prokle, Philips Research North America
We model and simulate an interventional radiology department (IR-Dept) seeking to decrease patient wait times. An IR-Dept’s performance depends on the simultaneous availability of professional resources (e.g. physician, nurse), physical assets (procedure room, equipment), clinical workflow (e.g. prep-procedure-recovery) and its ability to handle inherently stochastic processes (e.g. patient arrivals, procedure times). We show how these aspects can be simulated and validate the model’s output (i.e. patient wait times) against an actual day in a hospital’s IR-Dept. Further, we use this model to show how this IR-Dept can be optimized to improve patient wait times and length of stay.
Michael Prokle is a Data Scientist at Philips Research North America in Cambridge, MA. He received his PhD in Industrial Engineering & Operations Research from the University of Massachusetts Amherst. Michael holds a master degree in Business & Industrial Engineering from the Karlsruhe Institute of Technology, Germany and the University of Massachusetts Amherst, respectively. He studied Innovation and Entrepreneurship at the Royal Institute of Technology, Stockholm, Sweden. Michael has worked for McKesson and MAN Truck & Bus AG, one of the leading international providers of commercial vehicle. He received the Eugene M. Isenberg Scholar Award (2013, 2014) from the University of Massachusetts Amherst and the Judith Liebman Award at INFORMS Annual 2015. For his dissertation, Michael was advised by Professor Ana Muriel and collaborated with Pratt & Whitney and Artaic-Innovative Mosaic on inventory and supply chain synchronization.
- Teacher-school Assignment With Forward-looking Stability
Navid Matin Moghaddam
Teacher retention has been identified as a pressing problem for Arizona’s public schools. The available information, while limited, suggests that Arizona’s teacher retention rates are lower than the national average. For example, in Arizona approximately 24% of first-year teachers leave the profession, compared to 7.5% of first-year teachers nationally.
Previous studies have demonstrated that, successive years of excellent teachers can help close achievement gaps, especially among poor and minority students. In contrast, research suggests that teachers leaving schools can lead to unstable learning environments and, as a result negatively affects students’ academic progress. Additionally, teachers are more likely to leave schools where they are needed the most, namely those that serve poor and minority students. Such high rates of attrition leave positions to be filled by less experienced and less effective teachers.
In addition to its impact on students, high rates of teacher attrition are also costly to schools and create additional workloads for veteran teachers. The total financial costs associated with teacher turnover vary by district and range from $4,000 per teacher in small rural districts to $18,000 per teacher in large urban districts. Again, these costs are most damaging in high-poverty school districts where resources are scarce.
In this study, we propose and plan to test the following hypothesis: when teachers’ and schools’ preferences and attitudes towards the profession and professional responsibilities and expectations are aligned, we would observe lower rates of teacher turnover. The hypothesized underlying mechanism of the relationship between preferences and turnover is as follows: when teachers’ perceptions about working conditions, expectations, and rewards are not aligned with that of schools and school administrators, the initial “match” i.e., teacher-school pairing would be unstable, and teachers would start looking for alternative options such as to leave profession entirely or move to another school or school district.
The three major goals of this project therefore are: 1. Analyze a longitudinal census of teacher data to identify characteristics of teacher-school pairings that plausibly influence teachers’ mobility decisions. 2. Develop a mathematical formulation and solution algorithm that identifies optimal (or near-optimal) teacher-school pairings that would result in fewer teachers leaving schools. 3. Test this algorithm using (1) simulated data and (2) real data on teacher mobility. We see the major outcome of the project is creating a tool for schools and school districts that would help them to make better hiring decisions resulting in long-term stable employment thus reducing both monetary costs (hiring and training new teachers) and non-pecuniary costs (effect of less qualified and less experienced teachers on students’ outcomes).
Navid Matin Moghaddam, earned his M.Sc in Industrial Engineering from Clemson University in 2015. In August 2015, he joined Arizona State University where he is pursuing PhD in Industrial Engineering. Navid’s research interests span development and application of novel OR and analytic methodologies such as optimization and machine learning to solve real world problems.
- Examining The Effect Of Forecast Accuracy On A Dynamic Inventory Optimization Model For Spares
This study proposes an optimization solution to minimize costs for the inventory system of the retailer. In the past, demands of different items were forecasted yearly and the distribution of each item was neglected. The retailer used weekly and monthly demand forecasts by just dividing yearly forecast with specific numbers. Therefore, the retailer purchased items in bulk to prepare for unexpected demand from vendors, which generated huge holding costs. If we approach the distribution of each item, then dynamic economic order quantity model would be possible. To solve this problem, we calculate the exact distributions for each item. Based on different distributions of each item, we built up formulas to calculate costs and service level. We created formulas as well as a simulation model to test practically. Lastly, with various constraints, we optimized our model to minimize the cost, while meeting several requirements, such as maximizing service level, for each item type.
Shan worked as a market researcher in Taiwan and is passionate about consumer behaviors and data-driven analytics tasks. She is also experienced in project management. She enjoys bringing the best out of teammates from diverse background and let them work together for maximum synergy.
- Cost Optimization for Automated Solution Design of IT Services
Aly Megahed, IBM
IT service providers typically need to prepare a solution that fulfills IT requirements of clients (software, hardware, mobile computing, etc). The solution consists of a set of offerings and their attribute values that should be at the minimum possible cost (e.g., a cloud computing offering with x GB of RAM, y vCPU, z GB Disk Storage). There are typically multiple possible offerings that can fulfill each requirement. Besides the need to determine the optimal set of offerings, another challenge is the cost function for such offerings is usually a computationally expensive-to-evaluate blackbox. We developed a novel framework that relies on a MIP formulation, Bayesian optimization, and a decomposition scheme for solving the problem, where we alternate between quering the true expensive function and a cheaper approximation of it. The problem is an immediate challenge faced by large IT service providers like the firm where this was developed.
Dr. Aly Megahed is a research staff member at IBM’s Almaden Research Center in San Jose, CA. In his current job, he develops analytical tools for complex service engagements, cloud computing, and IoT, and advances research in analytics, machine learning, and operations research. Dr. Megahed got his Ph.D. in Industrial Engineering from Georgia Tech and he has two master’s degrees and a B.S. in Production Engineering. He has done multiple analytical research/consultancy projects for 6 companies in the past and has his work published in several academic journals and conferences, in addition to filing multiple patent disclosures. He has taught university level courses at 7 different institutions. Dr. Megahed won several internal and external awards. Being an active INFORMS member, he organized and gave talks at multiple INFORMS conferences, was a panelist in industrial panels, served as an organizer for Ph.D. colloquia, judged best paper awards, and mentored several students.
- Real World VRP Needs a GIS
Heather Moe, Esri
The Vehicle Routing Problem (VRP) becomes much more complex when the solutions need to be deployed in the real world. Esri has worked with a wide range of customers who need to send a fleet of vehicles out to accomplish their unique work. These VRP users have common modeling requirements that include more geographically precise inputs which help produce a more operational result. This can be achieved by integrating a Geographic Information System (GIS) into the problem-solving methodology. A GIS allows for a complete representation of a road network, which goes beyond the basic Origin-Destination (OD) matrix. To accurately reflect what a driver can and will accomplish, the GIS data model includes a detailed network dataset, precise street based geocoding, vehicle specific requirements, entry/exit requirements, road work, and local knowledge. With these GIS inputs, the VRP model will produce a route plan that is both cost efficient and realistic.
Received a B.S. in Biology from Rochester Institute of Technology (RIT) in 2008 and a M.S. in Operations Research from Kansas State University in 2016. After graduating from RIT, Heather joined the National Oceanic and Atmospheric Administration (NOAA) Corps, where she served as an officer piloting ships and working on various fisheries and climate research projects. After resigning her commission at the end of 2015, Heather has worked at Esri first as an intern and then a full-time employee on the Network Analyst Team working primarily on the Vehicle Routing Problem.
- An Analytical Approach For Understanding Promotion Effects On Demand And Improving Profits
Chaitnay Singh, Purdue University
The objective of this study is to design and develop a better revenue management system that focuses on leveraging an understanding of price elasticity and promotional effects to predict demand for grocery items. This study is important because the use of sales promotions in grocery retailing has intensified over the last decade where competition between retailers has increased. Category managers constantly face the challenge of maximizing sales and profits for each category. Price elasticities of demand play a major role in the selection of products for promotions, and are a major lever retailers will use to push not only the products on sale, but other products as well. We model price sensitivity and develop highly accurate predictive demand models based on the product, discount, and other promotional attributes, using machine learning approaches, and compare performance of those models against time-series forecasts
Chaitnay is currently pursuing an MS in Business Analytics from Purdue University. Prior to this he worked at Mu Sigma as a Decision Scientist. At Mu Sigma, he worked with a team that helped a US based Pharmaceutical giant make better marketing decisions by making sense of data. During his professional career, Chaitnay worked on a lot of interesting projects that involved A/B Testing, Clustering, etc which deepened his love for analytics. Chaitnay enjoys the technical aspects of analytics and would like to use his time at Purdue to become skilled at advanced analytical tools and techniques. He would like to apply these techniques to help businesses make better data driven decisions.
- Tkmars An Efficient Approach For A New Class Of Black Box Functions
Hadis Anahideh, University of Texas at Arlington
Simulators can be used to study, and potentially optimize complex systems. Surrogate optimization approaches attempt to approximate and optimize computationally expensive simulation models at a reduced cost. Historically, surrogate optimization methods assume the complex system involves no uncertainty in performance and the set of important decision variables is known apriori. In the real world, both of these assumptions are violated. This research introduces a new surrogate optimization paradigm, to address uncertainty in system performance and feature selection for important decision variables.
Hadis Anahideh is an Industrial Engineering Ph.D. student at the University of Texas at Arlington (UTA) since 2013. She received her Master of Science degree in Industrial and Systems Engineering at the University of Tehran, Iran and completed her Bachelor of Science in Applied Mathematics from the University of Shahid Beheshti, Iran. Prior to beginning her Ph.D. program, she worked as a system analyst at Maskan Bank, IT group for two years. She did two internships with American Airlines Advanced Analytics and Operations Research group, summer 2015 and 2016. She is a member of COSMOS and working under Dr. Jay Rosenberger and Dr. Victoria Chen supervisions. Her research interests include mathematical modeling, optimization, and statistical analysis.
- Marine Corps Logistics Command Warehouse Optimization Model
Amber Coleman, Marine Corps Logistics Command (MARCORLOGCOM)
Marine Corps Logistics Command (MARCORLOGCOM) is responsible for long term storage and upkeep of billions of dollars of military equipment (ME), which must be maintained at high readiness levels. Indoor storage provides the best protection against environmental elements, however, MARCORLOGCOM only has access to 50 percent of the warehouse space required to store all ME. The Warehouse Optimization Model (WOM) enables leadership to optimally allocate limited warehouse space by minimizing the total replacement value of equipment stored outside subject to available warehouse space and the size of warehouse doors. We validate this plan by developing visual load plans in the same manner that we load amphibious ships. This optimization tool enables sensitivity analysis and load plan verification before a single ME is ever moved. Ultimately, MARCORLOGCOM is able to prioritize equipment ensuring the most expensive equipment is stored indoors while maintaining the highest levels of readiness.
Major Amber Coleman is an Operations Research Analyst at Marine Corps Logistics Command (MARCORLOGCOM) in Albany, Georgia. She graduated from the United States Naval Academy in 2004 with a Bachelor of Science in Quantitative Economics. She received a Master of Science in Supply Chain Management from Syracuse University in 2012 and a Master of Science in Operations Research from the Naval Postgraduate School (NPS) in 2016. Major Coleman’s military experience includes two Operation IRAQI FREEDOM (OIF) deployments, command at the platoon and company level as well as multiple staff level positions. Her current analysis projects include: aggregating, cleaning, and analyzing Marine Corps depot-level maintenance cost data; inventory applications using optical character recognition; and warehouse space optimization.
- Employee Engaged Data-analytics Results In Innovative Scrap And Waste Reduction
Tim Dorsey, The Dorsey Group
Poster Proposal for Informs 2018: Baltimore Elevator Pitch: A company found innovative scrap and waste reduction through the utilization of employee-engaged data-analytics. Experiencing excessive waste, the company sought to find the root cause and then take action. These goals were achieved by implementing an infrastructure, tools and behaviors that empowered employees and enabled them to properly utilize data for prioritizing and decision-making. Modern analytics including descriptive, predictive and prescriptive analytics were all utilized. The end result was a reduction of waste/scrap that positively impacted the profits, employee engagement and morale and the environment (reduction of physical waste). How could this methodology be applicable in any organization? Waste exists everywhere. Waste is not only physical waste or scrap which is costly to both the organization and to the environment but it also any non-value-added activity. Non-valued added activity is costly to the organization and to the employee morale. Analytics on their own cannot eliminate waste; the employees must be involved. Employee engagement is essential to getting real results – significant, sustainable results. Employee engagement and empowering employees to utilize data in the decision-making process can be implemented in any organization, in any industry, at any time. How is the project innovative? By engaging the employees, they uncovered and put into action plans to reduce the waste/scrap. They turned data into action by following: Data => Information => Knowledge => Action Their results were:
- Cost saving to the company via reduction of raw material usage and lower manufacturing costs
- Cost saving to company via significant reduction of finished goods scrapped
- Customer satisfaction via product having a longer shelf-life
- Positive environmental impact via less disposal of waste (raw material and expired product)
- Positive impact on employee morale due to engagement, innovation from within and fulfilment of social responsibility
- Developed a culture of performance improvement using tools and engagement that provide sustainable, ongoing means to explore improvement opportunities throughout the organization
What type of analytics did we use:
- Descriptive Analytics – tools used to detect current weaknesses
- Predictive Analytics – tools used to show trends/ predict consequences
- Prescriptive Analytics – tools used to determine potential action plans and interventions
Tim Dorsey, founder and principle of The Dorsey Group, has provides performance improvement initiatives to domestic and international organizations. The Dorsey Group’s distinctive Operational Excellence Strategy strives for world-class operations by documenting processes, identifying problem areas, implementing solutions and creating a COPQ recovery plan with a focus on sustainability through motivated, engaged employees.
A frequent speaker and thought leader, Tim addresses the ever-changing work environment and how it can be optimized. Tim’s approach to successful performance improvement initiatives is to focus on people first to achieve long term sustainability.
Tim has a Master of Science in Economics and an undergraduate degree in Business Administration. Co-Author: Carla Dorsey is the co-owner/founder of The Dorsey Group, a global consulting group that specializes in performance improvement initiatives and operational efficiencies. In addition, she is a doctoral candidate at the University of South Florida Muma College of Business. She has published her first academic scholarly work, a case Study, Data Analytics and Operation Efficiency. It was accepted into the NACRA 2017 competition where she is also conference reviewer. She is a non-practicing CPA. Carla earned her BS in Accounting from the Ohio State University and has a Masters in Accounting.
- Simplifying Paradata Analytics For Field Managers
Craig Gagel, U.S. Census Bureau
The U.S. Census Bureau employs thousands of field representatives (FRs) located throughout the country to make personal visits or place telephone calls to households and conduct survey interviews. To manage this effort, the Bureau operates 6 regional offices with each having responsibility for field operations over a specific geographic area that encompasses several states. These field managers are responsible for simultaneously tracking and controlling the cost and performance of all surveys within their respective areas. Due to the size and scope of the survey operations, this results in a challenging and fast-paced environment where leadership needs to be well informed on the status of current operations and able to effectively communicate decisions to field staff. Various types of paradata—or data about the survey-data collection process—are captured to provide insights into the cost and performance of these survey field operations. The U.S. Census Bureau has developed new methods to more efficiently capture and effortlessly disseminate survey paradata and other operational information. In addition, the Bureau recognized the need to involve analytic-product users to better develop and understand such information to facilitate more informed and data-driven decisions.
Craig Gagel, Lead, Paradata Tools & Budget Analytics, Office of Survey and Census Analytics, U.S. Census Bureau
- Scientific Advances To Continuous Insider Threat Evaluation Program
Dan Hudson, U.S. Nuclear Regulatory Commission and Innovative Decisions, Inc.
The Intelligence Advanced Research Projects Activity (IARPA) Scientific advances to Continuous Insider Threat Evaluation (SCITE) program seeks to develop and test methods to detect insider threats. Insider threats are individuals who (1) have privileged access within an organization, and (2) engage (or intend to engage) in malicious behaviors such as espionage or sabotage. The SCITE program aims to advance several research areas, including: (1) engineering inference enterprise systems that detect low-probability events using low-accuracy sensors, (2) developing innovative statistical methods to estimate performance of these systems, and (3) advancing evidence-based forecasting methods and probabilistic reasoning. Innovative Decisions, Inc., (IDI) is an analytics and management consulting firm serving business and government clients through innovative applications of decision and risk analysis, operations research, and systems engineering. IDI applied diverse multi-modeling techniques to achieve superior performance among SCITE program contractors in developing Inference Enterprise Models (IEMs) that forecast insider threat detection system accuracy.
Dan Hudson is a Reliability Engineer and Risk Analyst for the U.S. Nuclear Regulatory Commission and a Senior Analyst for Innovative Decisions, Inc. (IDI). Dan previously served for several years in the U.S. Armed Forces, first as a Marine, and then as a Navy Submarine Warfare Officer. As a submariner, Dan was assigned to a fast-attack nuclear submarine and a SEAL delivery vehicle team, and earned certifications as a Navy Diver and Nuclear Engineer Officer. Dan is a Certified Analytics Professional (CAP) with INFORMS and is Certified in Public Health (CPH) with the National Board of Public Health Examiners. Dan earned a B.S. degree with distinction in Aerospace Engineering from the U.S. Naval Academy, an M.S. degree in Risk Analysis from the University of Maryland A. James Clark School of Engineering, and a Ph.D. degree in Risk Sciences and Public Policy from the Johns Hopkins Bloomberg School of Public Health.
- Assessment Of Factors Influencing Intent-to-use Big Data Analytics In Organization In A Post-adoption Context: A Survey Study
Wayne Madhlangobe, CAP, Enbridge Inc.
Big Data Analytics is a cross-section of big data, machine learning and modeling processes of examining large data sets to uncover hidden patterns, unknown correlations, trends and other useful information for decision-making. Big Data Analytics is quickly becoming a critically important driver for business success. Many organizations are increasing their Information Technology budgets on Big Data Analytics capabilities. The objective of this study is to assess the factors influencing the post-adoption intent-to-use of Big Data Analytics by an organization. Big Data Analytics is forcing organizations to re-engineer their business processes due to automation of cognitive and manual tasks. This level of automation can introduce anxiety to business users, therefore setting up a clash between business users and technology. This environment presents an attractive opportunity to revisit existing IS concepts such as how trust-in-technology can influence intent-to-use Big Data Analytics.
Wayne Madhlangobe is a Certified Analytics Professional (CAP) currently leads a team of Data Scientists and Operation Research Specialists at Enbridge Inc. He is currently based in the oil-rich province of Alberta in Canada and of course cold long winters but very beautiful summers. Wayne is currently finishing his doctoral studies at Nova Southeastern University with a focus in Information Systems.
- Identifying Major Defect Causes From Manufacturing Data Using Decision Rules: A Case Study
Josue Obregon, Department of Industrial and Management Systems Engineering, Kyung Hee University
Nowadays, manufacturing execution systems (MES) collect various types of data from the manufacturing machines and the sensors installed on them during manufacturing process. With this huge availability of data, organizations are able to introduce data-driven approaches in order to enhance their productivity and analyze their current performance. A common use case of data-driven methods is that, given the product condition and quality measures, the classification of defect and non-defect production lots of items can be done. For this purpose, several machine learning and data mining methods exist and are commonly used. However, such models can become complex and difficult to interpret. We tackle this problem by presenting a framework in which the most influential variables affecting the classification models are presented as simple decision rules easy to understand and interpret. The framework and experimental results are presented through a case study of the analysis of quality data of a die-casting company from Korea.
Josue Obregon is currently a PhD candidate in the Department of Industrial and Management Systems Engineering at Kyung Hee University in South Korea. He works in the Artificial Industrial Intelligence laboratory were he is involved in smart manufacturing and smart energy projects. His research interests include machine learning and deep learning applied to industry, process mining, social networks and decision support systems.
- Supporting Researchers With Metadata At Internal Revenue Service (irs) Research, Applied Analytics, And Statistics (raas)
Robin Rappaport, Internal Revenue Service
Metadata is the key to turn data into information. Research is only as good as the researcher’s understanding of the data. The Compliance Data Warehouse (CDW) is an analytical data environment with over 2 petabytes of data (from over 50 different data sources making it the largest Internal Revenue Service [IRS] database) designed to support compliance research activities (managed and administered by Research, Applied Analytics, and Statistics [RAS]). Metadata is published to the CDW website; used to search for, collect statistics on, and understand the meaning of data to support research and analysis. Recognized as a best practice by the IRS and the data quality community (including academics, industry, government, and non-profits), CDW Metadata is an efficient model for metadata development, maintenance, and delivery that can be used as an example of how to customize a metadata repository without reliance on expensive, limiting metadata repository specific vendor tools.
Robin Rappaport is Data Quality Team Leader responsible for delivery of Data Quality Initiative for Research Databases at Internal Revenue Service (IRS). Her work and that of her team contributed to the IRS being awarded a The Data Warehousing Institute (TWDI) 2011 Best Practices Award for the Compliance Data Warehouse (CDW), a Computerworld Honor and a Government Computer News (GCN) Gala Award. She has over 25 years of experience as a Data Quality practitioner. Her undergraduate degree was in Economics with Computer Science. Her graduate work was in Operations Research with a concentration in Mathematical Modeling in Information Systems. She has worked in both private (6 years) and public sectors (since 1990). Her positions include Computer Programmer, Systems Analyst, and Operations Research Analyst. She was webinar facilitator for International Association for Information & Data Quality (IAIDQ) (2011 – 2017). She served on the Certified Analytics Professional (CAP) Exam Committee since 2013.
- Improving Customer Satisfaction And Railway Dwell Times By Better Crowd Management At Netherlands Railways
Matthias Rensink, Netherlands Railways
With a significant increase in passenger volume due to the economic upturn, Netherlands Railways (NS) is confronted with the negative effects of crowding in trains and on platforms. Crowding influences the variability in dwell times and, having one of the busiest railway networks in the world, therefore reduces the reliability of train services in The Netherlands.
To decrease this variability, NS conducted experiments in order to spread the passengers more evenly over the platform and the train. In these experiments NS used different channels to distribute crowding information to passengers, ranging from mobile phone apps to LED screens above the platforms. By giving the passengers information about the train composition and crowding it is possible to nudge them to the preferable position on the platform or in the train.
These experiments showed that by sharing this information both the variability in dwell times decreases and the overall customer satisfaction rises.
After finishing a Bachelor’s in Industrial Engineering & Management and a Master’s in Business Administration, Matthias Rensink started working for the Netherlands Railways (NS). As a Business Consultant at the Performance Management and Innovation (PI) department, he is responsible for the implementation of OR in the planning and dispatching departments. Research led by the PI department received the Franz Edelman Award in 2008.
- Predictive Analytics On Performing Arts Patrons
Aurelie Thiele, Engineering Management, Information and Systems department, Southern Methodist University
This poster describes an application of predictive analytics to customer segmentation in the context of performing arts, using sanitized, realistic multi-year data based on data provided by a regional theater venue. The work documents the relevance and impact of various analytical techniques for predictions not only in the traditional setting of ticket purchases forecasting but also the nonprofit-specific setting of donation forecasting. Understanding various patterns of donations over time (donor profiles) and the position of each donor in his corresponding donor lifecycle is a critical task for nonprofit performing arts venues, which typically generate less than half their revenues in ticket sales. The output of the work is a segmentation of the customer base to identify past, current and potential key donors and ticket purchasers. This is particularly important for relationship management with patrons who may be willing to become more financially involved with a performing arts venue. To the best of our knowledge, this work is the first to apply analytics to nonprofit management in the context of nonprofit theater, and to investigate the key characteristics of successful analytical models in this setting, with a focus on clustering and classification and regression trees. Hence, this work presents an innovative application of analytics to an important nonprofit setting.
Aurelie Thiele is an Associate Professor in the Engineering Management, Information and Systems department at Southern Methodist University in Dallas, TX. Prior to joining SMU in August 2016, she was a tenure-track and tenured faculty member at Lehigh University in Bethlehem, PA, where she also served as the co-director of the Master of Science program in Analytical Finance, and a Visiting Associate Professor at the Massachusetts Institute of Technology in Cambridge, MA. She has also served as the committee chairwoman for the INFORMS Undergraduate O.R. Student Paper Competition and the INFORMS O.R. & Analytics Student Team Competition, and a judge for the INFORMS George E. Nicholson Paper Competition and the INFORMS Wagner Prize Competition, among other responsibilities. She holds a M.S. and Ph.D. from the Massachusetts Institute of Technology and a “diplome d’ingenieur” from the Ecole des Mines de Paris, France.
- A Design And Analysis Of Computer Experiments For Green Building
Shirish Rao, Industrial, Manufacturing & Systems Engineering Department, University of Texas
Building structures have a significant impact on the environment and energy consumption. Green building seeks to reduce the energy consumption and environmental impact of buildings throughout their life cycle from construction to operations to demolition. While computer models exist to assist with the green building design process, their usage to-date has been limited to trial and error approaches. The goal of this research is to employ these computer models within an organized design and analysis of computer experiments approach to study how different building options, such as windows and insulation, affect sustainability metrics. Two new experimental design methods are introduced to handle the mix of continuous, discrete-numerical, and categorical factor variables. Two statistical modeling methods, treed regression and multivariate adaptive regression splines, are used to identify which building options are important for the different sustainability metrics.
Shirish Rao is a PhD student, working in the Center on Stochastic Modeling, Optimization, & Statistics (COSMOS) lab,in the Industrial, Manufacturing & Systems Engineering department, at the University of Texas at Arlington. He received his B.E. in Electrical Engineering from Mumbai University (India) in 2012. His research interests are in Engineering Statistics and is currently working with Dr. Victoria Chen on the interdisciplinary sustainability assessment project.
- Bagging Classification Algorithm For Predicting Contactability In Outbound Telemarketing
Nowadays Internet has allowed us to collect a great variety of information from customers. In particular, we can design mechanisms to find customers who might be interested in one or many of our products, a lead customer. The trivial next step for a company is to send this lead customer to a telemarketing division to validate the interest of the customer and make a sale. In order to avoid unnecessary costs and expenses, telemarketing companies need to know, with certain probability, whether or not a customer will answer a call. i.e., predict the customer contactability. This poster presents a bagging classification algorithm, with a computational complexity O(n^3), that predicts whether or not a person will answer a call made by a telemarketing agent. We applied the proposed algorithm to a telesales company and increase contactability from 8% to 17%. The results were presented using a Shiny app.
Jorge A. Samayoa holds a Bs. in Electronics and Computer Science (UFM), a Ms. in Operation Research (Galileo), a Ms. in Applied Mathematics(Texas A&M) and a Ph.D. in Industrial Engineering (Purdue). In 2003 he started teaching undergraduate courses in Mathematics at the Engineering School of Galileo University and has taught courses at several universities in Guatemala and Texas A&M University, College Station, TX. In 2004, he founded the Teaching Assistants Department of Galileo University where he was responsible for all the Teaching Assistants of several Schools of Galileo University. In 2006, he was awarded the “Excellence in Teaching” Award of Galileo University. Currently, he is the chair of the only Operations Research program in Guatemala, and one of the few online graduate programs in operations research in the world. His current research interests include Complex Systems, General Systems Theory, Data Science, and efficiency and creation of teaching tools for Engineering Education.
- Methodology For Natural Gas Demand Forecasting In New England
Laszlo Steinhoff, Towson University
Accurate forecasting of the United States natural gas consumption is of crucial importance to energy providers for decision making regarding natural gas purchasing and energy pricing to increase profits and efficiency. Energy forecasting covers a wide range of forecasting problems, such as generation forecasting, load forecasting, price forecasting, demand response forecasting, and so on. According to previous research, although a significant amount of the literature has been devoted to energy forecasting, most studies are still at the theoretical level, and have little practical value. The project aims to seek effective methods to improve the accuracy of natural gas demand forecasting in the US. To match the EIA monthly data, we establish a parallel method to make a natural gas demand forecast. The method models the scrape data and ratio data individually in two modules and then combines the outputs of the two modules to generate the forecast. An assortment of models to forecast the natural gas scrape data are investigated. In particular, we create a multi-scale method to treat the long term trend, medium and short term parts respectively. Two models of the multi-scale method are developed based on the semi-parametric technique and time series method. We test the proposed methods on the residential and commercial natural gas consumption data in the New England area for one year. The proposed multi-scale method consistently generates models which are among the top performers. Moreover, the proposed parallel method performs very well for generating forecasts that closely match the published EIA data.
Towson University’s Applied Mathematics Laboratory (AML) is a long-running endeavor by the Department of Mathematics at TU. It undertakes applied research projects of a mathematical nature on problems of interest to a sponsoring organization—usually a private company or government agency. For each project, the AML forms a team of talented undergraduate and/or graduate students led by faculty advisers from the Mathematics Department. This team studies the problem and presents a written report of their findings to the Mathematics Department and to the sponsor. The team’s efforts usually involve using a combination of: modeling techniques, mathematical knowledge on the advanced undergraduate level, and modern computing. Some recent sponsors
include: the Science Application International Corporation, the Baltimore City Fire Department, the National Institute of Justice, the Chemical Security Analysis Center, and RTR Technologies, LLC. The present AML team of seven students under the supervision of Drs. Y. Cui and X. Wang is working on a project sponsored by Constellation Energy.
- A Disease Dashboard For Tracking And Predicting The Quality Of Glycemic Control In Diabetic Patients
This project details a visualization tool that was developed as part of collaboration between the Katz Graduate School of Business and the VA Pittsburgh Healthcare System’s Telehealth Services Department, to provide support and tools for the management of a chronic disease, specifically diabetes.
According to the Centers for Disease Control and Prevention’s (CDC) latest report, more than 30.3 million Americans (9.4 percent of the U.S. population) are now living with diabetes; diabetes was the seventh leading cause of death in the U.S. in 2015. Patients with diabetes, especially those with poor glycemic control, are at increased risk of serious health complications, including premature death, vision loss, cardiovascular disease, stroke, kidney failure, bone and joint problems, tooth and gum infections, amputation of toes, feet, or legs, bacterial infections, fungal infections, and non-healing wounds.
The Diabetes Dashboard integrates multiple elements of patients’ medical history, in order to facilitate better disease management support. The initial goal of the Dashboard is to increase provider efficiency, by speeding up the patient screening process for specialty consults. Longer term goals include using the Dashboard for knowledge elicitation, to support development of partly automated patient screening, as well as development of predictive analytical models for identifying if and when patients may experience poor glycemic control.
- Predictive Model For Banknotes Demand Forecasting
Jose Velasquez, Department of Industrial Engineering, Universidad de los Andes
As part of Colombia´s financial system, the Banks in the country are challenged to determine the adequate amount of banknotes per denomination demanded at each branch. In this specific bank case, the estimations currently calculated by its central treasury department, are not precise enough to make decisions on this issue, so it is necessary to use other quantitative alternatives. This work describes the implementation of a tool used for exploratory analysis and automatic demand estimation of banknotes demand employing multiple time series models such as: exponential smoothing, ARIMA, autoregressive neural networks, linear regressions for time series, seasonal Naïve Bayes, BATS, TBATS, structural time series, theta method, mean forecast and finally the seasonal decomposition method. Throughout these models it is possible to make more precise forecasts of the demanded banknotes disclosed by denomination at each branch, generating a significant information contribution for the decision-making process at the Bank.
Industrial Engineer, Economist, graduate of the Master’s Degree in Industrial Engineering from the Universidad de los Andes and certified as Black Belt Six Sigma at Arizona State University. Currently, he is an Instructor and Coordinator of the Master of Science Program in Analytics of the Department of Industrial Engineering at the Universidad de los Andes. Also a member of the research groups: Production and Logistics (PYLO) and the Center for Optimization and Applied Probability (COPA). His research interests include: forecasting, demand planning, inventory management, optimization techniques, vehicle routing, lean six sigma methodology and supply chain management.
- Machine Learning Approach To Delivery Time Estimation For Industrial Equipment
Douglas Halim, Purdue University
Our research focuses on obtaining better predictions for lead-time of made-to-order equipment for a large multinational corporation. In collaboration with this corporate partner, our team was tasked to create a deployable solution that could provide reliable delivery predictions. The motivation for this work is that when customers place orders for pieces of equipment, and they are provided an expectation that their product will be delivered in a timely manner. Without a delivery estimation system currently in place, the company cannot provide customers an expected time window, which is an inconvenience for the customers that have their own operational planning to use this equipment. To predict this lead-time, our team was provided access to tens of thousands of entries of equipment order data. We experimented with many models considering the unique aspects of the features and were able to obtain predictions of delivery time for each product line. Our predictive approach provides a solution to this business dilemma, by providing a highly accurate, cross-validated predictions of delivery time as well as a corresponding prediction interval. We believe our approach could be easily extended to other similar type supply-chain problems.
Doug Halim is a current Master of Business Analytics and Information Management Candidate at Purdue University. Prior to his graduate studies, he studied economics at Southeast Missouri State University and International Relations at the Latin American University of Science and Technology. While at Purdue, Doug worked alongside several corporate partners in order to build statistical model and help solve various business problems within the companies. Upon graduation, Doug wishes to continue applying his analytics experience and assist companies in making strategic data-driven decisions.
- Use of Recurrent Neural Networks And LSTM Neural Networks for Time Series Forecasting
This study focuses on predicting demand based on data collected which spans across many periods. To help our client build a solution to forecast demand effectively, we developed a model using Long Short-Term Memory (LSTM) Networks, a type of Recurrent Neural Network, to estimate demand based on historical patterns. While there are many available models for dealing with a time series problem, the LSTM model is relatively new and highly sophisticated to its counterparts. By comparing this study which works excellently for sequential learning, to the other existing models and techniques, we are now closer to solving at least one of many complications apparent across industries. The study becomes all the more important for supply chain professionals, especially those in purchasing, as they can now rely on a highly accurate model instead of basing their forecasts purely on intuition and recent customer behavior. We perceive huge cost savings by use of this method; avoiding stock outs and excessive inventory storage.
Ruthwik comes from Mumbai, India and he graduated from Manipal University with a Bachelor’s of Technology in Electronics and Communication degree in 2015. After completion, he worked at ZS Associates as a Business Analyst for two years, on numerous analytical projects for the global pharma industry. Ruthwik was exposed to analytics through his internship at Lister Technologies and employment at ZS Associates. Along his path, he worked on verticals of analytics like Forecasting, Key-Driver Analytics, Targeting, Alignment, Sizing and Decision Sciences, and in turn, developing expertise in tools like Excel, VBA, SAS, R, SQL with Oracle, and Tableau. Going forward, Ruthwik aims to build specialization in Predictive Analytics and Data Mining, leveraging it to drive business strategy and performance for leading consulting firms, enabling effective decision-making.
- Well Construction Process Modeling And Simulation
Jason Baker, Transocean Offshore International
During offshore oil and gas well construction, a significant amount of time is spent during drilling operations. The operation involves a sequence of manual and semi-automated processes performed in unison to complete the operation as safely and efficiently as possible. The dozens of machines utilized were designed independently and operate independently but are tightly coupled during many complex activities. If any machine function is delayed, the critical path time may be impacted thus reducing the efficiency of the entire operation. Processing time across the drilling industry varies significantly given different equipment manufacturers and configurations, experience and fatigue of operations personnel, rig type and drill floor layout, weather conditions, equipment maintenance, and wellbore conditions among others. Over the life of a well, these operational variations and inefficiencies are typically hidden in the overall operation but can add up to days and weeks of invisible loss time thus making the fleet less productive than optimal. This presentation highlights the challenge of decomposing the complex drilling operations into its critical path activities and dependencies, and decomposing machine functions into discrete states and modes to enable precise measurement of overall system performance. Leveraging AnyLogic simulation software, state machine algorithms and discrete event process models were developed to analyze the drilling operation in real time. The simulation was first performed using Triangular distributions for each individual function based on historical data to replicate the real-world environment. The application was extended to read in real-time machine data to automatically identify machine states, process activities, critical path impacts, and capture descriptive statistics. The application outputs the data to a SQL database which is then read by Tableau software to visualize performance for management and operations personnel. The results provide powerful insight into system performance and trending that is being used to provide real time feedback to operations personnel as well as for post event analysis and trending. The initial results indicate over 20% of time could be saved by implementing various solutions. The methods defined during this project are also being leveraged in other projects to understand system of systems behavior and performance, predicting future performance and optimizing new system designs.
Jason Baker is a senior systems engineer at Transocean Offshore International in Houston, Texas. He is responsible for leading a systems engineering team developing a digital twin of offshore drilling rigs. He has 10 years experience applying systems engineering methodologies developing safety systems in Oil and Gas and Aerospace Industries. He is the Vice President of the International Council on Systems Engineering (INCOSE) Texas Gulf Coast Chapter. He received a B.S. in Human Factors from Embry Riddle Aeronautical University and a M.E. in Systems Engineering from Stevens Institute of Technology and is currently a Ph.D. student at The University of Houston in the Industrial Engineering program. He has received various professional certifications including, INCOSE CSEP and TUV Rheinland Functional Safety Engineer.
- Modeling 2018 Tax-Induced Regime Shifts across Statewide Municipal Bond Term Structures Using AI and Big Data Analytics
Gordon Dash, The University of Rhode Island
In 2018 Americans face a newly enacted 2018 tax law. The Act is expected to increase demand for securities from investors hardest hit by limits on the ability to deduct state and local taxes. It is likely that the pricing of individual bonds will become more complex, more unpredictable, and increasingly dependent on issuer state demographics. To eek out new levels of transparency in the thinly traded municipal bond markets, the Municipal Securities Rulemaking Board (MSRB) amended Rule G-15 to require brokerage firms to disclose to retail investors the pricing mark-up or mark-down on corporate, agency and municipal bond trades. The enhanced data flow produce new research on market liquidity issues. This talk links 2018 tax law to the efficient modelling of a 2018 local and regional muni-market reference structure. The talk is also practice driven with remarks on user-implementation of the embedded Big Data and artificial intelligence algorithms.
Gordon H. Dash, Professor of Finance, Decision Sciences and Interdisciplinary Neuroscience; http://www.GHDash.net 01-July-2017 Professor Dash joined the faculty of the College of Business Administration (CBA) at The University of Rhode Island, USA (URI) in 1974. Prior to his arrival at URI he completed his undergraduate degree in business administration at Coe College, Cedar Rapids, IA (1968). He earned a masters- and dual program PhD-degree in Finance and Operational Research from the University of Colorado at Boulder, CO (1978). Dr. Dash has authored over 90 manuscripts in his role as a professor of finance and computational neuroscience, Professor Dash’s current research agenda focuses on the neuroeconomics of SMART cities, translational science, and neurobehavioral animal science. Most recently, his neurobehavioral animal models of prosocial residential behavior were on display at the URI Brain Fair (The Ryan Institute for Neuroscience) and the Mind-Brain Day Fair (Brown University). Current interdisciplinary research extends studies on rodent neurobehavioral responses to form a prosocial multi-objective optimization model for public housing assignment. Past publications on algorithmic optimal bank structures appear in several academic journals including the Journal of Multi-Criteria Decision Analysis; the Journal of Banking and Finance and The Financial Review. An multi-objective extension to the Sharpe model for efficient diversification with hedging appears in the International Transactions on Operational Research. His research in the broader area of quantitative financial decision-making appears in: The New England Journal of Business & Economics; the Journal of Portfolio Management; the Journal of Applied Operational Research; Operations Research: An International Journal; the African Finance Journal; the Handbook of Financial Engineering; and, The Journal of Computing and e-Systems. Past editorial appointments include: The Northeast Journal of Business & Economics; The International Review of Financial Analysis; and, The Financial Review. Professor Dash has published three books: Recent Advances in Computational Finance; Applied Risk Management: the Fundamentals of Derivatives, Neuroeconomics and Automated Trading; and, Operations Research Software, Volume I and II. Globally, Professor Dash has hosted classes or made professional appearances in in Germany, Greece, India, Italy, Japan, Lithuania, Malaysia, South Africa, Singapore, South Korea, Tunisia, Turkey, and Thailand.
- An Investigation Of Forecasting And Evaluating Intermittent Demand Time-series
Jingda Zhou, Purdue University
The current research leverages machine learning methodologies to predict intermittent demand for an industrial partner. Intermittent demand refers to random and low-volume demand. It appears irregularly with large proportion of zero values in between demand periods. The unpredictable nature of intermittent demand poses challenges to companies managing sophisticated inventory system, incurring excessive inventory cost or stock outs. The world’s largest manufacturing companies are burdened with inventory cost, and especially for those of $1.5 trillion revenue, an average of 26% was spent on service operations. Therefore, small improvement in prediction accuracy of intermittent demand will translate into significant savings. The poster examined various machine learning techniques including quantile random forests (QRF), neural network (NN) and gradient boosting machine (GBM) compared their predicative power against the classical Croston’s method. The results showed that the QRF model significantly improved prediction accuracy and could save the partner company substantial inventory cost.
Jingda is a MS Business Analytics information Management candidate at Purdue University. She has been exploring various areas of analytics, including machining learning, web scraping, database mgmt and data visualization, etc. Prior to Purdue, she conducted quantitative research in communication behavior, and built regression models to identify behavioral patterns. During her internship as a marketing analyst at MullenLowe, she performed consumer analytics, web analytics and digital campaign optimization. Jingda enjoys outdoors in her leisure time.
- Claims Prediction using Cox Hazard Model
Samuel Berestizhevsky, InProfix Inc.
The central piece of claim management is claims modeling. Two strategies are commonly used by insurers to analyze claims: the two-part approach that decomposes claims cost into frequency and severity components, and the pure premium approach that uses the Tweedie distribution. In this article, we evaluate an additional approach to claims analysis – time-event modeling. In this article, we provide a general framework to look into the process of modeling of claims using Cox hazard model. This model is a standard tool in survival analysis for studying the dependence of a hazard rate on covariates and time. Although the Cox hazard model is very popular in statistics, in practice data to be analyzed often fails to hold assumptions underlying this model. This article also is a case study intended to indicate a possible application of Cox hazard model to workers’ compensation insurance, particularly occurrence of claims
Samuel Berestizhevsky is an innovator and actionable analytics expert having served as the CEO at InProfix Inc., a stealth mode startup that develops AI solutions for the insurance industry. Samuel has extensive knowledge of software development methods and technologies, artificial intelligence methods and algorithms, statistically designed experiments. Samuel co-authored two books on statistical analysis and metadata-based applications development with SAS.
- A Machine Learning Approach To Estimating Oil Demand
Hongxia Shi, Purdue University
Predicting future liquefied petroleum gas (LPG) demands is of great importance, as it can help shipping companies to optimize routing plans, commodity traders to determine the precise timing to sell/purchase products, and government agents to develop better trading strategies. Current approaches rely on manual human analysis and cannot handle the massive amount of data, thus cannot provide accurate predictions of future vessel patterns and behaviors. To tackle this challenge, we proposed a predictive model to use the availability of empty LPG vessels to predict how much LPG will be shipped in the future. We extracted features based on the total tonnage of empty vessels moving across the ocean. Then we developed a regression model to predict the amount of LPG shipped three weeks after. The performances of multiple model configurations and regression models were compared and the optimal setting was identified. Our final optimal model can successfully predict future LPG shipping amount with high correct ratio.
Hongxia Shi is a Master student of Business Analytics and Information Management Program at Purdue. She got her bachelor’s degree of management in accounting from Nanjing University of Finance and Economics. Her working experience of financial auditor and technical knowledge sharpen her analytical competences. She desires to build a professional career in the field of business analytics. She aims to employ her diverse expertise in business, data analysis and database management to deliver data-driven business solutions to any enterprise who embarks this astonishing era of data.
Tuesday afternoon | Exhibit Hall
- Temporal Demand Forecasting Using Machine Learning Techniques: A Comparative Study Of Open-source And A Commercial In-house Solution
Kalyan Mupparaju, Purdue University
In the project, we would be outlining the workflow of building an open source modern time series prediction models to forecast the demand faced by a retail chain with numerous stores. We applied machine learning techniques – gradient boosting, LSTM neural networks and support vector machines and compare their performance to the traditional ARIMA models. Ultimately, we compared the performance of these models to a commercial in-house Machine Learning Platform. In today’s world of extreme competition, cost reduction is of utmost importance, primarily in the retail and consumer product goods(CPG) industries. In addition to cost optimization, having just the right amount of inventory is also becoming important for consumer satisfaction. Efficient and accurate demand forecast enables organizations, to anticipate demand and consequently allocate the optimal amount of resources to minimize stagnant inventory. This allocation is more pronounced in case of perishable items where the question changes from “when will it be sold” to “whether it would be sold(before its life)”? While traditional techniques have given good results for demand forecasting at an aggregated(week/month) level, we demonstrated that by using deep-learning algorithms, forecasts can also be done at the day-level with significant accuracy and thus provide more flexibility to retailers in their resource planning exercises. These model’s performance came out to be performing below par to the commercial platform but yet proved the effectiveness of open source workflow.
Kalyan graduated from the Indian Institute of Technology (BHU), located in Varanasi, India in 2015 with a Bachelor of Technology degree in Civil Engineering. After his undergraduate education, he joined Mu Sigma Business Solutions. At Mu Sigma, he got to work with a Fortune 100 US-based retail giant. He worked on various analytical projects including new service impact measurement, cannibalization analysis, customer targeting, customer satisfaction analysis, and store space optimization. Kalyan’s work always impressed the clients, but he received special appreciation for a project in which he designed a framework to identify the high-value customers very early in their shopping journey. Towards the later part of his time at Mu Sigma, he was also leading the training of new analysts joining his team in SQL and team specific data knowledge. Kalyan believes that there will be a greater push for real-time data-driven decision making in the near future. He sees himself working with a major technology or retail company to lead this revolution by employing the skills and knowledge that he will acquire while pursuing the MS BAIM program at Purdue University.
- Bootstrapping versus Geometric Brownian Motion – A Tale of Two Simulation Models for Portfolio Analysis
James DiLellio, Pepperdine University
Geometric Brownian motion (GBM) is important stochastic process used in a variety of applications, and is governed by two parameters: drift and variance. In financial applications, it has widespread use in modeling stock price movements and is fundamental in the Black-Scholes-Merton option pricing theory. However, the distribution of stock price observations can sometimes diverge from a normal distribution of returns, limiting the insights the GBM process can provide. This presentation discusses the alternative use of bootstrapping for simulating asset prices, demonstrating the benefits of this alternative formulation. Empirical studies over different historical periods of stock and bond markets returns are presented, and the strengths of each model are highlighted in terms of their utility in supporting optimal decisions in portfolio analysis.
James A. DiLellio, PhD, MBA, is an Associate Professor of Decision Sciences in the Graziadio School of Business and management at Pepperdine University. Dr. DiLellio has over a decade of domestic and international experience in the aerospace and defense industries. He most recently served as a department manager for Raytheon, and has also served as a senior manager at the Boeing Company. Dr. DiLellio’s current research interests are primarily in the area of nonlinear optimization, simulation, and Kalman filtering techniques for modeling investment problems. The application of this research covers portfolio management, retirement planning, commodity price modeling, and the analysis of investment strategies. He has published papers in Decision Sciences, Energy Economics, Journal of Economics and Finance, Journal of Investing, and Financial Services Review.
- An Optimization Approach For Assortment Planning
Rishabh Mohan, Purdue University
In this report, an optimization model is built to deal with a very important aspect of supply chain management “what to stock and how much to stock”. There is a trade-off between inventory cost and customer satisfaction as high inventory level increases cost as well as customer service level, and vice versa. So, firms tend to obtain a balance of the two to maximize service level with minimum cost. This brings to the idea of the assortment of inventories i.e. what combinations and quantities of SKU to be stocked so that customers can find the desirable SKUs. The quantity and combinations vary as per the objective of the respective firm i.e. whether firms want to maximize profit /coverage /customer satisfaction. For this, in collaboration with a national retailer, we have built an optimization model with maximizing profit as our objective function for different assortment of SKUs with total space and total cost quantity required as decision variables. Some of the constraints chosen for the optimization model are store and MPOG Budget, store shelf space availability, minimum quantity required if a SKU is stocked, and bundling (inclusion of one SKU with another SKU). With our model, we have provided an analytics-based decision-support-system (DSS) that provides a solution to this problem. The model will be particularly helpful in retail sector as assortment planning is a huge business problem, and if not properly addressed, the firm may lose its competitive advantage with many unsatisfied customers. An analytical model which captures the trend from the historical data of the customers eventually helps in predicting the choice of customers. With our DSS, a proper assortment of SKU’s can be developed which will cater to the customer’s needs. There are not enough reports available on assortment optimization where the model is built using Gurobi package on R. The Gurobi package is one of the most powerful packages for building optimization models, and hence used here to build our assortment optimization model.
The author is a graduate student at Purdue University pursuing masters in business analytics and information management, and planning to graduate in May 2018. Going forward, he wants to contribute, as part of a team that applies analytics, machine learning and informatics to solve roadblocks and make informed decisions in business processes as well as in drug discovery.
- Supply Chain Network Optimization at Nordstrom
With annual revenues exceeding $15 Billion, Nordstrom Supply chain network is complex. Logistics and Distribution costs in United States estimated at more than $1 Billion annually and is expected to increase significantly with growing pressures in transportation industry coupled with increase in ecommerce segment of business. Nordstrom purchases more than Million SKUs from large number of suppliers around the world. Supply Network originates from Suppliers with goods flowing through various distribution centers and fulfillment centers located in United States before reaching customers via stores or online deliveries. With significant growth rate of ecommerce fulfillment network coupled with brick and mortar stores, integrating ecommerce network with physical foot print presents opportunity for optimizing for the long term. In addition to network complexities business strategies pursued by Management influence the design and operation of Supply Chain. Cost, Capacity, Service and Inventory presentation in stores influences Supply Chain Network configuration. We have adopted an Integer Programming model to optimize distribution network. The model was formulated using Python interface and solved using Gurobi. The results of our analysis were helpful to the senior leadership to develop long term strategy for Nordstrom Supply Chain Network in a timely manner. The team is in the process of developing a more comprehensive Supply Chain Network Optimization models to solve unique problems in the context of Nordstrom network. The purpose of this presentation is to describe an ongoing project where the team is involved in optimizing Supply Chain Network Configuration.
- Clustering And Prediction Problem Framing For Sparse Demand Products
In collaboration with a national retailer, this study focused on assessing the impact of sales prediction accuracy when clustering sparse demand products in various ways, while trying to identify scenarios when framing the problem as a regression-problem or classification-problem would lead to the best demand decision-support. This problem is motivated by the fact that modeling very sparse demand products is hard. Some retailers frame the prediction problem as a classification problem, where they obtain the propensity that a product will sell or not sell within a specified planning horizon, or they might model it in a regression setting that is plagued by many zeros in the response. In our study, we clustered products using k-means, SOMs, and HDBSCAN algorithms using lifecycles, failure rates, product usability, and market-type features. We found there was a consistent story behind the clusters generated, which was primarily distinguished by particular demand patterns. Next, we aggregated the clustering results into a single input feature, which led to improved prediction accuracy of the predictive models we examined. When forecasting sales, we investigated a variety of different regression- and classification-type models and report a short list of those models that performed the best in each case. Lastly, we identify certain scenarios we observed when modeling the problem a classification problem versus a regression problem so that our partner could be more strategic in how they use these forecasts for their assortment decision.
- A Gravity Model of Market Share Based on Transaction Records
Mohsen Bahrami, Sabanci University/MIT, Media Lab
Understanding customer patronage behavior is an essential step in solving facility location and market share estimation problems. While existing studies have conducted surveys to estimate merchants’ market share and their attractiveness factors, recent trend in big data analysis enables us to understand human behavior and decision making in a deeper sense. This paper proposes a novel approach of transaction-based patronage behavior modeling. We use the Huff gravity model together with large-scale transactional datasets to model customer patronage behavior in regional scale. This paper is the first in using the Huff model in conjunction with a large-scale transactional dataset to model customer retail patronage behavior, enabling us to easily apply the model to different regions and merchant categories. Thus, we are able to evaluate indicators that are correlated with the Huff model performance. Experimental results show that our method robustly performs well on modeling customer shopping behavior for various shopping categories.
Mohsen Bahami is a Ph.D. candidate in Operations & Supply Chain Management at Sabanci University. He is currently working as a research assistant in MIT, Media Lab, Human Dynamics Group. His great passion is use data in order to understand the human behavior and help better organize companies, public well-being, and governance. Prior to beginning the Ph.D. program, Mohsen worked as a Manager and CEO in the electronics industry. He conducted a variety of projects, involving qualitative and quantitative analysis in the Operations
Management field and he won the award of “Superior Managers of Electricity and Electronics Industry” conference in Tehran, Iran in 2012. Mohsen received his B.Sc. degree in Electrical Engineering from Sharif University of Technology and his MBA from Tehran Polytechnic. He is currently working on research projects in the Big Data and Behavioral Analytics using transactional data from the finance industry.
- Machine Learning Approach To Identify Risky Projects
Meena Kewlani, DXC Technology
We identify a rare event of a customer reneging on a signed agreement, which is akin to problems such as fraud detection, diagnosis of rare diseases, etc. where there is a high cost of misclassification. Our approach can be used in all cases where the class to be predicted is highly under-represented in the data (i.e. data is imbalanced) because it is rare by design; there is a clear benefit attached to this class’ accurate classification and even higher cost attached to its misclassification. Pre-emptive classification of churn, contract cancellations, identification of at-risk youths in a community, etc. are potential situations where our model development and evaluation approach can be used to better classify the rare but important events. We use Random Forest and Gradient Boosting classifiers to predict customers as members of a highly underrepresented class and handle imbalanced data using techniques such as SMOTE, class-weights, and a combination of both. Finally, we compare cost-based multi-class classification models by measuring the dollar value of potential lost revenue and costs that our client can save by using our model to identify at-risk projects and proactively engaging with such customers.While most research deals with binary classification problems when handling imbalanced datasets, our case is a multi-classification problem, which adds another layer of intricacy.
Meena is a data driven decision facilitator who enjoys employing analytics and machine learning techniques to turn data into actionable insights. She graduated from Rajiv Gandhi Technical University, India in 2014 and started her professional tenure with CSC (now known as DXC Technology). She worked as a Programmer Analyst for three years and specialized on CSC’s enterprise product “CyberLife” which not only required technical expertise but also an analytical perspective to formulate business strategies and solve complex business problems. She also developed an automation application for client, HMI- HealthMarkets Inc. to facilitate the billing change and update requests from their customers. After acquiring a strong appreciation for the financial IT industry, she decided to pursue a graduate program to harness unexplored potential of the world of business intelligence and eventually joined Krannert’s ‘Business Analytics and Information Management’ program. Meena is now motivated to enhance her proficiency in advanced analytic models and tools used to make better informed and lucrative business decisions and wants to become a valuable asset to a company’s growth.
- Predicting Return To Acute Care Of Rehab Patients
Huigang Liang, East Carolina University
Readmissions have important implications for healthcare policy makers, payers, and providers. Return to acute care hospital (RACH), or readmission from inpatient rehabilitation facilities (IRFs) back to acute care hospitals before finishing the course of rehabilitation is a significant and growing problem in the US. Current predictive models for RACH have poor performance. This study compared nine machine learning models based on a 16-year data set. The random forest model outperformed the other models with an AUC value of 0.93 and misclassification rate of 0.06. Using random forest, the accuracy of predicting RACH can be greatly improved.
Huigang Liang is Professor of MIS and Teer Endowed Chair at College of Business, East Carolina University. His research focuses on IT issues at both individual and organizational levels in a variety of contexts. His current interests include data analytics in healthcare, IT security, and IT strategy. His work has appeared in MIS Quarterly, Information Systems Research, Journal of MIS, Journal of AIS, Communications of the ACM, Decision Support Systems, Information Systems Journal, and Journal of Strategic Information Systems, among others. He is serving on the editorial board of MISQ, JAIS, and I&M.
- Jumpstarting Analytics Teams In The Healthcare Environment
Healthcare is a data-dense environment with data originating from clinical treatment, billing and patient scheduling. Two years ago, our institution created an analytics department to provide insight into the medical practice of 110 surgeons operating in 33 hospitals in the Los Angeles area. To complement existing tools such as the INFORMS Analytics Maturity Model, we describe in detail four management strategies to help ensure the success of new analytics teams in healthcare. Firstly, since healthcare is a legacy industry, the staffing model of new analytics teams must consider hiring internally to ensure tacit organizational knowledge is available within the team. Secondly, since the project mix can significantly overlap with evidence based clinical protocols, it is important to include clinical expertise at the mid-management level. Thirdly, a surge in project proposals is observed at the initiation of the analytics team requiring considerations of slow progress on many projects, or ‘sprinting’ on a few top-priority projects to demonstrate early wins. Lastly, to ensure steady growth of the analytics function, it is crucial to work with executives to define return-on-investment categories that measure the impact of projects executed by the analytics team in its infancy.
Manas Bhatnagar, MHSc, has developed and implemented several technology solutions for use in the healthcare setting. His work has supported research in several domains such as clinical decision support, quality improvement, mobilehealth apps and design thinking in healthcare. His expertise lies in supporting clinical research through the creation of usable technology and data analytics. Mr. Bhatnagar is an engineer by training, earning his degrees at the University of Toronto and the University of Kansas.
- Bayesian Hierarchical Bernoulli Weibull Mixture Model for Extremely Rare Events on Internet Service
Yuki Ohnishi, BizReach, Inc.
Estimating the duration of user events, such as the first purchase at EC or first login after registration, is a central concern for most internet companies. However, it is the often the case that these events rarely occur because most users become inactive after registration, and conventional statistical framework cannot be applied well in this scenario. To address this challenge, this work presents a proposal of Bayesian Hierarchical Bernoulli-Weibull Mixture Model for infrequent events. The result shows that the proposed model offers substantial improvements over conventional, non-hierarchical survival models in terms of WAIC and WBIC. Furthermore, an experiment and extensive analysis were conducted using real-world data from the Japanese job search website, CareerTrek, a popular matching platform for junior job seekers and headhunters in Japan, published by BizReach, Inc. In the analysis, some research questions are raised. A discussion of uncertainty is also provided along with the analysis.
A lead data scientist at BizReach, Inc., which is rapidly growing IT company in Japan. He graduated from the University of Tokyo with the degree of Master of Engineering (2014). His job focus in on the development of machine learning related systems and business analytics collaboration. His current interest is interpretability of machine learning models, which can be derived from Bayesian modeling.
- A Retrospective Investigation Of Test And Learn Business Experiments & Lift Analysis
Manya Mittal, Purdue University
This study provides an analysis to retrospectively investigate how various promotional activities (e.g. discount rates and bundling) affect a firm’s KPIs such as sales, traffic, and margins. The motivation for this study is that in the retail industry, a small change in price has significant business implications. The Fortune 500 retailer we collaborated with thrives on low price margins and had historically ran many promotions, however, until this study, they had limited ability to estimate the impact of these promotions on the business. The solution given employs a traditional log-log model of demand versus price to obtain a baseline measure of price sensitivity, followed by an efficient dynamic time-series intermittent forecast to estimate the promotional lift. We believe our approach is both a novel and practical solution to retrospectively understand promotional effects of test-and- learn type experiments that all retailers could implement to help improve their revenue management.
The author is a graduate student majoring in Business Analytics & Information Management (BAIM) at Purdue University. With two years of prior analytics consulting experience, she is adept at delivering high-quality recommendations to clients through simplification of complex data analytics problems.
- Estimating Home Delivery Orders in Grocery Sector: Leveraging Internal and External Data Sources
Sepideh Kaffash, Suffolk University
Online grocery sector is enjoying a strong growth, putting retailers under pressure to deliver within one-hour time windows. To maintain their contact with the end-consumer, retailers invest in their own fleets and provide their home delivery service in-house. The demand for home delivery orders must be estimated for fleet and route optimisation. In this talk you will see how data from internal and external sources can be combined to estimate location-based home delivery orders. Online grocery order data comprising retailer’s annual revenues and average basket sizes, population and sociodemographic statistics of locations to be served, and retailers’ market shares as well as store footprints in the same areas are inputs to a neural network where the output is the annual home delivery orders per location. We use 300K home deliveries of an online retailer to train, validate, and test the neural network. The intended audience for this talk is planners who are interested in novel methodologies for demand estimation.
Dr Kaffash was the financial trade manager, director of credit risk analytics at Bank Melli, the largest bank in Iran. She was responsible for supporting international financial trade transactions, including long-standing methods of payments and open accounts to customers. Kaffash controlled pre-shipment, post-shipment and the receivable finance process. In addition, she managed a financial trade department. Kaffash is a visiting scholar at Information Systems and Operations Management Department at Suffolk University. Previously she taught at the University of Massachusetts, Boston “Managerial Statistics,” “Operations Management,” and “Global Supply Chain Finance.” Dr Kaffash has been published in both the Economic Modelling and the Annuals of Operations Research journals.
- Mining Business Value From Unstructured Data
William Amadio, Rider University
Is there business value in unstructured data such as the opinions expressed in online reviews? And if so, how does one exploit that value? To explore these questions, we present a text mining/visualization system for competitive analysis using online reviews. The system transforms a collection of reviews into a hierarchy of data of increasing dimensionality, and integrates the levels of the hierarchy through interactive visual summaries. This novel design enables analysts to identify, for example, features that discriminate amongst competitors or features required for competition at a given price point. The resulting intelligence can be used to plan competitive actions, anticipate rivals’ actions and find opportunities in changing and/or less competitive spaces.
Bill Amadio is Associate Professor of Information Systems and Director of the Center for Business Analytics at Rider University in Lawrenceville, NJ.
Dr. Amadio has worked on text mining projects ranging from marketing management to fraud detection. His statistical experience includes studies in the management of cancer pain, the design of employer/employee contributions to health benefits and genetic studies in newborns.
Dr. Amadio earned his doctorate in mathematics at the Polytechnic Institute of Brooklyn in New York City. He is the author of two books on computer system analysis, along with numerous articles in professional publications such as The Journal of Information Technology and The Journal of Human Genetics.
- A Dynamic Simulation Shows Patient-Matched Knee Implants That Fabricated Through Additive Manufacturing Can Enhance Patient-Outcome And Be Cost-Effective
More than 6 million people are living with knee implants in the United States and the number of patients requiring knee replacements is expected to increase to more than 3.4 million per year by 2030. Yet the costs and benefits of total joint replacement procedures have not been fully evaluated. The need for efficient strategies to improve quality of knee implants and the efficiency of procedures and operations is more apparent than ever. Using a model that simulates the adoption of patient-matched (PM) and traditionally manufactured off-the-shelf (OTS) implants, we present that a higher rate in PM adoption, has positive long-term effects not only on patient outcomes but also on the economics of the healthcare system. Results indicate that a higher rate of PM adoption ultimately saves hospitals and surgeons 11% on procedure time, decreases the recovery time by 30%, reduces the number of readmissions and revision surgeries by 2% and 4%, respectively, and lessen the costs for the stakeholders of the system by 18%.
I received my Bachelor’s degree in Electrical Engineering from Purdue University (PU). During my undergraduate studies at Purdue, I was assigned to an energy project of a sustainable energy research program. Throughout this initiative my colleagues and I designed and performed simulation analyses on a backup battery system for solar cell panels in order to provide electricity for student dormitories – not only was this project a strong educational experience for me but it also sparked my interest for continuous learning and research. I subsequently completed my Engineering Management Master’s degree at Northeastern University in 2013 and later began my PhD in Industrial Engineering at Northeastern University in September 2014. The focus of my research has been on modeling, dynamic simulation, process optimization, smart manufacturing, 3D printing and
their applications in the medical field. Over the past years I have had the opportunity to attend several conferences, including the RAPID conference in 2017 where I received the Additive Manufacturing Fundamental Certification from SME. The curriculum at Northeastern has helped advance my statistical, optimization, and economic decision making skills while exploring the applicability of additive manufacturing in the healthcare industry through a variety of conferences and professional networks.
- Optimization Of Preferred Boarding Groups By Markov Chains
Many lowcosters (such as VivaAerobus and Viviana in Mexico) practice boarding preferences not only for business- or first-class passengers but also for those economy-class passengers who have paid a special fee when doing their web-check-in. Even though the company’s profit related to this fee reaches its maximum if all the economy-class passengers acquire this privilege, this extreme case isn’t stable. Indeed, as the preferred boarding line, in fact, stops being attractive when all passengers enjoy the same preference order, many of them might be inclined not to buy it for their next flights. On the other hand, when the majority of economy class passengers are obliged to wait until their privileged mates pass the check-in counter, they either nurture the incentive to buy a preferred line boarding pass for the next flight check-in or decide to come to the airport much later than they normally do in order to avoid the ordeal of the stuck check-in line for unprivileged passengers. However, the latter intention might become troublesome for the airplane company, as bearing the risks of delayed departures. Therefore, there must be a kind of trade-off for the lowcosters determining the optimal proportion of preferred boarding passes to be delivered for sale. As the optimality is here understood as providing the percentage of preferred boarding passes that imply the minimum deviation from its value in the future sales, the Markov stable state techniques help solve the problem in question. Interesting conclusions and bypass practical recommendations have been developed and justified in a rigorous mathematical form.
Viacheslav V. Kalashnikov got his Ph.D. degree in Operations Research (OR) in 1981 from the Institute of Mathematics of USSR Academy of Sciences in Novosibirsk and his Dr. Sc. (Habilitation Degree) in OR in 1995 from the Central Economics and Mathematics Institute of the Russian Academy of Sciences. His work in the areas of bilevel programming and variational inequality problems are well-known in the optimization community. He is author and co-author of 5 monographs and more than 90 papers published in many prestigious journals in the area of optimization. He has advised 9 Ph.D. students and 14 master students in the universities in Russia, Mexico, and Ukraine.
- Media Mix Modeling Solution By Discovery Communications And Civis Analytics
Anastasia Chen, Civis Analytics
The D%Mix Media Mix Modeling tool, developed by Discovery Communications and Civis Analytics, is a self-service application that empowers media strategy planners to create advertising campaigns that maximize viewership and ratings points of season premieres. In the media industry, where marketing budgets and plans are historically determined with incomplete information, the ability to make data-driven decisions is “a breakthrough of many orders of magnitude.” Starting with a data set of historical show and campaign characteristics, we use propensity matching and predictive modeling techniques to build a model that accurately reflects the impact of advertising spend on TV ratings for supported shows. This model serves as the objective function for a non-linear optimization problem, which determines the media budget allocation that maximizes TV ratings for a premiere, given show characteristics and spending constraints. The model and optimization engine are packaged in a self-service web application that media planners access multiple times a week to plan campaigns throughout the year and demonstrate value of marketing campaigns. D%Mix supports planning for more than 100 campaigns per year over multiple networks.
Ana is a Senior Applied Data Scientist at Civis Analytics, where she specializes in applications of optimization and customer analytics in the media and retail sectors. Her prior work includes developing a tool that combines optimization with predictive analytics to help clients allocate marketing spend, conducting field experiments to measure efficacy of marketing interventions, and designing and evaluating consumer segments for audience definition and targeted advertising. Prior to Civis, Ana was an operations consultant with Analytics Operations Engineering, where she worked with retail clients on problems ranging from marketing analytics to inventory management. She holds an MPhil in Management Science and Operations from the University of Cambridge Judge Business School and a BS in Management Science from the Massachusetts Institute of Technology.
- Predicting And Optimizing Transportation Costs For A National Retailer Under Tiered Rebate Constraints
In this study we developed a transportation delivery decision-support system in collaboration with a high-end national furniture retailer. This retailer transports millions of dollars’ worth of their products via a popular package delivery company. The retailer gets rewarded with a significant annual rebate from the carrier based on the total shipment dollars over the course of a year. However, the decision to choose a delivery vendor is made on a weekly basis by the retailer. For every shipment, there is a trade-off between increasing the total sales with the popular carrier (and thus achieving a higher tiered discount later) or choosing an alternative lower-cost transportation vendor today. The motivation for this proposed solution is that minimizing transportation costs are an important objective of all retailers. Often transportation delivery companies will offer tiered rebate policies to their customers to incentivize them to do more of their business with them. This challenges the retailer to figure out how much of their current and future deliveries should be done by each carrier, in order to reduce the other transportation costs for a specified planning horizon. Thus, knowing how to calculate applicable rebate levels to take advantage of reduced logistical costs, and determining whom will deliver what and when, is an important problem for supply chain departments. After understanding the constraints of the business problem and the primary transportation providers tiered rebate policy, we framed this problem into an analytics problem. Our analytical solution predicts expected rebate rate at a week-level by framing it as a time-forecasting problem. Traditional ARIMA model are used to build features that served as inputs to machine learning models (e.g. Random Forest, Artificial Neural Network, and a Deep Learning model) that we found led to even better forecasts. As agreed upon with the partner, we focused on obtaining a model that achieved the lowest cross-validated Root Mean Squared Error (RMSE). Lastly, we developed a tool that would simulate various transportation scenarios to demonstrate how much of the delivery business could be allocated to other smaller carriers, while still ensuring the retailer would remain in a certain strategic rebate tier to minimize their overall yearly transportation costs.
Surya Gundavarapu graduated from Purdue University in 2017 with Bachelor’s degrees in Supply Chain, Information Analytics, Economics and a Certificate in Entrepreneurship and Innovation. During the coursework, he has had the unique opportunity to work on several case competitions and academic projects that put his vibrant skillset to test across industries and disciplines. His first taste of the corporate world occurred during his internship at MoboTap Inc.Surya’s primary interest is in information systems and Artificial Intelligence, he had his first taste of analytics during a data dive competition he participated in as part of his co-curricular endeavors at Purdue University. He realized the potential of Big Data and Data Analytics in pattern recognition and machine learning which is instrumental in Artificial Intelligence.Surya sees an advanced degree in Data Analytics to be of utmost importance to follow his dream of improving Artificial Intelligence to the stage of recognizing speech patterns and understanding human language. Apart from his professional life, Surya is a sink of information and a voracious reader. He enjoys writing software programs and scripts in his pastime.
- A Machine Learning Approach To Estimating Product Substitution Behavior
Matthew Lanham, Krannert School of Management, Purdue University and Biz Analytics Lab, LLC
This study examines multi-classification machine learning methodologies that a retailer might use to gauge the propensity a product might sell when purchase substitution is inevitable. This study is important because retail category managers are routinely reevaluating the mix of products they offer consumers. In practice, the use of predictive models enables the retailer to estimate the substitutability of their products. Category managers use these estimates as a decision-support mechanism to help them decide not only the assortment, but also help formulate their breadth versus depth strategy for their upcoming planning horizon. In collaboration with a national retailer, we build and assess traditional choice models, as well as investigate more sophisticated machine learning multi-classification methodologies and compare the results. We highlight where certain models perform better than others and show that in certain cases, non-traditional machine learning methods would provide improved category manager decision-support.
The author is a Clinical Assistant Professor at Purdue Universities’ Krannert School of Management and Co-Founder/Chief Data Scientist of Biz Analytics Lab, LLC in Lafayette, IN. He is the course coordinator and teacher for the Data Mining, Predictive Analytics, and Using R for Analytics courses within Krannert, and spends most of his time obtaining and mentoring experimental learning projects for students within Purdue’s M.S. in Business Analytics & Information Management (BAIM) program. His research and industry experience focuses on designing and developing decision-support solutions to retail problems.
- Fake or Not? Classification and Visualization of News Article
Jeh Lokhande, Georgia Tech
The US Presidential elections in 2016 created a huge uproar with reports suggesting that fake news spread online affected the peoples’ sentiments and hence the results of the election. With the advances in internet being used in education, we all want it to be a place that is safe for the younger generation to feed from. There should hence be a way to fact check articles on the internet, to prevent the spread of misinformation.
The current method of fact checking is tedious, time consuming and manual. We hence decided to automate this process by creating a tool that takes in a URL as an input, parses the text from the news article, extracts features from the article, classifies it as fake or real and the displays the trends of the article over time (using google trends).
Fake articles have structural or textual features in them that often are the cause of their fakeness. We did a literature survey on such features and extracted those from the text article. Text was converted to a machine readable number using Word2Vec which a deep-learning model used to vectorize text. All the features were then fed into a sequential deep learning model to classify them. The model made predictions with accuracy in the range of 85-90%.
Our aim is to work on the robustness of the model towards the different genres of news, and then make this tool publicly available.We are working towards reducing the spread of misinformation on the internet.
MS in Analytics candidate at Georgia Tech with 2 years of professional experience as an analytics consultant for Unilever and Liberty Mutual. Experienced in predictive modelling (machine learning), natural language processing, optimization and visualization.
- Modeling Production In Japan’s Cardboard Industry Using An Axiomatic Nonparametric Production Function
Daisuke Yagi, Texas A&M University
We develop a new approach to estimate a production function based on the economic axioms of the Regular Ultra Passum law and convex non-homothetic input isoquants. The standard approach in the productivity literature is to estimate the parameters of a parametric function, such as Cobb-Douglas production function. However, a parametric function has several restrictive characteristics. The goal of this research is to develop a new approach that is less dependent on functional form assumptions to estimate a production function. Central to the development of our estimator is stating the axioms as shape constraints and using shape constrained nonparametric regression methods.
Graduate Research Assistant at Department of Industrial and Systems Engineering, Texas A&M University
Yagi, D., Y. Chen, A.L. Johnson, and H. Morita, 2018, Modeling production in Japan’s cardboard industry using an axiomatic nonparametric production function, Working paper.
Yagi, D., Y. Chen, A.L. Johnson, and T. Kuosmanen, 2018, Shape constrained kernel-weighted least squares: Application to production function estimation for Chilean manufacturing industries, Journal of Business and Economic Statistics, (Accepted).
Yagi, D., K. Nagasawa, T. Irohara, H. Ehm, G. Yachi, 2014, Semiconductor supply planning by considering transit options to take advantage of pre-productions and order cancellations, Simulation Modelling Practice and Theory, Vol.41, pp.46-58.
- Data Mining Approaches For Stock Risk Assessment
Amirhossein Gandomi, Stevens Institute of Technology
The recent exponential growth of investors in stock markets brings the idea to develop predictive models to forecast the total risk of investment in stock markets. In this study, several data mining approaches including classical and evolutionary computation approaches were proposed to predict the total risk in stock investment. To develop evolutionary computation models, a multi-objective genetic programming (GP) strategy based on non-dominated sorting genetic algorithm II was employed. Here the optimization of mean-square error as the fitness measure and the subtree complexity as the complexity measure are considered as problem objectives. The GP model ran for 500 generations with 1000 population considering training/testing sets to overcome any possible over-fitting. The proposed models are developed using an S&P 500 database in a 20-year time period. These various data mining algorithms compared and evaluated. The reasonable results suggest that some of the proposed models can reach a high degree of accuracy and they can be applied to various stock databases to assess the total risk of investment. The accurate models along with stock selection decision support systems can overcome the disadvantages of weighted scoring stock selection.
Amir H. Gandomi is an assistant professor of Analytics & Information Systems at School of Business, Stevens Institute of Technology. Prior to joining Stevens, Dr. Gandomi was a distinguished research fellow in headquarter of BEACON NSF center located at Michigan State University. He received his PhD in engineering and used to be a lecturer in several universities. Dr. Gandomi has published over 130 journal papers and 4 books. Some of those publications are now among the hottest papers in the field and collectively have been cited about 8,500 times (h-index = 46). Recently, he has been named as 2017 Clarivate Analytics Highly Cited Researcher (The Top 1%) and ranked 20th in GP bibliography among more than 10,000 researchers. He has also served as associate editor, editor, and guest editor in several prestigious journals and has delivered several keynote/invited talks. Dr. Gandomi is part of a NASA technology cluster on Big Data, Artificial Intelligence, and Machine Learning. His research interests are global optimization and (big) data mining using machine learning and evolutionary computations in particular.
- A Network Economic Model with Time Competition
Jun Ma, University of Calgary and Shenyang University of Technology
In this paper, we study a designing and managing effective supply chain network model for perishable product with time-based competition. Time-based competition in supply chain network is driven by the consumer characteristic attributes. The design of time competition is based on consumer’s choice between selling price and time which will directly affect perishable product quality. A discrete choice model is developed to quantify consumer purchase decisions between selling price and time. The goal is to develop a new supply chain network equilibrium model considering consumer-driven time competition and deterioration of perishable product in order to optimize the profit of each supply chain in competing demand markets. Furthermore, the Variational Inequality is used to express the equilibrium solution in supply chain networks for perishable product. Numerical examples with different characteristics of perishable product in supply chain are reliable and reasonable.
Jun Ma is Ph.D. student of University of Calgary, Canada and associate professor in Shenyang University of Technology, China. He is also a visiting scholar and adjunct instructor in State University of New York. His research interests include supply chain equilibrium and optimization, large-scale optimization, production management and control, evolutionary games, complex network
- Performance Analytics Of Regional Blood Centers With Public-private Partnerships
Bilal Anwar, Xi’an Jiao Tong University
Efficient MIS can generate possible analytics that was already available theoretically but not pragmatically viable.one such challenging area is the provision of adequate safe blood for the health sector in developing countries. Where different issues, especially in perspective of safe blood transfusion, are:- Ø Lacking efficient tools for information and knowledge management and process automation(like inept blood component preparation method, blood storage, excessive blood ordering etc.) Ø Absence of central MIS (Management Information System) between the stakeholders Ø Someplace State of the art managing information systems are available but not integrated with other existing blood centers. But due to budgetary constraints, their governments focusing on Public-private partnerships (PPPs) to enhance efficiency.This case study reveals Ø How the efficiency of employees in public hospital can be enhanced after mergers through outperforming by addressing technical hitches by using innovative practices? Ø This study also reveals the examples of success achieved by comparison of data KPI’s before and after mergers of hospitals.So, the present case study and concepts offer the original solution in its findings and promise to be eye‐opening to those who have not given much thought to this topic with benchmark practices in all developing areas following the UN sustainable development goal3 i.e.Ensure healthy lives and promote well-being for all at all ages.
Mr. Bilal Anwar is final year Ph.D. scholar at Xi’an Jiao Tong University China. He always enjoyed his whole academic period with pride either it was departmental studies or extracurricular activities. He has diversified experience as well education. He completed his master’s in Physics as well in management. He has enriched the diversified experience of 9 years in management of National and Multinational Banks as well marketing of Pharmaceutical company products in Pakistan. Whereas as a research student in his doctorate his keen interest is sustainable infrastructure (economic, social) through Public-private partnerships.
- Machine Learning Algorithm For The Vehicle Routing Problem
Juan Jaramillo, SUNY Farmingdale School of Business
The designing of efficient routes for vehicles that serve multiple customers is known as the Vehicle Routing Problem (VRP). Given its wide applications in logistics, VRP is one of the most studied problems in Combinatorial Optimization (CO). This work presents a Machine Learning based Memetic Algorithm (MLMA) that uses path linking as its diversification strategy and tabu search as its intensification technique. Like most meta-heuristic, MLMA builds and evaluates a large amount of solutions in the process of converging to a high-quality solution. The innovative Machine Learning component maps which locations tend to group together in the same routes, which ones do not, and which ones tend to organize themselves in consecutive order. Information obtained by the mapping process is used to generate clusters that lead to good quality initial solutions and to speed the algorithm during the intensification process. Experimental results indicate that adding the Machine Learning component increases the effectivity and the speed of the Memetic Algorithm.
Juan is assistant professor at SUNY Farmingdale School of Business. He holds a PhD in Industrial Engineering from West Virginia University. Juan has been employed in industry and in academia, and has taught to undergraduate and master students in the fields of Industrial Engineering and Business Administration. He designed the Supply Chain B.S. in Albany State University and he is the lead faculty in the design and implementation of the new BS in Business Analytics at SUNY Farmingdale. Currently, Juan is the chair of the INFORMS Innovative Applications in Analytics Award. His published research spans the fields of Analytics, Operations, Logistics, and Supply Chain Management
- Common Errors in Marketing Experiments and How to Avoid Them
Tanya Kolosova, InProfix Inc
A methodology for designing experiments developed by Sir Ronald Fisher is more than 80 years old, but many marketers still rely on simple A/B tests to compare the performance of marketing campaigns and to find conditions to achieve the best results. Because marketing efficiency depends on a combination of factors and not on factors acting independently, A/B tests are not only inefficient but actually are not suitable for conducting marketing experiments. In this article, we describe very useful and efficient split-unit (or split-plot) design of marketing experiments. Split-unit design is often used in marketing experiments but not recognized; often miss or inappropriately analyzed. We use a real-life example to demonstrate some of the ideas involved and ways to correctly analyze split-unit design.
Tanya Kolosova is an expert in the area of actionable analytics and analytical software development having served as the Senior Vice President of Research and Analytics at IPG Inc, Principal Researcher at Yahoo!, Vice President of Analytics at Nielsen and Chief Analytics Officer at YieldWise Inc. Tanya developed her expertise with extensive depth and breadth of experience in bringing mathematical disciplines to bear on marketing and other business problems. She has extensive knowledge of audience intelligence, design and analysis of marketing experiments, market-mix modeling, and multi-channel commerce and has worked in a variety of industries like online and offline retail, telecom, finance, and more. Tanya also co-authored two books on statistical analysis and metadata-based applications development with SAS which are used in universities globally and was featured in Forbes Magazine (2006) for her work for GAP. In 2017 Tanya co-founded InProfix Inc, a stealth mode startup that develops AI solutions for the insurance industry.
- Data Mining Approach For Cast Selection In Movies
Shishir Suman, Georgia Tech
Movie production is a risky endeavor. Only 30-40% of movies break even at the box office. One of the primary determinants of movie success is cast. Existing cast selection methods are prone to human bias with casting managers relying on their experience and contact to audition actors. Current analytical approaches that study the effect of cast on potential movie success come in at post-production or post-release phases of the movie, when it is already too late to make major cast changes.
The poster provides a data mining approach to predict movie success scores based on attributes known during pre-production. It also recommends a ranked list of cast combinations based on associated movie success. The underlying algorithmic framework includes movie similarity identification, co-actor network analysis and random forest implementation to build a success scoring model. The technique results in a respectably high accuracy for prediction of movie success. Attributes such as cast novelty for the given genre and uniqueness of cast combinations have been incorporated into the scoring. Even novel considerations of pairing between actors and uniqueness of a cast combination have been included.
An exhaustive dataset compiled from IMDb, TMDb and other similar movie databases has been used to develop a web-used utility for users to explore the impact of prospective actors on a movie’s success. Users also have the option to choose actors with different levels of popularity as per the requirements of the role. Users can even filter out actors ranking high on the popularity scale to focus on lesser known actors. Such a data-driven approach is more efficient than heuristic approaches and can identify “hidden gems” who otherwise might be overlooked.
This paper presents a novel approach to movie casting, a domain with immense unexplored potential for data analytics.
Currently enrolled as a Graduate Student in MS in Analytics Program at Georgia Tech, I have experience using stats and machine-learning to find useful insights in data and presenting those insights to a wider audience.
As an experienced Professional with more than 3 years of Product Development experience in a fast-paced and dynamic environment, I have developed a flair for problem solving in a flexible way. Global exposure of executing projects and client experience in US, East Asia and India has helped me develop a solid use-case driven perspective.
- Creative Outpatient Center Layout Design Using Simulation And Optimization
Vahab Vahdatzad, Northeastern University
Among developed countries, the United States has the highest annual per-capita health expenditures, yet there are significant challenges to increase effectiveness and efficiency in the delivery of care. These challenges in improving care are a result of numerous design and operational decisions. These include long-term strategic level decisions, tactical medium-term planning decisions, and short-term operational decisions. Facility layout and design are examples of strategic decisions that have been studied in the healthcare sector. Hospital layout researchers aim to minimize traveling distances or associated costs of locating clinical and operational units inside hospitals. While the designs and layouts of hospitals are optimized by minimizing the distances between units, this disregards the integration of the design with functionality such that the layout is efficiently designed for stochastic patient flow. For this purpose, a simulation-optimization model is used to design a healthcare center layout by considering such joint decisions that not only optimize the walking distances but also seek to optimize patient flow and resource utilization.
Vahab Vahdatzad is a PhD candidate of Industrial Engineering at Northeastern University, Boston, MA. Vahab interest is applying systems thinking and systems engineering in healthcare. He has worked in several healthcare projects that have been applied in Boston Children’s Hospital, Massachusetts General Hospital, and Brigham and Women’s Hospital.
- Process Wind Tunnel for Improving Insurance Business Processes
Dr Sudhendu Rai, AIG
Massive amount of data is being generated and collected by multitude of software systems that orchestrate insurance business processes. This data can give key insights with the latest development in data science and operations research such as data wrangling, visualization, process mining, tail scheduling and simulation optimization. Process wind tunnel is a framework and a set of software tools encapsulating these new methods and technologies. The framework helps to assess, design and optimize process structure, parameters and operational policies using the data. We discuss two high impact case-studies to illustrate the framework and corresponding data, analytical methods, considerations related to implementing process change recommendations and results achieved. The intended audience for this talk is anyone interested in significantly improving their processes leveraging data and new analytical methods and tools.
Sudhendu Rai is a Senior Director in the AIG Science group responsible for developing and deploying innovative process improvement solutions for insurance business processes and operations. Prior to joining AIG, he was a Research Fellow at PARC and Xerox Fellow at Xerox Corporation. He is a past Edelman finalist for his work on print shop productivity improvement solution LDP Lean Document Production that delivered $200M in profits across the Xerox value chain. He holds 75 patents and has over 45 publications. He received his Ph.D. from MIT.
- Integrating Natural Language Processing And Fuzzy Logic To Address Contract Ambiguity
Mehdi Rajabi Asadabadi
In the procurement process, a set of requirements for the product and obligations for the vendor and purchaser are defined and the vendor then is committed to provide the product. Since the buyer does not get involved in the process, avoiding vagueness in the determination of the requirements and obligations is a crucial factor in the success of such contracts. Signing an ambiguous contract by the buyer may result in receiving an unsatisfactory product. Despite the significance of the clarification of contracts in the procurement process, only limited studies in software engineering have been undertaken. The aim of this research is to expose contract ambiguity in the context of the procurement process and propose a semi-automated framework combining Natural Language Processing (NLP) and fuzzy logic to address the issue.
Mehdi was awarded the International Postgraduate Research Scholarship (IPRS) from the University of New South Wales. He is currently developing a framework using mathematical approaches to address ambiguities in contracts. His current study contributes to contract theory by opening the new line of research which paves the way of leveraging AI techniques in automated or semi-automated contract monitoring. His current research should stimulate the extensive applications of AI in contracting. Such applications will be beneficial for the future world in which it will be essential for systems to be able to arrange and even sign accurate contracts instead of human.
Production & Presentation Guidelines
Posters must be produced as a single-sheet exposition that can fit on the bulletin board provided by INFORMS. The bulletin board measures 90” wide by 43” tall. We recommend a poster size of 72” wide by 36” tall or 48” wide by 36” tall. However, other sizes are acceptable as long as they do not exceed the bulletin board dimensions of 90” wide by 43” tall.
In preparing your poster, you may want to reference these online sites that provide templates, as well as printing services. Please see below for our guidelines on poster content and design.
We strongly recommend that you have your poster printed locally or by an online site before you travel to the meeting, rather than attempt to have it printed onsite in Baltimore.