Assistant Professor, Management Science & Statistics
University of Maryland- College Park
Addressing the Gender Paygap with Analytics
The gender pay gap (and other demographic pay gaps) are a topic of discussion in the boardroom, in the media and among policy makers, with multiple new legislations being passed at the US state level in addition to Federal policies and multiple laws in the rest of the world. Increasingly, firms request outside consulting to measure the pay gap or do so internally. The most common methodology for determining pay discrimination is to run log-linear regression models, regressing the natural log of wages on gender, while controlling for observable characteristics such as education, job role, age, experience, and gender (hereafter referred to as traits). The coefficient for gender indicates the percent pay gap between men and women. “Equal Pay” certification typically means that the percent difference in pay for men and women, after all observable characteristics are controlled for, is less than some small threshold value. A manager faced with closing or reducing the gender pay has so far not had any tools to support her decision making. In this presentation we address this knowledge gap. We first describe a cost optimal approach based on statistics and optimization that can meet the “equal pay for equal work” standard for less than half the cost of the naive method of increasing all female workers’ wages equally. We further explore the impacts of closing the gap based solely on cost efficiency. Because the gender pay gap is measured using log-wages; a $1 raise to an employee with a low wage has a greater percentage impact than to one with a high wage. Therefore employees with low wages all else being equal are more likely to receive a raise. Further, women who resemble men in terms of traits tend to have a larger influence on the measured gender pay-gap and are therefore more likely to receive salary increase. In addition, counterintuitively we show that there may exists men within a firm who if they receive salary increases will reduce the gender pay gap. These men strongly typify male employees in terms of traits. We demonstrate these results on a simplified simulated data. Finally we discuss ghettoization and the limitation of these models to prevent it. In order to balance cost efficiency with fairness we discuss other fairness driven algorithmic approaches that address and close the gender pay gap. These approaches while more expensive than the cost optimal approach can still save significant costs compared to the naïve approach. We demonstrate the above algorithmic approaches, savings and costs, using real data from our developing partners.
Margrét Vilborg Bjarnadóttir, is an Assistant Professor of Management Science and Statistics at Robert H. Smith School of Business. Dr. Bjarnadóttir holds a B.Sc. degree in Mechanical and Industrial Engineering from the University of Iceland (2001) and a Ph.D. in Operations Research from Massachusetts Institute of Technology (2008). She teaches quantitative modeling and data analytics at the graduate level both in the traditional classroom format as well as online. Before joining the Smith School she was at Stanford’s Graduate School of Business for two years. Dr. Bjarnadóttir’s research focuses on data-driven decision-making, combining operations research modeling with data analytics. In her work she has developed advanced data models to drive decision-making through optimization and predictive analytics. These models either build directly on the data, or take into account the uncertainty of different scenarios through the models’ predictive performance. In addition to the main focus of her work, which is health care, she has applied these models to contexts in finance and sports and, most recently, to people analytics. Examples of the topics of her papers include healthcare cost prediction, drug surveillance design, drug pattern analysis, cross-ownership analysis and purchasing recommendations. Dr. Bjarnadóttir has advised number of health care start-ups such as D2Hawkeye, 360Fresh and Benefit Science on cost predictions and risk evaluations. She worked with the parliamentary appointed Special Investigation Commission into the Banking Crash in Iceland and later for the Central Bank of Iceland where she focused on capital control fraud detection in the post-crash era.
Senior Program Manager
DC Water Pipeline Management with Risk Analysis and Machine Learning
Buried infrastructure is the backbone of our modern society. It connects communities, empowers our economy and enables everyday life. Utility managers have to make tough decisions every day to balance repairs and replacement of their ageing pipe inventory.The District of Columbia Water and Sewer Authority (DC Water) is challenged with a problem that will require capital investment that is limited in availability. In 2016, more than 250 water pipeline failures occurred in DC. DC Water will need to invest capital dollars to curb the inevitable increase in operational costs as well as impacts to service levels; however, capital budgets are tightly constrained. DC Water adopted a comprehensive risk management program for their 240 miles of large diameter water mains to determine which pipe assets should be targeted for capital investment. Using asset risk to prioritize pipeline inspections, DC Water can implement the right approach, at the right time, to manage water pipeline with the lowest financial impact. With recent advances in pressure pipe inspection technologies, assessment techniques, and repair/rehabilitation methods, a risk-based approach will ensure resources are focused on the correct pipelines. The pipeline prioritization model was developed by estimating the probability of failure and consequence of failure factors. A predictive model is developed to calculate the probability of failure based on machine learning theory based on pipe design information, 60 years historical break records, annual climate data, soil data and relevant data sources. DC Water evaluated multiple aspects of the consequence of pipeline failure including social impact, economic impact, and environmental impact. The risk assessment helped DC Water to identify pipelines with high priority and it is the foundation to establish performance based resource allocation decisions. This presentation will include the following aspects: 1. Challenges faced in water industry asset management 2. DC Water’s pipeline asset management strategies3. DC water’s pipeline failure machine learning model 4. DC Water’s consequence of failure model 5. DC Water’s risk mitigation practice
Over the past 18 years Craig has worked for both Public Utilities and Consulting Companies where he has performed a variety of tasks related to the development and management of water, waste water and storm water infrastructure.Craig Daly currently works for Pure Technologies as a Senior Program Manager where he performs a variety tasks related to pipeline condition assessment focusing on risk management, risk based planning and asset management of utility systems. Craig manages a group of engineers that utilizes advanced analytics to: • Develop a deeper understanding how uncertainty affects our decision making process. • Develops methods to incorporate uncertainty to capital planning of utility assets. • Utilize optimization methodology along with a deeper understanding of uncertainty to minimize capital investment while producing the maximum benefit with respect to infrastructure management. Craig received a Bachelor of Science degree in Forest Engineering from the University of New Brunswick in Canada and a Master of Environmental Engineering from the John’s Hopkins University.
The Analytic Edge
From the Sit Room to the Board Room: Leveraging Intelligence Analytic Tools to Identify and Manage Risk
US Intelligence Community (USIC) analysts/managers and their private sector counterparts live in the same world. Both assess risk and vulnerabilities, forecast what’s ahead, and look for ways to influence or exploit outcomes. And the private sector can use the same analytic tools to transform information into actionable intelligence that drives near-term and strategic decisions. For this session, Dr. Grusin will use the collapse of Target Canada in 2015 as a case study to demonstrate how structured analytic techniques—in this case the key assumptions check (KAC)– can help private sector organizations identify and manage risk for routine and high-stakes decisions across business areas, at any level of the organization, and regardless of data sets or decision-making model. The session will close with an introduction to how to use the KAC exercise results to shape forecasts and mitigation strategies.
Dr. Grusin completed a 29-plus-year career at the Central Intelligence Agency in 2008. He is a member of the Senior Intelligence Service and received the Director’s Career Intelligence Medal in recognition of his contributions. Recognized for his skills as an analyst and manager/teacher of analysts, Dr. Grusin helped create and deliver an analytic tradecraft course that redefined how analysis would be taught and formed the foundation of the Directorate of Analysis training curriculum. He is a Central Intelligence Agency University certified instructor with over 14,000 hours in the classroom instructing analysts and their managers. After four years with SAIC/leidos Dr. Grusin established his company in 2012 to bring intelligence training to the private sector and public sector organizations in and outside the US Intelligence Community. He serves clients in Washington DC and around the country. He has a BA from Bradley University and an MA and PhD from the University of Arizona.
Senior Director, Advanced Analytics
Dun & Bradstreet
Third Party Compliance Risk Analytics with Incomplete Data
Compliance risk is one of the critical third-party risks that businesses are exposed to. While leading the industry in commercial credit and operational risk analytics, Dun & Bradstreet (D&B) has also worked with multiple enterprises across various industries on their third-party compliance risk, a relatively new arena in risk analytics. One of the main common obstacles in third-party compliance risk analytics is scarcity of data, especially information related to a third party’s past compliance alerts and/or sanctions. However, even in cases where such information exists from past compliance endeavors, often they are incomplete or systematically biased. D&B has the world’s largest commercial information databases enhanced with its proprietary algorithms for processing and management of both structured and unstructured data. While this definitely helps alleviate the above-mentioned lack of key information, the incompleteness and inaccuracy of information collected by data providers still hinder significantly statistical modeling results in many cases. To this end, D&B developed its proprietary methods that have been record-proven over the course of recent years. To demonstrate the challenges and benefits of this type of analytics, this presentation will focus on a case study where third-party compliance risk analysis is applied for anti-bribery and anti-corruption purposes. The first step in the analysis is to identify Third-Party Intermediaries (TPIs) that are agents or middlemen in business activities. TPIs are more likely to be exposed to bribery/corruption risks. Once TPIs are identified, a business can proceed with the required compliance due diligence. In this case study, all known TPIs are self-reported. Thus, they’re subject to incomplete and systematically biased information. In other words, supervised learning methods based on such inaccurate/incomplete data inputs will miss out on those TPIs which are not reported. To capture TPIs that are not self-reported, Unsupervised Learning algorithms married with business knowledge are used to segment third-party portfolio into distinct groups. Third parties in top segment(s) which include most of the known TPIs are treated as TPIs. At this point, Supervised Machine Learning kicks off to analyze and score the entire third-party portfolio. The TPI Compliance Risk Model was created afterwards to be used for scoring TPIs based on historic third-party info which includes various types of compliance alerts and sanctions. D&B and its data partners collect and store this data for exactly these purposes. Because of the constant changes in the compliance field and to further improve the model accuracy, third-party compliance risk analytics employs a Recursive and Adaptive Learning process. With the help of this new type of analytics, business compliance practices can now provide new insights on the known TPIs and third-party compliance alerts/sanctions related to them. To capture new trends and any pattern changes, new information is continuously fed into the TPI Compliance Risk Model resulting in updated and more accurate risk predictions. Using the above analysis methods and their iterations, even incomplete business information can be effectively used for third-party compliance risk analytics.
Jonathan Yan is a Senior Director of Advanced Analytics Solutions of Dun & Bradstreet located in Short Hills, New Jersey. In his current role, he is leading global risk analytics projects. Having previously worked in analytical positions at New York Life Insurance and Dreyfus, Jonathan has 20+ years of professional experience managing research teams and projects. His expertise includes global risk management solutions, statistical modeling, data management and analysis. Jonathan has an extensive background providing innovative analytical solutions to global businesses in various industries. Jonathan holds a PhD degree from the University of Connecticut. Jonathan’s research papers have been presented at conferences of the International Communication Association (ICA), his article on advertising impact on audience reception was published in the Journal of Communication, and he has a patent pending on global risk modeling algorithms.
Senior Data Scientist
How to Reliably Release a New Feature for Mobile App? A Staged Rollout Framework with Comparative Analytics
As Uber grows, the company needs to continue to move fast while ensuring quality and reliability during the feature rollout process. Without a guided process, a buggy feature can cause large scale app crash and user experience degradation. Without effective analytics and monitoring, a negative impact brought by a feature can go unnoticed for a long time. To ensure reliability during the feature rollout, we developed an innovative procedure to control the blast radius of the feature rollout and continuously monitor the impact on app health using comparative analytics. The takeaways of this talk include: • The staged rollout framework for new feature release to control blast radius • The practical challenges we encountered for continuously monitoring the impact and detecting potential regressions during the feature rollout process • The comparative analytics developed for continuous monitoring the feature rollout using sequential likelihood ratio test with nonparametric variance estimation
Zhenyu Zhao is working on the data science platform at Uber as a senior data scientist. He received his PhD in Statistics from Northwestern University. Before joining Uber, he worked as a data scientist at Yahoo for experimentation and data insights. Zhenyu has hands-on experience in online experimentation for multiple mobile app and web products. Also he contributes to the data science methodology and tool development for internal platform. He developed several novel approaches in experimentation to tackling practical challenges and satisfying business needs, with a focus on delivering actionable insights with quality.