Drawing Causal Inference from Big Data

March 26-27, 2015
National Academy of Sciences
Washington, DC

Speakers
Edoardo Airoldi Susan Athey Leon Bottou Peter Buhlmann Susan Dumais Dean Eckles James Fowler	Michael Hawrylycz David Heckerman Jennifer Hill Michael Jordan Steven Levitt David Madigan Judea Pearl Thomas Richardson	James M Robins Bernhard Schölkopf Jasjeet Sekhon Cosma Shalizi Richard Shiffrin John Stamatoyannopoulos Hal Varian Bin Yu

Edoardo Airoldi

Edoardo Airoldi is well known for his research that explores modeling, inferential, and other methodological issues that often arise in applied problems where network data (i.e., measurements on pairs of units, or tuples more generally) need to be considered, and standard statistical theory and methods are no longer adequate to support the goals of the analysis. More broadly, his research encompasses statistical methodology and theory with application to molecular biology and computational social science. His areas of technical interest include approximation theorems, inequalities, convex and combinatorial optimization, and geometry.

Dr. Airoldi is an Associate Professor of Statistics and Director of Graduate Studies at Harvard University. He received a Ph.D. from Carnegie Mellon University in 2007, working at the intersection of statistical machine learning and computational social science with Stephen Fienberg and Kathleen Carley. His PhD thesis explored modeling approaches and inference strategies for analyzing social and biological networks. He was a postdoctoral fellow in the Lewis-Sigler Institute for Integrative Genomics and the Department of Computer Science at Princeton University working with Olga Troyanskaya and David Botstein.

Susan Athey

Susan Athey is regarded for her research in the areas of industrial organization, microeconomic theory, and applied econometrics. Her current research focuses on the design of auction-based marketplaces and the economics of the internet, primarily on online advertising and the economics of the news media. She has also studied dynamic mechanisms and games with incomplete information, comparative statics under uncertainty, and econometric methods for analyzing auction models.

Dr. Athey is a Professor of Economics at Stanford University and an NAS Member. She is also a Senior Fellow at the Stanford Institute for Economics Policy Research. She received her bachelor’s degree from Duke University and her PhD from Stanford, and she holds an honorary doctorate from Duke University. She previously taught at the economics departments at MIT, Stanford and Harvard. At the age of 36, Professor Athey received the John Bates Clark Medal. The Clark Medal was awarded by the American Economic Association every other year to “that American economist under the age of forty who is adjudged to have made the most significant contribution to economic thought and knowledge.”

Leon Bottou

Leon Bottou is best known for his work in machine learning and data compression. His work presents stochastic gradient descent as a fundamental learning algorithm. He is also one of the main creators of the DjVu image compression technology (together with Yann LeCun and Patrick Haffner), and the maintainer of DjVuLibre, the open source implementation of DjVu. He is the original developer of the Lush programming language.

Dr. Bottou is currently a researcher with Facebook Al Research. He obtained the Diplôme d'Ingénieur from École Polytechnique, and a PhD from Université Paris-Sud. In 1995, he returned to Bell Laboratories, where he developed a number of new machine learning methods, such as Graph Transformer Networks and applied them to handwriting recognition and OCR. In 2010 he joined the Microsoft adCenter in Redmond, WA, and in 2012 became a Principal Researcher at Microsoft Research in New York City. He is associate editor of the Journal of Machine Learning Research, the IEEE Transactions on Pattern Analysis and Machine Intelligence, and Pattern Recognition Letters.

Peter Buhlmann

Peter Buhlmann is a Professor in the Department of Mathematics at ETH Zurich and the Chair of the Mathematics department. His research interests include statistics, machine learning, computational biology (ranging from methodology and mathematical theory to interdisciplinary research in biology and bio-medicine).

Dr. Buhlmann received his PhD from ETH Zurich. He is a Group Leader of the Competence Center for the Systems of Physiology and Metabolic Diseases and a Member of the German-Swiss Research Group FOR916: Statistical Regularization and Qualitative Constraints. He was Co-Editor of the Annals of Statistics from 2010-2012. In 2014 he was awarded the honor of Distinguished Lecturer at the Chinese Academy of Sciences and recognized as a Highly Cited Researcher in Mathematics by Thomson Reuters. He was also the 2013 winner of the Winton Research Prize in London.

RETURN TO TOP OF PAGE

Susan Dumais

Susan Dumais is a Distinguished Scientist and Deputy Managing Director at Microsoft Corporation. She is best known for her work in algorithms and interfaces for improved information retrieval, as well as general issues in human-computer interaction. Susan has been at Microsoft Research since July 1997. In 2014 she was honored with the ACM-W Athena Lecture Award and the Tony Kent Strix Award.

Dr. Dumais is a member of the National Academy of Engineering. Her current research focuses on gaze-enhanced interaction, the temporal dynamics of information systems, user modeling and personalization, novel interfaces for interactive retrieval, and search evaluation. Previous research studied a variety of information access and management challenges, including personal information management, desktop search, question answering, text categorization, collaborative filtering, interfaces for improving search and navigation, and user/task modeling. She has worked closely with several Microsoft groups (Bing, Windows Desktop Search, SharePoint Portal Server, and Office Online Help) on search-related innovations. Prior to Microsoft, she co-developed a statistical method for concept-based retrieval known as Latent Semantic Indexing.

Dean Eckles

Dean Eckles is a social scientist, statistician, and member of the Data Science team at Facebook. He studies how interactive technologies affect human behavior by mediating, amplifying, and directing social influence — and the statistical methods to study these processes. His current work uses large field experiments and observational studies. His research appears in peer reviewed proceedings and journals in computer science, marketing, and statistics. Dean holds degrees from Stanford University in philosophy (BA), cognitive science (BS, MS), and statistics (MS), and communication (PhD).

Dr. Eckles completed his PhD in Clifford Nass’s CHIMe Lab at Stanford University. He was previously a member of the research staff at Nokia Research Center, Palo Alto. Before joining Nokia, he worked with BJ Fogg on research in mobile persuasive technologies in the Stanford Persuasive Technology Lab and worked at Yahoo! Research Berkeley, designing and studying mobile photo sharing apps and services.

James Fowler

James Fowler earned his PhD from Harvard in 2003 and is currently a Professor at the University of California, San Diego. His work lies at the intersection of the natural and social sciences, with a focus on social networks, behavior, evolution, politics, genetics, and big data.

Dr. Fowler was recently named a Fellow of the John Simon Guggenheim Foundation, one of Foreign Policy's Top 100 Global Thinkers, TechCrunch's Top 20 Most Innovative People, Politico's 50 Key Thinkers, Doers, and Dreamers, and Most Original Thinker of the year by The McLaughlin Group. He has also appeared on The Colbert Report. His research has been featured in numerous best-of lists including New York Times Magazine's Year in Ideas, Time's Year in Medicine, Discover Magazine's Year in Science, and Harvard Business Review's Breakthrough Business Ideas. Together with Nicholas Christakis, James wrote a book on social networks for a general audience called Connected.

Michael Hawrylycz

Mike Hawrylycz joined the Allen Institute in 2003. He is responsible for the direction of the data analysis and annotation effort. Hawrylycz has worked in a variety of applied mathematics and computer science areas, addressing challenges in consumer and investment finance, electrical engineering and image processing, and computational biology and genomics.

Dr. Hawrylycz received his Ph.D. in applied mathematics at the Massachusetts Institute of Technology. He subsequently was a post-doctoral researcher in the Computer Research and Applications Group at the Los Alamos National Laboratory. He received his Masters in Mathematics at Wesleyan University. He is a member of the Society for Industrial and Applied Mathematics as well as the American Statistical Association and the Society for Neuroscience. He has served as a Review Editor for Frontiers in Neurogenomics and a Reviewer for the Journal of Neuroscience, Nature Biotechnology and Physiological Genomics.

David Heckerman

David Heckerman is currently a Senior Director with Microsoft Corporation. In his early work, he demonstrated the importance of probability theory in Artificial Intelligence, and developed methods to learn graphical models from data, including methods for causal discovery. More recently, he is developing machine learning and statistical approaches for biological and medical applications, including HIV vaccine design and genomics. At Microsoft, David have developed numerous applications including data-mining tools in SQL Server and Commerce Server, the junk-mail filters in Outlook, Exchange, and Hotmail, handwriting recognition in the Tablet PC, text mining software in Sharepoint Portal Server, troubleshooters in Windows, and the Answer Wizard in Office.

Dr. Heckerman began his education with the intent of becoming a physicist, but his interests eventually led him into the medical sciences. While working on his MD at Stanford, he began looking at the problems of Artificial Intelligence. For his PhD work, he submitted an impressive construct he called the “probabilistic expert system" which led to Microsoft hiring him to build such systems for non-medical applications.

RETURN TO TOP OF PAGE

Jennifer Hill

Jennifer Hill is an Associate Professor of Social Sciences at New York University. She works on development of methods that help us to answer the causal question that are so vital to policy research and scientific development. In particular she focuses on situations in which it is difficult or impossible to perform traditional randomized experiments, or when even seemingly pristine study designs are complicated by missing data or hierarchically structured data.

Most recently Dr. Hill has been pursuing two major strands of research. The first focuses on Bayesian nonparametric methods that allow for flexible estimation of causal models without the need for methods such as propensity score matching. The second line of work pursues strategies for exploring the impact of violations of typical assumptions in this work that require that all confounders have been measured. Hill earned her PhD in Statistics at Harvard University in 2000 and completed a post-doctoral fellowship in Child and Family Policy at Columbia University's School of Social Work in 2002.

Michael Jordan

Michael I. Jordan is the Pehong Chen Distinguished Professor in the Department of Electrical Engineering and Computer Science and the Department of Statistics at the University of California, Berkeley.
He received his Masters in Mathematics from Arizona State University, and earned his PhD in Cognitive Science in 1985 from the University of California, San Diego. He was a professor at MIT from 1988 to 1998.

Dr. Jordan’s research interests bridge the computational, statistical, cognitive and biological sciences, and have focused in recent years on Bayesian nonparametric analysis, probabilistic graphical models, spectral
methods, kernel machines and applications to problems in distributed computing systems, natural language processing, signal processing and statistical genetics. Prof. Jordan is a member of the National Academy of Sciences, a member of the National Academy of Engineering and a member of the American Academy of Arts and Sciences. He is a Fellow of the American Association for the Advancement of Science. In 2015, he received the David E. Rumelhart Prize.

Steven Levitt

Steve Levitt is the William B. Ogden Distinguished Service Professor of Economics at the University of Chicago, where he directs the Becker Center on Chicago Price Theory. Levitt received his BA from Harvard University in 1989 and his PhD from MIT in 1994. He has taught at Chicago since 1997.

In 2004, Dr. Levitt was awarded the John Bates Clark Medal, awarded to the most influential economist under the age of 40. In 2006, he was named one of Time magazine's “100 People Who Shape Our World.” Steve co-authored Freakonomics, which spent over 2 years on the New York Times Best Seller list and has sold more than 4 million copies worldwide. SuperFreakonomics, released in 2009, includes brand new research on topics from terrorism to prostitution to global warming. Steve is also the co-author of the popular Freakonomics Blog.

David Madigan

David Madigan received a bachelor’s degree in Mathematical Sciences and a Ph.D. in Statistics, both from Trinity College Dublin. He has previously worked for AT&T Inc., Soliloquy Inc., the University of Washington, Rutgers University, and SkillSoft, Inc. He has over 100 publications in such areas as Bayesian statistics, text mining, Monte Carlo methods, pharmacovigilance and probabilistic graphical models. He is an elected Fellow of the American Statistical Association and of the Institute of Mathematical Statistics.

Dr. Madigan recently completed a term as Editor-in-Chief of Statistical Science. For the past three years, David has worked as a principal investigator on the OMOP research program, making significant contributions to the project's methodological work including the development, implementation, and analysis of a variety of statistical methods applied to various observational databases. His expertise is in the application of statistical methods to large-scale data problems. David is interested in large-scale predictive modeling and statistical analysis of healthcare data.

Judea Pearl

Judea Pearl is a graduate of the Technion-Israel Institute of Technology. He came to the United States for postgraduate work in 1960, and the following year he received a master’s degree in electrical engineering from Newark College of Engineering, now New Jersey Institute of Technology. In 1965, he simultaneously received a master’s degree in physics from Rutgers University and a PhD from the Brooklyn Polytechnic Institute, now Polytechnic Institute of New York University.

Dr. Pearl joined the faculty of UCLA in 1969, where he is currently a professor of computer science and statistics and director of the Cognitive Systems Laboratory. He is known internationally for his contributions to artificial intelligence, human reasoning, and philosophy of science. He is the author of more than 350 scientific papers and three landmark books in his fields of interest. A member of the National Academy of Sciences, the National Academy of Engineering and a founding Fellow of the American Association for Artificial Intelligence, Pearl is the recipient of numerous scientific prizes, including the Turing Award and IEEE Intelligent Systems' AI's Hall of Fame.
RETURN TO TOP OF PAGE

Thomas Richardson

Thomas Richardson is Professor and Chair of the Department of Statistics. He is also an Adjunct Professor in the Departments of Economics and Electrical Engineering and a member of the eScience Steering Committee. He received his BA from the University of Oxford and his MS and PhD from Carnegie Mellon University. He is a Fellow of the Center for Advanced Studies in the Behavioral Sciences at Stanford University. His research interests include Graphical Models and Causality.

Dr. Richardson is known for his research in Graphical models, algorithmic model selection, Bayesian inference, causal models, applications in economics. He was a Visiting Senior Research Fellow at Jesus College in Oxford and was awarded a fellowship at the Institute for Advanced Studies at the University of Bologna.

James M Robins

James M. Robins is an epidemiologist and biostatistician best known for advancing methods for drawing causal inferences from complex observational studies and randomized trials, particularly those in which the treatment varies with time. He is the 2013 recipient of the Nathan Mantel Award for lifetime achievement in statistics and epidemiology.

Dr. Robins graduated in medicine from Washington University in 1976. He is currently Mitchell L. and Robin LaFoley Dong Professor of Epidemiology at Harvard School of Public Health. He has published over 100 papers in academic journals and is an ISI highly cited researcher. In his original paper on causal inference, Robins described two new methods for confounding bias, which can be applied in the generalized setting of time-dependent exposures: The G-formula and G-Estimation of Structural Nested Models. He introduced a third class of models, Marginal Structural Models, in which the parameters are estimated using inverse probability of treatment weights. He has also contributed significantly to the theory of dynamic treatment regimes, which are of high significance in comparative effectiveness research and personalized medicine.

Bernhard Schölkopf

Bernhard Scholkopf studied in Tübingen and London physics, mathematics and philosophy. He received his doctorate from the Technical University of Berlin in computer science. He was Director and Scientific Member at the Max Planck Institute for Biological Cybernetics in Tübingen. Since May 2011 he has been Director of the Max Planck Institute for Intelligent Systems in Tübingen and Stuttgart. He taught at the Humboldt University of Berlin and the Technical University of Berlin and in Tübingen, since 2002 he is honorary professor at the Technical University of Berlin. He has received numerous outstanding prizes, most recently the Royal Society Milner Award in 2014 and the 2011 Max Planck Research Award.

Dr. Scholkopf is one of the leading international experts in this field. With his research team he developed new learning method that can detect patterns in observed data. More recently, he has dealt with the problem of causal data analysis and found an interesting link between causality and description complexity.

Jasjeet Sekhon

Jasjeet S. Sekhon is a Robson Professor of Political Science and Statistics at University of California, Berkeley. His current research focuses on methods for causal inference in observational and experimental studies and evaluating social science, public health and medical interventions. Professor Sekhon has done research on elections, voting behavior and public opinion in the United States, multivariate matching methods for causal inference, machine learning algorithms for irregular optimization problems, robust estimators with bounded influence functions, health economic cost effectiveness analysis, and the philosophy and history of inference and statistics in the social sciences.

Dr. Sekhon studied at the University of British Columbia. He earned his MA and PhD at Cornell University. In 2012, he won the Society for Political Methodology Software Award and The Warren Miller Prize. He is a BIDS Co-PI for Moore/Sloan Data Science Environment.

Cosma Shalizi

Cosma Shalizi is an Associate Professor of Statistics at Carnegie Mellon University. He is best known for his research in statistical inference for complex systems; nonparametric prediction for stochastic processes; causal inference; large deviations and ergodic theory; networks and information flow in neuroscience, economics and social sciences; heavy-tailed distributions; self-organization.

Dr. Shalizi earned his PhD in theoretical physics from the University of Wisconsin-Madison. As a post-doc, he moved from the mathematics of optimal prediction to devising algorithms to estimate such predictors from finite data, and applying those algorithms to concrete problems. On the algorithmic side, he devised an algorithm, CSSR, which exploits the formal properties of the optimal predictive states to efficiently reconstruct them from discrete sequence data. He also developed a reconstruction algorithm for spatio-temporal random fields. His most recent work falls into the areas of heavy tails, learning theory for time series, Bayesian consistency, neuroscience, network analysis and causal inference, with some overlap between these.

RETURN TO TOP OF PAGE

Richard Shiffrin

Richard M Shiffrin heads the Memory and Perception Laboratory in the Department of Psychological and Brain Sciences at Indiana University--the MAPLAB website gives information about present and past lab members and projects. He is a Distinguished Professor and Luther Dana Waterman Professor and has additional appointments in Cognitive Science (which he founded in 1988) and Statistics.

Dr. Shiffrin is a member of the National Academies of Sciences. His research interests are quite broad, more or less covering the fields of Cognitive Science and Psychology. Generally speaking the research involves empirical studies and quantitative and computational modeling of the results. Current projects are generally tailored toward the interests of the graduate students and postdoctoral researchers in the lab, and the need to carry out research funded by external grants (presently from NSF and AFOSR).

John Stamatoyannopoulos

John Stamatoyannopoulos, M.D., is an Assistant Professor of Genome Sciences and Medicine at the University of Washington School of Medicine. He graduated from Stanford University in 1990 with degrees in Biology, Symbolic Systems, and Classics, and received an M.D. in 1995 from the University of Washington. He completed residency in Internal Medicine at Brigham and Women's Hospital, Harvard Medical School, and was a fellow in Oncology and Hematology at Dana Farber Cancer Institute and the Massachusetts General Hospital. He was awarded a Howard Hughes Medical Institute Physician-postdoctoral fellowship at Dana Farber. Dr. Stamatoyannopoulos then served as Chief Scientific Officer of biotechnology company Regulome Corp., and subsequently joined the Departments of Genome Sciences and Medicine (Oncology) at the University of Washington in 2005. He was elected to American Society for Clinical Investigation in 2009 and is a member of the Editorial Board of Genome Research.

Dr. Stamatoyannopoulos' lab focuses on understanding the large-scale cis-regulatory circuitry of the human genome, and the functional consequnces of non-coding genetic variation. He is PI of the UW ENCODE Project, and Director of the Northwest Reference Epigenome Mapping Center.

Hal Varian

Hal R. Varian is the Chief Economist at Google. He is also an emeritus professor at the University of California, Berkeley in three departments: business, economics, and information management. He received his SB degree from MIT in 1969 and his MA in mathematics and Ph.D. in economics from UC Berkeley in 1973. He has also taught at MIT, Stanford, Oxford, Michigan and other universities around the world.

Dr. Varian is a fellow of the Guggenheim Foundation, the Econometric Society, and the American Academy of Arts and Sciences. Professor Varian has published numerous papers in economic theory, industrial organization, financial economics, econometrics and information economics. He is the author of two major economics textbooks which have been translated into 22 languages. He is the co-author of a bestselling book on business strategy, Information Rules: A Strategic Guide to the Network Economy and wrote a monthly column for the New York Times from 2000 to 2007

Bin Yu

Bin Yu is a Professor of Statistics, and Electrical Engineering & Computer Science, at University of California Berkeley. She is Chancellor's Professor in Statistics from 2006 to 2011. Her current research interests include statistical inference, machine learning, information theory (Minimum Description Length Principle), as well as data modeling in areas such as remote sensing, internet tomography, sensor networks, neuroscience, bioinformatics, and finance.

Dr. Yu received her B.S. degree in Mathematics from Peking University in 1984, M.S. and Ph.D. degrees in Statistics from the University of California at Berkeley in 1987 and 1990 respectively. Her doctoral research was on empirical processes for dependent data and Minimum Description Length (MDL) Principle. She is an elected Fellow of IEEE, The Institute of Mathematical Statistics (IMS) and ASA (American Statistical Association). She was a Miller Research Professor, Miller Institute, UC Berkeley. She is an Associate Editor for The Annals of Statistics, Journal of American Statistical Association (JASA), and Statistica Sinica, an Action Editor for the Journal of Machine Learning Research (JMLR). She served on the Board of IEEE Information Theory Society (two terms) and on the Council of IMS (one term).