VLDB2020: Keynote Speakers

Keynote 1

Title

The Relational Data Borg is Learning

Speaker

Dan Olteanu

Primary

2020-09-01T09:00:00Z

Repeat

2020-09-01T21:00:00Z

Abstract

As we witness the data science revolution, each research community legitimately reflects on its relevance and place in this new landscape. The database research community has at least three reasons to feel empowered by this revolution. This has to do with the pervasiveness of relational data in data science, the widespread need for efficient data processing, and the new processing challenges posed by data science workloads beyond the classical database workloads. The first two aforementioned reasons are widely acknowledged as core to the community’s raison d’être. The third reason explains the longevity of relational database management systems success: Whenever a new promising data-centric technology surfaces, research is under way to show that it can be captured naturally by variations or extensions of the existing relational techniques. Like the Star Trek’s Borg Collective co-opting technology and knowledge of alien species, the Relational Data Borg assimilates ideas and applications from connex fields to adapt to new requirements and become ever more powerful and versatile. Unlike the former, the latter moves fast, has great skin complexion, and is reasonably happy. Resistance is futile in either case.

In this talk, I will make the case for a first-principles approach to machine learning over relational databases that guided recent development in database systems and theory. This includes theoretical development on the algebraic and combinatorial structure of relational data processing. It also includes systems development on compilation for hybrid database and learning workloads and on computation sharing across aggregates in learning-specific batches. Such development can dramatically boost the performance of machine learning.

This work is the outcome of extensive collaboration of the author with colleagues from relationalAI (https://www.relational.ai), in particular Mahmoud Abo Khamis, Molham Aref, Hung Ngo, and XuanLong Nguyen, and from the FDB research project (https://fdbresearch.github.io/), in particular Ahmet Kara, Milos Nikolic, Maximilian Schleich, Amir Shaikhha, and Haozhe Zhang.

Bio

Dan Olteanu has recently become Professor for Big Data Science at the University of Zurich after spending over 12 years at the University of Oxford. Over the last two decades, he has published in the areas of database systems, database theory, and AI, contributing to XML query processing, incomplete information and probabilistic databases, factorised databases, in-database machine learning, incremental maintenance for analytics, and the commercial systems LogicBlox and relationalAI. He co-authored the book « Probabilistic Databases » (2011). He served or is serving as associate editor for PVLDB (2012, 2020), IEEE TKDE (2013-2015), ACM TODS (2018-), and the SIGMOD Record database principles column (2019-). He also served among others as PC vice chair for SIGMOD 2017 and will serve as PC chair for ICDT 2022. He is the recipient of the ICDT 2019 best paper award, SIGMOD 2018 Distinguished PC member award, an ERC Consolidator grant (2016), and an Oxford Outstanding Teaching award (2009). Some of his recent work on machine learning over relational databases, incremental maintenance, and declarative probabilistic programming have been invited to best-of-conference (ICDT 2016 and 2019, PODS 2018 and 2019) issues of ACM TODS.

Keynote 2

Title

Responsible Data Management

Speaker

Julia Stoyanovich

Primary

2020-09-02T15:00:00Z

Repeat

2020-09-03T03:00:00Z

Abstract

The need for responsible data management intensifies with the growing impact of data on society. One central locus of the societal impact of data are Automated Decision Systems (ADS), socio-legal-technical systems that are used broadly in industry, non-profits, and government. ADS process data about people, help make decisions that are consequential to people’s lives, are designed with the stated goals of improving efficiency and promoting equitable access to opportunity, involve a combination of human and automated decision making, and are subject to auditing for legal compliance and to public disclosure. They may or may not use AI, and may or may not operate with a high degree of autonomy, but they rely heavily on data.

In this talk I hope to convince you that the data management community should play a central role in the responsible design, development, use, and oversight of ADS. I outline a technical research agenda and also argue that, to make progress, we may need to step outside our engineering comfort zone and start reasoning in terms of values and beliefs, in addition to checking results against known ground truths and optimizing for efficiency objectives. This seems high-risk, but one of the upsides is being able to explain to our children what we do and why it matters.

Bio

Julia Stoyanovich is an Assistant Professor at New York University in the Department of Computer Science and Engineering at the Tandon School of Engineering, and the Center for Data Science. Julia's research focuses on responsible data management and analysis practices: on operationalizing fairness, diversity, transparency, and data protection in all stages of the data acquisition and processing lifecycle. She established the Data, Responsibly consortium (https://dataresponsibly.github.io/), and served on the New York City Automated Decision Systems Task Force, by appointment from Mayor de Blasio. Julia developed and is teaching courses on Responsible Data Science at NYU (https://dataresponsibly.github.io/courses/) In addition to data ethics, she works on management and analysis of preference data, and on querying large evolving graphs. She holds M.S. and Ph.D. degrees in Computer Science from Columbia University, and a B.S. in Computer Science and in Mathematics and Statistics from the University of Massachusetts at Amherst. Julia's work has been funded by the NSF, BSF and by industry. She is a recipient of an NSF CAREER award and of an NSF/CRA CI Fellowship.

Keynote 3

Title

Out-of-order Execution of Query Processing and New Advances in COVID-19 Information

Speaker

Masaru Kitsuregawa

Primary

2020-09-03T08:00:00Z

Repeat

2020-09-04T00:00:00Z

Abstract

This talk will have two parts. The first part, on “out-of-order execution” algorithms, is a long-term vision that has become reality, and the second part on COVID-19 information, is the current reality that may lead to future advances. First, I will talk about “out-of-order execution” algorithms that we have been working on for more than 10 years. The idea was so simple, but it took years to understand the essence of the out-of-order execution principle. We have verified significant speedups for a variety of queries and datasets over disk-based and flash-based database systems. A practical application enabled by the out-of-order execution is a healthcare data platform supporting interactive analytics on country-scale insurance claims (approximately two hundred billion records) in Japan. The out-of-order execution reduced the typical query response time from many days to a few minutes, enabling active and productive use by medical and public administration researchers to improve treatments based on the outcomes from real data of entire country. Second , I will describe some of our recent research advances for obtaining and managing COVID-19 information in several urgently useful application areas.

Bio

Director General of National Institute of Informatics and Professor at Institute of Industrial Science, the University of Tokyo. Received Ph.D. degree from the University of Tokyo in 1983. Served in various positions such as President of Information Processing Society of Japan (2013–2015) and Chairman of Committee for Informatics, Science Council of Japan（2014-2016）. He has wide research interests, especially in database engineering. Especially published many papers on high performance database systems, including GRACE hash join. He has received many awards including ACM SIGMOD E. F. Codd Innovations Award in 2009 and IEEE Innovation in Societal Infrastructure Award 2020. He was also awarded Medal with Purple Ribbon in 2013, the Chevalier de la Legion D’Honneur in 2016 and Japan Academy Prize in 2020. He is a fellow of ACM, IEEE, IEICE and IPSJ, also CCF honorary member.