aiDM 2019
Second International Workshop on Exploiting Artificial Intelligence Techniques for Data Management (aiDM)

Friday, July 5, 2019
In conjunction with SIGMOD/PODS 2019
Workshop Overview

Recently, the Artificial Intelligence (AI) field has been experiencing a resurgence. AI broadly covers a wide swath of techniques which include logic-based approaches, probabilistic graphical models, and machine learning/deep learning approaches. Advances in hardware capabilities, such as Graphics Processing Units (GPUs), software components (e.g., accelerated libraries, programming frameworks), and systems infrastructures (e.g., GPU-enabled cloud providers) has led to a wide-spread adaptation of AI techniques to a variety of domains. Examples of such domains include image classification, autonomous driving, automatic speech recognition (ASR) and conversational systems (chatbots). AI solutions not only support multiple datatypes (e.g., free text, images, or speech), but are also available in various configurations, from personal devices to large-scale distributed systems.

In spite of the wide ranging applications of AI techniques, its interactions with the data management systems remains in infancy. At present, a majority of database management systems (DBMS) are being used primarily as a repository for feeding input data and storing results. Recently, there has been some activity in using AI techniques in data management systems, e.g., enabling natural language interfaces to relational databases and applying machine learning techniques for query optimizations. However, a lot more needs to done to fully exploit the power of AI for data management workloads.

We propose to organize a one-day workshop that will bring together people from academia and industry to discuss various ways of integrating AI techniques with data management systems. The primary goal of the proposed workshop is to explore opportunities for AI techniques for enhancing different components of the data management systems, e.g., user interfaces, tooling, performance optimizations, new query types, and workloads. Special emphasis would be given to transparent exploitation of AI techniques using existing data management for enterprise class workloads. We hope this workshop will identify important areas of research and spur new efforts in this emerging field.

Topics of Interest

The goal of the workshop is to take a holistic view of various AI technologies and investigate how they can be applied to different component of an end-to-end data management pipeline. Special emphasis would be given to how AI techniques could be used for enhancing user experience by reducing complexity in tools, or providing newer insights, or providing better user interfaces. Topics of interest include, but are not restricted to:

  • Characterizing different AI approaches: Logic-based, Probabilistic Graphical models, and machine learning/deep learning approaches
  • Evaluation of different learning approaches: unsupervised learning, supervised or reinforced learning, transfer learning, zero-shot learning, adversarial networks, and deep probabilistic models
  • New AI-enabled business intelligence (BI) queries for relational databases
  • Natural language queries and chatbot interfaces
  • Natural language result summarization
  • Issues with explainability/interpretability
  • Evaluating quality of approximate results from AI-enabled queries
  • Supporting multiple datatypes (e.g., images or time-series data)
  • Supporting semi-structured, streaming, and graph databases
  • Reasoning over knowledge bases
  • Data exploration and visualization
  • Integrating structured and unstructured data sources
  • AI-enabled data integration strategies
  • Re-inforcement Learning for Database Tuning
  • Impact of AI on tooling, e.g., ETL or data cleaning
  • Performance implications of AI-enabled queries
  • Case studies of AI-accelerated workloads
  • Social Implications of AI-enabled database (e.g., De-Biasing)


Workshop Co-Chairs

       For questions regarding the workshop please send email to bordaw AT us DOT ibm DOT com.

Program Committee

  • Uri Alon, Technion
  • Bortik Bandyopadhyay, Ohio State University
  • Raul Castro Fernandez, MIT
  • Bugra Gedik, Bilkent University
  • Tin Kam Ho, IBM Cloud and Cognitive Software
  • Lipyeow Lim, University of Hawaii
  • Sharad Mehrotra, University of California, Irvine
  • Ryan Markus, Brandeis University
  • Subhabrata Mukherjee, Amazon
  • Sebastian Schelter, NYU
  • Shaikh Quader, IBM Cloud and Cognitive Software
  • Jennifer Sleeman, University of Maryland, Baltimore County
  • Kavitha Srinivas, IBM Research
  • Seema Sundara, Oracle Labs

Workshop Program
9 am- 6 pm, Administratiezaal

Session 1 (9-10.30 am)

  • Keynote Presentation: Exploratory Data Analysis - ML to the rescue

    Prof. Tova Milo, Tel Aviv University
  • Abstract: Exploratory Data Analysis (EDA) is an critical procedure in any data-driven discovery process.Yet it is known to be a difficult process, especially for non-expert users, since it requires profound analytical skills and familiarity with the data domain. In this work, we will examine the use of Machine Learning techniques, in particular Deep Reinforcement Learning (DRL), to simplify Exploratory Data Analysis. We suggest an end-to-end framework architecture, coupled with an initial implementation of each component. The goal of the talk is to encourage the exploration of DRL models and techniques for facilitating a full-fledged, autonomous solution for EDA.

  • Scheduling OLTP Transactions via Learned Abort Prediction, Yangjun Sheng, Anthony Tomasic, Tieying Zhang, and Andy Pavlo, Carnegie Mellon University
Session 2 (11 am-12.30 pm)

  • Considerations for Handling Updates in Learned Index Structures, Ali Hadian and Thomas Heinis, Imperial College, London
  • Cardinality Estimation with Local Deep Learning Models, Lucas Woltmann, Claudio Hartmann, Maik Thiele, Dirk Habich and Wolfgang Lehner, TU Dresden
  • Towards Learning a Partitioning Advisor with Deep Reinforcement Learning, Benjamin Hilprecht, Carsten Binnig, TU Darmstadt and Uwe Röhm, The University of Sydney

Session 3 (2-3.30 pm)

  • Keynote Presentation: Our AI world: The rise of data and the death of coding

    Sam Lightstone, IBM
  • Abstract: IBM CTO for Data and IBM Fellow, Sam Lightstone, will discuss how AI is fundamentally changing computer science and the practice of coding, what Machine Learning means today, recent advances in hardware and software and breakthrough innovations that are being researched. Sam will discuss how the advent of AI will change the use and objectives of data systems. He'll summarize what tools and languages are commonly used as well as some that are emerging ones for powerful simplification and improved scalability. Hear about deep fakes, Generative Adversarial Networks (GANs), neurosynaptic computing and much more.

  • Interpreting Deep Learning Models for Entity Resolution: An Experience Report Using LIME, Vincenzo Di Cicco, and Donatella Firmani, Roma Tre University, Nick Koudas, University of Toronto, Paolo Merialdo, Roma Tre University, and Divesh Srivastava, AT&T Labs-Research
Session 4 (4.30-6 pm)

  • Termite: A System for Tunneling Through Heterogeneous Data, Raul Castro Fernandez and Samuel Madden, MIT
  • Learning to Optimize Federated Queries, Liqi Xu, University of Illinois Urbana-Champaign, Richard Cole and Daniel Ting, Tableau
  • Question Answering via Web Extracted Tables, Bhavya Karki, Fan Hu, Nithin Haridas, Suhail Barot, Zihua Liu, Lucile Callebert, Matthias Grabmair and Anthony Tomasic, Carnegie Mellon University

Submission Instructions

Important Dates 

  • Paper Submission: Monday, 18th March 2019
  • Notification of Acceptance: Friday, 19th April, 2019
  • Camera-ready Submission: Friday, 3rd May, 2019
  • Workshop Date: Friday, 5th July, 2019

Submission Site 

All submissions will be handled electronically via EasyChair.

Formatting Guidelines 

We will use the same document templates as the SIGMOD/PODS'19 conferences (using the 2017 ACM format).

It is the authors' responsibility to ensure that their submissions adhere strictly to the 2017 ACM format . In particular, it is not allowed to modify the format with the objective of squeezing in more material. Submissions that do not comply with the formatting detailed here will be rejected without review. 

The paper length for a full paper is limited upto 8 pages. However, shorter papers (4 pages) are encouraged as well.  

All accepted papers will be indexed via the ACM digital library and available for download from the workshop webpage in the digital library.