|
|
Recently, the field of Artificial Intelligence
(AI) has been experiencing a
resurgence. AI broadly covers
a wide swath of techniques,
which include logic-based
approaches, probabilistic
graphical models, machine
learning approaches such as
deep learning. Advances in
specialized hardware
capabilities (e.g., Graphics
Processing Units (GPUs),
Tensor Processing Units
(TPUs), Field-Programmable
Gate Arrays (FPGAs), etc.),
software ecosystem (e.g.,
programming languages such as
Python, Data Science frameworks, and
accelerated ML libraries), and
systems infrastructure (e.g.,
cloud servers with AI
accelerators) have led to
wide-spread adoption of AI
techniques in a variety of
domains. Examples of such
domains include image
classification, autonomous
driving, automatic speech
recognition, and
conversational systems (e.g.,
chatbots). AI solutions not
only support multiple data
types (e.g., images, speech,
or text), but also are
available in various
configurations and settings,
from personal devices to
large-scale distributed
systems.
In spite
of the wide-ranging techniques
and applications of AI, their
interactions with data
management systems remain in
infancy. Database management
systems have been, for a long
time, simply used as
repositories for feeding
inputs and storing
results. Only very recently,
we have started seeing some
new efforts in using AI
techniques in data management
systems, e.g., enabling
natural language interfaces to
relational databases and
applying machine learning
techniques for query
optimization. However, a lot
more needs to be done to fully
exploit the power of AI for
data management systems and
workloads.
aiDM is a one-day workshop that will bring
together people from academia
and industry to discuss
various ways of integrating AI
techniques with data
management systems. The
primary goal of the workshop
is to explore opportunities
for using AI techniques in
enhancing various components
of data management systems,
such as user interfaces, tooling, performance optimization, support
for new query types and workloads. Special emphasis will be given to
transparent exploitation of AI techniques using existing data
management infrastructures for enterprise-class workloads. We hope this workshop will
identify important areas of research and spur new efforts in this
emerging field.
The goal of the workshop is to take a holistic view of various AI technologies and
investigate how they can be applied to different component of an end-to-end data management
pipeline. Special emphasis would be given to how AI techniques could be used for enhancing
user experience by reducing complexity in tools, or providing newer insights, or providing
better user interfaces. Topics of interest include, but are not restricted to:
- Characterizing different AI approaches: Logic-based, probabilistic graphical models, and machine learning/deep learning approaches
- Evaluation of different learning approaches: unsupervised,
self-supervised, supervised or reinforced learning, transfer learning,
zero-shot learning, adversarial networks, and deep probabilistic models
- New AI-enabled business intelligence (BI) queries for relational databases
- Natural language enablement (e.g., queries, result summarization,
chatbot interfaces, etc.)
- Explainability and interpretability
- Fairness of AI-based system components
- Integration with Data Science and Deep Learning toolkits (e.g.,
sklearn, TensorFlow, PyTorch, ONNX, etc.)
- Evaluating quality of approximate results from AI-enabled queries
- Supporting multiple datatypes (e.g., images, time-series data, etc.)
- Supporting semi-structured, streaming, and graph databases
- Reasoning over knowledge bases
- Data exploration and visualization
- Integrating structured and unstructured data sources
- AI-enabled data integration strategies (e.g., entity resolution,
schema matching, etc.)
- Reinforcement learning for Database tuning
- Impact of AI on tooling, e.g., ETL or data cleaning
- Performance implications of AI-enabled queries
- Case studies of AI-accelerated workloads
- Social Implications of AI-enabled databases (e.g., detection and
elimination of bias)
- Learned data structures, database algorithms or systems
components
- AI-enabled databases for managing and supporting AI workloads
- AI strategies for data provenence, access control, anomaly detection and cyber security
- Experiences with database systems employing AI-enhanced components and interaction among AI-enhanced components
Session 1 (8.30-10am PST) (Chair: Yael Amsterdamer)
- Introductory Remarks
Yael Amsterdamer, Department of Computer Science, Bar-Ilan University
-
AutoCure: Automated Tabular Data Curation Technique for ML Pipelines
Mohamed Abdelaal, Software AG; Rashmi Koparde, Otto von Guericke University Magdebur; and Harald Schoening, Software AG
-
Tuple Bubble: Learned Tuple Representation for Tunable Approximate Query Processing
Damjan Gjurovski and Sebastian Michel, RPTU Kaiserslautern-Landau
-
Adversarial and Clean Data Are Not Twins
Zhitao Gong, Auburn University and Wenlu Wang, Texas A & M University-Corpus Christi
-
Zero-Shot Cost Models for Parallel Stream Processing
Pratyush Agnihotri, Technical University of Darmstadt; Boris Koldehofe, Technical University of Ilmenau; Carsten Binnig and Manisha Luthra, Technical University of Darmstadt and DFKI
Coffee Break
(10-10.30am PST)
Session 2 (10.30am - 12pm PST) (Chair: Oded Shmueli)
- Keynote 1: Reasoning in Natural Language, Dan Roth, VP/Distinguished Scientist, AWS AI Labs and the Eduardo D. Glandt Distinguished Professor, CIS, University of Pennsylvania
Dan Roth is the Eduardo D. Glandt Distinguished Professor at the Department of Computer and Information Science, University of Pennsylvania, a VP/Distinguished Scientist at AWS AI Labs, and a Fellow of the AAAS, the ACM, AAAI, and the ACL.
In 2017 Roth was awarded the John McCarthy Award, the highest award the AI community gives to mid-career AI researchers. Roth was recognized “for major conceptual and theoretical advances in the modeling of natural language understanding, machine learning, and reasoning.”
Roth has published broadly in machine learning, natural language processing, knowledge representation and reasoning, and learning theory. He was the Editor-in-Chief of the Journal of Artificial Intelligence Research (JAIR), has served as the Program Chair for AAAI, ACL and CoNLL, and as a Conference Chair for a few top conferences. Roth has been involved in several startups; most recently he was a co-founder and chief scientist of NexLP, a startup that leverages the latest advances in Natural Language Processing (NLP), Cognitive Analytics, and Machine Learning in the legal and compliance domains. NexLP was acquired by Reveal in 2020. Prof. Roth received his B.A Summa cum laude in Mathematics from the Technion, Israel, and his Ph.D. in Computer Science from Harvard University in 1995.
-
OmniscientDB: A Large Language Model-Augmented DBMS That Knows What Other DBMSs Do Not Know
Matthias Urban, Duc Dat, and Carsten Binnig, Technical University of Darmstadt
Lunch Break (12-1.30pm PST)
Session 3 (1.30-3pm PST) (Chair: Yael Amsterdamer)
- (Keynote 2)
Jun Wan, Databricks
-
Learned Spatial Data Partitioning
Keizo Hori, Yuya Sasaki, Daichi Amagata, Yuki Murosaki, and Makoto Onizuka, Osaka University
Coffee Break (3-3.30pm PST)
Session 4 (3.30-5pm PST)
- (Panel) Foundation Models and Databases: Opportunities and Challenges, Moderator: Rajesh Bordawekar
Workshop Steering Committee
- Rajesh Bordawekar, IBM T.J. Watson Research Center
- Oded Shmueli, Hirundo Ltd., and Emeritus Professor at Technion - Israel Institute of Technology
Workshop Program Chairs
Program Committee
- Zainab Abbas, KTH
- Laure Berti-Equille, IRD
- Cansu Kaynak Kocberber, Oracle
- Nick Koudas, University of Toronto
- Manisha Luthra, Tu Darmstadt
- Umar Farooq Minhas, Apple
- Felix Naumann, HPI
- Rekha Singhal, Tata Consultancy Services
- Anthony Tomasic, CMU
- Brit Youngmann, MIT
Important Dates
- Paper Submission: Friday, 17th March 2023, 12 pm PST
- Notification of Acceptance: Monday, 17th April, 2023
- Camera-ready Submission: Monday, 8th May, 2023
Submission Site
All submissions will be handled electronically via EasyChair.
Formatting Guidelines
We will use the same document templates as the SIGMOD/PODS'23
conferences (the
ACM format). It is the authors' responsibility to ensure that
their submissions adhere
strictly to the ACM
format. In particular, it is not allowed to modify the format with the objective of squeezing in more material. Submissions that do not comply with the formatting detailed here will be rejected without review.
The paper length for a full paper is limited upto 12
pages, with unlimited pages of references. However, shorter papers
(4 or 8 pages)
are encouraged as
well.
All accepted papers will be
indexed via the ACM digital
library and available for
download from the workshop
webpage in the digital
library.
|
|
|