About me
I am a postdoc in the Data Systems Group at the University of Waterloo. I am working with Renée J. Miller on data lakes. I worked with Tamer Özsu on graph data management. Previously, I was a postdoc at the University of Lyon (Lyon 1), working with Angela Bonifati. I received my PhD in computer science from the University of Clermont Auvergne, advised by Farouk Toumani. I also worked closely with Inria and Oracle Labs (Zurich).
I work on data systems. My recent focus is on data systems and large language models, collectively referred to as Data Intelligence Platforms. My research interests include the following:
- Query Processing: indexing methods; stream processing; multimodal and heterogenous data; natural language queries.
- Data Lakes: data discovery; vector and semantic search; metadata management; data integration; open lakehouse and AI.
Key Works
- Ada-ef (SIGMOD’ 26): Adaptive vector search with declarative recall.
- BIC (VLDB’ 24, KDD’ 21): Incremental computing framework for sliding window query processing over data streams.
- SUDAF (ACM TODS 2024): Processing and optimizing queries with user-defined aggregate functions.
- RLC Index (ICDE’ 23): Indexing multi-hop, recursive relationships on graphs.
- LazyVLM: Scaling vision-language-model–based video analytics.
Publications
DBLP: Chao Zhang 0045
Distribution-Aware Exploration for Adaptive HNSW Search.
Chao Zhang, Renée J. Miller.
SIGMOD 2026 (accepted for publication).Indexing Techniques for Graph Reachability Queries.
Chao Zhang, Angela Bonifati, M. Tamer Özsu.
ACM CSUR 2025 (accepted for publication). [Extended version].Incremental Sliding Window Connectivity over Streaming Graphs.
Chao Zhang, Angela Bonifati, M. Tamer Özsu.
VLDB 2024. [Extended version] [Code]Sharing Queries with Nonequivalent User-Defined Aggregate Functions.
Chao Zhang, Farouk Toumani.
ACM TODS 2024, Volume 49, Issue 2.An Overview of Reachability Indexes on Graphs.
Chao Zhang, Angela Bonifati, M. Tamer Özsu.
SIGMOD 2023 (Tutorial). [Slides (short)] [Slides (full)]A Reachability Index for Recursive Label-Concatenated Graph Queries.
Chao Zhang, Angela Bonifati, Hugo Kapp, Vlad Ioan Haprian, Jean-Pierre Lozi.
ICDE 2023. [Extended version] [Code] [Slides]Efficient Incremental Computation of Aggregations over Sliding Windows.
Chao Zhang, Reza Akbarinia, Farouk Toumani.
KDD 2021. [Code]SUDAF: Sharing User-Defined Aggregate Functions.
Chao Zhang, Farouk Toumani, Bastien Doreau.
ICDE 2020 (Demo).Sharing Computations for User-Defined Aggregate Functions.
Chao Zhang, Farouk Toumani.
EDBT 2020.Symmetric and Asymmetric Aggregate Functions in Massively Parallel Computing.
Chao Zhang.
VLDB PhD Workshop 2017.
Preprints & Submissions
LazyVLM: Neuro-Symbolic Approach to Video Analytics.
Xiangru Jian, Wei Pang, Zhengyuan Dong, Chao Zhang, M. Tamer Özsu.GRAPHOMNI: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks.
Hao Xu, Xiangru Jian, Xinjian Zhao, Wei Pang, Chao Zhang, Suyuchen Wang, Qixin Zhang, Zhengyuan Dong, Joao Monteiro, Bang Liu, Qiuzhuang Sun, Tianshu Yu.Low-Latency Sliding Window Connectivity.
Chao Zhang, Angela Bonifati, M. Tamer Özsu.Parallelization of Incremental Aggregations over Sliding Windows.
Chao Zhang, Reza Akbarinia, Farouk Toumani.
Presentations
- Towards Efficient and Trustworthy Query Processing over Heterogeneous Data Lakes.
- School of Information Technology, York University, November 20, 2025.
- Department of Computer Science, University of Lyon (Lyon 1) / CNRS LIRIS, October 13, 2025
- Towards High-Throughput and Low-Latency Sliding Window Processing over Data Streams.
- School of Computer Science, University of Guelph, June 18, 2025.
- School of Information Technology, York University, April 2, 2024.
- Big Graph Processing Systems (with Angela Bonifati).
- MDD 2022 Summer School, Bastia, June 19-23, 2022.
- eBISS 2022 Summer School, Cesena, July 4-8, 2022.
Service
- Program Committee Member:
- SIGMOD: 2025, 2024, 2022
- VLDB: 2026, 2025, 2024, 2023
- ICDE: 2026
- ACM SoCC: 2025, 2024, 2023
- BDA: 2021 (Demo).
- Invited Journal Reviewer:
- ACM TODS, VLDBJ, IEEE TKDE
Awards
- Best Paper Award in BDA 2021, French Data Management Community.
- Best PhD Thesis (runner-up) in BDA 2020, French Data Management Community.
- Full scholarship for the 1st Big Sky Earth training school in DLR, Germany, April 4th-9th, 2016.
Teaching
- CS 348: Introduction to Database Management at the University of Waterloo, Winter 2023.
- Big Data Processing: Apache Spark in Action at the University of Clermont Auvergne, Spring 2022.
- Semantic Web (labs) at the University of Clermont Auvergne, Fall 2021.
Visitors to this page come from:
