About me

I am a postdoc in the Data Systems Group at the University of Waterloo. I am working with Renée J. Miller on data lakes. I worked with Tamer Özsu on graph data management. Previously, I was a postdoc at the University of Lyon (Lyon 1), working with Angela Bonifati. I received my PhD in computer science from the University of Clermont Auvergne, advised by Farouk Toumani. I also worked closely with Inria and Oracle Labs (Zurich).

I work on data systems. My recent focus is on data systems and large language models, collectively referred to as Data Intelligence Platforms. My research interests include the following:

  • Query Processing: heterogenous data; natural language queries; indexing methods; graph queries; OLAP; stream processing.
  • Data Lakes: data discovery; vector and semantic search; metadata management; data integration; open lakehouse and AI.

I am looking for PhD, Postdoc, Master’s Studnets, Research Interns, and Visiting Students. Please feel free to drop me a message.

Key Works

  • Ada-ef (SIGMOD’ 26): Adaptive vector search with declarative recall.
  • BIC (VLDB’ 24, KDD’ 21): Incremental computing framework for sliding window query processing over data streams.
  • SUDAF (ACM TODS 2024): Processing and optimizing queries with user-defined aggregate functions.
  • RLC Index (ICDE’ 23): Indexing multi-hop, recursive relationships on graphs.
  • LazyVLM (ICDE’ 26 Demo): Scaling vision-language-model–based video analytics.

Publications

DBLP: Chao Zhang 0045

  • Distribution-Aware Exploration for Adaptive HNSW Search.
    Chao Zhang, Renée J. Miller.
    SIGMOD 2026 (accepted for publication). [Extended version] [Code]

  • LazyVLM: Neuro-Symbolic Approach to Video Analytics.
    Xiangru Jian, Wei Pang, Zhengyuan Dong, Chao Zhang, M. Tamer Özsu.
    ICDE 2026 Demo (accepted for publication).

  • GRAPHOMNI: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks.
    Hao Xu, Xiangru Jian, Xinjian Zhao, Wei Pang, Chao Zhang, Suyuchen Wang, Qixin Zhang, Zhengyuan Dong, Joao Monteiro, Bang Liu, Qiuzhuang Sun, Tianshu Yu.
    ICLR 2026.

  • Indexing Techniques for Graph Reachability Queries.
    Chao Zhang, Angela Bonifati, M. Tamer Özsu.
    ACM CSUR 2025, Volumne 58, Issue 6. [Extended version].

  • Incremental Sliding Window Connectivity over Streaming Graphs.
    Chao Zhang, Angela Bonifati, M. Tamer Özsu.
    VLDB 2024. [Extended version] [Code]

  • Sharing Queries with Nonequivalent User-Defined Aggregate Functions.
    Chao Zhang, Farouk Toumani.
    ACM TODS 2024, Volume 49, Issue 2.

  • An Overview of Reachability Indexes on Graphs.
    Chao Zhang, Angela Bonifati, M. Tamer Özsu.
    SIGMOD 2023 Tutorial. [Slides (short)] [Slides]

  • A Reachability Index for Recursive Label-Concatenated Graph Queries.
    Chao Zhang, Angela Bonifati, Hugo Kapp, Vlad Ioan Haprian, Jean-Pierre Lozi.
    ICDE 2023. [Extended version] [Code] [Slides]

  • Efficient Incremental Computation of Aggregations over Sliding Windows.
    Chao Zhang, Reza Akbarinia, Farouk Toumani.
    KDD 2021. [Code]

  • SUDAF: Sharing User-Defined Aggregate Functions.
    Chao Zhang, Farouk Toumani, Bastien Doreau.
    ICDE 2020 Demo.

  • Sharing Computations for User-Defined Aggregate Functions.
    Chao Zhang, Farouk Toumani.
    EDBT 2020.

  • Symmetric and Asymmetric Aggregate Functions in Massively Parallel Computing.
    Chao Zhang.
    VLDB PhD Workshop 2017.

Preprints & Submissions

Presentations

Invited talks, excluding paper presentations at conferences.

  • Distribution-Aware Exploration for Adaptive HNSW Search.
    • Infrastructure System Lab, ByteDance USA, January 27, 2026.
  • Towards Efficient and Trustworthy Query Processing over Heterogeneous Data Lakes.
    • School of Information Technology, York University, November 20, 2025.
    • Department of Computer Science, University of Lyon (Lyon 1) / CNRS LIRIS, October 13, 2025
  • Towards High-Throughput and Low-Latency Sliding Window Processing over Data Streams.
    • School of Computer Science, University of Guelph, June 18, 2025.
    • School of Information Technology, York University, April 2, 2024.
  • Big Graph Processing Systems (with Angela Bonifati).
    • MDD 2022 Summer School, Bastia, June 19-23, 2022.
    • eBISS 2022 Summer School, Cesena, July 4-8, 2022.

Service

  • Program Committee Member:
    • SIGMOD: 2025, 2024, 2022
    • VLDB: 2027, 2026, 2025, 2024, 2023
    • ICDE: 2026
    • EDBT: 2027
    • ACM SoCC: 2025, 2024, 2023
    • VLDB Demo: 2026
    • BDA Demo: 2021.
  • Invited Journal Reviewer:
    • ACM TODS, VLDBJ, IEEE TKDE, CACM

Awards

  • Best Paper Award in BDA 2021, French Data Management Community.
  • Best PhD Thesis (runner-up) in BDA 2020, French Data Management Community.
  • Full scholarship for the 1st Big Sky Earth training school in DLR, Germany, April 4th-9th, 2016.

Teaching

  • CS 348: Introduction to Database Management at the University of Waterloo, Winter 2023.
  • Big Data Processing: Apache Spark in Action at the University of Clermont Auvergne, Spring 2022.
  • Semantic Web (labs) at the University of Clermont Auvergne, Fall 2021.

Visitors to this page come from: