TechTalk – Data Intelligence of Large Language Models

All members of the HKU community and the general public are welcome to join!
Speaker: Professor Reynold C.K. Cheng, Professor and Department Head of AI & Data Science, School of Computing and Data Science, HKU
Date: 18th June 2026 (Thursday)
Time: 4:00 PM

Mode: Mixed

About the TechTalk
All members of the HKU community and the general public are welcome to join!
Speaker: Professor Reynold C.K. Cheng, Professor and Department Head of AI & Data Science, School of Computing and Data Science, HKU
Moderator: Professor Ben Kao, Professor, Department of AI & Data Science, School of Computing and Data Science, HKU
Date: 18th June 2026 (Thursday)
Time: 4:00 PM
Mode: Mixed (both face-to-face and online). Seats for on-site participants are limited. A confirmation email will be sent to participants who have successfully registered.
Language: English

Database systems, which provide various operations for defining and querying data, enable large-scale AI systems and intelligent applications in various domains. Due to recent advances in large language models (LLMs), automating database operations through code generation has become increasingly attainable. This capability of having data intelligence in LLMs has given rise to a new paradigm—Data-Centric Code Generation (DCCG)—which aims to build systems that can automatically understand, manipulate, and reason over data.
To realize DCCG, I will discuss our team’s effort in building benchmarking systems, including BIRD-SQL, a large-scale Text-to-SQL benchmark on real databases, and SWE-SQL, which gauges the ability that an LLM resolves user SQL issues. These benchmarks, widely used in the industry, reveal hallucination and other issues faced by LLMs. To address these challenges, I will present our work in graph-aware reasoning, SQL correction, and multi-turn tabular data analysis. They aim to evolve LLMs from static code generators into autonomous, trustworthy agents, with data intelligence, that can understand and generate data-driven software systems.

Registration
Registration
Registration
  • The tech talk “Data Intelligence of Large Language Models” will be organized in the Tam Wing Fan Innovation Wing Two (G/F, Run Run Shaw Building, HKU) on 18th June 2026 (Thursday), 4:00pm.
  • Seats are limited. Zoom broadcast is available if the seating quota is full. 
  • Registrants on the waiting list will be notified of the arrangement after the registration deadline (with seating/free-standing/other arrangement
Recording of the Tech Talk
About the speaker

Professor Reynold C.K. Cheng

Professor Cheng is named the AI 2000 Most Influential Scholar Honorable Mention in Database in 2023 to 2025. He received the ACM Distinguished Membership Award and the HKU Outstanding Research Student Supervisor Award in 2023. He was listed as the World’s Top 2% Scientists by Stanford University in 2022. He received the SIGMOD Research Highlights Reward 2020, International Exhibition of Inventions Geneva Award (2026), HKICT Awards (2021, 2023), HKU Knowledge Exchange Award (2024), HKU Engineering Knowledge Exchange Award (2024, 2021), HKU Engineering Best Teaching Award (2023, 2024), and HKU Outstanding Young Researcher Award 2011-12.

Promotion materials
About the project

Multifunctional Filters for Protecting Public Health

Clean water and clean air are vital for public health. This project focuses on developing high-efficiency and environmentally sustainable filters for removing harmful air/water pollutants. The team has developed novel architectures and functionalities for the filters to achieve high permeance, high removal efficiency, and excellent reusability.

Other Tech talks