Computer Science

Young Scholar TechTalk – Subgraph Federated Learning with Missing Neighbor Generation

In computer science, a graph is a network modeling objects and their unique interactions. The graph learning model is a specialized machine learning model that learns on graphs. Similar to traditional machine learning models, a well-performed graph learning model can capture the global data distribution with sufficient and unbiased training data. However, in a distributed subgraph system, most data owners only possess small amounts of the data (small subgraphs) in their local systems and can have unpredictable biases.
In this talk, the speaker will introduce this novel yet realistic setting – subgraph federated learning, which aims to let distributed data owners collaboratively train a powerful and generalized graph learning model without directly sharing their subgraphs. Towards this setting, two major techniques are proposed by the research team. (1) FedSage, which trains a GraphSage model based on FedAvg to integrate node features, link structures, and task labels on multiple local subgraphs; (2) FedSage+, which trains a missing neighbor generator along FedSage to deal with missing links across local subgraphs. Empirical results and theoretical analysis of proposed models respectively demonstrate the effectiveness and prove the generalization ability.

Tech Talk – dPRO: A Generic Performance Diagnosis and Optimization Toolkit for Expediting Distributed DNN Training

Distributed training using multiple devices (i.e., GPU servers) has been widely adopted for learning DNN models over large datasets. However, the performance of large-scale distributed training tends to be far from linear speed-up in practice. Given the complexity of distributed systems, it is challenging to identify the root cause(s) of inefficiency and exercise effective performance optimizations when unexpected low training speed occurs. To date, there exists no software tool which diagnoses performance issues and helps expedite distributed DNN training, while the training can be run using different machine learning frameworks. This paper proposes dPRO, a toolkit that includes: (1) an efficient profiler that collects runtime traces of distributed DNN training across multiple frameworks, especially fine-grained communication traces, and constructs global data flow graphs including detailed communication operations for accurate replay; (2) an optimizer that effectively identifies performance bottlenecks and explores optimization strategies (from computation, communication and memory aspects) for training acceleration. We implement dPRO on multiple deep learning frameworks (PyTorch, TensorFlow, MXNet) and representative communication schemes (AllReduce and Parameter Server architecture). Extensive experiments show that dPRO predicts performance of distributed training in various settings with<5% errors in most cases and finds optimization strategies with up to87.1%speed-up over the baselines.

Tech Talk – HINCare: Using Heterogenous Information Networks for Elderly Care Recommendation

In Hong Kong, the number of elderly citizens is estimated to rise to one third of the population, or 2.37 million, in year 2037. As they age and become more frail, the demand for formal support services (e.g., providing domestic or escort services) will increase significantly in the coming years. However, there is a severe lack of manpower to meet these needs. Some elderly-care homes reported a 70% shortage of employees. There is thus a strong need of voluntary or part-time helpers for taking care of elders.
In this talk, Prof. Cheng will introduce HINCare, a software platform that encourages mutual-help and volunteering culture in the community. HINCare uses the HIN (Heterogeneous Information Network) to recommend helpers to elders or other service recipients. The algorithms that use HINs and AI technologies for matching elders and helpers are based on our recent research results. This is the first time that HIN is used to support elderly care.
HINCare is now downloadable in Apple and Google Play Store, and has been serving more than a thousand of elders and helpers in NGOs (e.g., SKH and CSFC). The app is originally designed for elderly users, but has now expanded its services to support the Community Investment and Inclusion Fund (CIIF) and 10 NGOs engaged in teenage and family services. The system won the HKICT Award 2021, Asia Smart App Award 2020, and the HKU Faculty Knowledge Exchange Awards 2021 HKU.

Duckietown AI-Driving

HKU Duckietown Project is an interdisciplinary project that aims to democratise A.I. and robotics research. Through this project, students will gain tangible experience in a fun and playful way in prototyping self-driving robots and applying A.I. to the physical education platform developed by MIT for experiential learning.
A project highlight is the students’ participation in the A.I. Driving Olympics (AI-DO) international contests, with the live final events co-locate with NIPS and ICRA, the prestigious conferences in A.I. and robotics. Through the preparation of the competitions for over half a year, students will undergo intensive trainings, foster peer collaborations, solidify their knowledge and demonstrate a sustainable learning outcome.