Young Scholar TechTalk – Subgraph Federated Learning with Missing Neighbor Generation
In computer science, a graph is a network modeling objects and their unique interactions. The graph learning model is a specialized machine learning model that learns on graphs. Similar to traditional machine learning models, a well-performed graph learning model can capture the global data distribution with sufficient and unbiased training data. However, in a distributed subgraph system, most data owners only possess small amounts of the data (small subgraphs) in their local systems and can have unpredictable biases.
In this talk, the speaker will introduce this novel yet realistic setting – subgraph federated learning, which aims to let distributed data owners collaboratively train a powerful and generalized graph learning model without directly sharing their subgraphs. Towards this setting, two major techniques are proposed by the research team. (1) FedSage, which trains a GraphSage model based on FedAvg to integrate node features, link structures, and task labels on multiple local subgraphs; (2) FedSage+, which trains a missing neighbor generator along FedSage to deal with missing links across local subgraphs. Empirical results and theoretical analysis of proposed models respectively demonstrate the effectiveness and prove the generalization ability.