Projects

1.Angel

Angel is a high-performance distributed machine learning and graph computing platform based on the philosophy of Parameter Server. It is tuned for performance with big data from Tencent and has a wide range of applicability and stability, demonstrating increasing advantage in handling higher dimension model. Angel is jointly developed by Tencent and Peking University, taking account of both high availability in industry and innovation in academia.

With model-centered core design concept, Angel partitions parameters of complex models into multiple parameter-server nodes, and implements a variety of machine learning algorithms and graph algorithms using efficient model-updating interfaces and functions, as well as flexible consistency model for synchronization. Angel is developed with Java and Scala. It supports running on Yarn. With PS Service abstraction, it supports Spark on Angel. Graph computing and deep learning frameworks support is under development and will be released in the future.

We welcome everyone interested in machine learning or graph computing to contribute code, create issues or pull requests. Please refer to Angel Contribution Guide for more detail.

2.SGL

SGL is a Graph Neural Network (GNN) toolkit targeting scalable graph learning, which supports deep graph learning on extremely large datasets. SGL allows users to easily implement scalable graph neural networks and evaluate its performance on various downstream tasks like node classification, node clustering, and link prediction. Further, SGL supports auto neural architecture search functionality based on OpenBox. SGL is designed and developed by the graph learning team from the DAIR Lab at Peking University.

The key difference between SGL and existing GNN toolkits, such as PyTorch Geometric (PyG) and Deep Graph Library (DGL), is that, SGL enjoys the characteristics of the follwing three perspectives.

High scalability: Following the scalable design paradigm SGAP in PaSca, SGL can scale to graph data with billions of nodes and edges.
Auto neural architecture search: SGL can automatically choose decent and scalable graph neural architectures according to specific tasks and pre-defined multiple objectives (e.g., inference time, memory cost, and predictive performance).
Ease of use: SGL has user-friendly interfaces for implementing existing scalable GNNs and executing various downstream tasks.

3.MindWare

MindWare is an efficient open-source system to help users to automate the process of 1) data pre-processing, 2) feature engineering, 3) algorithm selection, 4) architecture design, 5) hyper-parameter tuning, and 6) model ensembling. It is capable of improving its AutoML power by decomposing the entire large AutoML search space into small ones, and solve each sub-problems jointly and efficiently.

MindWare is developed by DAIR Lab at Peking University. The goal of MindWare is to make machine learning easier to apply both in industry and academia.

4.OpenBox

OpenBox is an efficient open-source system designed for solving generalized black-box optimization (BBO) problems.

It owns the following characteristics:

BBO with multiple objectives and constraints.
BBO with transfer learning.
BBO with distributed parallelization.
BBO with multi-fidelity acceleration.
BBO with early stops.

Wentao Zhang

Projects

1.Angel

2.SGL

3.MindWare

4.OpenBox