Enabling High-Performance Acceleration of Graph Neural Networks

Graphs serve as powerful representations for capturing relationships between entities. Graph Neural Networks (GNNs) have emerged as a transformative approach for processing graph data, designed to learn from complex relational information by exploiting the rich interconnectedness of graphs. The unique computation requirements of GNN acceleration have been addressed by several FPGA and ASIC accelerators, such as HyGCN and GenGNN. Despite their relative success, previous works have shown to be limited to small graphs with up to 20k nodes, such as Cora, Citeseer and Pubmed. Since the computational overhead of GNN inference grows with increasing graph size, current accelerators are not well-prepared to handle medium to large-scale graphs, particularly in real-time applications.

AGILE (Accelerated Graph Inference Logic Engine) is an FPGA accelerator enabling real-time GNN inference for large graphs, introduced during an FYP project last year (see GitHub). Node aggregation is performed through an efficient Network-on-Chip architecture, while transformation is performed over a Systolic Array. The main contributions over presently-available solutions are as follows:

A new asynchronous programming model is formulated, which reduces pipeline gaps by addressing the non-uniform distribution in node degrees.
AGILE is the first accelerator to support quantized GNN inference, enabling large improvements in throughput and device resource usage. Additionally, a multi-precision node dataflow is proposed, inspired by quantization analysis from Taylor et al.
An asynchronous Streaming Prefetcher unit was implemented, removing the requirement for storing graph data on-chip and enabling computation on large graphs.

Despite the performance gains obtained from AGILE, this has so far only been deployed on Graph Convolutional Networks. A number of design and interface changes are required to support other network topologies, such as Graph Attention Networks (GAT) and Graph Isomorphism Networks (GIN). Furthermore, state-of-the-art GNN architectures are frequently proposed by the Machine Learning community, at a faster pace than the design of custom accelerators. These architectures can all be described by the Message Passing Mechanism, which is a mathematical model that encompasses all GNN topologies. As such, this project aims to fully support this abstraction within the AGILE infrastructure, enabling acceleration of any arbitrary GNN model.

This project involves:

Write specifications to meet functional requirements for new hardware features under power, performance and area (PPA) constraints.
Write RTL code to support design features using SystemVerilog.
Write a Verification Plan and implement constrained random tests using Cocotb to verify the intended functionalities.

Potential extension tasks:

Analyse design performance and propose new features to address any bottlenecks.
Implement power optimisation logic to reduce energy consumption across the design.
Analyse testbench coverage reports and write further directed tests to achieve high functional and code coverage.