Skip to content

Master Thesis: Pure Transformers Can Be Powerful Hypergraph Learners

Notifications You must be signed in to change notification settings

TuDou-PK/TokenHGT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Pure Transformers Can Be Powerful Hypergraph Learners

TokenHGT: Pure Transformers Can Be Powerful Hypergraph Learners

This is my master thesis project in DIAG, Sapienza University of Rome.

sapienza-big

Author: Kai Peng Acdamic Year: 2022/2023

Thesis supervisor:

Quick Facts

Existing Problems:

  • Graph/hypergraph convolution operations (message-passing methods) can lead to over-smoothing problems.
  • Modified structure transformers are designed for specific tasks may limit versatility, hindering integration into multi-task and multi-modal general-purpose attentional architectures.
  • Tokenized Graph Transformer(TokenGT) has successfully addressed these issues in the graph area. But not address these issues in the hypergraph area.

Thesis contributions:

  • This thesis aims to expand TokenGT to the hypergraph area to solve the limitations of message-passing and graph-specific structural modifications in the hypergraph field.
  • Provide an optional method for processing hypergraphs.

TokenHGT: Based on TokenGT

This work is based on tokenGT, our model called Tokenized HyperGraph Transformer(TokenHGT), but because hypergraphs are different from graphs, there are still innovations in our pipeline.

The following is a comparison between tokenGT and tokenHGT pipelines. The tokenGT pipeline as follows:

image

Our model TokenHGT pipeline as follows:

image

The differences such as:

  • The laplacian eigendecomposition formula is different.
  • Each graph edge only contains 2 nodes, each hyperedge contains a different number of nodes, so we can't do eigenvector alignment by the number of connected nodes as graph edges do, so I directly add eigrnvectors for feature fusion.
  • I concat feature tokens with eigenvector tokens instead of summing them, according to experiment result....
  • I didn't use "Type Identifier", it will reduce the performance of the model(In my personal opinion, it's not human-made features, it's noise).

Experiment

Now let's do some experiments!

The tokenHGT algorithm is designed to operate at the graph level, making it suitable for datasets that contain a significant number of hypergraphs. Ideally, the dataset should include both node and hyperedge features to capture the structural and attribute information inherent in the hypergraphs.

However, it is challenging to find readily available datasets that meet these requirements. Therefore, we have explored two methods to create suitable hypergraph datasets.

I find two choices:

I tried DHT method in PCQM4Mv2 and ogbg-molhiv, that's interesting to convert a molecular graph to a hypergraph, but due to device limitation(Money is all you need :), I have to give up it.

Convert a text to a hypergraph is a good chioce, the dataset alway small~ More detials about how to convert a text into a hypergraph.

Here is the dataset I used:

image

Result:

image

You will find I used 5 datasets but only show 3 dataset result, that's because our model performance too terrible on 20NG & Ohsumed.

Conclutions & Limitations

Conclusions:

  • The tokenHGT model generally applies pure transformers to the hypergraph area.
  • The tokenHGT’s effectiveness in overcoming the limitations of message passing methods, leading to superior performance on specific datasets.
  • Meanwhile, the pure transformer architecture guarantees the versatility of the models, which contributes to future multimodal research.

Limitations:

  • TokenHGT is not good at processing large hypergraphs. According to Graphomer, the self-attention module exhibits a quadratic complexity, which poses limitations on its applicability to large graphs.
  • It requires a suitable hypergraph dataset, which can be challenging to find.

Code Introduction

The full code is in "FullCode" file, keep the file structure.

  • Download row MR dataset: MR_Download.py
  • Generate LDA file: generate_lda.py, detials Here
  • Run main code: main.py
    • The model structure in: model_mr.py
    • process the LDA data: preprocess.py & utils.py
    • Detials about hypergraph laplacian eigendecomposition: eigen.py

About

Master Thesis: Pure Transformers Can Be Powerful Hypergraph Learners

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages