This repository contains the codebase supporting the findings of the paper "Learning-Based Link Anomaly Detection in Continuous-Time Dynamic Graphs". With the code provided in this repository, we explore performance of temporal graph learning models on link anomaly detection task. See our paper for more details.
Our implementation works with python >= 3.9 and can be installed as follows:
- Set up a conda environment.
conda create -n tgb_env python=3.9
conda activate tgb_env
- Install external packages.
pip install pandas==1.5.3
pip install matplotlib==3.7.1
pip install clint==0.5.1
pip install mlflow==2.10.0
pip install omegaconf==2.3.0
Install Pytorch and PyG dependencies to run the examples.
pip install torch==2.0.0 --index-url https://download.pytorch.org/whl/cu117
pip install torch_geometric==2.3.0
pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.0.0+cu117.html
- Clone TGB-link-anomaly-detection repository and install local dependencies under its root directory
/TGB-link-anomaly-detection
.
git clone https://github.com/timpostuvan/TGB-link-anomaly-detection.git
cd TGB-link-anomaly-detection
pip install -e .
Running experiments generally require two steps: generating anomalies for the dataset, and training and evaluating the model. Additionally, some datasets (e.g., LANL and DARPA-THEIA) have to be first pre-processed with the scripts provided in TGB-link-anomaly-detection. A simple example of training and evaluating the TGN model on Wikipedia dataset with injected temporal-structural-contextual anomalies can be run as follows:
- Generate anomalies
To generate temporal-structural-contextual anomalies for the Wikipedia dataset from TGB, run the following command inside /TGB-link-anomaly-detection/tgb/datasets/dataset_scripts
directory:
python link_anomaly_generator.py \
--dataset_name tgbl-wiki \
--val_ratio 0.15 \
--test_ratio 0.15 \
--anom_type temporal-structural-contextual \
--output_root <OUTPUT-DIR>
The anomalies are generated for the validation and test splits according to the 70/15/15 data split. The data is saved under <OUTPUT-DIR>
directory, which should be specified as an absolute path.
- Train and evaluate the model
TGN model can be trained and evaluated by running the following command inside the root directory /
:
python train_tgb_linkanomdet.py --config_path=experiments/example.yaml
Note that <OUTPUT-DIR>
directory in the configuration file has to be substituted with the same directory that was specified when generating anomalies.
The results of the experiment and the best model checkpoints are saved in <OUTPUT_DIR>/EXPERIMENTS/saved_results
and <OUTPUT_DIR>/EXPERIMENTS/saved_models
, respectively.
This section presents the commands to reproduce experiments from the paper.
- Learning-based temporal graph models:
python train_tgb_linkanomdet.py --config_path=experiments/experiment.yaml
- EdgeBank models:
python train_tgb_linkanomdet_edge_bank.py --config_path=experiments/experiment_EdgeBank.yaml
- Learning-based temporal graph models:
python train_tgb_linkanomdet.py --config_path=experiments/experiment_LANL.yaml
python train_tgb_linkanomdet.py --config_path=experiments/experiment_DARPA_THEIA.yaml
- EdgeBank models:
python train_tgb_linkanomdet_edge_bank.py --config_path=experiments/experiment_EdgeBank_LANL.yaml
python train_tgb_linkanomdet_edge_bank.py --config_path=experiments/experiment_EdgeBank_DARPA_THEIA.yaml
- Learning-based temporal graph models:
python train_tgb_linkanomdet.py --config_path=experiments/experiment_synthetic_vs_organic_anomalies_LANL.yaml
python train_tgb_linkpred_for_linkanomdet.py --config_path=experiments/link_prediction_experiment_LANL.yaml
- Learning-based temporal graph models:
python train_tgb_linkanomdet_without_conditioning_on_context.py --config_path=experiments/experiment_without_conditioning_on_context.yaml
- Learning-based temporal graph models:
python train_tgb_linkanomdet_without_improved_training.py --config_path=experiments/experiment_without_improved_training.yaml
The code is adapted from TGB_Baselines repository. If this code repository is useful for your research, please consider citing the original authors from TGB paper as well.
If this repository is helpful for your research, please consider citing our paper below.
@article{postuvan2024learningbased,
title={Learning-Based Link Anomaly Detection in Continuous-Time Dynamic Graphs},
author={Tim Postuvan and Claas Grohnfeldt and Michele Russo and Giulio Lovisotto},
journal={Transactions on Machine Learning Research},
issn={2835-8856},
year={2024},
url={https://openreview.net/forum?id=8imVCizVEw}
}