The PatchDB Dataset


To foster large-scale research on vulnerability mitigation and to enable a comparison of different detection approaches, we make our dataset PatchDB from our DSN21 paper publicly available.

PatchDB is a large-scale security patch dataset that contains around 12K security patches and 24K non-security patches from the real world. You can find more details on the dataset in the paper "PatchDB: A Large-Scale Security Patch Dataset".

You can see some typical examples on our website. To download the PatchDB dataset, please carefully read the download policy, disclaim, and agreement.

If you are using PatchDB for work that will result in a publication (thesis, dissertation, paper, article), please use the following citation:

@inproceedings{wang2021PatchDB,
  title={PatchDB: A Large-Scale Security Patch Dataset},
  author={Wang, Xinda, Wang, Shu, Feng, Pengbin, Sun, Kun and Jajodia, Sushil},
  booktitle={2021 51st Annual IEEE/IFIP International Conference on Dependable Systems
and Networks (DSN)}, year={2021}, organization={IEEE} }
OR
Xinda Wang, Shu Wang, Pengbin Feng, Kun Sun and Sushil Jajodia, "PatchDB: A Large-Scale 
Security Patch Dataset," 2021 51th Annual IEEE/IFIP International Conference on
Dependable Systems and Networks (DSN), 2021.

Team


The PatchDB dataset is built by Sun Security Laboratory (SunLab) at George Mason University, Fairfax, VA.

sunlab       csis


The PatchDB Dataset | Sun Security Laboratory at George Mason University