Download Policy


We are happy to share PatchDB and hope you can find our dataset useful in your research.
You have two options to download the PatchDB dataset.

Option 1: download the '.json' format PatchDB from Hugging Face 🤗

You can download the dataset from sunlab/patch_db in Hugging Face.
This repository is publicly accessible. You need to log into Hugging Face, share your contact information (email and username), and agree to your terms and conditions (if any) to download the dataset.

Option 2: download the '.zip' dataset using the request form.

Request Steps:

1. Please open the online request form in a browser.
Link to PatchDB Request Form: https://forms.gle/4CXnx9th1GcJAjC4A
(If you are unable to access the page, please contact SunLab by email.)

2. Sign in to your Google account.
Since our request form and download link are facilitated by Google, please use your Gmail as the valid email to receive the form response.

3. In the request form, please include your name, affiliation, work email, homepage, and the purpose of using PatchDB.
The information is needed for verification.
Note that your request may be ignored if we are not able to determine your identity or affiliation.
We do not share your personal information with any third parties.

4. Acknowledge all the information you provided is correct.

5. Read and acknowledge the Disclaimer & Download Agreement for PatchDB.

6. Submit the request form.
A request receipt will be emailed to the email address you provided.
Once we verify your information, we will email the download link to you as soon as possible.

Disclaimer & Download Agreement


To download the PatchDB dataset, you must agree with the items of the succeeding Disclaimer & Download Agreement. You should carefully read the following terms before submitting the PatchDB request form.

  • PatchDB is constructed and cross-checked by 3 experts that work in security patch research. Due to the potential misclassification led by subjective factors, the Sun Security Laboratory (SunLab) cannot guarantee a 100% accuracy for samples in the dataset.

  • The copyright of the PatchDB dataset is owned by SunLab.

  • The purpose of using PatchDB should be non-commercial research and/or personal use. The dataset should not be used for commercial use and any profitable purpose.

  • The PatchDB dataset should not be re-selled or re-distributed. Anyone who has obtained PatchDB should not share the dataset with others without the permission from SunLab.

Citation


If you are using PatchDB for work that will result in a publication (thesis, dissertation, paper, article), please use the following citation:

@inproceedings{wang2021PatchDB,
  title={PatchDB: A Large-Scale Security Patch Dataset},
  author={Wang, Xinda, Wang, Shu, Feng, Pengbin, Sun, Kun and Jajodia, Sushil},
  booktitle={2021 51st Annual IEEE/IFIP International Conference on Dependable Systems
and Networks (DSN)}, year={2021}, pages={149-160}, doi={10.1109/DSN48987.2021.00030} }
OR
Xinda Wang, Shu Wang, Pengbin Feng, Kun Sun and Sushil Jajodia, "PatchDB: A Large-Scale 
Security Patch Dataset," 2021 51st Annual IEEE/IFIP International Conference on Dependable
Systems and Networks (DSN 2021), 2021, pp. 149-160, doi: 10.1109/DSN48987.2021.00030.

The PatchDB Dataset | Sun Security Laboratory at George Mason University