This is a deep learning dataset for cross-version binary code similarity detection.
In order to clone it, you will need git-lfs. You can follow the steps:
-
Install git-lfs as noted on https://www.atlassian.com/git/tutorials/git-lfs#installing-git-lfs
-
git lfs clone https://github.com/twelveand0/alphadiff-dataset.git
- On Linux, you can unzip it by the following commands:
>> cd alphadiff-dataset
>> cat cat dataset.z01 dataset.z02 dataset.z03 dataset.z04 dataset.z05 dataset.z06 dataset.z07 dataset.z08 dataset.z09 dataset.zip > complete.zip
>> unzip complete.zip
>> unzip data.zip
ps: because the original ZIP file is splited into multi-parts, you should first cancatenate the parts in order together.
-
On Windows, you can just right-click the dataset.zip file and select extract....
-
On Max, I have not tried.
coming soon...