Skip to main navigation Skip to search Skip to main content

VinJ: An Automated Tool for Large-Scale Software Vulnerability Data Generation

  • Washington State University Pullman
  • University of Texas at Dallas

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

We present VinJ, an efficient automated tool for large-scale diverse vulnerability data generation. VinJ automatically generates vulnerability data by injecting vulnerabilities into given programs, based on knowledge learned from existing vulnerability data. VinJ is able to generate diverse vulnerability data covering 18 CWEs with 69% success rate and generate 686k vulnerability samples in 74 hours (i.e., 0.4 seconds per sample), indicating it is efficient. The generated data is able to improve existing DL-based vulnerability detection, localization, and repair models significantly. The demo video of VinJ can be found at https://youtu.be/-oKoUqBbxD4. The tool website can be found at https://github.com/NewGillig/VInj. We also release the generated large-scale vulnerability dataset, which can be found at https://zenodo.org/records/10574446.

Original languageEnglish
Title of host publicationFSE Companion - Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering
EditorsMarcelo d�Amorim
PublisherAssociation for Computing Machinery, Inc
Pages567-571
Number of pages5
ISBN (Electronic)9798400706585
DOIs
StatePublished - Jul 10 2024
Event32nd ACM International Conference on the Foundations of Software Engineering, FSE Companion - Porto de Galinhas, Brazil
Duration: Jul 15 2024Jul 19 2024

Publication series

NameFSE Companion - Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering

Conference

Conference32nd ACM International Conference on the Foundations of Software Engineering, FSE Companion
Country/TerritoryBrazil
CityPorto de Galinhas
Period07/15/2407/19/24

Keywords

  • Vulnerability analysis
  • data augmentation
  • deep learning

Fingerprint

Dive into the research topics of 'VinJ: An Automated Tool for Large-Scale Software Vulnerability Data Generation'. Together they form a unique fingerprint.

Cite this