TY - GEN
T1 - On the vulnerability proneness of multilingual code
AU - Li, Wen
AU - Li, Li
AU - Cai, Haipeng
N1 - Publisher Copyright:
© 2022 Owner/Author.
PY - 2022/11/7
Y1 - 2022/11/7
N2 - Software construction using multiple languages has long been a norm, yet it is still unclear if multilingual code construction has significant security implications and real security consequences. This paper aims to address this question with a large-scale study of popular multi-language projects on GitHub and their evolution histories, enabled by our novel techniques for multilingual code characterization. We found statistically significant associations between the proneness of multilingual code to vulnerabilities (in general and of specific categories) and its language selection. We also found this association is correlated with that of the language interfacing mechanism, not that of individual languages. We validated our statistical findings with in-depth case studies on actual vulnerabilities, explained via the mechanism and language selection. Our results call for immediate actions to assess and defend against multilingual vulnerabilities, for which we provide practical recommendations.
AB - Software construction using multiple languages has long been a norm, yet it is still unclear if multilingual code construction has significant security implications and real security consequences. This paper aims to address this question with a large-scale study of popular multi-language projects on GitHub and their evolution histories, enabled by our novel techniques for multilingual code characterization. We found statistically significant associations between the proneness of multilingual code to vulnerabilities (in general and of specific categories) and its language selection. We also found this association is correlated with that of the language interfacing mechanism, not that of individual languages. We validated our statistical findings with in-depth case studies on actual vulnerabilities, explained via the mechanism and language selection. Our results call for immediate actions to assess and defend against multilingual vulnerabilities, for which we provide practical recommendations.
KW - cross-language vulnerability
KW - language interfacing
KW - multi-language software
KW - multilingual code
KW - regression analysis
KW - software security
UR - https://www.scopus.com/pages/publications/85143072313
U2 - 10.1145/3540250.3549173
DO - 10.1145/3540250.3549173
M3 - Conference contribution
AN - SCOPUS:85143072313
T3 - ESEC/FSE 2022 - Proceedings of the 30th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering
SP - 847
EP - 859
BT - ESEC/FSE 2022 - Proceedings of the 30th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering
A2 - Roychoudhury, Abhik
A2 - Cadar, Cristian
A2 - Kim, Miryung
PB - Association for Computing Machinery, Inc
T2 - 30th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2022
Y2 - 14 November 2022 through 18 November 2022
ER -