Skip to main navigation Skip to search Skip to main content

Poster: Android malware detection using multi-flows and API patterns

  • Feng Shen
  • , Justin Del Vecchio
  • , Aziz Mohaisen
  • , Steven Y. Ko
  • , Lukasz Ziarek
  • SUNY Buffalo

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

This paper proposes a new technique for detecting mobile malware based on information ow analysis. Our approach focuses on the structure of information ows we gather in our analysis, and the patterns of behavior present in information ows. Our analysis not only gathers simple ows that have a single source and a single sink, but also Multi-Flows that either start from a single source and ow to multiple sinks, or start from multiple sources and ow to a single sink. This analysis captures more complex behavior that both recent malware and recent benign applications exhibit. We leverage N-gram analysis to understand both unique and common behavioral patterns present in Multi-Flows. Our tool leverages N-gram analysis over sequences of API calls that occur along control ow paths in Multi-Flows to precisely analyze Multi-Flows with respect to app behavior. Using our approach, we show that there is a need to look beyond simple ows in order to effectively leverage information ow analysis for malware detection. By analyzing recently-collected malware, we show there has been an evolution in malware beyond simply collecting sensitive information and immediately exposing it. Many previous systems focus on identifying the existence of simple information ows|i.e. considering an information ow as just a (source, sink) pair. However, modern malware performs complex computations before, during, and after collecting sensitive information and tends to aggregate data before exposing it. A simple (source, sink) view of information ow does not adequately capture such behavior. The uniqueness of our approach comes from the following two features. First, our information ow analysis represent an information ow not as a simple (source, sink) pair, but as a sequence of API calls. This gives us the ability to distinguish different ows with same sources and sinks based on the computation performed along the information ow. Second, our information ow analysis detects Multi-Flows, ows that either start with a single source and ow to multiple sinks, or start with multiple sources and ow to a single sink. We treat such ows as a single ow, instead of multiple distinct ows. Fig. 1 shows a comparison of a Multi-Flow and its corresponding simple ows. This allows us to examine the structure of the ows themselves. We leverage machine learning techniques to extract features from Multi-Flows and their API sequences (N-gram analysis) and use these features to perform SVM-based classification. Based on this approach, we build an open source implementation of Multi-Flow analysis and API sequencing in the BlueSeal framework [2] [3] [1], along with N-gram analysis and a SVM-based classifier. We also conduct a detailed evaluation study, highlighting the differences in old and new apps. We leverage the app behavior extracted as features from both Multi-Flows and their API usage patterns and apply machine learning techniques to automatically identify malware based on the structure of its computation over sensitive data. We test our tool on a set of 1,576 benign apps downloaded from Google Play and 2,422 known malicious apps. Our results show that app behavior difference on sensitive data can be a significant factor in malware detection.

Original languageEnglish
Title of host publicationMobiSys 2017 - Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services
PublisherAssociation for Computing Machinery, Inc
Pages171
Number of pages1
ISBN (Electronic)9781450349284
DOIs
StatePublished - Jun 16 2017
Event15th ACM International Conference on Mobile Systems, Applications, and Services, MobiSys 2017 - Niagara Falls, United States
Duration: Jun 19 2017Jun 23 2017

Publication series

NameMobiSys 2017 - Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services

Conference

Conference15th ACM International Conference on Mobile Systems, Applications, and Services, MobiSys 2017
Country/TerritoryUnited States
CityNiagara Falls
Period06/19/1706/23/17

Fingerprint

Dive into the research topics of 'Poster: Android malware detection using multi-flows and API patterns'. Together they form a unique fingerprint.

Cite this