Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2505515.2507848acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

Software plagiarism detection: a graph-based approach

Published: 27 October 2013 Publication History

Abstract

As plagiarism of software increases rapidly, there are growing needs for software plagiarism detection systems. In this paper, we propose a software plagiarism detection system using an API-labeled control flow graph (A-CFG) that abstracts the functionalities of a program. The A-CFG can reflect both the sequence and the frequency of APIs, while previous work rarely considers both of them together. To perform a scalable comparison of a pair of A-CFGs, we use random walk with restart (RWR) that computes an importance score for each node in a graph. By the RWR, we can generate a single score vector for an A-CFG and can also compare A-CFGs by comparing their score vectors. Extensive evaluations on a set of Windows applications demonstrate the effectiveness and the scalability of our proposed system compared with existing methods.

References

[1]
Business Software Alliance, BSA Global Software Piracy Study, http://globalstudy.bsa.org/2010, 2010.
[2]
S. Choi, H. Park, H. Lim, and T. Han, "A Static API Birthmark for Windows Binary Executables," Journal of Systems and Software, 82(5): 862--873, 2009.
[3]
D. Chae, S. Kim, J. Ha, S. Lee, and G. Woo, "Software Plagiarism Detection via the Static API Call Frequency Birthmark," ACM SAC, pp. 1639--1643, 2013.
[4]
H. Park, S. Choi, H. Lim, and T. Han, "Detecting Java Theft based on Static API Trace Birthmark," Advances in Information and Computer Security, 5312:121--135, 2008.
[5]
H. Lim, H. Park, S. Choi, and T. Han, "A Method for Detection the Theft of Java Programs through Analysis of the Control Flow Information," Information and Software Technology, 51(9): 1338--1350, 2009.
[6]
J. Pan, H. Yang, and C. Faloutsos, "MMSS: Multi-Modal Story-Oriented Video Summarization," IEEE ICDM, pp. 491--494, 2004.
[7]
J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques. Morgan Kaufmann, 2006.
[8]
MSDN APIs, http://msdn.microsoft.com.
[9]
A. Aiken, Moss: A System for Detecting Software Plagiarism, University of California-Berkeley. http://www.cs.bereley.edu/~aiken/moss.html.
[10]
C. Hoffmann, "Group-Theoretic Algorithms and Graph Isomorphism," Heidelberg: Springer, 1982.
[11]
Louden, K. C, Compiler construction. PWS Publishing Company, 1997
[12]
A. Aizawa, "An Information-Theoretic Perspective of TF-IDF Measure," Information Processing and Management, 39(1):45--65, 2003.
[13]
T. Haveliwala, "Topic-Sensitive Pagerank," WWW, pp. 517--526, 2002
[14]
W. Hwang, S. Chae, S. Kim, and G. Woo, "Yet Another Paper Ranking Algorithm Advocating Recent Publications," WWW, pp. 1117--1118, 2010.
[15]
D. Bae, S. Hwang, S. Kim, and C. Faloutsos, "Constructing Seminal Paper Genealogy," ACM CIKM, pp. 2101--2104, 2011.

Cited By

View all
  • (2024)Comparing semantic graph representations of source code: The case of automatic feedback on programming assignmentsComputer Science and Information Systems10.2298/CSIS230615004P21:1(117-142)Online publication date: 2024
  • (2024)TypeFSL: Type Prediction from Binaries via Inter-procedural Data-flow Analysis and Few-shot LearningProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695502(1269-1281)Online publication date: 27-Oct-2024
  • (2024)Detecting Automatic Software Plagiarism via Token Sequence NormalizationProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639192(1-13)Online publication date: 20-May-2024
  • Show More Cited By

Index Terms

  1. Software plagiarism detection: a graph-based approach

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management
    October 2013
    2612 pages
    ISBN:9781450322638
    DOI:10.1145/2505515
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 October 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. binary analysis
    2. graph
    3. similarity
    4. software plagiarism

    Qualifiers

    • Poster

    Conference

    CIKM'13
    Sponsor:
    CIKM'13: 22nd ACM International Conference on Information and Knowledge Management
    October 27 - November 1, 2013
    California, San Francisco, USA

    Acceptance Rates

    CIKM '13 Paper Acceptance Rate 143 of 848 submissions, 17%;
    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)33
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 15 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Comparing semantic graph representations of source code: The case of automatic feedback on programming assignmentsComputer Science and Information Systems10.2298/CSIS230615004P21:1(117-142)Online publication date: 2024
    • (2024)TypeFSL: Type Prediction from Binaries via Inter-procedural Data-flow Analysis and Few-shot LearningProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695502(1269-1281)Online publication date: 27-Oct-2024
    • (2024)Detecting Automatic Software Plagiarism via Token Sequence NormalizationProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639192(1-13)Online publication date: 20-May-2024
    • (2024)On Plagiarism and Software PlagiarismAdvances in Computational Collective Intelligence10.1007/978-3-031-70259-4_24(314-326)Online publication date: 9-Sep-2024
    • (2024)Search for Structurally Similar Projects of Software SystemsAdvances in Automation V10.1007/978-3-031-51127-1_2(15-26)Online publication date: 4-Jan-2024
    • (2023)Current Trends in the Search for Similarities in Source Codes with an Application in the Field of Plagiarism and Clone Detection2023 33rd Conference of Open Innovations Association (FRUCT)10.23919/FRUCT58615.2023.10143064(77-84)Online publication date: 24-May-2023
    • (2023)PEM: Representing Binary Program Semantics for Similarity Analysis via a Probabilistic Execution ModelProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616301(401-412)Online publication date: 30-Nov-2023
    • (2023)Improving Binary Code Similarity Transformer Models by Semantics-Driven Instruction DeemphasisProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598121(1106-1118)Online publication date: 12-Jul-2023
    • (2023)Triplet-trained graph transformer with control flow graph for few-shot malware classificationInformation Sciences10.1016/j.ins.2023.119598649(119598)Online publication date: Nov-2023
    • (2023)An Overview on the Identification of Software Birthmarks for Software ProtectionProceedings of International Conference on Information Technology and Applications10.1007/978-981-19-9331-2_27(323-330)Online publication date: 19-May-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media