Published December 25, 2009 | Version 13000
Journal article Open

Syntax Sensitive and Language Independent Detection of Code Clones

Creators

Description

This paper proposes a new technique to detect code clones from the lexical and syntactic point of view, which is based on PALEX source code representation. The PALEX code contains the recorded parsing actions and also lexical formatting information including white spaces and comments. We can record a list of parsing actions (shift, reduce, and reading a token) during a compiling process after a compiler finishes analyzing the source code. The proposed technique has advantages for syntax sensitive approach and language independency.

Files

13000.pdf

Files (133.6 kB)

Name Size Download all
md5:d8b8962efcd81f71af35a88379ef49c0
133.6 kB Preview Download

Additional details

References

  • Bill Moggridge, "Designing Interactions," The MIT Press, 2007.
  • Brenda .S. Baker, "On Finding Duplication and Near-Duplication in Large Software Systems," Working Conferneceo on Reverse Engineering, pp.86-95, 1995.
  • Ira D. Baxter, Andrew Yahin, et al., "Clone Detection Using Abstract Syntax Trees," International Conference on Software Maintenance, pp.368- 377, 1998.
  • St'ephane Ducasse, Matthias Rieger, Serge Demeyer, "A Language Independent Approach for Detecting Duplicated Code," 15th IEEE International Conference on Software Maintenance, pp.109-118,1999.
  • Cory Kapser and Michael W. Godfrey, "-Cloning Considered Harmful- Considered Harmful," Working Conference on Reverse Engineering, pp.19-28, 2006.
  • Kazuaki Maeda, "XML-Based Source Code Representation with Parsing Actions," The International Conference on Software Engineering Research and Practice, 2007.
  • PMD: Finding copied and pasted code, available from http://pmd.sourceforge.net/cpd.html (accessed 2009-11-28).
  • Toshihiro Kamiya, Shinji Kusumoto, Katsuro Inoue, "CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code," IEEE Transactions on Software Engineering, pp.654-670, vol.28, no.7, Jul. 2002.
  • Vera Wahler, Dietmar Seipel, et al., "Clone Detection in Source Code by Frequent Itemset Techniques," IEEE International Workshop on Source Code Analysis and Manipulation, pp.128-135, 2004. [10] William S. Evans, Christopher W. Fraser, Fei Ma, "Clone Detection via Structural Abstraction," Software Quality Journal, vol.17, no.4, pp.309- 330, 2009. [11] Raghavan Komondoor, Susan Horwitz, "Using Slicing to Identify Duplication in Source Code," pp.40-56, LNCS vol.2126, 2001. [12] Jens Krinke, "Identifying Similar Code with Program Dependence Graphs," Working Conference on Reverse Engineering, pp.301-309, 2001. [13] Chao Liu, Chen Chen, et al., "GPLAG: Detection of Software Plagiarism by Program Dependence Graph Analysis," The 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.872-881, 2006. [14] Steven C. Johnson. "Yacc: Yet Another Compiler Compiler," UNIX Programmer-s Manual, vol. 2, pp. 353-387, 1979. [15] Charles Donnelly, Richard Stallman, "Bison - The Yacc-Compatible Parser Generator," Free Software Foundation, 2006. [16] Maxime Crochmore, Christphe Hancart, Thierry Lecroq, "Algorithms on Strings," Cambridge University Press, 2001.