Clones

2020

  • H. W. Alomari and M. Stephan, “srcClone: Detecting Code Clones via Decompositional Slicing,” International conference on. program comprehension (icpc), vol. 11, iss. 20, 2020. doi:10.1145/3387904
    [BibTeX] [Abstract] [PDF]

    Detecting code clones is an established method for comprehending and maintaining systems. One important but challenging form of code clone detection involves detecting semantic clones, which are those that are semantically similar code segments that differ syntactically. Existing approaches to semantic clone detection do not scale well to large code bases and have room for improvement in their precision and recall. In this paper, we present a scalable slicing-based approach for detecting code clones, including semantic clones. We determine code segment similarity based on their corresponding program slices. We take advantage of a lightweight, publicly available, and scalable program slicing approach to compute the necessary information. Our approach uses dependency analysis to find and measure cloned elements, and provides insights into elements of the code that are affected by an entire clone set/-class. We have implemented our approach as a tool called srcClone. We evaluate it by comparing it to two semantic clone detectors in terms of clones, performance, and scalability; and perform recall and precision analysis using established benchmark scenarios. In our evaluation, we illustrate our approach is both relatively scalable and accurate. srcClone can also be used by program analysts to run on non-compilable and incomplete source code, which serves comprehension and maintenance tasks very well. We believe our approach is an important advancement in program comprehension that can help improve clone detection practices and provide developers greater insights into their software. CCS CONCEPTS • Software and its engineering → Software maintenance tools.

    @article{alomari_srcclone_2020,
    title = {{srcClone}: {Detecting} {Code} {Clones} via {Decompositional} {Slicing}},
    volume = {11},
    url = {http://www.users.miamioh.edu/stephamd/papers/icpc2020.pdf},
    doi = {10.1145/3387904},
    abstract = {Detecting code clones is an established method for comprehending and maintaining systems. One important but challenging form of code clone detection involves detecting semantic clones, which are those that are semantically similar code segments that differ syntactically. Existing approaches to semantic clone detection do not scale well to large code bases and have room for improvement in their precision and recall. In this paper, we present a scalable slicing-based approach for detecting code clones, including semantic clones. We determine code segment similarity based on their corresponding program slices. We take advantage of a lightweight, publicly available, and scalable program slicing approach to compute the necessary information. Our approach uses dependency analysis to find and measure cloned elements, and provides insights into elements of the code that are affected by an entire clone set/-class. We have implemented our approach as a tool called srcClone. We evaluate it by comparing it to two semantic clone detectors in terms of clones, performance, and scalability; and perform recall and precision analysis using established benchmark scenarios. In our evaluation, we illustrate our approach is both relatively scalable and accurate. srcClone can also be used by program analysts to run on non-compilable and incomplete source code, which serves comprehension and maintenance tasks very well. We believe our approach is an important advancement in program comprehension that can help improve clone detection practices and provide developers greater insights into their software. CCS CONCEPTS • Software and its engineering → Software maintenance tools.},
    number = {20},
    journal = {International Conference on. Program Comprehension (ICPC)},
    author = {Alomari, Hakam W and Stephan, Matthew},
    year = {2020},
    note = {ISBN: 9781450379588
    Publisher: ACM},
    keywords = {Code clone, Clone detection, Program slicing, Semantic clones}
    }

  • S. Baltes and C. Treude, “Code Duplication on Stack Overflow,” 42nd international conference on software engineering: new ideas and emerging results (icse-nier 2020), 2020. doi:10.1145/3377816.3381744
    [BibTeX] [Abstract] [PDF]

    Despite the unarguable importance of Stack Overflow (SO) for the daily work of many software developers and despite existing knowledge about the impact of code duplication on software maintainability , the prevalence and implications of code clones on SO have not yet received the attention they deserve. In this paper, we motivate why studies on code duplication within SO are needed and how existing studies on code reuse differ from this new research direction. We present similarities and differences between code clones in general and code clones on SO and point to open questions that need to be addressed to be able to make data-informed decisions about how to properly handle clones on this important platform. We present results from a first preliminary investigation, indicating that clones on SO are common and diverse. We further point to specific challenges, including incentives for users to clone successful answers and difficulties with bulk edits on the platform, and conclude with possible directions for future work. CCS CONCEPTS • Software and its engineering → Maintaining software.

    @article{baltes_code_2020,
    title = {Code {Duplication} on {Stack} {Overflow}},
    url = {https://arxiv.org/abs/2002.01275},
    doi = {10.1145/3377816.3381744},
    abstract = {Despite the unarguable importance of Stack Overflow (SO) for the daily work of many software developers and despite existing knowledge about the impact of code duplication on software maintainability , the prevalence and implications of code clones on SO have not yet received the attention they deserve. In this paper, we motivate why studies on code duplication within SO are needed and how existing studies on code reuse differ from this new research direction. We present similarities and differences between code clones in general and code clones on SO and point to open questions that need to be addressed to be able to make data-informed decisions about how to properly handle clones on this important platform. We present results from a first preliminary investigation, indicating that clones on SO are common and diverse. We further point to specific challenges, including incentives for users to clone successful answers and difficulties with bulk edits on the platform, and conclude with possible directions for future work. CCS CONCEPTS • Software and its engineering → Maintaining software.},
    author = {Baltes, Sebastian and Treude, Christoph},
    month = feb,
    year = {2020},
    journal = {42nd International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER 2020)},
    note = {ISBN: 9781450371261
    \_eprint: 2002.01275v1},
    keywords = {code clones, code duplication, software evo-lution, software licenses, software maintenance, stack overflow}
    }

  • P. Gautam and H. Saini, “Mutation testing-based evaluation framework for evaluating software clone detection tools,” in Lecture notes in mechanical engineering, 2020, pp. 21-35. doi:10.1007/978-981-15-3746-2_3
    [BibTeX] [Abstract] [PDF]

    Mutation testing has become a prominent research area in the past few decades. The mutation testing has been basically used in the testing society. It is a type of software testing where we mutate (small change, modification in the program) source code using mutant operators by introducing potential new bugs in the program code without changing its behavior. Analogously, mutant operators generate new clones by copy/paste editing activities. However, several software clone detection tools and techniques have been introduced by numerous scientists and a large number of tools comprises for a perceivable evaluation. Moreover, there have been a lot of efforts to empirically assess and analyze variant state-of-the-art tools. The current abstraction exhibits that various aspects that could leverage the legitimacy of the outcome of such assessment have been roughly anticipated due to lack of legitimized software clone benchmark. In this paper, we present a mutation testing-based automatic evaluation structure for valuating software clone detection tools and techniques. The proposed framework uses the edit-based taxonomy of mutation operator for assessing code clone detection tools. The proposed structure injects software clones in the source code automatically, and after that, we evaluate clone detection tools. The clone detection tools are evaluated on the basis of precision (number of corrected clones) and recall (total number of clones). We visualize that such a framework will present a valuable augmentation to the research community.

    @inproceedings{gautam_mutation_2020,
    title = {Mutation Testing-Based Evaluation Framework for Evaluating Software Clone Detection Tools},
    isbn = {9789811537455},
    url = {https://link.springer.com/chapter/10.1007/978-981-15-3746-2_3},
    doi = {10.1007/978-981-15-3746-2_3},
    abstract = {Mutation testing has become a prominent research area in the past few decades. The mutation testing has been basically used in the testing society. It is a type of software testing where we mutate (small change, modification in the program) source code using mutant operators by introducing potential new bugs in the program code without changing its behavior. Analogously, mutant operators generate new clones by copy/paste editing activities. However, several software clone detection tools and techniques have been introduced by numerous scientists and a large number of tools comprises for a perceivable evaluation. Moreover, there have been a lot of efforts to empirically assess and analyze variant state-of-the-art tools. The current abstraction exhibits that various aspects that could leverage the legitimacy of the outcome of such assessment have been roughly anticipated due to lack of legitimized software clone benchmark. In this paper, we present a mutation testing-based automatic evaluation structure for valuating software clone detection tools and techniques. The proposed framework uses the edit-based taxonomy of mutation operator for assessing code clone detection tools. The proposed structure injects software clones in the source code automatically, and after that, we evaluate clone detection tools. The clone detection tools are evaluated on the basis of precision (number of corrected clones) and recall (total number of clones). We visualize that such a framework will present a valuable augmentation to the research community.},
    booktitle = {Lecture Notes in Mechanical Engineering},
    publisher = {Springer},
    author = {Gautam, Pratiksha and Saini, Hemraj},
    year = {2020},
    note = {ISSN: 21954364},
    keywords = {Mutation analysis, Mutation operators, Mutation techniques, Software clone},
    pages = {21-35}
    }

  • M. Hammad, H. A. Basit, S. Jarzabek, and R. Koschke, “Survey of Clone Visualisations View project Bad Smells View project A systematic mapping study of clone visualization,” Computer science review, vol. 37, p. 100266, 2020. doi:10.1016/j.cosrev.2020.100266
    [BibTeX] [Abstract] [PDF]

    Knowing code clones (similar code fragments) is helpful in software maintenance and re-engineering. As clone detectors return huge numbers of clones, visualization techniques have been proposed to make cloning information more comprehensible and useful for programmers. We present a mapping study of clone visualization techniques, classifying visualizations in respect to the user goals to be achieved by means of clone visualizations and relevant clone-related information needs. Our mapping study will aid tool users in selecting clone visualization tools suitable for the task at hand, tool vendors in improving capabilities of their tools, and researchers in identifying open problems in clone visualization research.

    @article{hammad_survey_2020,
    title = {Survey of {Clone} {Visualisations} {View} project {Bad} {Smells} {View} project {A} systematic mapping study of clone visualization},
    volume = {37},
    url = {https://doi.org/10.1016/j.cosrev.2020.100266},
    doi = {10.1016/j.cosrev.2020.100266},
    abstract = {Knowing code clones (similar code fragments) is helpful in software maintenance and re-engineering. As clone detectors return huge numbers of clones, visualization techniques have been proposed to make cloning information more comprehensible and useful for programmers. We present a mapping study of clone visualization techniques, classifying visualizations in respect to the user goals to be achieved by means of clone visualizations and relevant clone-related information needs. Our mapping study will aid tool users in selecting clone visualization tools suitable for the task at hand, tool vendors in improving capabilities of their tools, and researchers in identifying open problems in clone visualization research.},
    journal = {Computer Science Review},
    author = {Hammad, Muhammad and Basit, Hamid Abdul and Jarzabek, Stan and Koschke, Rainer},
    year = {2020},
    note = {ISBN: 2020.100266},
    keywords = {Clone, Feature analysis, Human-computer interaction, Information needs, User goals, Visualization techniques},
    pages = {100266}
    }

  • D. Lee, U. Ko, I. Aitkazin, S. Park, H. Tak, and H. Cho, “A fast detecting method for clone functions using global alignment of token sequences,” in Proceedings of the 2020 12th international conference on machine learning and computing, New York, NY, USA, 2020, p. 17–22. doi:10.1145/3383972.3384014
    [BibTeX] [PDF]
    @inproceedings{10.1145/3383972.3384014,
    author = {Lee, Da-Young and Ko, Uram and Aitkazin, Ibrahim and Park, SangUn and Tak, Hae-Sung and Cho, Hwan-Gue},
    title = {A Fast Detecting Method for Clone Functions Using Global Alignment of Token Sequences},
    year = {2020},
    isbn = {9781450376426},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3383972.3384014},
    doi = {10.1145/3383972.3384014},
    booktitle = {Proceedings of the 2020 12th International Conference on Machine Learning and Computing},
    pages = {17–22},
    numpages = {6},
    keywords = {clone function, global alignment, Clone detection, code analysis},
    location = {Shenzhen, China},
    series = {ICMLC 2020}
    }

  • G. Mostaeen, B. Roy, C. Roy, K. Schneider, J. Svajlenko, F. Author, and S. Author, “A machine learning based framework for code clone validation,” 2020.
    [BibTeX] [PDF]
    @techreport{mostaeen_machine_2020,
    title={A Machine Learning Based Framework for Code Clone Validation},
    url = {https://arxiv.org/abs/2005.00967},
    author = {Mostaeen, Golam and Roy, Banani and Roy, Chanchal and Schneider, Kevin and Svajlenko, Jeffrey and Author, F and Author, S},
    year = {2020},
    journal = {Journal of Systems and Software},
    note = {Publication Title: arxiv.org},
    keywords = {Management, Clone, Code clones ·, Learning ·, Machine, Validation ·}
    }

  • W. Vanhoof and G. Yernaux, “Generalization-Driven Semantic Clone Detection in CLP,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2020, pp. 228-242. doi:10.1007/978-3-030-45260-5_14
    [BibTeX] [Abstract] [PDF]

    In this work we provide an algorithm capable of searching for semantic clones in CLP program code. Two code fragments are considered semantically cloned (at least to some extent) when they can both be transformed into a single code fragment thus representing the functionality that is shared between the fragments. While the framework of what constitutes such semantic clones has been established before, it is parametrized by a set of admissible program transformations and no algorithm exists that effectively performs the search with a concrete set of allowed transformations. In this work we use the well-known unfolding and slicing transformations to establish such an algorithm, and we show how the generalization of CLP goals can be a driving factor both for controlling the search process (i.e. keeping it finite) as for guiding the search (i.e. choosing what transformation(s) to apply at what moment).

    @inproceedings{vanhoof_generalization-driven_2020,
    title = {Generalization-{Driven} {Semantic} {Clone} {Detection} in {CLP}},
    volume = {12042 LNCS},
    isbn = {978-3-030-45259-9},
    url = {https://link.springer.com/chapter/10.1007/978-3-030-45260-5_14},
    doi = {10.1007/978-3-030-45260-5_14},
    abstract = {In this work we provide an algorithm capable of searching for semantic clones in CLP program code. Two code fragments are considered semantically cloned (at least to some extent) when they can both be transformed into a single code fragment thus representing the functionality that is shared between the fragments. While the framework of what constitutes such semantic clones has been established before, it is parametrized by a set of admissible program transformations and no algorithm exists that effectively performs the search with a concrete set of allowed transformations. In this work we use the well-known unfolding and slicing transformations to establish such an algorithm, and we show how the generalization of CLP goals can be a driving factor both for controlling the search process (i.e. keeping it finite) as for guiding the search (i.e. choosing what transformation(s) to apply at what moment).},
    booktitle = {Lecture {Notes} in {Computer} {Science} (including subseries {Lecture} {Notes} in {Artificial} {Intelligence} and {Lecture} {Notes} in {Bioinformatics})},
    publisher = {Springer},
    author = {Vanhoof, Wim and Yernaux, Gonzague},
    year = {2020},
    note = {ISSN: 16113349},
    pages = {228-242}
    }

  • A. Walker, T. Cerny, and E. Song, “Open-source tools and benchmarks for code-clone detection,” Acm sigapp applied computing review, vol. 19, iss. 4, pp. 28-39, 2020. doi:10.1145/3381307.3381310
    [BibTeX] [Abstract] [PDF]

    Code duplication is a common problem, and a well known sign of bad design. But Code duplication is one of the most popular forms of software reuse among developers. Clone detection or code duplication detection is the technique concerned with the identification of code fragments that essentially compute the same results .The primary aim of clone detection is to identify clone code and replace them with a single function call where the function would mimic the behavior of a single instance from the set of clones. As a result of that, in the last decade, the issue of detecting code duplication led to various tools that can automatically find duplicated blocks of code. In this paper different methods for code clone detection, different tools and technique used for that and the code analysis will be discussed

    @article{walker_open-source_2020,
    title = {Open-source tools and benchmarks for code-clone detection},
    volume = {19},
    issn = {1559-6915},
    url = {https://dl.acm.org/doi/abs/10.1145/3381307.3381310},
    doi = {10.1145/3381307.3381310},
    abstract = {Code duplication is a common problem, and a well known sign of bad design. But Code duplication is one of the most popular forms of software reuse among developers. Clone detection or code duplication detection is the technique concerned with the identification of code fragments that essentially compute the same results .The primary aim of clone detection is to identify clone code and replace them with a single function call where the function would mimic the behavior of a single instance from the set of clones. As a result of that, in the last decade, the issue of detecting code duplication led to various tools that can automatically find duplicated blocks of code. In this paper different methods for code clone detection, different tools and technique used for that and the code analysis will be discussed},
    number = {4},
    journal = {ACM SIGAPP Applied Computing Review},
    author = {Walker, Andrew and Cerny, Tomas and Song, Eungee},
    month = jan,
    year = {2020},
    note = {Publisher: Association for Computing Machinery (ACM)},
    pages = {28-39}
    }

  • J. Akram, M. Mumtaz, and P. Luo, “IBFET: Index-based features extraction technique for scalable code clone detection at file level granularity,” Software – practice and experience, vol. 50, iss. 1, pp. 22-46, 2020. doi:10.1002/spe.2759
    [BibTeX] [Abstract] [PDF]

    Many techniques have been developed over the years to detect code clones in different software systems to maintain security measures. These techniques often require the source code to compare the subject system against a very large data set of big code. This paper presents index-based features extraction technique (IBFET) to detect code clones at a very large-scale level to billions of LOC at file level granularity. We performed preprocessing, indexing, and clone detection for more than 324 billion of LOC using a Hadoop distributed environment, which is quite faster and more efficient as compared to existing distributed indexing and clone detection techniques; meanwhile, it detects all three types of clones efficiently. The MapReduce rule of divide and conquer is used for a count and retrieve the similar features between different systems. We evaluated the execution time, scalability, precision, and recall of IBFET by using a well-known clone detection data set IJaDataset and BigCloneBench; furthermore, we compared the results with other state-of-the-art tools. Our approach is faster, flexible, scalable, and provides accurate results with high authenticity and can be implemented at a large-scale level.

    @article{akram_ibfet_2020,
    title = {{IBFET}: {Index}-based features extraction technique for scalable code clone detection at file level granularity},
    volume = {50},
    issn = {1097024X},
    doi = {10.1002/spe.2759},
    url = {https://onlinelibrary.wiley.com/doi/full/10.1002/spe.2759},
    abstract = {Many techniques have been developed over the years to detect code clones in different software systems to maintain security measures. These techniques often require the source code to compare the subject system against a very large data set of big code. This paper presents index-based features extraction technique (IBFET) to detect code clones at a very large-scale level to billions of LOC at file level granularity. We performed preprocessing, indexing, and clone detection for more than 324 billion of LOC using a Hadoop distributed environment, which is quite faster and more efficient as compared to existing distributed indexing and clone detection techniques; meanwhile, it detects all three types of clones efficiently. The MapReduce rule of divide and conquer is used for a count and retrieve the similar features between different systems. We evaluated the execution time, scalability, precision, and recall of IBFET by using a well-known clone detection data set IJaDataset and BigCloneBench; furthermore, we compared the results with other state-of-the-art tools. Our approach is faster, flexible, scalable, and provides accurate results with high authenticity and can be implemented at a large-scale level.},
    number = {1},
    journal = {Software - Practice and Experience},
    author = {Akram, Junaid and Mumtaz, Majid and Luo, Ping},
    month = jan,
    year = {2020},
    note = {Publisher: John Wiley and Sons Ltd},
    keywords = {clone detection, plagiarism detection, big code, code similarity detection, software reuse, software security and maintenance},
    pages = {22-46}
    }

  • J. Akram, “Droidsd: an efficient indexed based android applications similarity detection tool,” Journal of information science and engineering, pp. 13-29, 2020. doi:10.6688/JISE.202001
    [BibTeX]
    @article{article,
    author = {Akram, Junaid},
    year = {2020},
    month = {01},
    pages = {13-29},
    title = {DroidSD: An Efficient Indexed Based Android Applications Similarity Detection Tool},
    journal = {Journal of Information Science and Engineering},
    doi = {10.6688/JISE.202001}
    }

  • D. Alfageh, H. Alhakami, A. Baz, E. Alanazi, and T. Alsubait, “Clone detection techniques for javascript and language independence: review,” International journal of advanced computer science and applications, vol. 11, 2020. doi:10.14569/IJACSA.2020.01104102
    [BibTeX]
    @article{article,
    author = {Alfageh, Danyah and Alhakami, Hosam and Baz, Abdullah and Alanazi, Eisa and Alsubait, Tahani},
    year = {2020},
    month = {01},
    pages = {},
    title = {Clone Detection Techniques for JavaScript and Language Independence: Review},
    volume = {11},
    journal = {International Journal of Advanced Computer Science and Applications},
    doi = {10.14569/IJACSA.2020.01104102}
    }

  • V. Bandi, C. K. Roy, and C. Gutwin, “Clone swarm: a cloud based code-clone analysis tool,” in 2020 ieee 14th international workshop on software clones (iwsc), 2020, pp. 52-56.
    [BibTeX] [PDF]
    @INPROCEEDINGS{9047642,
    author={V. Bandi and C. K. Roy and C. Gutwin},
    booktitle={2020 IEEE 14th International Workshop on Software Clones (IWSC)},
    title={Clone Swarm: A Cloud Based Code-Clone Analysis Tool},
    year={2020},
    volume={},
    number={},
    url = {https://ieeexplore.ieee.org/abstract/document/9047642/},
    pages={52-56},
    }

  • S. Bharti and H. Singh, “Comprehending code fragment in code clones: a literature-based perspective,” , pp. 785-795, 2020.
    [BibTeX] [PDF]
    @article{bharti_comprehending_nodate,
    title = {Comprehending Code Fragment in Code Clones: A Literature-Based Perspective},
    url = {https://link.springer.com/chapter/10.1007/978-3-030-29407-6_56},
    booktitle={Proceedings of ICRIC 2019},
    year = {2020},
    pages={785-795},
    author={Bharti, Sarveshwar and Singh, Hardeep},
    }

  • S. Bharti and H. Singh, “Proactively managing clones inside an ide: a systematic literature review,” International journal of computers and applications, pp. 1-20, 2020. doi:10.1080/1206212X.2020.1720952
    [BibTeX] [PDF]
    @article{doi:10.1080/1206212X.2020.1720952,
    author = {Sarveshwar Bharti and Hardeep Singh},
    title = {Proactively managing clones inside an IDE: a systematic literature review},
    journal = {International Journal of Computers and Applications},
    volume = {0},
    number = {0},
    pages = {1-20},
    year = {2020},
    publisher = {Taylor & Francis},
    doi = {10.1080/1206212X.2020.1720952},
    URL = {https://doi.org/10.1080/1206212X.2020.1720952},
    eprint = {https://doi.org/10.1080/1206212X.2020.1720952}
    }

  • P. M. Caldeira, K. Sakamoto, H. Washizaki, Y. Fukazawa, and T. Shimada, “Improving syntactical clone detection methods through the use of an intermediate representation,” in 2020 ieee 14th international workshop on software clones (iwsc), 2020, pp. 8-14.
    [BibTeX] [PDF]
    @INPROCEEDINGS{9047637,
    author={P. M. {Caldeira} and K. {Sakamoto} and H. {Washizaki} and Y. {Fukazawa} and T. {Shimada}},
    booktitle={2020 IEEE 14th International Workshop on Software Clones (IWSC)},
    title={Improving Syntactical Clone Detection Methods through the Use of an Intermediate Representation},
    year={2020},
    url = {http://doi.org/10.1109/IWSC50091.2020.9047637},
    volume={},
    number={},
    pages={8-14},
    }

  • C. Fang, Z. Liu, Y. Shi, J. Huang, and Q. Shi, “Functional code clone detection with syntax and semantics fusion learning,” in Proceedings of the 29th acm sigsoft international symposium on software testing and analysis, New York, NY, USA, 2020, p. 516–527. doi:10.1145/3395363.3397362
    [BibTeX] [PDF]
    @inproceedings{10.1145/3395363.3397362,
    author = {Fang, Chunrong and Liu, Zixi and Shi, Yangyang and Huang, Jeff and Shi, Qingkai},
    title = {Functional Code Clone Detection with Syntax and Semantics Fusion Learning},
    year = {2020},
    isbn = {9781450380089},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3395363.3397362},
    doi = {10.1145/3395363.3397362},
    booktitle = {Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis},
    pages = {516–527},
    numpages = {12},
    keywords = {code representation, functional clone detection, Code clone detection, syntax and semantics fusion learning},
    location = {Virtual Event, USA},
    series = {ISSTA 2020}
    }

  • Z. Gao, L. Jiang, X. Xia, D. Lo, and J. Grundy, “Checking smart contracts with structural code embedding,” Ieee transactions on software engineering, pp. 1-1, 2020.
    [BibTeX] [PDF]
    @ARTICLE{8979435,
    author={Z. {Gao} and L. {Jiang} and X. {Xia} and D. {Lo} and J. {Grundy}},
    journal={IEEE Transactions on Software Engineering},
    title={Checking Smart Contracts with Structural Code Embedding},
    year={2020},
    url = {https://ieeexplore.ieee.org/abstract/document/8979435/},
    volume={},
    number={},
    pages={1-1},
    }

  • S. Gholamian and P. A. S. Ward, “Logging statements’ prediction based on source code clones,” in Proceedings of the 35th annual acm symposium on applied computing, New York, NY, USA, 2020, p. 82–91. doi:10.1145/3341105.3373845
    [BibTeX] [PDF]
    @inproceedings{10.1145/3341105.3373845,
    author = {Gholamian, Sina and Ward, Paul A. S.},
    title = {Logging Statements’ Prediction Based on Source Code Clones},
    year = {2020},
    isbn = {9781450368667},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3341105.3373845},
    doi = {10.1145/3341105.3373845},
    booktitle = {Proceedings of the 35th Annual ACM Symposium on Applied Computing},
    pages = {82–91},
    numpages = {10},
    keywords = {logging statement, source code, automation, software engineering, code clones},
    location = {Brno, Czech Republic},
    series = {SAC ’20}
    }

  • M. Hammad, H. A. Basit, S. Jarzabek, and R. Koschke, “A systematic mapping study of clone visualizations,” Computer science review, vol. 37, pp. 100-266, 2020. doi:10.1016/j.cosrev.2020.100266
    [BibTeX] [Abstract] [PDF]

    Knowing code clones (similar code fragments) is helpful in software maintenance and re-engineering. As clone detectors return huge numbers of clones, visualization techniques have been proposed to make cloning information more comprehensible and useful for programmers. We present a mapping study of clone visualization techniques, classifying visualizations in respect to the user goals to be achieved by means of clone visualizations and relevant clone-related information needs. Our mapping study will aid tool users in selecting clone visualization tools suitable for the task at hand, tool vendors in improving capabilities of their tools, and researchers in identifying open problems in clone visualization research.

    @article{hammad_survey_2020-1,
    title = {A systematic mapping study of clone visualizations},
    volume = {37},
    url = {http://www.sciencedirect.com/science/article/pii/S1574013719302679},
    doi = {10.1016/j.cosrev.2020.100266},
    abstract = {Knowing code clones (similar code fragments) is helpful in software maintenance and re-engineering. As clone detectors return huge numbers of clones, visualization techniques have been proposed to make cloning information more comprehensible and useful for programmers. We present a mapping study of clone visualization techniques, classifying visualizations in respect to the user goals to be achieved by means of clone visualizations and relevant clone-related information needs. Our mapping study will aid tool users in selecting clone visualization tools suitable for the task at hand, tool vendors in improving capabilities of their tools, and researchers in identifying open problems in clone visualization research.},
    journal = {Computer Science Review},
    author = {Hammad, Muhammad and Basit, Hamid Abdul and Jarzabek, Stan and Koschke, Rainer},
    year = {2020},
    note = {ISBN: 2020.100266},
    keywords = {Clone, Feature analysis, Human-computer interaction, Information needs, User goals, Visualization techniques},
    pages = {100-266}
    }

  • Y. Hung and S. Takada, “Cppcd: a token-based approach to detecting potential clones,” in 2020 ieee 14th international workshop on software clones (iwsc), 2020, pp. 26-32.
    [BibTeX] [PDF]
    @INPROCEEDINGS{9047636,
    author={Y. {Hung} and S. {Takada}},
    booktitle={2020 IEEE 14th International Workshop on Software Clones (IWSC)},
    title={CPPCD: A Token-Based Approach to Detecting Potential Clones},
    year={2020},
    url = {https://ieeexplore.ieee.org/abstract/document/9047636/},
    volume={},
    number={},
    pages={26-32},
    }

  • G. Li, Y. Wu, C. K. Roy, J. Sun, X. Peng, N. Zhan, B. Hu, and J. Ma, “Saga: efficient and large-scale detection of near-miss clones with gpu acceleration,” in 2020 ieee 27th international conference on software analysis, evolution and reengineering (saner), 2020, pp. 272-283.
    [BibTeX] [PDF]
    @INPROCEEDINGS{9054832,
    author={G. {Li} and Y. {Wu} and C. K. {Roy} and J. {Sun} and X. {Peng} and N. {Zhan} and B. {Hu} and J. {Ma}},
    booktitle={2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER)},
    title={SAGA: Efficient and Large-Scale Detection of Near-Miss Clones with GPU Acceleration},
    year={2020},
    url = {https://ieeexplore.ieee.org/abstract/document/9054832/},
    volume={},
    number={},
    pages={272-283},}

  • S. Li, X. ·. Niu, ·. Zhouyang Jia, X. ·. Liao, J. Wang, and T. Li, “Guiding log revisions by learning from software evolution history,” Empirical software engineering, pp. 2302-2340, 2020. doi:10.1007/s10664-019-09757-y
    [BibTeX] [Abstract] [PDF]

    Despite the importance of log statements in postmortem debugging, developers are difficult to establish good logging practices. There are mainly two reasons. First, there are no rigorous specifications or systematic processes to instruct logging practices. Second, logging code evolves with bug fixes or feature updates. Without considering the impact of software evolution, previous works on log enhancement can partially release the first problem but are hard to solve the latter. To fill this gap, this paper proposes to guide log revisions by learning from evolution history. Motivated by code clones, we assume that logging code with similar context is pervasive and deserves similar modifications and conduct an empirical study on 12 open-source projects to validate our assumption. Upon this, we design and implement LogTracker, an automatic tool that learns log revision rules by mining the correlation between logging context and modifications and recommends candidate log revisions by applying these rules. With an enhanced modeling of logging context, LogTracker can instruct more intricate log revisions that cannot be covered by existing tools. Our experiments show that LogTracker can detect 369 instances of candidates when applied to the latest versions of software. So far, we have reported 79 of them, and 52 have been accepted.

    @article{li_guiding_nodate,
    title = {Guiding log revisions by learning from software evolution history},
    url = {https://doi.org/10.1007/s10664-019-09757-y},
    doi = {10.1007/s10664-019-09757-y},
    abstract = {Despite the importance of log statements in postmortem debugging, developers are difficult to establish good logging practices. There are mainly two reasons. First, there are no rigorous specifications or systematic processes to instruct logging practices. Second, logging code evolves with bug fixes or feature updates. Without considering the impact of software evolution, previous works on log enhancement can partially release the first problem but are hard to solve the latter. To fill this gap, this paper proposes to guide log revisions by learning from evolution history. Motivated by code clones, we assume that logging code with similar context is pervasive and deserves similar modifications and conduct an empirical study on 12 open-source projects to validate our assumption. Upon this, we design and implement LogTracker, an automatic tool that learns log revision rules by mining the correlation between logging context and modifications and recommends candidate log revisions by applying these rules. With an enhanced modeling of logging context, LogTracker can instruct more intricate log revisions that cannot be covered by existing tools. Our experiments show that LogTracker can detect 369 instances of candidates when applied to the latest versions of software. So far, we have reported 79 of them, and 52 have been accepted.},
    journal = {Empirical Software Engineering},
    pages = {2302-2340},
    year = {2020},
    author = {Li, Shanshan and Niu, · Xu and Zhouyang Jia, · and Liao, · Xiangke and Wang, Ji and Li, Tao},
    keywords = {Software evolution, Empirical study, Failure diagnose, Log revision}
    }

  • M. Mondal, B. Roy, C. K. Roy, and K. A. Schneider, “Associating code clones with association rules for change impact analysis,” in 2020 ieee 27th international conference on software analysis, evolution and reengineering (saner), 2020, pp. 93-103.
    [BibTeX] [PDF]
    @INPROCEEDINGS{9054846,
    author={M. {Mondal} and B. {Roy} and C. K. {Roy} and K. A. {Schneider}},
    booktitle={2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER)},
    title={Associating Code Clones with Association Rules for Change Impact Analysis},
    year={2020},
    url = {https://ieeexplore.ieee.org/abstract/document/9054846/},
    volume={},
    number={},
    pages={93-103},}

  • M. Nadim, M. Mondal, and C. K. Roy, “Evaluating performance of clone detection tools in detecting cloned cochange candidates,” in 2020 ieee 14th international workshop on software clones (iwsc), 2020, pp. 15-21.
    [BibTeX] [PDF]
    @INPROCEEDINGS{9047639,
    author={M. {Nadim} and M. {Mondal} and C. K. {Roy}},
    booktitle={2020 IEEE 14th International Workshop on Software Clones (IWSC)},
    title={Evaluating Performance of Clone Detection Tools in Detecting Cloned Cochange Candidates},
    year={2020},
    url = {https://ieeexplore.ieee.org/document/9047639},
    volume={},
    number={},
    pages={15-21},
    }

  • D. Pizzolotto and K. Inoue, “Blanker: a refactor-oriented cloned source code normalizer,” in 2020 ieee 14th international workshop on software clones (iwsc), 2020, pp. 22-25.
    [BibTeX] [PDF]
    @INPROCEEDINGS{9047634,
    author={D. {Pizzolotto} and K. {Inoue}},
    booktitle={2020 IEEE 14th International Workshop on Software Clones (IWSC)},
    title={Blanker: A Refactor-Oriented Cloned Source Code Normalizer},
    year={2020},
    url = {https://ieeexplore.ieee.org/abstract/document/9047634},
    volume={},
    number={},
    pages={22-25},
    }

  • M. Pyl, B. van Bladel, and S. Demeyer, “An empirical study on accidental cross-project code clones,” in 2020 ieee 14th international workshop on software clones (iwsc), 2020, pp. 33-37.
    [BibTeX] [PDF]
    @INPROCEEDINGS{9047641,
    author={M. {Pyl} and B. {van Bladel} and S. {Demeyer}},
    booktitle={2020 IEEE 14th International Workshop on Software Clones (IWSC)},
    title={An Empirical Study on Accidental Cross-Project Code Clones},
    year={2020},
    url = {https://ieeexplore.ieee.org/abstract/document/9047641/},
    volume={},
    number={},
    pages={33-37},}

  • W. Rahman, Y. Xu, F. Pu, J. Xuan, X. Jia, M. Basios, L. Kanthan, L. Li, F. Wu, and B. Xu, “Clone detection on large scala codebases,” in 2020 ieee 14th international workshop on software clones (iwsc), 2020, pp. 38-44.
    [BibTeX] [PDF]
    @INPROCEEDINGS{9047640,
    author={W. {Rahman} and Y. {Xu} and F. {Pu} and J. {Xuan} and X. {Jia} and M. {Basios} and L. {Kanthan} and L. {Li} and F. {Wu} and B. {Xu}},
    booktitle={2020 IEEE 14th International Workshop on Software Clones (IWSC)},
    title={Clone Detection on Large Scala Codebases},
    year={2020},
    url = {https://ieeexplore.ieee.org/abstract/document/9047640/},
    volume={},
    number={},
    pages={38-44},}

  • M. Singh and K. K. in Software, “Scalable and Accurate Detection of Function Clones in Software Using Multithreading,” Springer, 2020.
    [BibTeX] [PDF]
    @article{singh_scalable_nodate,
    title = {Scalable and {Accurate} {Detection} of {Function} {Clones} in {Software} {Using} {Multithreading}},
    url = {https://link.springer.com/chapter/10.1007/978-3-030-26574-8_3},
    journal = {Springer},
    year = {2020},
    jorunal = { Integrating Research and {Practice}},
    author = {Singh, MK and in Software, K Kumar}
    }

  • J. Svacina, J. Simmons, and T. Cerny, “Semantic code clone detection for enterprise applications,” in Proceedings of the 35th annual acm symposium on applied computing, New York, NY, USA, 2020, p. 129–131. doi:10.1145/3341105.3374117
    [BibTeX] [PDF]
    @inproceedings{10.1145/3341105.3374117,
    author = {Svacina, Jan and Simmons, Jonathan and Cerny, Tomas},
    title = {Semantic Code Clone Detection for Enterprise Applications},
    year = {2020},
    isbn = {9781450368667},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3341105.3374117},
    doi = {10.1145/3341105.3374117},
    booktitle = {Proceedings of the 35th Annual ACM Symposium on Applied Computing},
    pages = {129–131},
    numpages = {3},
    keywords = {enterprise software, source code analysis, semantic clone, software engineering, code clone detection},
    location = {Brno, Czech Republic},
    series = {SAC ’20}
    }

  • H. Thaller, L. Linsbauer, and A. Egyed, “Towards semantic clone detection via probabilistic software modeling,” in 2020 ieee 14th international workshop on software clones (iwsc), 2020, pp. 64-69.
    [BibTeX] [PDF]
    @INPROCEEDINGS{9047635,
    author={H. {Thaller} and L. {Linsbauer} and A. {Egyed}},
    booktitle={2020 IEEE 14th International Workshop on Software Clones (IWSC)},
    title={Towards Semantic Clone Detection via Probabilistic Software Modeling},
    year={2020},
    url = {https://ieeexplore.ieee.org/abstract/document/9047635/},
    volume={},
    number={},
    pages={64-69},}

  • S. Tokui, N. Yoshida, E. Choi, and K. Inoue, “Clone notifier: developing and improving the system to notify changes of code clones,” in 2020 ieee 27th international conference on software analysis, evolution and reengineering (saner), 2020, pp. 642-646.
    [BibTeX] [PDF]
    @INPROCEEDINGS{9054793,
    author={S. {Tokui} and N. {Yoshida} and E. {Choi} and K. {Inoue}},
    booktitle={2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER)},
    title={Clone Notifier: Developing and Improving the System to Notify Changes of Code Clones},
    year={2020},
    url = {https://ieeexplore.ieee.org/abstract/document/9054793/},
    volume={},
    number={},
    pages={642-646},
    }

  • W. Wang, G. Li, B. Ma, X. Xia, and Z. Jin, “Detecting Code Clones with Graph Neural Network and Flow-Augmented Abstract Syntax Tree,” in SANER 2020 – Proceedings of the 2020 IEEE 27th International Conference on Software Analysis, Evolution, and Reengineering, 2020, p. 261–271. doi:10.1109/SANER48275.2020.9054857
    [BibTeX] [Abstract] [PDF]

    Code clones are semantically similar code fragments pairs that are syntactically similar or different. Detection of code clones can help to reduce the cost of software maintenance and prevent bugs. Numerous approaches of detecting code clones have been proposed previously, but most of them focus on detecting syntactic clones and do not work well on semantic clones with different syntactic features. To detect semantic clones, researchers have tried to adopt deep learning for code clone detection to automatically learn latent semantic features from data. Especially, to leverage grammar information, several approaches used abstract syntax trees (AST) as input and achieved significant progress on code clone benchmarks in various programming languages. However, these AST-based approaches still can not fully leverage the structural information of code fragments, especially semantic information such as control flow and data flow. To leverage control and data flow information, in this paper, we build a graph representation of programs called flow-augmented abstract syntax tree (FA-AST). We construct FA-AST by augmenting original ASTs with explicit control and data flow edges. Then we apply two different types of graph neural networks (GNN) on FA-AST to measure the similarity of code pairs. As far as we have concerned, we are the first to apply graph neural networks on the domain of code clone detection. We apply our FA-AST and graph neural networks on two Java datasets: Google Code Jam and BigCloneBench. Our approach outperforms the state-of-the-art approaches on both Google Code Jam and BigCloneBench tasks.

    @inproceedings{wang_detecting_2020,
    title = {Detecting {Code} {Clones} with {Graph} {Neural} {Network} and {Flow}-{Augmented} {Abstract} {Syntax} {Tree}},
    isbn = {978-1-72815-143-4},
    doi = {10.1109/SANER48275.2020.9054857},
    url = {https://arxiv.org/abs/2002.08653},
    abstract = {Code clones are semantically similar code fragments pairs that are syntactically similar or different. Detection of code clones can help to reduce the cost of software maintenance and prevent bugs. Numerous approaches of detecting code clones have been proposed previously, but most of them focus on detecting syntactic clones and do not work well on semantic clones with different syntactic features. To detect semantic clones, researchers have tried to adopt deep learning for code clone detection to automatically learn latent semantic features from data. Especially, to leverage grammar information, several approaches used abstract syntax trees (AST) as input and achieved significant progress on code clone benchmarks in various programming languages. However, these AST-based approaches still can not fully leverage the structural information of code fragments, especially semantic information such as control flow and data flow. To leverage control and data flow information, in this paper, we build a graph representation of programs called flow-augmented abstract syntax tree (FA-AST). We construct FA-AST by augmenting original ASTs with explicit control and data flow edges. Then we apply two different types of graph neural networks (GNN) on FA-AST to measure the similarity of code pairs. As far as we have concerned, we are the first to apply graph neural networks on the domain of code clone detection. We apply our FA-AST and graph neural networks on two Java datasets: Google Code Jam and BigCloneBench. Our approach outperforms the state-of-the-art approaches on both Google Code Jam and BigCloneBench tasks.},
    booktitle = {{SANER} 2020 - {Proceedings} of the 2020 {IEEE} 27th {International} {Conference} on {Software} {Analysis}, {Evolution}, and {Reengineering}},
    publisher = {Institute of Electrical and Electronics Engineers Inc.},
    author = {Wang, Wenhan and Li, Ge and Ma, Bo and Xia, Xin and Jin, Zhi},
    month = feb,
    year = {2020},
    note = {\_eprint: 2002.08653},
    keywords = {clone detection, deep learning, control flow, data flow, graph neural network},
    pages = {261--271}
    }

2019

  • Q. Ul Ain, F. Azam, M. W. Anwar, and A. Kiran, “A model-driven approach for token based code clone detection techniques – an introduction to umlccd,” in Proceedings of the 2019 8th international conference on educational and information technology, New York, NY, USA, 2019, p. 312–317. doi:10.1145/3318396.3318440
    [BibTeX] [PDF]
    @inproceedings{10.1145/3318396.3318440,
    author = {Ul Ain, Qurat and Azam, Farooque and Anwar, Muhammad Waseem and Kiran, Ayesha},
    title = {A Model-Driven Approach for Token Based Code Clone Detection Techniques - An Introduction to UMLCCD},
    year = {2019},
    isbn = {9781450362672},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3318396.3318440},
    doi = {10.1145/3318396.3318440},
    booktitle = {Proceedings of the 2019 8th International Conference on Educational and Information Technology},
    pages = {312–317},
    numpages = {6},
    keywords = {MDA, Token based approaches, UMLCCD, Code clone detection},
    location = {Cambridge, United Kingdom},
    series = {ICEIT 2019}
    }

  • F. Alomari and M. Harbi, “Scalable source code similarity detection in large code repositories,” Eai endorsed transactions on scalable information systems, vol. 6, iss. 22, 2019. doi:10.4108/eai.13-7-2018.159353
    [BibTeX] [PDF]
    @ARTICLE{10.4108/eai.13-7-2018.159353,
    author={Firas Alomari and Muhammed Harbi},
    title={Scalable Source Code Similarity Detection in Large Code Repositories},
    journal={EAI Endorsed Transactions on Scalable Information Systems},
    volume={6},
    number={22},
    publisher={EAI},
    journal_a={SIS},
    year={2019},
    month={7},
    url = {https://eudl.eu/doi/10.4108/eai.13-7-2018.159353},
    keywords={clones, software similarity, Control Flow Graphs, Fingerprints},
    doi={10.4108/eai.13-7-2018.159353}
    }

  • V. Arammongkolvichai, R. Koschke, C. Ragkhitwetsagul, M. Choetkiertikul, and T. Sunetnanta, “Improving clone detection precision using machine learning techniques,” in 2019 10th international workshop on empirical software engineering in practice (iwesep), 2019, pp. 31-315.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8945086,
    author={V. Arammongkolvichai and R. Koschke and C. Ragkhitwetsagul and M. Choetkiertikul and T. Sunetnanta},
    booktitle={2019 10th International Workshop on Empirical Software Engineering in Practice (IWESEP)},
    title={Improving Clone Detection Precision Using Machine Learning Techniques},
    year={2019},
    volume={},
    number={},
    pages={31-315},
    url = {https://ieeexplore.ieee.org/abstract/document/8945086}
    }

  • M. Badri, L. Badri, O. Hachemane, and A. Ouellet, “Measuring the effect of clone refactoring on the size of unit test cases in object-oriented software: an empirical study,” Innov. syst. softw. eng., vol. 15, iss. 2, p. 117–137, 2019. doi:10.1007/s11334-019-00334-6
    [BibTeX] [PDF]
    @article{10.1007/s11334-019-00334-6,
    author = {Badri, Mourad and Badri, Linda and Hachemane, Oussama and Ouellet, Alexandre},
    title = {Measuring the Effect of Clone Refactoring on the Size of Unit Test Cases in Object-Oriented Software: An Empirical Study},
    year = {2019},
    issue_date = {June 2019},
    publisher = {Springer-Verlag},
    address = {Berlin, Heidelberg},
    volume = {15},
    number = {2},
    issn = {1614-5046},
    url = {https://doi.org/10.1007/s11334-019-00334-6},
    doi = {10.1007/s11334-019-00334-6},
    journal = {Innov. Syst. Softw. Eng.},
    month = jun,
    pages = {117–137},
    numpages = {21},
    keywords = {Object-oriented software, Linear regression, Source code attributes, Clone refactoring, Metrics, Unit test cases, Machine learning algorithms, Relationships}
    }

  • S. Bharti and H. Singh, “An Efficient Architectural Framework for Non-obtrusive and Instantaneous Real-Time Identification of Clones During the Software Development Process in IDE,” in Communications in computer and information science, 2019, pp. 397-405. doi:10.1007/978-981-15-0111-1_35
    [BibTeX] [Abstract] [PDF]

    Code Clones are well-known Source Code Smells that impacts the Software maintenance thus research community proposed various real-time clone detection approaches to proactively manage them during the software development process. The present-day real-time Code Clone identifiers have at least one of the five inadequacies: (a) entails Developer involvement to start the Clone Detection process, (b) despite of having focused search capability from few tools, Clone Detection necessitates to be triggered by the Developer, (c) in spite of few plug-in tools instigating concentrated search a large portion of available plug-ins procedure Clones in bunch mode and in this way expends much time to find clones, (d) despite being plugins to the IDEs, current tools require Software Programmer to trigger the visualization of Clone Detection results thus deficits instantaneous real-time Clone recognition functionality, (e) uses indexing techniques that can further be replaced by other available more efficient techniques to reduce the response time. This paper presents the Architectural Framework of underdevelopment real-time Code Clone Detection plug-in tool, which is proficiently adequate as a resolution to all the above-unveiled issues. The tool architecture description clearly indicates the proficiency of our approach in the application of automatic triggering of Clone Detection process as well as focused block level search on interception of block end leading to instantaneous real-time identification of clones and immediate recommendation mechanism.

    @inproceedings{bharti_efficient_2019,
    title = {An {Efficient} {Architectural} {Framework} for {Non}-obtrusive and {Instantaneous} {Real}-{Time} {Identification} of {Clones} {During} the {Software} {Development} {Process} in {IDE}},
    volume = {1076},
    isbn = {9789811501104},
    doi = {10.1007/978-981-15-0111-1_35},
    abstract = {Code Clones are well-known Source Code Smells that impacts the Software maintenance thus research community proposed various real-time clone detection approaches to proactively manage them during the software development process. The present-day real-time Code Clone identifiers have at least one of the five inadequacies: (a) entails Developer involvement to start the Clone Detection process, (b) despite of having focused search capability from few tools, Clone Detection necessitates to be triggered by the Developer, (c) in spite of few plug-in tools instigating concentrated search a large portion of available plug-ins procedure Clones in bunch mode and in this way expends much time to find clones, (d) despite being plugins to the IDEs, current tools require Software Programmer to trigger the visualization of Clone Detection results thus deficits instantaneous real-time Clone recognition functionality, (e) uses indexing techniques that can further be replaced by other available more efficient techniques to reduce the response time. This paper presents the Architectural Framework of underdevelopment real-time Code Clone Detection plug-in tool, which is proficiently adequate as a resolution to all the above-unveiled issues. The tool architecture description clearly indicates the proficiency of our approach in the application of automatic triggering of Clone Detection process as well as focused block level search on interception of block end leading to instantaneous real-time identification of clones and immediate recommendation mechanism.},
    booktitle = {Communications in Computer and Information Science},
    publisher = {Springer},
    author = {Bharti, Sarveshwar and Singh, Hardeep},
    year = {2019},
    note = {ISSN: 18650937},
    keywords = {Architectural framework, Instantaneous real-time identification, Non-obtrusive Code Clone Detection, Software clones},
    pages = {397-405},
    url = {https://link.springer.com/chapter/10.1007/978-981-15-0111-1_35}
    }

  • S. Bharti and H. Singh, “Investigating developers’ sentiments associated with software cloning practices,” in Communications in computer and information science, 2019, pp. 397-406. doi:10.1007/978-981-13-3140-4_36
    [BibTeX] [Abstract] [PDF]

    Researchers through empirical observations have established that efficiency of software development tasks and their output relies upon software developer’s associated persuasions. Thus, empathizing software developer’s sentiments has now become one of the goals of an effective Software Engineering. This paper presents the developers’ sentiments associated with software cloning practices. SentiStrength, a frequently used Sentiment Analysis tool in software engineering is used to explore the sentiment polarity of the developers during programming tasks. 39 responses collected via online industrial survey were analyzed with SentiStrength tool. Sentiment Analysis performed on the developer responses mainly indicate the neutral polarity i.e. developers under study don’t think clones and cloning practices as good or bad practice, instead 71.79\% expressed neutral sentiments. The collected opinions indicate neither the acceptance nor rejection of harmfulness or benefits of clones, rather depicted the neutral opinion of software developers towards clones.

    @inproceedings{bharti_investigating_2019,
    title = {Investigating developers' sentiments associated with software cloning practices},
    volume = {955},
    isbn = {9789811331398},
    doi = {10.1007/978-981-13-3140-4_36},
    abstract = {Researchers through empirical observations have established that efficiency of software development tasks and their output relies upon software developer's associated persuasions. Thus, empathizing software developer's sentiments has now become one of the goals of an effective Software Engineering. This paper presents the developers' sentiments associated with software cloning practices. SentiStrength, a frequently used Sentiment Analysis tool in software engineering is used to explore the sentiment polarity of the developers during programming tasks. 39 responses collected via online industrial survey were analyzed with SentiStrength tool. Sentiment Analysis performed on the developer responses mainly indicate the neutral polarity i.e. developers under study don't think clones and cloning practices as good or bad practice, instead 71.79\% expressed neutral sentiments. The collected opinions indicate neither the acceptance nor rejection of harmfulness or benefits of clones, rather depicted the neutral opinion of software developers towards clones.},
    booktitle = {Communications in Computer and Information Science},
    publisher = {Springer Verlag},
    author = {Bharti, Sarveshwar and Singh, Hardeep},
    year = {2019},
    note = {ISSN: 18650929},
    keywords = {Developers' behavior, Sentiment analysis, Software cloning},
    pages = {397-406},
    url = {https://link.springer.com/chapter/10.1007/978-981-13-3140-4_36}
    }

  • A. Calleja, J. Tapiador, and J. Caballero, “The malsource dataset: quantifying complexity and code reuse in malware development,” Ieee transactions on information forensics and security, vol. 14, iss. 12, pp. 3175-3190, 2019.
    [BibTeX]
    @ARTICLE{8568018,
    author={A. Calleja and J. Tapiador and J. Caballero},
    journal={IEEE Transactions on Information Forensics and Security},
    title={The MalSource Dataset: Quantifying Complexity and Code Reuse in Malware Development},
    year={2019},
    volume={14},
    number={12},
    pages={3175-3190},
    }

  • M. E. Batista, P. A. Parreira, and H. Costa, “An exploratory study on detection of cloned code in information systems,” in Proceedings of the xv brazilian symposium on information systems, New York, NY, USA, 2019. doi:10.1145/3330204.3330277
    [BibTeX] [PDF]
    @inproceedings{10.1145/3330204.3330277,
    author = {Batista, Mall\'{u} Eduarda and Parreira, Paulo Afonso and Costa, Heitor},
    title = {An Exploratory Study on Detection of Cloned Code in Information Systems},
    year = {2019},
    isbn = {9781450372374},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3330204.3330277},
    doi = {10.1145/3330204.3330277},
    booktitle = {Proceedings of the XV Brazilian Symposium on Information Systems},
    articleno = {67},
    numpages = {8},
    keywords = {Clone Detection, Clone Detection Tools, Clone Detection Approaches, Clone Code, Programming Language, Programming Paradigm},
    location = {Aracaju, Brazil},
    series = {SBSI’19}
    }

  • A. A. Elkhail, J. Svacina, and T. Cerny, “Intelligent token-based code clone detection system for large scale source code,” in Proceedings of the conference on research in adaptive and convergent systems, New York, NY, USA, 2019, p. 256–260. doi:10.1145/3338840.3355654
    [BibTeX] [PDF]
    @inproceedings{10.1145/3338840.3355654,
    author = {Elkhail, Abdulrahman Abu and Svacina, Jan and Cerny, Tomas},
    title = {Intelligent Token-Based Code Clone Detection System for Large Scale Source Code},
    year = {2019},
    isbn = {9781450368438},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3338840.3355654},
    doi = {10.1145/3338840.3355654},
    booktitle = {Proceedings of the Conference on Research in Adaptive and Convergent Systems},
    pages = {256–260},
    numpages = {5},
    keywords = {BigCloneBench, clone detection, code clone, case study},
    location = {Chongqing, China},
    series = {RACS ’19}
    }

  • W. Haider, M. W. Anwar, F. Azam, Q. Ul Ain, W. Haider Butt, W. Anwar, and B. Maqbool, “Recent advancements in code clone detection – techniques and tools,” Ieee access, 2019. doi:10.1109/ACCESS.2019.2918202
    [BibTeX] [Abstract] [PDF]

    Code cloning refers to the duplication of source code. It is the most common way of reusing source code in software development. If a bug is identified in one segment of code, all the similar segments need to be checked for the same bug. Consequently, this cloning process may lead to bug propagation that significantly affect maintenance cost. By considering this problem, Code Clone Detection (CCD) appears as an active area of research. Consequently, there is a strong need to investigate the latest techniques, trends and tools in the domain of CCD. Therefore, in this article, we comprehensively inspect the latest tools and techniques utilized for the detection of code clones. Particularly, a Systematic Literature Review (SLR) is performed to select and investigate 54 studies pertaining to CCD. Consequently, six categories are defined to incorporate the selected studies as per relevance i.e. textual approaches (12), lexical approaches (8), tree-based approaches (3), metric-based approaches (7), semantic approaches (7) and hybrid approaches (17). We identified and analyzed 26 code clone detection tools, i.e., 13 existing and 13 proposed / developed. Moreover, 62 open source subject systems whose source code is utilized for code clone detection are presented. It is concluded that there exist several researches to detect type1, type2, type3 and type4 clones individually. However, there is a need to develop novel approaches with complete tool support in order to detect all four types of clones collectively. Furthermore, it is also required to introduce more approaches to simplify the development of Program Dependency Graph (PDG) while dealing with the detection of type4 clones.

    @article{haider_recent_2019,
    title = {Recent Advancements in Code Clone Detection – Techniques and Tools},
    url = {https://www.researchgate.net/profile/Bilal_Maqbool3/publication/333301646_Recent_Advancements_in_Code_Clone_Detection_-_Techniques_and_Tools/links/5cfe875b92851c874c5d7d84/Recent-Advancements-in-Code-Clone-Detection-Techniques-and-Tools.pdf},
    doi = {10.1109/ACCESS.2019.2918202},
    abstract = {Code cloning refers to the duplication of source code. It is the most common way of reusing source code in software development. If a bug is identified in one segment of code, all the similar segments need to be checked for the same bug. Consequently, this cloning process may lead to bug propagation that significantly affect maintenance cost. By considering this problem, Code Clone Detection (CCD) appears as an active area of research. Consequently, there is a strong need to investigate the latest techniques, trends and tools in the domain of CCD. Therefore, in this article, we comprehensively inspect the latest tools and techniques utilized for the detection of code clones. Particularly, a Systematic Literature Review (SLR) is performed to select and investigate 54 studies pertaining to CCD. Consequently, six categories are defined to incorporate the selected studies as per relevance i.e. textual approaches (12), lexical approaches (8), tree-based approaches (3), metric-based approaches (7), semantic approaches (7) and hybrid approaches (17). We identified and analyzed 26 code clone detection tools, i.e., 13 existing and 13 proposed / developed. Moreover, 62 open source subject systems whose source code is utilized for code clone detection are presented. It is concluded that there exist several researches to detect type1, type2, type3 and type4 clones individually. However, there is a need to develop novel approaches with complete tool support in order to detect all four types of clones collectively. Furthermore, it is also required to introduce more approaches to simplify the development of Program Dependency Graph (PDG) while dealing with the detection of type4 clones.},
    journal = {IEEE Access},
    author = {Haider, Wasi and Anwar, Muhammad Waseem and Azam, Farooque and Ul Ain, Qurat and Haider Butt, Wasi and Anwar, Waseem and Maqbool, Bilal},
    year = {2019},
    note = {Publisher: Doi Number},
    keywords = {CCD tools, Code Clone Detection, Code Clone Types, INDEX TERMS CCD, SLR}
    }

  • N. He, L. Wu, H. Wang, Y. Guo, and X. Jiang, “Characterizing code clones in the ethereum smart contract ecosystem,” Corr, 2019.
    [BibTeX] [Abstract] [PDF]

    In this paper, we present the first large-scale and systematic study to characterize the code reuse practice in the Ethereum smart contract ecosystem. We first performed a detailed similarity comparison study on a dataset of 10 million contracts we had harvested, and then we further conducted a qualitative analysis to characterize the diversity of the ecosystem, understand the correlation between code reuse and vulnerabilities, and detect the plagiarist DApps. Our analysis revealed that over 96\% of the contracts had duplicates, while a large number of them were similar, which suggests that the ecosystem is highly homogeneous. Our results also suggested that roughly 9.7\% of the similar contract pairs have exactly the same vulnerabilities, which we assume were introduced by code clones. In addition, we identified 41 DApps clusters, involving 73 plagiarized DApps which had caused huge financial loss to the original creators, accounting for 1/3 of the original market volume.

    @article{he_characterizing_2019,
    title = {Characterizing Code Clones in the Ethereum Smart Contract Ecosystem},
    url = {http://arxiv.org/abs/1905.00272},
    abstract = {In this paper, we present the first large-scale and systematic study to characterize the code reuse practice in the Ethereum smart contract ecosystem. We first performed a detailed similarity comparison study on a dataset of 10 million contracts we had harvested, and then we further conducted a qualitative analysis to characterize the diversity of the ecosystem, understand the correlation between code reuse and vulnerabilities, and detect the plagiarist DApps. Our analysis revealed that over 96\% of the contracts had duplicates, while a large number of them were similar, which suggests that the ecosystem is highly homogeneous. Our results also suggested that roughly 9.7\% of the similar contract pairs have exactly the same vulnerabilities, which we assume were introduced by code clones. In addition, we identified 41 DApps clusters, involving 73 plagiarized DApps which had caused huge financial loss to the original creators, accounting for 1/3 of the original market volume.},
    author = {He, Ningyu and Wu, Lei and Wang, Haoyu and Guo, Yao and Jiang, Xuxian},
    month = may,
    year = {2019},
    journal = {CoRR},
    note = {\_eprint: 1905.00272}
    }

  • M. Mondal, B. Roy, C. K. Roy, and K. A. Schneider, “Ranking co-change candidates of micro-clones,” in Proceedings of the 29th annual international conference on computer science and software engineering, USA, 2019, p. 244–253.
    [BibTeX] [PDF]
    @inproceedings{10.5555/3370272.3370298,
    author = {Mondal, Manishankar and Roy, Banani and Roy, Chanchal K. and Schneider, Kevin A.},
    title = {Ranking Co-Change Candidates of Micro-Clones},
    year = {2019},
    publisher = {IBM Corp.},
    url = {https://dl.acm.org/doi/10.5555/3370272.3370298},
    address = {USA},
    booktitle = {Proceedings of the 29th Annual International Conference on Computer Science and Software Engineering},
    pages = {244–253},
    numpages = {10},
    location = {Toronto, Ontario, Canada},
    series = {CASCON ’19}
    }

  • G. Mostaeen, J. Svajlenko, B. Roy, C. K. Roy, and K. A. Schneider, “Clonecognition: machine learning based code clone validation tool,” in Proceedings of the 2019 27th acm joint meeting on european software engineering conference and symposium on the foundations of software engineering, New York, NY, USA, 2019, p. 1105–1109. doi:10.1145/3338906.3341182
    [BibTeX] [PDF]
    @inproceedings{10.1145/3338906.3341182,
    author = {Mostaeen, Golam and Svajlenko, Jeffrey and Roy, Banani and Roy, Chanchal K. and Schneider, Kevin A.},
    title = {CloneCognition: Machine Learning Based Code Clone Validation Tool},
    year = {2019},
    isbn = {9781450355728},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3338906.3341182},
    doi = {10.1145/3338906.3341182},
    booktitle = {Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering},
    pages = {1105–1109},
    numpages = {5},
    keywords = {Machine Learning, Artificial Neural Network, Clone Management, Validation, Code Clones},
    location = {Tallinn, Estonia},
    series = {ESEC/FSE 2019}
    }

  • L. Nichols, M. Emre, and B. Hardekopf, “Structural and nominal cross-language clone detection,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2019, p. 247–263. doi:10.1007/978-3-030-16722-6_14
    [BibTeX] [Abstract] [PDF]

    In this paper we address the challenge of cross-language clone detection. Due to the rise of cross-language libraries and applications (e.g., apps written for both Android and iPhone), it has become common for code fragments in one language to be ported over into another language in an extension of the usual “copy and paste” coding methodology. As with single-language clones, it is important to be able to detect these cross-language clones. However there are many real-world cross-language clones that existing techniques cannot detect. We describe the first general, cross-language algorithm that combines both structural and nominal similarity to find syntactic clones, thereby enabling more complete clone detection than any existing technique. This algorithm also performs comparably to the state of the art in single-language clone detection when applied to single-language source code; thus it generalizes the state of the art in clone detection to detect both single- and cross-language clones using one technique.

    @inproceedings{nichols_structural_2019,
    title = {Structural and nominal cross-language clone detection},
    volume = {11424 LNCS},
    isbn = {978-3-030-16721-9},
    url = {https://link.springer.com/chapter/10.1007/978-3-030-16722-6_14},
    doi = {10.1007/978-3-030-16722-6_14},
    abstract = {In this paper we address the challenge of cross-language clone detection. Due to the rise of cross-language libraries and applications (e.g., apps written for both Android and iPhone), it has become common for code fragments in one language to be ported over into another language in an extension of the usual “copy and paste” coding methodology. As with single-language clones, it is important to be able to detect these cross-language clones. However there are many real-world cross-language clones that existing techniques cannot detect. We describe the first general, cross-language algorithm that combines both structural and nominal similarity to find syntactic clones, thereby enabling more complete clone detection than any existing technique. This algorithm also performs comparably to the state of the art in single-language clone detection when applied to single-language source code; thus it generalizes the state of the art in clone detection to detect both single- and cross-language clones using one technique.},
    booktitle = {Lecture {Notes} in {Computer} {Science} (including subseries {Lecture} {Notes} in {Artificial} {Intelligence} and {Lecture} {Notes} in {Bioinformatics})},
    publisher = {Springer Verlag},
    author = {Nichols, Lawton and Emre, Mehmet and Hardekopf, Ben},
    year = {2019},
    note = {ISSN: 16113349},
    pages = {247--263}
    }

  • S. Rongrong, Z. Liping, and Z. Fengrong, “A method for identifying and recommending reconstructed clones,” in Proceedings of the 2019 3rd international conference on management engineering, software engineering and service sciences, New York, NY, USA, 2019, pp. 39-44. doi:10.1145/3312662.3312709
    [BibTeX] [PDF]
    @inproceedings{10.1145/3312662.3312709,
    author = {Rongrong, She and Liping, Zhang and Fengrong, Zhao},
    title = {A Method for Identifying and Recommending Reconstructed Clones},
    year = {2019},
    isbn = {9781450361897},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3312662.3312709},
    doi = {10.1145/3312662.3312709},
    booktitle = {Proceedings of the 2019 3rd International Conference on Management Engineering, Software Engineering and Service Sciences},
    pages = {39-44},
    numpages = {6},
    keywords = {clone reconstruction, clone tracking, Clone code, feature extraction, clone family},
    location = {Wuhan, China},
    series = {ICMSS 2019}
    }

  • H. Yu, W. Lam, L. Chen, G. Li, T. Xie, and Q. Wang, “Neural detection of semantic code clones via tree-based convolution,” in 2019 ieee/acm 27th international conference on program comprehension (icpc), 2019, pp. 70-80.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8813290,
    author={H. {Yu} and W. {Lam} and L. {Chen} and G. {Li} and T. {Xie} and Q. {Wang}},
    booktitle={2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)},
    title={Neural Detection of Semantic Code Clones Via Tree-Based Convolution},
    year={2019},
    url = {https://ieeexplore.ieee.org/abstract/document/8813290/},
    volume={},
    number={},
    pages={70-80},
    }

  • S. Baars and A. Oprescu, “Towards automated refactoring of code clones in object-oriented programming languages,” 2019.
    [BibTeX] [Abstract] [PDF]

    Duplication in source code can have a major negative impact on the maintainability of source code, as it creates implicit dependencies between fragments of code. Such implicit dependencies often cause bugs and increase maintenance efforts. In this study, we look into the opportunities to automatically refactor these duplication problems for object-oriented programming languages. We propose a method to detect clones that are suitable for refactoring. This method focuses on the context and scope of clones, ensuring our refac-toring improves the design and does not create side effects. Our intermediate results indicate that more than half of the duplication in code is related to each other through inheritance, making it easier to refactor these clones in a clean way. About 40 percent of the duplication can be refactored through method extraction, while other clones require other refactoring techniques or further transformations. Future measurements will provide further insight into what clones should be refactored to improve the design of software systems.

    @techreport{baars_easychair_2019,
    title = {Towards Automated Refactoring of Code Clones in Object-Oriented Programming Languages},
    url = {https://wvvw.easychair.org/publications/preprint_download/Jlvk},
    abstract = {Duplication in source code can have a major negative impact on the maintainability of source code, as it creates implicit dependencies between fragments of code. Such implicit dependencies often cause bugs and increase maintenance efforts. In this study, we look into the opportunities to automatically refactor these duplication problems for object-oriented programming languages. We propose a method to detect clones that are suitable for refactoring. This method focuses on the context and scope of clones, ensuring our refac-toring improves the design and does not create side effects. Our intermediate results indicate that more than half of the duplication in code is related to each other through inheritance, making it easier to refactor these clones in a clean way. About 40 percent of the duplication can be refactored through method extraction, while other clones require other refactoring techniques or further transformations. Future measurements will provide further insight into what clones should be refactored to improve the design of software systems.},
    author = {Baars, Simon and Oprescu, Ana},
    journal = {International Journal of Innovative Research in Science, Engineering and Technology},
    year = {2019},
    volume = {5}
    }

  • O. Babur, L. Cleophas, M. van den Brand, and I. tuenl Extended, “Metamodel Clone Detection with SAMOS (extended abstract),” 2019.
    [BibTeX] [PDF]
    @techreport{babur_metamodel_nodate,
    title = {Metamodel {Clone} {Detection} with {SAMOS} (extended abstract)},
    url = {https://www.researchgate.net/publication/331611919_Metamodel_clone_detection_with_SAMOS},
    author = {Babur, Onder and Cleophas, Loek and van den Brand, Mark and Extended, tuenl I},
    note = {Publication Title: Elsevier},
    journal = {Journal of Computer Languages},
    year = {2019}
    }

  • L. Chen, W. Ye, and S. Zhang, “Capturing source code semantics via tree-based convolution over api-enhanced ast,” in Proceedings of the 16th acm international conference on computing frontiers, New York, NY, USA, 2019, p. 174–182. doi:10.1145/3310273.3321560
    [BibTeX] [PDF]
    @inproceedings{10.1145/3310273.3321560,
    author = {Chen, Long and Ye, Wei and Zhang, Shikun},
    title = {Capturing Source Code Semantics via Tree-Based Convolution over API-Enhanced AST},
    year = {2019},
    isbn = {9781450366854},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3310273.3321560},
    doi = {10.1145/3310273.3321560},
    booktitle = {Proceedings of the 16th ACM International Conference on Computing Frontiers},
    pages = {174–182},
    numpages = {9},
    keywords = {big code, code semantics, representation learning, code search, API, clone detection, semantic clone, tree-based LSTM, code summarization, tree-based convolution, AST},
    location = {Alghero, Italy},
    series = {CF ’19}
    }

  • M. Gharehyazie, B. Ray, M. Keshani, M. S. Zavosht, A. Heydarnoori, and V. Filkov, “Cross-project code clones in GitHub,” Empirical software engineering, vol. 24, iss. 3, pp. 1538-1573, 2019. doi:10.1007/s10664-018-9648-z
    [BibTeX] [PDF]
    @article{gharehyazie_cross-project_2019,
    title = {Cross-project code clones in {GitHub}},
    volume = {24},
    issn = {15737616},
    url = {https://link.springer.com/article/10.1007/s10664-018-9648-z},
    doi = {10.1007/s10664-018-9648-z},
    number = {3},
    journal = {Empirical Software Engineering},
    author = {Gharehyazie, Mohammad and Ray, Baishakhi and Keshani, Mehdi and Zavosht, Masoumeh Soleimani and Heydarnoori, Abbas and Filkov, Vladimir},
    month = jun,
    year = {2019},
    note = {Publisher: Springer New York LLC},
    keywords = {Clone detection, GitHub, Cross-project cloning, Deckard},
    pages = {1538-1573}
    }

  • H. Honda, S. Tokui, K. Yokoi, E. Choi, N. Yoshida, and K. Inoue, “Ccevovis: a clone evolution visualization system for software maintenance,” in 2019 ieee/acm 27th international conference on program comprehension (icpc), 2019, pp. 122-125.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8813292,
    author={H. {Honda} and S. {Tokui} and K. {Yokoi} and E. {Choi} and N. {Yoshida} and K. {Inoue}},
    booktitle={2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)},
    title={CCEvovis: A Clone Evolution Visualization System for Software Maintenance},
    year={2019},
    url ={https://ieeexplore.ieee.org/document/8813292},
    volume={},
    number={},
    pages={122-125},
    }

  • J. F. Islam, M. Mondal, C. K. Roy, and K. A. Schneider, “Comparing bug replication in regular and micro code clones,” in 2019 ieee/acm 27th international conference on program comprehension (icpc), 2019, pp. 81-92.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8813261,
    author={J. F. {Islam} and M. {Mondal} and C. K. {Roy} and K. A. {Schneider}},
    booktitle={2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)},
    title={Comparing Bug Replication in Regular and Micro Code Clones},
    year={2019},
    url = {https://ieeexplore.ieee.org/abstract/document/8813261/},
    volume={},
    number={},
    pages={81-92},}

  • J. F. Islam, M. Mondal, and C. K. Roy, “A comparative study of software bugs in micro-clones and regular code clones,” in 2019 ieee 26th international conference on software analysis, evolution and reengineering (saner), 2019, pp. 73-83.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8667993,
    author={J. F. {Islam} and M. {Mondal} and C. K. {Roy}},
    booktitle={2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER)},
    title={A Comparative Study of Software Bugs in Micro-clones and Regular Code Clones},
    year={2019},
    url = {https://ieeexplore.ieee.org/abstract/document/8667993/},
    volume={},
    number={},
    pages={73-83},}

  • H. K. Jnanamurthy, R. Jetley, F. Henskens, D. Paul, M. Wallis, and S. D. Sudarsan, “Analysis of industrial control system software to detect semantic clones,” in 2019 ieee international conference on industrial technology (icit), 2019, pp. 773-779.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8754957,
    author={H. K. {Jnanamurthy} and R. {Jetley} and F. {Henskens} and D. {Paul} and M. {Wallis} and S. D. {Sudarsan}},
    booktitle={2019 IEEE International Conference on Industrial Technology (ICIT)},
    title={Analysis of Industrial Control System Software to Detect Semantic Clones},
    year={2019},
    url = {https://ieeexplore.ieee.org/abstract/document/8754957/},
    volume={},
    number={},
    pages={773-779},}

  • M. J. I. Mostafa, “An empirical study on clone evolution by analyzing clone lifetime,” in 2019 ieee 13th international workshop on software clones (iwsc), 2019, pp. 20-26.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8665850,
    author={M. J. I. {Mostafa}},
    booktitle={2019 IEEE 13th International Workshop on Software Clones (IWSC)},
    title={An Empirical Study on Clone Evolution by Analyzing Clone Lifetime},
    year={2019},
    url ={https://ieeexplore.ieee.org/document/8665850},
    volume={},
    number={},
    pages={20-26},}

  • J. Kanwal, O. Maqbool, H. A. Basit, and M. A. Sindhu, “Evolutionary perspective of structural clones in software,” Ieee access, vol. 7, pp. 58720-58739, 2019.
    [BibTeX] [PDF]
    @ARTICLE{8701448,
    author={J. {Kanwal} and O. {Maqbool} and H. A. {Basit} and M. A. {Sindhu}},
    journal={IEEE Access},
    title={Evolutionary Perspective of Structural Clones in Software},
    year={2019},
    url = {https://ieeexplore.ieee.org/abstract/document/8701448/},
    volume={7},
    number={},
    pages={58720-58739},}

  • H. Kaur and R. Maini, “Assessing lexical similarity between short sentences of source code based on granularity,” Int. j. inf. tecnol., p. 599–614, 2019. doi:10.1007/s41870-018-0213-1
    [BibTeX] [Abstract] [PDF]

    Detecting similarity between two source code bases or inside one code base has many applications in the area of plagiarism detection and reused code which is manageable for refactoring. In this paper, State of the art techniques: Levenshtein Distance, Cosine Similarity, Hamming Distance and ASCII based hashing and Rabin-Karp rolling hashing have been investigated on source code strings, which is an extended work to already published research work. From experimentation, it has been observed that Rabin-Karp hashing performs better than other techniques in terms of running time, accuracy and type-of-clones. All techniques face one issue of increase in similarity searching time linearly with database size, whereas Rabin-Karp hashing handled this issue efficiently. Moreover , Rabin-Karp rolling hash method reported minimum false positives and it is also able to manage multiple patterns at a time.

    @article{kaur_assessing_nodate,
    title = {Assessing lexical similarity between short sentences of source code based on granularity},
    url = {https://doi.org/10.1007/s41870-018-0213-1},
    doi = {10.1007/s41870-018-0213-1},
    abstract = {Detecting similarity between two source code bases or inside one code base has many applications in the area of plagiarism detection and reused code which is manageable for refactoring. In this paper, State of the art techniques: Levenshtein Distance, Cosine Similarity, Hamming Distance and ASCII based hashing and Rabin-Karp rolling hashing have been investigated on source code strings, which is an extended work to already published research work. From experimentation, it has been observed that Rabin-Karp hashing performs better than other techniques in terms of running time, accuracy and type-of-clones. All techniques face one issue of increase in similarity searching time linearly with database size, whereas Rabin-Karp hashing handled this issue efficiently. Moreover , Rabin-Karp rolling hash method reported minimum false positives and it is also able to manage multiple patterns at a time.},
    journal = { Int. j. inf. tecnol.},
    pages = {599–614},
    year= {2019},
    author = {Kaur, Harpreet and Maini, Raman},
    keywords = {Clones, Granularity, Similarity Index, Token}
    }

  • A. Lerina and L. Nardi, “Investigating on the impact of software clones on technical debt,” in 2019 ieee/acm international conference on technical debt (techdebt), 2019, pp. 108-112.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8786015,
    author={A. {Lerina} and L. {Nardi}},
    booktitle={2019 IEEE/ACM International Conference on Technical Debt (TechDebt)},
    title={Investigating on the Impact of Software Clones on Technical Debt},
    year={2019},
    url = {https://ieeexplore.ieee.org/abstract/document/8786015/},
    volume={},
    number={},
    pages={108-112},}

  • D. Mondal, M. Mondal, C. K. Roy, K. A. Schneider, S. Wang, and Y. Li, “Towards visualizing large scale evolving clones,” in 2019 ieee/acm 41st international conference on software engineering: companion proceedings (icse-companion), 2019, pp. 302-303.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8802819,
    author={D. {Mondal} and M. {Mondal} and C. K. {Roy} and K. A. {Schneider} and S. {Wang} and Y. {Li}},
    booktitle={2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion)},
    title={Towards Visualizing Large Scale Evolving Clones},
    year={2019},
    url = {https://ieeexplore.ieee.org/abstract/document/8802819/},
    volume={},
    number={},
    pages={302-303},}

  • D. Mondal, M. Mondal, C. K. Roy, K. A. Schneider, Y. Li, and S. Wang, “Clone-world: a visual analytic system for large scale software clones,” Visual informatics, vol. 3, iss. 1, pp. 18-26, 2019. doi:https://doi.org/10.1016/j.visinf.2019.03.003
    [BibTeX] [PDF]
    @article{MONDAL201918,
    title = "Clone-World: A visual analytic system for large scale software clones",
    journal = "Visual Informatics",
    volume = "3",
    number = "1",
    pages = "18-26",
    year = "2019",
    note = "Proceedings of PacificVAST 2019",
    issn = "2468-502X",
    doi = "https://doi.org/10.1016/j.visinf.2019.03.003",
    url = "http://www.sciencedirect.com/science/article/pii/S2468502X1930018X",
    author = "Debajyoti Mondal and Manishankar Mondal and Chanchal K. Roy and Kevin A. Schneider and Yukun Li and Shisong Wang",
    keywords = "Visual analytics, Software clones, Multivariate networks",
    }

  • M. Mondal, B. Roy, C. K. Roy, and K. A. Schneider, “Investigating context adaptation bugs in code clones,” in 2019 ieee international conference on software maintenance and evolution (icsme), 2019, pp. 157-168.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8919065,
    author={M. {Mondal} and B. {Roy} and C. K. {Roy} and K. A. {Schneider}},
    booktitle={2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)},
    title={Investigating Context Adaptation Bugs in Code Clones},
    year={2019},
    url = {https://ieeexplore.ieee.org/abstract/document/8919065/},
    volume={},
    number={},
    pages={157-168},}

  • F. Patrick Viertel, W. Brunotte, D. Strüber, and K. Schneider, “Detecting Security Vulnerabilities using Clone Detection and Community Knowledge,” , 2019. doi:10.18293/SEKE2019-183
    [BibTeX] [Abstract] [PDF]

    Faced with the severe financial and reputation implications associated with data breaches, enterprises now recognize security as a top concern for software analysis tools. While software engineers are typically not equipped with the required expertise to identify vulnerabilities in code, community knowledge in the form of publicly available vulnerability databases could come to their rescue. For example, the Common Vulnerabilities and Exposures Database (CVE) contains data about already reported weaknesses. However, the support with available examples in these databases is scarce. CVE entries usually do not contain example code for a vulnerability, its exploit or patch. They just link to reports or repositories that provide this information. Manually searching these sources for relevant information is time-consuming and error-prone. In this paper, we propose a vulnerability detection approach based on community knowledge and clone detection. The key idea is to harness available example source code of software weaknesses, from a large-scale vulnerability database, which are matched to code fragments using clone detection. We leverage a clone detection technique from the literature, which we adapted to make it applicable to vulnerability databases. In an evaluation based on 20 reports and affected projects, our approach showed good precision and recall.

    @article{patrick_viertel_detecting_nodate,
    title = {Detecting {Security} {Vulnerabilities} using {Clone} {Detection} and {Community} {Knowledge}},
    url = {https://ksiresearch.org/seke/seke19paper/seke19paper_183.pdf},
    doi = {10.18293/SEKE2019-183},
    abstract = {Faced with the severe financial and reputation implications associated with data breaches, enterprises now recognize security as a top concern for software analysis tools. While software engineers are typically not equipped with the required expertise to identify vulnerabilities in code, community knowledge in the form of publicly available vulnerability databases could come to their rescue. For example, the Common Vulnerabilities and Exposures Database (CVE) contains data about already reported weaknesses. However, the support with available examples in these databases is scarce. CVE entries usually do not contain example code for a vulnerability, its exploit or patch. They just link to reports or repositories that provide this information. Manually searching these sources for relevant information is time-consuming and error-prone. In this paper, we propose a vulnerability detection approach based on community knowledge and clone detection. The key idea is to harness available example source code of software weaknesses, from a large-scale vulnerability database, which are matched to code fragments using clone detection. We leverage a clone detection technique from the literature, which we adapted to make it applicable to vulnerability databases. In an evaluation based on 20 reports and affected projects, our approach showed good precision and recall.},
    author = {Patrick Viertel, Fabien and Brunotte, Wasja and Strüber, Daniel and Schneider, Kurt},
    keywords = {Code Clones, Security, Information Systems},
    booktitle = {SEKE},
    year = {2019}
    }

  • D. Perez and S. Chiba, “Cross-language clone detection by learning over abstract syntax trees,” in Proceedings of the 16th international conference on mining software repositories, 2019, p. 518–528. doi:10.1109/MSR.2019.00078
    [BibTeX] [PDF]
    @inproceedings{10.1109/MSR.2019.00078,
    author = {Perez, Daniel and Chiba, Shigeru},
    title = {Cross-Language Clone Detection by Learning over Abstract Syntax Trees},
    year = {2019},
    publisher = {IEEE Press},
    url = {https://doi.org/10.1109/MSR.2019.00078},
    doi = {10.1109/MSR.2019.00078},
    booktitle = {Proceedings of the 16th International Conference on Mining Software Repositories},
    pages = {518–528},
    numpages = {11},
    keywords = {machine learning, source code representation, clone detection},
    location = {Montreal, Quebec, Canada},
    series = {MSR ’19}
    }

  • C. Ragkhitwetsagul, J. Krinke, M. Paixao, G. Bianco, and R. Oliveto, “Toxic code snippets on stack overflow,” Ieee transactions on software engineering, pp. 1-1, 2019.
    [BibTeX] [PDF]
    @ARTICLE{8643998,
    author={C. {Ragkhitwetsagul} and J. {Krinke} and M. {Paixao} and G. {Bianco} and R. {Oliveto}},
    journal={IEEE Transactions on Software Engineering},
    title={Toxic Code Snippets on Stack Overflow},
    year={2019},
    url = {http://dx.doi.org/10.1109/TSE.2019.2900307},
    volume={},
    number={},
    pages={1-1},}

  • R. Rwemalika, M. Kintis, M. Papadakis, Y. Le Traon, and P. Lorrach, “On the evolution of keyword-driven test suites,” in 2019 12th ieee conference on software testing, validation and verification (icst), 2019, pp. 335-345.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8730167,
    author={R. {Rwemalika} and M. {Kintis} and M. {Papadakis} and Y. {Le Traon} and P. {Lorrach}},
    booktitle={2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)},
    title={On the Evolution of Keyword-Driven Test Suites},
    url = {https://ieeexplore.ieee.org/abstract/document/8730167/},
    year={2019},
    volume={},
    number={},
    pages={335-345},
    }

  • V. Saini, F. Farmahinifarahani, Y. Lu, D. Yang, P. Martins, H. Sajnani, P. Baldi, and C. V. Lopes, “Towards automating precision studies of clone detectors,” in Proceedings of the 41st international conference on software engineering, 2019, p. 49–59. doi:10.1109/ICSE.2019.00023
    [BibTeX] [PDF]
    @inproceedings{10.1109/ICSE.2019.00023,
    author = {Saini, Vaibhav and Farmahinifarahani, Farima and Lu, Yadong and Yang, Di and Martins, Pedro and Sajnani, Hitesh and Baldi, Pierre and Lopes, Cristina V.},
    title = {Towards Automating Precision Studies of Clone Detectors},
    year = {2019},
    publisher = {IEEE Press},
    url = {https://doi.org/10.1109/ICSE.2019.00023},
    doi = {10.1109/ICSE.2019.00023},
    booktitle = {Proceedings of the 41st International Conference on Software Engineering},
    pages = {49–59},
    numpages = {11},
    keywords = {machine learning, precision evaluation, open source labeled datasets, clone detection},
    location = {Montreal, Quebec, Canada},
    series = {ICSE ’19}
    }

  • P. Sharma and S. C. And, “A Novel Method of Clone Detection by Neural Networks,” European journal of engineering research and science, pp. 9-15, 2019. doi:10.24018/ejers.2019.4.12.1642
    [BibTeX] [Abstract] [PDF]

    Code clone is that type of engine that helps to find duplicate code patterns find within the whole code. Programmers usually adopt code reusability task from previous few years, so that time consumption can be reduced. Code reusability can be done via replication or by just copy-paste. Code reusability leads to not writing code from scratch, just copy paste the useful part of the code. In finding of duplicated code fragment or text, plagiarism detection also works pretty well but it is not applicable to the large system in finding functional clone and also it is more time consuming even at small scale which make the detection method inappropriate. In this paper, we proposed a pattern similarity conditions on the basis of textual similarity for finding the code or text clones in the large content on the basis of SVM, Neural Network using Java coding, Neural Network and Sim Cad. This approach detects code or text clones from original one. The resultant simulation is taken place in the MATLAB environment, and it has shown that it is providing better results. The proposed algorithm performance is measured using parameters i.e. FRR, FAR and Accuracy.

    @article{sharma_novel_nodate,
    title = {A {Novel} {Method} of {Clone} {Detection} by {Neural} {Networks}},
    url = {http://dx.doi.org/10.24018/ejers.2019.4.12.1642},
    doi = {10.24018/ejers.2019.4.12.1642},
    abstract = {Code clone is that type of engine that helps to find duplicate code patterns find within the whole code. Programmers usually adopt code reusability task from previous few years, so that time consumption can be reduced. Code reusability can be done via replication or by just copy-paste. Code reusability leads to not writing code from scratch, just copy paste the useful part of the code. In finding of duplicated code fragment or text, plagiarism detection also works pretty well but it is not applicable to the large system in finding functional clone and also it is more time consuming even at small scale which make the detection method inappropriate. In this paper, we proposed a pattern similarity conditions on the basis of textual similarity for finding the code or text clones in the large content on the basis of SVM, Neural Network using Java coding, Neural Network and Sim Cad. This approach detects code or text clones from original one. The resultant simulation is taken place in the MATLAB environment, and it has shown that it is providing better results. The proposed algorithm performance is measured using parameters i.e. FRR, FAR and Accuracy.},
    journal = {European Journal of Engineering Research and Science},
    author = {Sharma, P and And, C Singh},
    year = {2019},
    pages = {9-15},
    keywords = {Index Terms-Clone Detection, Code-Fragment, Text, Textual Comparison}
    }

  • M. Stephan, “Towards a cognizant virtual software modeling assistant using model clones,” in Proceedings of the 41st international conference on software engineering: new ideas and emerging results, 2019, p. 21–24. doi:10.1109/ICSE-NIER.2019.00014
    [BibTeX] [PDF]
    @inproceedings{10.1109/ICSE-NIER.2019.00014,
    author = {Stephan, Matthew},
    title = {Towards a Cognizant Virtual Software Modeling Assistant Using Model Clones},
    year = {2019},
    publisher = {IEEE Press},
    url = {https://doi.org/10.1109/ICSE-NIER.2019.00014},
    doi = {10.1109/ICSE-NIER.2019.00014},
    booktitle = {Proceedings of the 41st International Conference on Software Engineering: New Ideas and Emerging Results},
    pages = {21–24},
    numpages = {4},
    keywords = {model driven engineering, model clone detection, model clones, machine learning, software modeling},
    location = {Montreal, Quebec, Canada},
    series = {ICSE-NIER ’19}
    }

  • P. Thongtanunam, W. Shang, and A. E. Hassan, “Will this clone be short-lived? Towards a better understanding of the characteristics of short-lived clones,” Empirical software engineering, vol. 24, iss. 2, pp. 937-972, 2019. doi:10.1007/s10664-018-9645-2
    [BibTeX] [Abstract] [PDF]

    Code clones are created when a developer duplicates a code fragment to reuse existing functionalities. Mitigating clones by refactoring them helps ease the long-term maintenance of large software systems. However, refactoring can introduce an additional cost. Prior work also suggest that refactoring all clones can be counterproductive since clones may live in a system for a short duration. Hence, it is beneficial to determine in advance whether a newly-introduced clone will be short-lived or long-lived to plan the most effective use of resources. In this work, we perform an empirical study on six open source Java systems to better understand the life expectancy of clones. We find that a large number of clones (i.e., 30\% to 87\%) lived in the systems for a short duration. Moreover, we find that although short-lived clones were changed more frequently than long-lived clones throughout their lifetime, short-lived clones were consistently changed with their siblings less often than long-lived clones. Furthermore, we build random forest classifiers in order to determine the life expectancy of a newly-introduced clone (i.e., whether a clone will be short-lived or long-lived). Our empirical results show that our random forest classifiers can determine the life expectancy of a newly-introduced clone with an average AUC of 0.63 to 0.92. We also find that the churn made to the methods containing a newly-introduced clone, the complexity and size of the methods containing the newly-introduced clone are highly influential in determining whether the newly-introduced clone will be short-lived. Furthermore, the size of a newly-introduced clone shares a positive relationship with the likelihood that the newly-introduced clone will be short-lived. Our results suggest that, to improve the efficiency of clone management efforts, practitioners can leverage our classifiers and insights in order to determine whether a newly-introduced clone will be short-lived or long-lived to plan the most effective use of their clone management resources in advance.

    @article{thongtanunam_will_2019,
    title = {Will this clone be short-lived? {Towards} a better understanding of the characteristics of short-lived clones},
    volume = {24},
    issn = {15737616},
    url = {https://link.springer.com/article/10.1007/s10664-018-9645-2},
    doi = {10.1007/s10664-018-9645-2},
    abstract = {Code clones are created when a developer duplicates a code fragment to reuse existing functionalities. Mitigating clones by refactoring them helps ease the long-term maintenance of large software systems. However, refactoring can introduce an additional cost. Prior work also suggest that refactoring all clones can be counterproductive since clones may live in a system for a short duration. Hence, it is beneficial to determine in advance whether a newly-introduced clone will be short-lived or long-lived to plan the most effective use of resources. In this work, we perform an empirical study on six open source Java systems to better understand the life expectancy of clones. We find that a large number of clones (i.e., 30\% to 87\%) lived in the systems for a short duration. Moreover, we find that although short-lived clones were changed more frequently than long-lived clones throughout their lifetime, short-lived clones were consistently changed with their siblings less often than long-lived clones. Furthermore, we build random forest classifiers in order to determine the life expectancy of a newly-introduced clone (i.e., whether a clone will be short-lived or long-lived). Our empirical results show that our random forest classifiers can determine the life expectancy of a newly-introduced clone with an average AUC of 0.63 to 0.92. We also find that the churn made to the methods containing a newly-introduced clone, the complexity and size of the methods containing the newly-introduced clone are highly influential in determining whether the newly-introduced clone will be short-lived. Furthermore, the size of a newly-introduced clone shares a positive relationship with the likelihood that the newly-introduced clone will be short-lived. Our results suggest that, to improve the efficiency of clone management efforts, practitioners can leverage our classifiers and insights in order to determine whether a newly-introduced clone will be short-lived or long-lived to plan the most effective use of their clone management resources in advance.},
    number = {2},
    journal = {Empirical Software Engineering},
    author = {Thongtanunam, Patanamon and Shang, Weiyi and Hassan, Ahmed E.},
    month = apr,
    year = {2019},
    note = {Publisher: Springer New York LLC},
    keywords = {Code clone, Software evolution, Software maintenance, Software management},
    pages = {937-972}
    }

  • K. Uemura, A. Mori, E. Choi, and H. Iida, “Tracking method-level clones and a case study,” in 2019 ieee 13th international workshop on software clones (iwsc), 2019, pp. 27-33.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8665851,
    author={K. {Uemura} and A. {Mori} and E. {Choi} and H. {Iida}},
    booktitle={2019 IEEE 13th International Workshop on Software Clones (IWSC)},
    title={Tracking Method-Level Clones and a Case Study},
    year={2019},
    url = {https://ieeexplore.ieee.org/abstract/document/8665851/},
    volume={},
    number={},
    pages={27-33},
    }

  • B. van Bladel and S. Demeyer, “A novel approach for detecting type-iv clones in test code,” in 2019 ieee 13th international workshop on software clones (iwsc), 2019, pp. 8-12.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8665855,
    author={B. {van Bladel} and S. {Demeyer}},
    booktitle={2019 IEEE 13th International Workshop on Software Clones (IWSC)},
    title={A Novel Approach for Detecting Type-IV Clones in Test Code},
    year={2019},
    url = {https://ieeexplore.ieee.org/abstract/document/8665855/},
    volume={},
    number={},
    pages={8-12},}

  • D. Yu, J. Yang, X. Chen, and J. Chen, “Detecting java code clones based on bytecode sequence alignment,” Ieee access, vol. 7, pp. 22421-22433, 2019.
    [BibTeX] [PDF]
    @ARTICLE{8637956,
    author={D. {Yu} and J. {Yang} and X. {Chen} and J. {Chen}},
    journal={IEEE Access},
    title={Detecting Java Code Clones Based on Bytecode Sequence Alignment},
    year={2019},
    url = {https://ieeexplore.ieee.org/abstract/document/8637956/},
    volume={7},
    number={},
    pages={22421-22433},}

  • Q. U. Ain, W. H. Butt, M. W. Anwar, F. Azam, and B. Maqbool, “A systematic review on code clone detection,” Ieee access, vol. 7, pp. 86121-86144, 2019.
    [BibTeX] [PDF]
    @ARTICLE{8719895,
    author={Q. U. {Ain} and W. H. {Butt} and M. W. {Anwar} and F. {Azam} and B. {Maqbool}},
    journal={IEEE Access},
    title={A Systematic Review on Code Clone Detection},
    year={2019},
    url = {https://ieeexplore.ieee.org/document/8719895},
    volume={7},
    number={},
    pages={86121-86144},}

2018

  • M. H. Alalfi, E. P. Antony, and J. R. Cordy, “An approach to clone detection in sequence diagrams and its application to security analysis,” Softw. syst. model., vol. 17, iss. 4, pp. 1287-1309, 2018. doi:10.1007/s10270-016-0557-6
    [BibTeX] [PDF]
    @article{10.1007/s10270-016-0557-6,
    author = {Alalfi, Manar H. and Antony, Elizabeth P. and Cordy, James R.},
    title = {An Approach to Clone Detection in Sequence Diagrams and Its Application to Security Analysis},
    year = {2018},
    issue_date = {October 2018},
    publisher = {Springer-Verlag},
    address = {Berlin, Heidelberg},
    volume = {17},
    number = {4},
    issn = {1619-1366},
    url = {https://doi.org/10.1007/s10270-016-0557-6},
    doi = {10.1007/s10270-016-0557-6},
    journal = {Softw. Syst. Model.},
    month = oct,
    pages = {1287-1309},
    numpages = {23},
    keywords = {Model clone detection, Model based security analysis}
    }

  • S. Baltes, L. Dumani, C. Treude, and S. Diehl, “The evolution of stack overflow posts: reconstruction and analysis,” Corr, 2018.
    [BibTeX] [PDF]
    @article{baltes_evolution_2018,
    title = {The Evolution of Stack Overflow Posts: Reconstruction and Analysis},
    url = {http://arxiv.org/abs/1811.00804},
    author = {Baltes, Sebastian and Dumani, Lorik and Treude, Christoph and Diehl, Stephan},
    month = nov,
    journal = {CoRR},
    year = {2018},
    note = {\_eprint: 1811.00804}
    }

  • L. Barbour, L. An, F. Khomh, Y. Zou, and S. Wang, “An investigation of the fault-proneness of clone evolutionary patterns,” Software quality journal, vol. 26, iss. 4, p. 1187–1222, 2018. doi:10.1007/s11219-017-9375-5
    [BibTeX] [PDF]
    @article{10.1007/s11219-017-9375-5,
    author = {Barbour, Liliane and An, Le and Khomh, Foutse and Zou, Ying and Wang, Shaohua},
    title = {An Investigation of the Fault-Proneness of Clone Evolutionary Patterns},
    year = {2018},
    issue_date = {December 2018},
    publisher = {Kluwer Academic Publishers},
    address = {USA},
    volume = {26},
    number = {4},
    issn = {0963-9314},
    url = {https://doi.org/10.1007/s11219-017-9375-5},
    doi = {10.1007/s11219-017-9375-5},
    journal = {Software Quality Journal},
    month = dec,
    pages = {1187–1222},
    numpages = {36},
    keywords = {Metrics, Clone genealogies, Fault-proneness}
    }

  • M. Elsabagh, R. Johnson, and A. Stavrou, “Resilient and scalable cloned app detection using forced execution and compression trees,” in 2018 ieee conference on dependable and secure computing (dsc), 2018, pp. 1-8.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8625133,
    author={M. {Elsabagh} and R. {Johnson} and A. {Stavrou}},
    booktitle={2018 IEEE Conference on Dependable and Secure Computing (DSC)},
    title={Resilient and Scalable Cloned App Detection Using Forced Execution and Compression Trees},
    year={2018},
    url = {https://ieeexplore.ieee.org/abstract/document/8625133/},
    volume={},
    number={},
    pages={1-8},
    }

  • C. K. Roy and J. R. Cordy, “Adventures in nicad: a ten-year retrospective,” in Proceedings of the 26th conference on program comprehension, New York, NY, USA, 2018, p. 19. doi:10.1145/3196321.3196325
    [BibTeX] [PDF]
    @inproceedings{10.1145/3196321.3196325,
    author = {Roy, Chanchal K. and Cordy, James R.},
    title = {Adventures in NICAD: A Ten-Year Retrospective},
    year = {2018},
    isbn = {9781450357142},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3196321.3196325},
    doi = {10.1145/3196321.3196325},
    booktitle = {Proceedings of the 26th Conference on Program Comprehension},
    pages = {19},
    numpages = {1},
    location = {Gothenburg, Sweden},
    series = {ICPC ’18}
    }

  • V. Saini, F. Farmahinifarahani, Y. Lu, P. Baldi, and C. V. Lopes, “Oreo: detection of clones in the twilight zone,” in Proceedings of the 2018 26th acm joint meeting on european software engineering conference and symposium on the foundations of software engineering, New York, NY, USA, 2018, p. 354–365. doi:10.1145/3236024.3236026
    [BibTeX] [PDF]
    @inproceedings{10.1145/3236024.3236026,
    author = {Saini, Vaibhav and Farmahinifarahani, Farima and Lu, Yadong and Baldi, Pierre and Lopes, Cristina V.},
    title = {Oreo: Detection of Clones in the Twilight Zone},
    year = {2018},
    isbn = {9781450355735},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3236024.3236026},
    doi = {10.1145/3236024.3236026},
    booktitle = {Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering},
    pages = {354–365},
    numpages = {12},
    keywords = {Clone detection, Machine Learning, Software Metrics},
    location = {Lake Buena Vista, FL, USA},
    series = {ESEC/FSE 2018}
    }

  • R. Tekchandani, R. Bhatia, and M. Singh, “Semantic code clone detection for Internet of Things applications using reaching definition and liveness analysis,” Journal of supercomputing, vol. 74, iss. 9, pp. 4199-4226, 2018. doi:10.1007/s11227-016-1832-6
    [BibTeX] [Abstract] [PDF]

    Knowledge extraction from existing software resources for maintenance, re-engineering and bug removal through code clone detection is an integral part of most of the internet-enabled devices. Similar code fragments which are live at different locations are called code clones. These Internet-enabled devices are used for knowledge sharing and data extraction to execute various applications related to code clone detection. However, most of the existing semantic code clone detection techniques are unable to provide heuristic solution for problems such as statement reordering, inversion of control predicates and insertion of irrelevant statements which may cause a performance bottleneck in this environment. To address these issues, we propose a novel approach that finds semantic code clones in a program or procedure using data flow analysis on the basis of reaching definition and liveness analysis. The algorithm based on reaching definition and liveness analysis is designed to find similar code fragments which are structurally divergent, but semantically equivalent. The results obtained demonstrate that the proposed approach using reaching definition and liveness analysis is effective in detection of semantic code clones for various applications running on the Internet-enabled devices. We have found 5831 semantically equivalent clone pairs on subject systems taken from DeCapo benchmark after elimination of 29,029 dead codes/statements having 2,16,579 line of code (LOC).

    @article{tekchandani_semantic_2018,
    title = {Semantic code clone detection for {Internet} of {Things} applications using reaching definition and liveness analysis},
    volume = {74},
    issn = {15730484},
    url = {https://link.springer.com/article/10.1007/s11227-016-1832-6},
    doi = {10.1007/s11227-016-1832-6},
    abstract = {Knowledge extraction from existing software resources for maintenance, re-engineering and bug removal through code clone detection is an integral part of most of the internet-enabled devices. Similar code fragments which are live at different locations are called code clones. These Internet-enabled devices are used for knowledge sharing and data extraction to execute various applications related to code clone detection. However, most of the existing semantic code clone detection techniques are unable to provide heuristic solution for problems such as statement reordering, inversion of control predicates and insertion of irrelevant statements which may cause a performance bottleneck in this environment. To address these issues, we propose a novel approach that finds semantic code clones in a program or procedure using data flow analysis on the basis of reaching definition and liveness analysis. The algorithm based on reaching definition and liveness analysis is designed to find similar code fragments which are structurally divergent, but semantically equivalent. The results obtained demonstrate that the proposed approach using reaching definition and liveness analysis is effective in detection of semantic code clones for various applications running on the Internet-enabled devices. We have found 5831 semantically equivalent clone pairs on subject systems taken from DeCapo benchmark after elimination of 29,029 dead codes/statements having 2,16,579 line of code (LOC).},
    number = {9},
    journal = {Journal of Supercomputing},
    author = {Tekchandani, Rajkumar and Bhatia, Rajesh and Singh, Maninder},
    month = sep,
    year = {2018},
    note = {Publisher: Springer New York LLC},
    keywords = {Code clones, Control flow, Data flow, Liveness analysis, Reaching definition},
    pages = {4199-4226}
    }

  • N. Yoshida, T. Ishizu, B. Edwards, and K. Inoue, “How slim will my system be?: Estimating refactored code size by merging clones,” in Proceedings – International Conference on Software Engineering, 2018, pp. 352-360. doi:10.1145/3196321.3196353
    [BibTeX] [Abstract] [PDF]

    We have been doing code clone analysis with industry collaborators for a long time, and have been always asked a question, “OK, I understand my system contains a lot of code clones, but how slim will it be after merging redundant code clones?” As a software system evolves for long period, it would increasingly contain many code clones due to quick bug fix and new feature addition. Industry collaborators would recognize decay of initial design simplicity, and try to evaluate current system from the view point of maintenance effort and cost. As one of resources for the evaluation, the estimated code size by merging code clone is very important for them. In this paper, we formulate this issue as “slimming” problem, and present three different slimming methods, Basic, Complete, and Heuristic Methods, each of which gives a lower bound, upper bound, and modest reduction rates, respectively. Application of these methods to OSS systems written in C/C++ showed that the reduction rate is at most 5.7\% of the total size, and to a commercial COBOL system, it is at most 15.4\%. For this approach, we have gotten initial but very positive feedback from industry collaborators.

    @inproceedings{yoshida_how_2018,
    title = {How slim will my system be?: {Estimating} refactored code size by merging clones},
    isbn = {978-1-4503-5714-2},
    url = {https://dl.acm.org/doi/abs/10.1145/3196321.3196353},
    doi = {10.1145/3196321.3196353},
    abstract = {We have been doing code clone analysis with industry collaborators for a long time, and have been always asked a question, "OK, I understand my system contains a lot of code clones, but how slim will it be after merging redundant code clones?" As a software system evolves for long period, it would increasingly contain many code clones due to quick bug fix and new feature addition. Industry collaborators would recognize decay of initial design simplicity, and try to evaluate current system from the view point of maintenance effort and cost. As one of resources for the evaluation, the estimated code size by merging code clone is very important for them. In this paper, we formulate this issue as "slimming" problem, and present three different slimming methods, Basic, Complete, and Heuristic Methods, each of which gives a lower bound, upper bound, and modest reduction rates, respectively. Application of these methods to OSS systems written in C/C++ showed that the reduction rate is at most 5.7\% of the total size, and to a commercial COBOL system, it is at most 15.4\%. For this approach, we have gotten initial but very positive feedback from industry collaborators.},
    booktitle = {Proceedings - {International} {Conference} on {Software} {Engineering}},
    publisher = {IEEE Computer Society},
    author = {Yoshida, Norihiro and Ishizu, Takuya and Edwards, Bufurod and Inoue, Katsuro},
    month = may,
    year = {2018},
    note = {ISSN: 02705257},
    keywords = {code clone, refactoring, size estimation},
    pages = {352-360}
    }

  • J. Akram, Z. Shi, M. Mumtaz, and P. Luo, “Dccd: an efficient and scalable distributed code clone detection technique for big code.” 2018, pp. 354-390. doi:10.18293/SEKE2018-117
    [BibTeX]
    @inproceedings{inproceedings,
    author = {Akram, Junaid and Shi, Zhendong and Mumtaz, Majid and Luo, Ping},
    year = {2018},
    month = {07},
    pages = {354-390},
    title = {DCCD: An Efficient and Scalable Distributed Code Clone Detection Technique for Big Code},
    doi = {10.18293/SEKE2018-117},
    journal = {The 30th International Conference on Software Engineering and Knowledge Engineering}
    }

  • H. W. Alomari and M. Stephan, “Towards slice-based semantic clone detection,” in 2018 ieee 12th international workshop on software clones (iwsc), 2018, pp. 58-59.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8327320,
    author={H. W. {Alomari} and M. {Stephan}},
    booktitle={2018 IEEE 12th International Workshop on Software Clones (IWSC)},
    title={Towards slice-based semantic clone detection},
    year={2018},
    volume={},
    number={},
    pages={58-59},
    url = {https://ieeexplore.ieee.org/document/8327320}
    }

  • S. Ankali, P. J. C. L. -. of And, and undefined 2018, “Dictionary based approach to detect cross language clones of c and java language,” Journal of computational and theoretical nanoscience, vol. 15, pp. 3334-3340, 2018.
    [BibTeX] [PDF]
    @article{ankali_dictionary_nodate,
    title = {Dictionary Based Approach to Detect Cross Language Clones of C and Java Language},
    url = {https://www.ingentaconnect.com/content/asp/jctn/2018/00000015/f0020011/art00045},
    journal = {Journal of Computational and Theoretical Nanoscience},
    author = {Ankali, SB and And, L Parthiban - Journal of Computational and undefined 2018},
    year = {2018},
    pages = {3334-3340},
    volume = {15}
    }

  • J. Barbosa, R. M. C. Andrade, J. B. F. Filho, C. I. M. Bezerra, I. Barreto, and R. Capilla, “Cloning in customization classes: a case of a worldwide software product line,” in Proceedings of the vii brazilian symposium on software components, architectures, and reuse, New York, NY, USA, 2018, pp. 43-52. doi:10.1145/3267183.3267188
    [BibTeX] [PDF]
    @inproceedings{10.1145/3267183.3267188,
    author = {Barbosa, Jefferson and Andrade, Rossana M. C. and Filho, Jo\~{a}o Bosco F. and Bezerra, Carla I. M. and Barreto, Isaac and Capilla, Rafael},
    title = {Cloning in Customization Classes: A Case of a Worldwide Software Product Line},
    year = {2018},
    isbn = {9781450365543},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3267183.3267188},
    doi = {10.1145/3267183.3267188},
    booktitle = {Proceedings of the VII Brazilian Symposium on Software Components, Architectures, and Reuse},
    pages = {43-52},
    numpages = {10},
    keywords = {Software Product Line, Customization, Clone},
    location = {Sao Carlos, Brazil},
    series = {SBCARS ’18}
    }

  • A. Blasi and A. Gorla, “Replicomment: identifying clones in code comments,” in Proceedings of the 26th conference on program comprehension, New York, NY, USA, 2018, p. 320–323. doi:10.1145/3196321.3196360
    [BibTeX] [PDF]
    @inproceedings{10.1145/3196321.3196360,
    author = {Blasi, Arianna and Gorla, Alessandra},
    title = {Replicomment: Identifying Clones in Code Comments},
    year = {2018},
    isbn = {9781450357142},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3196321.3196360},
    doi = {10.1145/3196321.3196360},
    booktitle = {Proceedings of the 26th Conference on Program Comprehension},
    pages = {320–323},
    numpages = {4},
    keywords = {software quality, code comments, clones, bad smell},
    location = {Gothenburg, Sweden},
    series = {ICPC ’18}
    }

  • Z. Chen, Y. W. Kwon, and M. Song, “Clone refactoring inspection by summarizing clone refactorings and detecting inconsistent changes during software evolution,” Journal of software: evolution and process, vol. 30, iss. 10, 2018. doi:10.1002/smr.1951
    [BibTeX] [PDF]
    @article{chen_clone_2018,
    title = {Clone refactoring inspection by summarizing clone refactorings and detecting inconsistent changes during software evolution},
    volume = {30},
    url = {https://onlinelibrary.wiley.com/doi/abs/10.1002/smr.1951},
    issn = {20477481},
    doi = {10.1002/smr.1951},
    number = {10},
    journal = {Journal of Software: Evolution and Process},
    author = {Chen, Zhiyuan and Kwon, Young Woo and Song, Myoungkyu},
    month = oct,
    year = {2018},
    note = {Publisher: John Wiley and Sons Ltd},
    keywords = {code clone, refactoring, software maintenance and evolution}
    }

  • V. Guna and S. M. Kumar, “A Survey on Software Code Clone Detection to Improve the Maintenance Effort and Maintenance Cost of the Software,” International journal of computer sciences and engineering, 2018. doi:10.26438/ijcse/v6si3.188192
    [BibTeX] [Abstract] [PDF]

    During the development of the software the developers have a chance to copy the code continuously. Due to copying of the code there is a chance of having the identical or more similar code fragments in the software and it is called as software clones or code clones. These clones can be detected from the existing code that is in c, c++, java etc programming languages. By the Argo UML tool to the existing code to generate the class diagrams by using reverse engineering process. In software development process, coping of existing code fragment and pasting them with or without modification is a frequent process. Code clone means copy of an original form or duplicate. Software clone detection is important to reduce the software maintenance cost and to recognize the software system in a better way. There are many software code clone detection techniques such as text-based, token-based, Abstract Syntax tree based etc. and they are used to spot and finding the existence of clones in software system. Mainly detection of clones is on the type-1, type-2 and type-3 clones. These clones can be detected by using several novel algorithms are ARIMA, Back propagation, Multi objective genetic algorithm, support vector machines and also with several hybrid techniques with respect to recall and precision.

    @article{guna_survey_2018,
    title = {A {Survey} on {Software} {Code} {Clone} {Detection} to {Improve} the {Maintenance} {Effort} and {Maintenance} {Cost} of the {Software}},
    issn = {2347-2693},
    url = {https://www.researchgate.net/publication/330653359_A_Survey_on_Software_Code_Clone_Detection_to_Improve_the_Maintenance_Effort_and_Maintenance_Cost_of_the_Software},
    doi = {10.26438/ijcse/v6si3.188192},
    abstract = {During the development of the software the developers have a chance to copy the code continuously. Due to copying of the code there is a chance of having the identical or more similar code fragments in the software and it is called as software clones or code clones. These clones can be detected from the existing code that is in c, c++, java etc programming languages. By the Argo UML tool to the existing code to generate the class diagrams by using reverse engineering process. In software development process, coping of existing code fragment and pasting them with or without modification is a frequent process. Code clone means copy of an original form or duplicate. Software clone detection is important to reduce the software maintenance cost and to recognize the software system in a better way. There are many software code clone detection techniques such as text-based, token-based, Abstract Syntax tree based etc. and they are used to spot and finding the existence of clones in software system. Mainly detection of clones is on the type-1, type-2 and type-3 clones. These clones can be detected by using several novel algorithms are ARIMA, Back propagation, Multi objective genetic algorithm, support vector machines and also with several hybrid techniques with respect to recall and precision.},
    journal = {INTERNATIONAL JOURNAL OF COMPUTER SCIENCES AND ENGINEERING},
    author = {Guna, V and Kumar, M Sunil},
    year = {2018},
    page = {188-192},
    keywords = {Software maintenance, Code Clones, Recall and Precision, Type-1, Type-II and Type-III clones}
    }

  • M. R. Islam and M. F. Zibran, “On the characteristics of buggy code clones: a code quality perspective,” in 2018 ieee 12th international workshop on software clones (iwsc), 2018, pp. 23-29.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8327315,
    author={M. R. {Islam} and M. F. {Zibran}},
    booktitle={2018 IEEE 12th International Workshop on Software Clones (IWSC)},
    title={On the characteristics of buggy code clones: A code quality perspective},
    year={2018},
    url = {https://ieeexplore.ieee.org/abstract/document/8327315/},
    volume={},
    number={},
    pages={23-29},}

  • J. Kanwal, H. A. Basit, and O. Maqbool, “Structural clones: an evolution perspective,” in 2018 ieee 12th international workshop on software clones (iwsc), 2018, pp. 9-15.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8327313,
    author={J. {Kanwal} and H. A. {Basit} and O. {Maqbool}},
    booktitle={2018 IEEE 12th International Workshop on Software Clones (IWSC)},
    title={Structural clones: An evolution perspective},
    year={2018},
    url = {https://ieeexplore.ieee.org/abstract/document/8327313/},
    volume={},
    number={},
    pages={9-15},}

  • K. Kim, D. Kim, T. F. Bissyandé, E. Choi, L. Li, J. Klein, and Y. L. Traon, “Facoy: a code-to-code search engine,” in Proceedings of the 40th international conference on software engineering, New York, NY, USA, 2018, p. 946–957. doi:10.1145/3180155.3180187
    [BibTeX] [PDF]
    @inproceedings{10.1145/3180155.3180187,
    author = {Kim, Kisub and Kim, Dongsun and Bissyand\'{e}, Tegawend\'{e} F. and Choi, Eunjong and Li, Li and Klein, Jacques and Traon, Yves Le},
    title = {FaCoY: A Code-to-Code Search Engine},
    year = {2018},
    isbn = {9781450356381},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3180155.3180187},
    doi = {10.1145/3180155.3180187},
    booktitle = {Proceedings of the 40th International Conference on Software Engineering},
    pages = {946–957},
    numpages = {12},
    location = {Gothenburg, Sweden},
    series = {ICSE ’18}
    }

  • M. Mondal, C. K. Roy, and K. A. Schneider, “Bug-proneness and late propagation tendency of code clones: a comparative study on different clone types,” Journal of systems and software, vol. 144, pp. 41-59, 2018. doi:https://doi.org/10.1016/j.jss.2018.05.028
    [BibTeX] [PDF]
    @article{MONDAL201841,
    title = "Bug-proneness and late propagation tendency of code clones: A Comparative study on different clone types",
    journal = "Journal of Systems and Software",
    volume = "144",
    pages = "41 - 59",
    year = "2018",
    issn = "0164-1212",
    doi = "https://doi.org/10.1016/j.jss.2018.05.028",
    url = "http://www.sciencedirect.com/science/article/pii/S0164121218301079",
    author = "Manishankar Mondal and Chanchal K. Roy and Kevin A. Schneider",
    keywords = "Code clones, Clone-types, Bug-proneness, Late propagation",
    }

  • M. Mondai, C. K. Roy, and K. A. Schneider, “Micro-clones in evolving software,” in 2018 ieee 25th international conference on software analysis, evolution and reengineering (saner), 2018, pp. 50-60.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8330196,
    author={M. {Mondai} and C. K. {Roy} and K. A. {Schneider}},
    booktitle={2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)},
    title={Micro-clones in evolving software},
    year={2018},
    url = {https://ieeexplore.ieee.org/abstract/document/8330196/},
    volume={},
    number={},
    pages={50-60},}

  • G. Mostaeen, J. Svajlenko, B. Roy, C. K. Roy, and K. A. Schneider, “[research paper] on the use of machine learning techniques towards the design of cloud based automatic code clone validation tools,” in 2018 ieee 18th international working conference on source code analysis and manipulation (scam), 2018, pp. 155-164.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8530729,
    author={G. {Mostaeen} and J. {Svajlenko} and B. {Roy} and C. K. {Roy} and K. A. {Schneider}},
    booktitle={2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM)},
    title={[Research Paper] On the Use of Machine Learning Techniques Towards the Design of Cloud Based Automatic Code Clone Validation Tools},
    year={2018},
    url = {https://ieeexplore.ieee.org/abstract/document/8530729/},
    volume={},
    number={},
    pages={155-164},}

  • K. Narasimhan, C. Reichenbach, and J. Lawall, “Cleaning up copy–-paste clones with interactive merging,” Automated software engg., vol. 25, iss. 3, p. 627–673, 2018. doi:10.1007/s10515-018-0238-5
    [BibTeX] [PDF]
    @article{10.1007/s10515-018-0238-5,
    author = {Narasimhan, Krishna and Reichenbach, Christoph and Lawall, Julia},
    title = {Cleaning up Copy---Paste Clones with Interactive Merging},
    year = {2018},
    issue_date = {September 2018},
    publisher = {Kluwer Academic Publishers},
    address = {USA},
    volume = {25},
    number = {3},
    issn = {0928-8910},
    url = {https://doi.org/10.1007/s10515-018-0238-5},
    doi = {10.1007/s10515-018-0238-5},
    journal = {Automated Software Engg.},
    month = sep,
    pages = {627–673},
    numpages = {47},
    keywords = {Static analysis, Clone management, Program analysis, Source code analysis}
    }

  • R. Perez-Castillo and M. Piattini, “An empirical study on how project context impacts on code cloning,” Journal of software: evolution and process, vol. 30, iss. 12, p. e2115, 2018. doi:10.1002/smr.2115
    [BibTeX] [Abstract] [PDF]

    Abstract Code cloning can seriously affect software quality. Code clones are various fragments of syntactically or semantically equivalent code. Some authors argue that code clones have a negative impact on maintainability and understandability, since clones propagate defects and make it mandatory to pay attention to several copies. However, other authors believe clones are not necessarily bad, since self-admitted clones favor system stability and allow developers to move projects forward. Although some root causes and effects of cloning have been widely studied, there is not much relevant work analyzing how certain projects context factors impact on code cloning. This work presents an empirical validation of six open source projects by considering certain factors from Git repositories measured throughout a total of 70 releases for the 6 systems. The factors analyzed were the number of commits and committers per release, the average size of the commits and the size of the system in each release. The main conclusion obtained from the study is that, while the number of commits and committers and the system size do not significantly affect cloning, larger commits lead to a higher cloning ratio. These insights contribute to predicting and preventing code cloning, thus enabling a software quality improvement.

    @article{doi:10.1002/smr.2115,
    author = {Perez-Castillo, Ricardo and Piattini, Mario},
    title = {An empirical study on how project context impacts on code cloning},
    journal = {Journal of Software: Evolution and Process},
    volume = {30},
    number = {12},
    pages = {e2115},
    keywords = {code cloning, development context, empirical study, git},
    doi = {10.1002/smr.2115},
    url = {https://onlinelibrary.wiley.com/doi/abs/10.1002/smr.2115},
    eprint = {https://onlinelibrary.wiley.com/doi/pdf/10.1002/smr.2115},
    note = {e2115 JSME-17-0218.R2},
    abstract = {Abstract Code cloning can seriously affect software quality. Code clones are various fragments of syntactically or semantically equivalent code. Some authors argue that code clones have a negative impact on maintainability and understandability, since clones propagate defects and make it mandatory to pay attention to several copies. However, other authors believe clones are not necessarily bad, since self-admitted clones favor system stability and allow developers to move projects forward. Although some root causes and effects of cloning have been widely studied, there is not much relevant work analyzing how certain projects context factors impact on code cloning. This work presents an empirical validation of six open source projects by considering certain factors from Git repositories measured throughout a total of 70 releases for the 6 systems. The factors analyzed were the number of commits and committers per release, the average size of the commits and the size of the system in each release. The main conclusion obtained from the study is that, while the number of commits and committers and the system size do not significantly affect cloning, larger commits lead to a higher cloning ratio. These insights contribute to predicting and preventing code cloning, thus enabling a software quality improvement.},
    year = {2018}
    }

  • C. Ragkhitwetsagul, J. Krinke, and B. Marnette, “A picture is worth a thousand words: code clone detection based on image similarity,” in 2018 ieee 12th international workshop on software clones (iwsc), 2018, pp. 44-50.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8327318,
    author={C. {Ragkhitwetsagul} and J. {Krinke} and B. {Marnette}},
    booktitle={2018 IEEE 12th International Workshop on Software Clones (IWSC)},
    title={A picture is worth a thousand words: Code clone detection based on image similarity},
    year={2018},
    url = {https://ieeexplore.ieee.org/document/8327318},
    volume={},
    number={},
    pages={44-50},}

  • C. K. Roy and J. R. Cordy, “Benchmarks for software clone detection: a ten-year retrospective,” in 2018 ieee 25th international conference on software analysis, evolution and reengineering (saner), 2018, pp. 26-37.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8330194,
    author={C. K. {Roy} and J. R. {Cordy}},
    booktitle={2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)},
    title={Benchmarks for software clone detection: A ten-year retrospective},
    year={2018},
    url = {https://ieeexplore.ieee.org/abstract/document/8330194/},
    volume={},
    number={},
    pages={26-37},
    }

  • N. Saini, S. Singh, and Suman, “Code clones: Detection and management,” Procedia computer science, pp. 718-727, 2018.
    [BibTeX] [PDF]
    @article{saini_code_nodate,
    title = {Code clones: {Detection} and management},
    url = {https://www.sciencedirect.com/science/article/pii/S1877050918308123},
    journal = {Procedia Computer Science},
    author = {Neha Saini and Sukhdip Singh and Suman},
    pages = {718-727},
    year = {2018}
    }

  • Y. Semura, N. Yoshida, E. Choi, and K. Inoue, “Multilingual detection of code clones using antlr grammar definitions,” in 2018 25th asia-pacific software engineering conference (apsec), 2018, pp. 673-677.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8719568,
    author={Y. {Semura} and N. {Yoshida} and E. {Choi} and K. {Inoue}},
    booktitle={2018 25th Asia-Pacific Software Engineering Conference (APSEC)},
    title={Multilingual Detection of Code Clones Using ANTLR Grammar Definitions},
    year={2018},
    url = {https://ieeexplore.ieee.org/document/8719568},
    volume={},
    number={},
    pages={673-677},
    }

  • P. Sharma and E. Arshpreet Kaur, “Code Smell Detection Techniques and Process: A Review,” 2018.
    [BibTeX] [Abstract] [PDF]

    A code smell is a hint that something has turned out badly some place in your code. The idea of code smells was introduced to characterize various different types of design shortcomings in code. Code and design smells are poor solutions to recurring implementation and design problems. They may hinder the evolution of a system by making it hard for software engineers to carry out changes. In this paper, we reviewed code smell detection tool like: Décor, InFusion, JDeodorant, PMD, Stench Blossom, etc. Furthermore, we discussed various code smells detecting techniques. Code clones are indistinguishable fragment of source code which may be embedded deliberately or inadvertently. Reusing code pieces through reordering with or without minor adjustments is general undertaking in programming advancement. We’ve examined several papers to explore various tools and techniques used for code smell. In addition, we reviewed the process of code smell detection.

    @techreport{sharma_code_nodate,
    title = {Code {Smell} {Detection} {Techniques} and {Process}: {A} {Review}},
    url = {http://www.ijfrcsce.org/download/browse/Volume_4/April_18_Volume_4_Issue_4/1524737260_26-04-2018.pdf},
    abstract = {A code smell is a hint that something has turned out badly some place in your code. The idea of code smells was introduced to characterize various different types of design shortcomings in code. Code and design smells are poor solutions to recurring implementation and design problems. They may hinder the evolution of a system by making it hard for software engineers to carry out changes. In this paper, we reviewed code smell detection tool like: Décor, InFusion, JDeodorant, PMD, Stench Blossom, etc. Furthermore, we discussed various code smells detecting techniques. Code clones are indistinguishable fragment of source code which may be embedded deliberately or inadvertently. Reusing code pieces through reordering with or without minor adjustments is general undertaking in programming advancement. We've examined several papers to explore various tools and techniques used for code smell. In addition, we reviewed the process of code smell detection.},
    author = {Sharma, Pratiksha and Arshpreet Kaur, Er},
    journal = {International Journal on Future Revolution in Computer Science \& Communication Engineering},
    note = {Publication Title: International Journal on Future Revolution in Computer Science \& Communication Engineering},
    keywords = {Code Smell, Detection tool (ie, fragment of source code, Infusion and Deodorant)},
    pages = {425-430},
    year = {2018}
    }

  • J. Svajlenko and C. K. Roy, “Fast, scalable and user-guided clone detection,” in Proceedings of the 40th international conference on software engineering: companion proceeedings, New York, NY, USA, 2018, p. 352–353. doi:10.1145/3183440.3195005
    [BibTeX] [PDF]
    @inproceedings{10.1145/3183440.3195005,
    author = {Svajlenko, Jeffrey and Roy, Chanchai K.},
    title = {Fast, Scalable and User-Guided Clone Detection},
    year = {2018},
    isbn = {9781450356633},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3183440.3195005},
    doi = {10.1145/3183440.3195005},
    booktitle = {Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings},
    pages = {352–353},
    numpages = {2},
    keywords = {fast, large-scale, clone detection, scalable, user guided},
    location = {Gothenburg, Sweden},
    series = {ICSE ’18}
    }

  • T. Vislavski, G. Raki, N. Cardozo, and Z. Budimac, “Licca: a tool for cross-language clone detection,” 5th ieee international conference on software analysis, evolution and reengineering (saner), 2018. doi:10.1109/SANER.2018.8330250
    [BibTeX] [Abstract] [PDF]

    Code clones mostly have been proven harmful for the development and maintenance of software systems, leading to code deterioration and an increase in bugs as the system evolves. Modern software systems are composed of several components, incorporating multiple technologies in their development. In such systems, it is common to replicate (parts of) functionality across the different components, potentially in a different programming language. Effect of these duplicates is more acute, as their identification becomes more challenging. This paper presents LICCA, a tool for the identification of duplicate code fragments across multiple languages. LICCA is integrated with the SSQSA platform and relies on its high-level representation of code in which it is possible to extract syntactic and semantic characteristics of code fragments positing full cross-language clone detection. LICCA is on a technology development level. We demonstrate its potential by adopting a set of cloning scenarios, extended and rewritten in five characteristic languages: Java, C, JavaScript, Modula-2 and Scheme.

    @article{vislavski_licca_nodate,
    title = {LICCA: A tool for cross-language clone detection},
    url = {https://www.researchgate.net/publication/323535753},
    doi = {10.1109/SANER.2018.8330250},
    abstract = {Code clones mostly have been proven harmful for the development and maintenance of software systems, leading to code deterioration and an increase in bugs as the system evolves. Modern software systems are composed of several components, incorporating multiple technologies in their development. In such systems, it is common to replicate (parts of) functionality across the different components, potentially in a different programming language. Effect of these duplicates is more acute, as their identification becomes more challenging. This paper presents LICCA, a tool for the identification of duplicate code fragments across multiple languages. LICCA is integrated with the SSQSA platform and relies on its high-level representation of code in which it is possible to extract syntactic and semantic characteristics of code fragments positing full cross-language clone detection. LICCA is on a technology development level. We demonstrate its potential by adopting a set of cloning scenarios, extended and rewritten in five characteristic languages: Java, C, JavaScript, Modula-2 and Scheme.},
    journal = {5th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)},
    year = {2018},
    author = {Vislavski, Tijana and Raki, Gordana and Cardozo, Nicolás and Budimac, Zoran}
    }

  • N. Volanschi, “Stereo: editing clones refactored as code generators,” in 2018 ieee international conference on software maintenance and evolution (icsme), 2018, pp. 595-604.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8530072,
    author={N. {Volanschi}},
    booktitle={2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)},
    title={Stereo: Editing Clones Refactored as Code Generators},
    year={2018},
    url = {https://ieeexplore.ieee.org/document/8530072},
    volume={},
    number={},
    pages={595-604},
    }

  • P. Wang, J. Svajlenko, Y. Wu, Y. Xu, and C. K. Roy, “Ccaligner: a token based large-gap clone detector,” in Proceedings of the 40th international conference on software engineering, New York, NY, USA, 2018, p. 1066–1077. doi:10.1145/3180155.3180179
    [BibTeX] [PDF]
    @inproceedings{10.1145/3180155.3180179,
    author = {Wang, Pengcheng and Svajlenko, Jeffrey and Wu, Yanzhao and Xu, Yun and Roy, Chanchal K.},
    title = {CCAligner: A Token Based Large-Gap Clone Detector},
    year = {2018},
    isbn = {9781450356381},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3180155.3180179},
    doi = {10.1145/3180155.3180179},
    booktitle = {Proceedings of the 40th International Conference on Software Engineering},
    pages = {1066–1077},
    numpages = {12},
    keywords = {evaluation, clone detection, large-gap clone},
    location = {Gothenburg, Sweden},
    series = {ICSE ’18}
    }

  • K. Yokoi, E. Choi, N. Yoshida, and K. Inoue, “Investigating vector-based detection of code clones using bigclonebench,” in 2018 25th asia-pacific software engineering conference (apsec), 2018, pp. 699-700.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8719484,
    author={K. {Yokoi} and E. {Choi} and N. {Yoshida} and K. {Inoue}},
    booktitle={2018 25th Asia-Pacific Software Engineering Conference (APSEC)},
    title={Investigating Vector-Based Detection of Code Clones Using BigCloneBench},
    year={2018},
    url = {https://ieeexplore.ieee.org/abstract/document/8719484/},
    volume={},
    number={},
    pages={699-700},
    }

  • R. Yue, Z. Gao, N. Meng, Y. Xiong, X. Wang, and J. D. Morgenthaler, “Automatic clone recommendation for refactoring based on the present and the past,” in 2018 ieee international conference on software maintenance and evolution (icsme), 2018, pp. 115-126.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8530022,
    author={R. {Yue} and Z. {Gao} and N. {Meng} and Y. {Xiong} and X. {Wang} and J. D. {Morgenthaler}},
    booktitle={2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)},
    title={Automatic Clone Recommendation for Refactoring Based on the Present and the Past},
    year={2018},
    url = {https://ieeexplore.ieee.org/abstract/document/8530022/},
    volume={},
    number={},
    pages={115-126},}

  • A. V. Zarras, G. Mamalis, A. Papamichail, P. Kollias, and P. Vassiliadis, “And the tool created a gui that was impure and without form: anti-patterns in automatically generated guis,” in Proceedings of the 23rd european conference on pattern languages of programs, New York, NY, USA, 2018. doi:10.1145/3282308.3282333
    [BibTeX] [PDF]
    @inproceedings{10.1145/3282308.3282333,
    author = {Zarras, Apostolos V. and Mamalis, Georgios and Papamichail, Aggelos and Kollias, Panagiotis and Vassiliadis, Panos},
    title = {And the Tool Created a GUI That Was Impure and Without Form: Anti-Patterns in Automatically Generated GUIs},
    year = {2018},
    isbn = {9781450363877},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3282308.3282333},
    doi = {10.1145/3282308.3282333},
    booktitle = {Proceedings of the 23rd European Conference on Pattern Languages of Programs},
    articleno = {24},
    numpages = {8},
    keywords = {GUIs, Responsibilities, Patterns, Refactoring, Code Clones},
    location = {Irsee, Germany},
    series = {EuroPLoP ’18}
    }

2017

  • S. Charalampidou, A. Ampatzoglou, A. Chatzigeorgiou, and P. Avgeriou, “Assessing code smell interest probability: a case study,” in Proceedings of the xp2017 scientific workshops, New York, NY, USA, 2017. doi:10.1145/3120459.3120465
    [BibTeX] [PDF]
    @inproceedings{10.1145/3120459.3120465,
    author = {Charalampidou, Sofia and Ampatzoglou, Apostolos and Chatzigeorgiou, Alexander and Avgeriou, Paris},
    title = {Assessing Code Smell Interest Probability: A Case Study},
    year = {2017},
    isbn = {9781450352642},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3120459.3120465},
    doi = {10.1145/3120459.3120465},
    booktitle = {Proceedings of the XP2017 Scientific Workshops},
    articleno = {5},
    numpages = {8},
    keywords = {technical debt, case study, change proneness, interest probability},
    location = {Cologne, Germany},
    series = {XP ’17}
    }

  • Y. Dang, D. Zhang, S. Ge, R. Huang, C. Chu, and T. Xie, “Transferring code-clone detection and analysis to practice,” in Proceedings of the 39th international conference on software engineering: software engineering in practice track, 2017, pp. 53-62. doi:10.1109/ICSE-SEIP.2017.6
    [BibTeX] [PDF]
    @inproceedings{10.1109/ICSE-SEIP.2017.6,
    author = {Dang, Yingnong and Zhang, Dongmei and Ge, Song and Huang, Ray and Chu, Chengyun and Xie, Tao},
    title = {Transferring Code-Clone Detection and Analysis to Practice},
    year = {2017},
    isbn = {9781538627174},
    publisher = {IEEE Press},
    url = {https://doi.org/10.1109/ICSE-SEIP.2017.6},
    doi = {10.1109/ICSE-SEIP.2017.6},
    booktitle = {Proceedings of the 39th International Conference on Software Engineering: Software Engineering in Practice Track},
    pages = {53-62},
    numpages = {10},
    location = {Buenos Aires, Argentina},
    series = {ICSE-SEIP ’17}
    }

  • T. A. D. Henderson and A. Podgurski, “Rethinking dependence clones,” in 2017 ieee 11th international workshop on software clones (iwsc), 2017, pp. 1-7.
    [BibTeX]
    @INPROCEEDINGS{7880512,
    author={T. A. D. Henderson and A. Podgurski},
    booktitle={2017 IEEE 11th International Workshop on Software Clones (IWSC)},
    title={Rethinking dependence clones},
    year={2017},
    volume={},
    number={},
    pages={1-7},
    }

  • C. V. Lopes, P. Maj, P. Martins, V. Saini, D. Yang, J. Zitny, H. Sajnani, and J. Vitek, “Déjàvu: a map of code duplicates on github,” Proc. acm program. lang., vol. 1, iss. OOPSLA, 2017. doi:10.1145/3133908
    [BibTeX] [PDF]
    @article{10.1145/3133908,
    author = {Lopes, Cristina V. and Maj, Petr and Martins, Pedro and Saini, Vaibhav and Yang, Di and Zitny, Jakub and Sajnani, Hitesh and Vitek, Jan},
    title = {D\'{e}J\`{a}Vu: A Map of Code Duplicates on GitHub},
    year = {2017},
    issue_date = {October 2017},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    volume = {1},
    number = {OOPSLA},
    url = {https://doi.org/10.1145/3133908},
    doi = {10.1145/3133908},
    journal = {Proc. ACM Program. Lang.},
    month = oct,
    articleno = {84},
    numpages = {28},
    keywords = {Clone Detection, Source Code Analysis}
    }

  • A. Sheneamer, S. Roy, and J. Kalita, “A detection framework for semantic code clones and obfuscated code,” Expert systems with applications, vol. 97, pp. 405-420, 2017. doi:10.1016/j.eswa.2017.12.040
    [BibTeX] [Abstract] [PDF]

    Code obfuscation is a staple tool in malware creation where code fragments are altered substantially to make them appear different from the original, while keeping the semantics unaffected. A majority of the obfuscated code detection methods use program structure as a signature for detection of unknown codes. They usually ignore the most important feature, which is the semantics of the code, to match two code fragments or programs for obfuscation. Obfuscated code detection is a special case of the semantic code clone detection task. We propose a detection framework for detecting both code obfuscation and clone using machine learning. We use features extracted from Java bytecode dependency graphs (BDG), program dependency graphs (PDG) and abstract syntax trees (AST). BDGs and PDGs are two representations of the semantics or meaning of a Java program. ASTs capture the structural aspects of a program. We use several publicly available code clone and obfuscated code datasets to validate the effectiveness of our framework. We use different assessment parameters to evaluate the detection quality of our proposed model. Experimental results are excellent when compared with contemporary obfuscated code and code clone detectors. Interestingly, we achieve 100\% success in detecting obfuscated code based on recall, precision , and F1-Score. When we compare our method with other methods for all of obfuscations types, viz, contraction, expansion, loop transformation and renaming, our model appears to be the winner. In case of clone detection our model achieve very high detection accuracy in comparison to other similar detectors.

    @article{sheneamer_detection_2017,
    title = {A detection framework for semantic code clones and obfuscated code},
    volume = {97},
    url = {https://doi.org/10.1016/j.eswa.2017.12.040},
    doi = {10.1016/j.eswa.2017.12.040},
    abstract = {Code obfuscation is a staple tool in malware creation where code fragments are altered substantially to make them appear different from the original, while keeping the semantics unaffected. A majority of the obfuscated code detection methods use program structure as a signature for detection of unknown codes. They usually ignore the most important feature, which is the semantics of the code, to match two code fragments or programs for obfuscation. Obfuscated code detection is a special case of the semantic code clone detection task. We propose a detection framework for detecting both code obfuscation and clone using machine learning. We use features extracted from Java bytecode dependency graphs (BDG), program dependency graphs (PDG) and abstract syntax trees (AST). BDGs and PDGs are two representations of the semantics or meaning of a Java program. ASTs capture the structural aspects of a program. We use several publicly available code clone and obfuscated code datasets to validate the effectiveness of our framework. We use different assessment parameters to evaluate the detection quality of our proposed model. Experimental results are excellent when compared with contemporary obfuscated code and code clone detectors. Interestingly, we achieve 100\% success in detecting obfuscated code based on recall, precision , and F1-Score. When we compare our method with other methods for all of obfuscations types, viz, contraction, expansion, loop transformation and renaming, our model appears to be the winner. In case of clone detection our model achieve very high detection accuracy in comparison to other similar detectors.},
    journal = {Expert Systems With Applications},
    author = {Sheneamer, Abdullah and Roy, Swarup and Kalita, Jugal},
    year = {2017},
    keywords = {Machine learning, Bytecode dependency graph, Code obfuscation, Program dependency graph, Semantic code clones},
    pages = {405-420}
    }

  • A. Charpentier, J. Falleri, F. Morandat, E. Ben Hadj Yahia, and L. Réveillère, “Raters’ reliability in clone benchmarks construction,” Empirical softw. engg., vol. 22, iss. 1, p. 235–258, 2017. doi:10.1007/s10664-015-9419-z
    [BibTeX] [PDF]
    @article{10.1007/s10664-015-9419-z,
    author = {Charpentier, Alan and Falleri, Jean-R\'{e}my and Morandat, Flor\'{e}al and Ben Hadj Yahia, Elyas and R\'{e}veill\`{e}re, Laurent},
    title = {Raters’ Reliability in Clone Benchmarks Construction},
    year = {2017},
    issue_date = {February 2017},
    publisher = {Kluwer Academic Publishers},
    address = {USA},
    volume = {22},
    number = {1},
    issn = {1382-3256},
    url = {https://doi.org/10.1007/s10664-015-9419-z},
    doi = {10.1007/s10664-015-9419-z},
    journal = {Empirical Softw. Engg.},
    month = feb,
    pages = {235–258},
    numpages = {24},
    keywords = {Empirical study, Software metrics, Code clone, Duplication}
    }

  • M. O. Elish, “On the association between code cloning and fault-proneness: an empirical investigation,” in 2017 computing conference, 2017, pp. 928-935.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8252205,
    author={M. O. {Elish}},
    booktitle={2017 Computing Conference},
    title={On the association between code cloning and fault-proneness: An empirical investigation},
    year={2017},
    url = {https://ieeexplore.ieee.org/document/8252205},
    volume={},
    number={},
    pages={928-935},}

  • M. Farrell, R. Monahan, and J. Power, “Specification clones: an empirical study of the structure of event-b specifications.” 2017, pp. 152-167. doi:10.1007/978-3-319-66197-1_10
    [BibTeX] [PDF]
    @inproceedings{inproceedings,
    author = {Farrell, Marie and Monahan, Rosemary and Power, James},
    year = {2017},
    url = {https://link.springer.com/chapter/10.1007/978-3-319-66197-1_10},
    month = {08},
    journal = {International Conference on Software Engineering and Formal Methods},
    pages = {152-167},
    title = {Specification Clones: An Empirical Study of the Structure of Event-B Specifications},
    isbn = {978-3-319-66196-4},
    doi = {10.1007/978-3-319-66197-1_10}
    }

  • P. Gautam and H. Saini, “Non-trivial software clone detection using program dependency graph,” International journal of open source software and processes, vol. 8, iss. 2, pp. 1-24, 2017. doi:10.4018/IJOSSP.2017040101
    [BibTeX] [Abstract] [PDF]

    Code clones are copied fragments that occur at different levels of abstraction and may have different origins in a software system. This article presents an approach which shows the significant parts of source code. Further, by using significant parts of a source code, a control flow graph can be generated. This control flow graph represents the statements of a code/program in the form of basic blocks or nodes and the edges represent the control flow between those basic blocks. A hybrid approach, named the Program Dependence Graph (PDG) is also presented in this article for the detection of non-trivial code clones. The program dependency graph approach consists of two approaches as a control dependency graph and a data dependency graph. The control dependency graph is generated by using a control flow graph. This article proposes an approach which can easily generate control flow graphs and by using control flow graph and reduced flowgraph approach, the trivial software clone, a similar textual structure, can be detected.The proposed approach is based on a tokenization concept.

    @article{gautam_non-trivial_2017,
    title = {Non-trivial software clone detection using program dependency graph},
    volume = {8},
    issn = {19423934},
    url = {https://dl.acm.org/doi/10.4018/IJOSSP.2017040101},
    doi = {10.4018/IJOSSP.2017040101},
    abstract = {Code clones are copied fragments that occur at different levels of abstraction and may have different origins in a software system. This article presents an approach which shows the significant parts of source code. Further, by using significant parts of a source code, a control flow graph can be generated. This control flow graph represents the statements of a code/program in the form of basic blocks or nodes and the edges represent the control flow between those basic blocks. A hybrid approach, named the Program Dependence Graph (PDG) is also presented in this article for the detection of non-trivial code clones. The program dependency graph approach consists of two approaches as a control dependency graph and a data dependency graph. The control dependency graph is generated by using a control flow graph. This article proposes an approach which can easily generate control flow graphs and by using control flow graph and reduced flowgraph approach, the trivial software clone, a similar textual structure, can be detected.The proposed approach is based on a tokenization concept.},
    number = {2},
    journal = {International Journal of Open Source Software and Processes},
    author = {Gautam, Pratiksha and Saini, Hemraj},
    month = apr,
    year = {2017},
    note = {Publisher: IGI Global},
    keywords = {Control Flow, Cyclomatic Complexity, Graph Software Clones, Program Dependency Graph},
    pages = {1-24}
    }

  • T. Görg, “Deriving categories of semantic clones from a coding contest,” in Softwaretechnik-trends, Berlin, 2017, pp. 58-59.
    [BibTeX] [PDF]
    @inproceedings{mci/Görg2017,
    author = {Görg, Torsten},
    title = {Deriving Categories of Semantic Clones from a Coding Contest},
    booktitle = {Softwaretechnik-Trends},
    year = {2017},
    volume = {37},
    number = {2},
    url = {https://dl.gi.de/handle/20.500.12116/4695},
    pages = { 58-59 },
    publisher = {Gesellschaft für Informatik e.V., Fachgruppe PARS},
    address = {Berlin}
    }

  • T. Hatano and A. Matsuo, “Removing code clones from industrial systems using compiler directives,” in 2017 ieee/acm 25th international conference on program comprehension (icpc), 2017, pp. 336-345.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7961534,
    author = {Hatano, Tomomi and Matsuo, Akihiko},
    booktitle={2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC)},
    title={Removing Code Clones from Industrial Systems Using Compiler Directives},
    year={2017},
    url = {https://ieeexplore.ieee.org/abstract/document/7961534/},
    volume={},
    number={},
    pages={336-345},
    }

  • Y. Yuki, Y. Higo, and S. Kusumoto, “A technique to detect multi-grained code clones,” in 2017 ieee 11th international workshop on software clones (iwsc), 2017, pp. 1-7.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7880510,
    author={Y. {Yuki} and Y. {Higo} and S. {Kusumoto}},
    booktitle={2017 IEEE 11th International Workshop on Software Clones (IWSC)},
    title={A technique to detect multi-grained code clones},
    year={2017},
    url = {https://ieeexplore.ieee.org/document/7880510},
    volume={},
    number={},
    pages={1-7},}

  • J. F. Islam, M. Mondal, C. K. Roy, and K. A. Schneider, “Comparing Software Bugs in Clone and Non-clone Code: An Empirical Study,” World scientific, vol. 27, iss. 9-10, pp. 1507-1527, 2017. doi:10.1142/S0218194017400083
    [BibTeX] [Abstract] [PDF]

    Code cloning is a recurrent operation in everyday software development. Whether it is a good or bad practice is an ongoing debate among researchers and developers for the last few decades. In this paper, we conduct a comparative study on bug-proneness in clone code and non-clone code by analyzing commit logs. According to our inspection of thousands of revisions of seven diverse subject systems, the percentage of changed �les due to bug-�x commits is signi�cantly higher in clone code compared with non-clone code. We perform a Mann-Whitney-Wilcoxon (MWW) test to show the statistical signi�cance of our �ndings. In addition, the possibility of occurrence of severe bugs is higher in clone code than in non-clone code. Bug-�xing changes a®ecting clone code should be considered more carefully. Finally, our manual investigation shows that clone code containing if-condition and if-else blocks has a high risk of having severing bugs. Changes to such types of clone fragments should be done carefully during software maintenance. According to our �ndings, clone code appears to be more bug-prone than non-clone code.

    @article{islam_comparing_2017,
    title = {Comparing {Software} {Bugs} in {Clone} and {Non}-clone {Code}: {An} {Empirical} {Study}},
    volume = {27},
    url = {https://www.worldscientific.com/doi/abs/10.1142/S0218194017400083},
    doi = {10.1142/S0218194017400083},
    abstract = {Code cloning is a recurrent operation in everyday software development. Whether it is a good or bad practice is an ongoing debate among researchers and developers for the last few decades. In this paper, we conduct a comparative study on bug-proneness in clone code and non-clone code by analyzing commit logs. According to our inspection of thousands of revisions of seven diverse subject systems, the percentage of changed �les due to bug-�x commits is signi�cantly higher in clone code compared with non-clone code. We perform a Mann-Whitney-Wilcoxon (MWW) test to show the statistical signi�cance of our �ndings. In addition, the possibility of occurrence of severe bugs is higher in clone code than in non-clone code. Bug-�xing changes a®ecting clone code should be considered more carefully. Finally, our manual investigation shows that clone code containing if-condition and if-else blocks has a high risk of having severing bugs. Changes to such types of clone fragments should be done carefully during software maintenance. According to our �ndings, clone code appears to be more bug-prone than non-clone code.},
    number = {9-10},
    journal = {World Scientific},
    author = {Islam, Judith F and Mondal, Manishankar and Roy, Chanchal K and Schneider, Kevin A},
    month = dec,
    year = {2017},
    note = {Publisher: World Scientific Publishing Co. Pte Ltd},
    keywords = {Code clones, severe bugs, software bugs},
    pages = {1507-1527}
    }

  • M. R. Islam, M. F. Zibran, and A. Nagpal, “Security vulnerabilities in categories of clones and non-cloned code: an empirical study,” in 2017 acm/ieee international symposium on empirical software engineering and measurement (esem), 2017, pp. 20-29.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8169981,
    author={M. R. {Islam} and M. F. {Zibran} and A. {Nagpal}},
    booktitle={2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)},
    title={Security Vulnerabilities in Categories of Clones and Non-Cloned Code: An Empirical Study},
    year={2017},
    url = {https://ieeexplore.ieee.org/abstract/document/8169981/},
    volume={},
    number={},
    pages={20-29},
    }

  • M. C. Júnior, M. Farias, J. Jorge, B. Torres, M. C. R. Junior, M. André, and F. Farias, “Procedural x oo – a corporative experiment on source code clone mining,” 19th international conference on enterprise information systems, p. 395-402} note = {ISBN: 9789897582486, 2017. doi:10.5220/0006325003950402
    [BibTeX] [Abstract] [PDF]

    Open Source Software (OSS) repositories are widely used to execute studies around code clone detection, mostly inside the public scenario. However, corporative code Repositories have their content restricted and protected from access by developers who are not part of the company. Besides, there are a lot of questions regarding paradigm efficiency and its relation to clone manifestation. This article presents an experiment performed on systems developed in a large private education company, to observe and compare the incidence of cloned code between Object Oriented and Procedural proprietary software, using an exact similarity threshold. The results indicate that Object Oriented Software wondrously showed higher cloned lines of code incidence and a similar use of abstraction (clone sets) for functions or methods.

    @article{junior_procedural_2017,
    title = {Procedural x OO - A Corporative Experiment on Source Code Clone Mining},
    url = {https://www.researchgate.net/publication/317115281},
    doi = {10.5220/0006325003950402},
    abstract = {Open Source Software (OSS) repositories are widely used to execute studies around code clone detection, mostly inside the public scenario. However, corporative code Repositories have their content restricted and protected from access by developers who are not part of the company. Besides, there are a lot of questions regarding paradigm efficiency and its relation to clone manifestation. This article presents an experiment performed on systems developed in a large private education company, to observe and compare the incidence of cloned code between Object Oriented and Procedural proprietary software, using an exact similarity threshold. The results indicate that Object Oriented Software wondrously showed higher cloned lines of code incidence and a similar use of abstraction (clone sets) for functions or methods.},
    journal = {19th International Conference on Enterprise Information Systems},
    author = {Júnior, Methanias Colaço and Farias, Mário and Jorge, José and Torres, Barreto and Junior, Methanias C R and André, Mário and Farias, Freitas},
    year = {2017},
    pages = {395-402}
    note = {ISBN: 9789897582486},
    keywords = {Mining Software Repositories, Clones, Software, Closed-source Projects, Experimental Software Engineering}
    }

  • H. Kaur and R. Maini, “Performance evaluation and comparative analysis of code-clone-detection techniques and tools,” International journal of software engineering and its applications, vol. 11, iss. 3, pp. 31-50, 2017. doi:10.14257/ijseia.2017.11.3.04
    [BibTeX] [Abstract] [PDF]

    Since Code Cloning is the recent area of research in software engineering, it is crucial to have good understanding of all the code-clone-detection techniques. Clones in software development increases maintenance cost and it leads to poor software quality. This paper is basically combination of two issues: literature review of code clone detection techniques and experimental work for the evaluation of chosen techniques from literature. This paper firstly list out the various studies and then evaluates the performance of three chosen techniques (Text-based, Token-based and Tree-based) by means of automated tools. Netbeans-Javadoc, JBoss and Java-Quizz source codes has been examined to validate results. From the analysis it has been observe red that token based approach reports more false positives as compared to other techniques. Text based and token based approaches have precision values greater than tree based approach, but tree based approach has higher recall values. Token based, Tree based and metric based approaches are useful in combination with refactoring tools. It has been observed that in terms of speed, text-based approach is suitable to small size projects, but token based technique is scalable to large size projects also. Tree-based and token based techniques work effectively to detect near-miss clones and give more safe and sound result. DuDe, ccFinder, solid-SDD and cloneDr tools have been used for validation. From the experimental work it has been observed that Dude tool is suitable for small projects, but ccFinder is scalable from small to large projects. False positives are reported by ccFinder because of its token based approach, but cloneDr leads to minimum false positives as compared to ccFinder. The aim of the paper is to find the strengths and weaknesses of these techniques which will be helpful to select a clone detection technique for a particular purpose.

    @article{kaur_performance_2017,
    title = {Performance Evaluation and Comparative Analysis of Code-Clone-Detection Techniques and Tools},
    volume = {11},
    issn = {1738-9984},
    url = {https://www.researchgate.net/publication/316531860_Performance_Evaluation_and_Comparative_Analysis_of_Code-Clone-Detection_Techniques_and_Tools},
    doi = {10.14257/ijseia.2017.11.3.04},
    abstract = {Since Code Cloning is the recent area of research in software engineering, it is crucial to have good understanding of all the code-clone-detection techniques. Clones in software development increases maintenance cost and it leads to poor software quality. This paper is basically combination of two issues: literature review of code clone detection techniques and experimental work for the evaluation of chosen techniques from literature. This paper firstly list out the various studies and then evaluates the performance of three chosen techniques (Text-based, Token-based and Tree-based) by means of automated tools. Netbeans-Javadoc, JBoss and Java-Quizz source codes has been examined to validate results. From the analysis it has been observe red that token based approach reports more false positives as compared to other techniques. Text based and token based approaches have precision values greater than tree based approach, but tree based approach has higher recall values. Token based, Tree based and metric based approaches are useful in combination with refactoring tools. It has been observed that in terms of speed, text-based approach is suitable to small size projects, but token based technique is scalable to large size projects also. Tree-based and token based techniques work effectively to detect near-miss clones and give more safe and sound result. DuDe, ccFinder, solid-SDD and cloneDr tools have been used for validation. From the experimental work it has been observed that Dude tool is suitable for small projects, but ccFinder is scalable from small to large projects. False positives are reported by ccFinder because of its token based approach, but cloneDr leads to minimum false positives as compared to ccFinder. The aim of the paper is to find the strengths and weaknesses of these techniques which will be helpful to select a clone detection technique for a particular purpose.},
    number = {3},
    journal = {International Journal of Software Engineering and Its Applications},
    author = {Kaur, Harpreet and Maini, Raman},
    year = {2017},
    keywords = {Code Clone, Software Maintenance, Clone-class, Code fragment},
    pages = {31-50}
    }

  • J. Krüger, L. Nell, W. Fenske, G. Saake, and T. Leich, “Finding lost features in cloned systems,” in Proceedings of the 21st international systems and software product line conference – volume b, New York, NY, USA, 2017, p. 65–72. doi:10.1145/3109729.3109736
    [BibTeX] [PDF]
    @inproceedings{10.1145/3109729.3109736,
    author = {Kr\"{u}ger, Jacob and Nell, Louis and Fenske, Wolfram and Saake, Gunter and Leich, Thomas},
    title = {Finding Lost Features in Cloned Systems},
    year = {2017},
    isbn = {9781450351195},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3109729.3109736},
    doi = {10.1145/3109729.3109736},
    booktitle = {Proceedings of the 21st International Systems and Software Product Line Conference - Volume B},
    pages = {65–72},
    numpages = {8},
    keywords = {code clone detection, extractive approach, feature location, legacy system, reverse engineering, Software product line},
    location = {Sevilla, Spain},
    series = {SPLC ’17}
    }

  • L. Li, H. Feng, W. Zhuang, N. Meng, and B. Ryder, “Cclearner: a deep learning-based clone detection approach,” in 2017 ieee international conference on software maintenance and evolution (icsme), 2017, pp. 249-260.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8094426,
    author={L. {Li} and H. {Feng} and W. {Zhuang} and N. {Meng} and B. {Ryder}},
    booktitle={2017 IEEE International Conference on Software Maintenance and Evolution (ICSME)},
    title={CCLearner: A Deep Learning-Based Clone Detection Approach},
    url = {https://ieeexplore.ieee.org/abstract/document/8094426/},
    year={2017},
    volume={},
    number={},
    pages={249-260},}

  • T. Matsushita and I. Sasano, “Detecting code clones with gaps by function applications,” in Proceedings of the 2017 acm sigplan workshop on partial evaluation and program manipulation, New York, NY, USA, 2017, p. 12–22. doi:10.1145/3018882.3018892
    [BibTeX] [PDF]
    @inproceedings{10.1145/3018882.3018892,
    author = {Matsushita, Tsubasa and Sasano, Isao},
    title = {Detecting Code Clones with Gaps by Function Applications},
    year = {2017},
    isbn = {9781450347211},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3018882.3018892},
    doi = {10.1145/3018882.3018892},
    booktitle = {Proceedings of the 2017 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation},
    pages = {12–22},
    numpages = {11},
    keywords = {abstract syntax tree, function application, code clone, gap},
    location = {Paris, France},
    series = {PEPM 2017}
    }

  • M. Mondal, C. K. Roy, and K. A. Schneider, “Identifying code clones having high possibilities of containing bugs,” in 2017 ieee/acm 25th international conference on program comprehension (icpc), 2017, pp. 99-109.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7961508,
    author={M. {Mondal} and C. K. {Roy} and K. A. {Schneider}},
    booktitle={2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC)},
    title={Identifying Code Clones Having High Possibilities of Containing Bugs},
    year={2017},
    url = {https://ieeexplore.ieee.org/abstract/document/7961508/},
    volume={},
    number={},
    pages={99-109},}

  • Vishwachi and S. Gupta, “Detection of near-miss clones using metrics and abstract syntax trees,” in 2017 international conference on inventive communication and computational technologies (icicct), 2017, pp. 230-234.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7975193,
    author={ {Vishwachi} and S. {Gupta}},
    booktitle={2017 International Conference on Inventive Communication and Computational Technologies (ICICCT)},
    title={Detection of near-miss clones using metrics and Abstract Syntax Trees},
    year={2017},
    url = {https://ieeexplore.ieee.org/abstract/document/7975193/},
    volume={},
    number={},
    pages={230-234},}

  • J. Pati, B. Kumar, D. Manjhi, and K. K. Shukla, “Machine Learning Strategies for Temporal Analysis of Software Clone Evolution using Software Metrics,” , 11, 2017.
    [BibTeX] [Abstract] [PDF]

    During software evolution, there is a tendency to duplicate the code, and modify the copy slightly, giving rise to clones. Cloned code fragments adversely affect software quality and maintenance. In this paper, we discuss identification of different types of clone components using Abstract Syntax Tree based approach and also propose models for prediction of the evolution of cloned components in future versions of the software. The primary focus of the paper is modelling of the evolution of clones in a software application. Detection of clones in a large software system is challenging as it depends on the internal design of software modules and methods. Object-oriented metrics like DIT, NOC, WMC, LCOM, and Cyclomatic complexity can be used as good indicators of clone contents. We demonstrate a correlation between clones and various metrics of the source. The first part of our study is to identify the cloned components using Abstract Syntax Tree. The second part is to predict the evolution of cloned components using advanced time series modelling using machine learning approaches. Evaluation of our model is performed using a large open source software system. The assessment includes quantifying the correlation between software metrics and the clone contents in the software .

    @techreport{pati_machine_2017,
    title = {Machine {Learning} {Strategies} for {Temporal} {Analysis} of {Software} {Clone} {Evolution} using {Software} {Metrics}},
    url = {https://www.researchgate.net/publication/322482317_Machine_learning_strategies_for_temporal_analysis_of_software_clone_evolution_using_software_metrics},
    abstract = {During software evolution, there is a tendency to duplicate the code, and modify the copy slightly, giving rise to clones. Cloned code fragments adversely affect software quality and maintenance. In this paper, we discuss identification of different types of clone components using Abstract Syntax Tree based approach and also propose models for prediction of the evolution of cloned components in future versions of the software. The primary focus of the paper is modelling of the evolution of clones in a software application. Detection of clones in a large software system is challenging as it depends on the internal design of software modules and methods. Object-oriented metrics like DIT, NOC, WMC, LCOM, and Cyclomatic complexity can be used as good indicators of clone contents. We demonstrate a correlation between clones and various metrics of the source. The first part of our study is to identify the cloned components using Abstract Syntax Tree. The second part is to predict the evolution of cloned components using advanced time series modelling using machine learning approaches. Evaluation of our model is performed using a large open source software system. The assessment includes quantifying the correlation between software metrics and the clone contents in the software .},
    number = {11},
    author = {Pati, Jayadeep and Kumar, Babloo and Manjhi, Devesh and Shukla, K K},
    year = {2017},
    note = {Publication Title: International Journal of Applied Engineering Research
    Volume: 12},
    journal ={International Journal of Applied Engineering Research},
    keywords = {Machine Learning, Software Maintenance, Abstract Syntax Tree, Exact Match Clones, Near-Miss Clones, Soft-ware Metrics, Software Clones, Time SeriesAnalysis},
    pages = {2798-2806}
    }

  • M. S. Rahman and C. K. Roy, “On the relationships between stability and bug-proneness of code clones: an empirical study,” in 2017 ieee 17th international working conference on source code analysis and manipulation (scam), 2017, pp. 131-140.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8090146,
    author={M. S. {Rahman} and C. K. {Roy}},
    booktitle={2017 IEEE 17th International Working Conference on Source Code Analysis and Manipulation (SCAM)},
    title={On the Relationships Between Stability and Bug-Proneness of Code Clones: An Empirical Study},
    year={2017},
    url = {https://ieeexplore.ieee.org/abstract/document/8090146/},
    volume={},
    number={},
    pages={131-140},}

  • M. R. H. Misu and K. Sakib, “Interface driven code clone detection,” in 2017 24th asia-pacific software engineering conference (apsec), 2017, pp. 747-748.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8306014,
    author={M. R. H. {Misu} and K. {Sakib}},
    booktitle={2017 24th Asia-Pacific Software Engineering Conference (APSEC)},
    title={Interface Driven Code Clone Detection},
    url = {https://ieeexplore.ieee.org/document/8306014},
    year={2017},
    volume={},
    number={},
    pages={747-748},}

  • G. Robles, J. Moreno-León, E. Aivaloglou, and F. Hermans, “Software clones in scratch projects: on the presence of copy-and-paste in computational thinking learning,” in 2017 ieee 11th international workshop on software clones (iwsc), 2017, pp. 1-7.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7880506,
    author={G. {Robles} and J. {Moreno-León} and E. {Aivaloglou} and F. {Hermans}},
    booktitle={2017 IEEE 11th International Workshop on Software Clones (IWSC)},
    title={Software clones in scratch projects: on the presence of copy-and-paste in computational thinking learning},
    year={2017},
    url = {https://ieeexplore.ieee.org/document/7880506},
    volume={},
    number={},
    pages={1-7},}

  • M. R. H. Misu, A. Satter, and K. Sakib, “An exploratory study on interface similarities in code clones,” in 2017 24th asia-pacific software engineering conference workshops (apsecw), 2017, pp. 126-133.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8312535,
    author={M. R. H. {Misu} and A. {Satter} and K. {Sakib}},
    booktitle={2017 24th Asia-Pacific Software Engineering Conference Workshops (APSECW)},
    title={An Exploratory Study on Interface Similarities in Code Clones},
    year={2017},
    url = {https://ieeexplore.ieee.org/document/8312535},
    volume={},
    number={},
    pages={126-133},
    }

  • Y. Semura, N. Yoshida, E. Choi, and K. Inoue, “Ccfindersw: clone detection tool with flexible multilingual tokenization,” in 2017 24th asia-pacific software engineering conference (apsec), 2017, pp. 654-659.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8305997,
    author={Y. {Semura} and N. {Yoshida} and E. {Choi} and K. {Inoue}},
    booktitle={2017 24th Asia-Pacific Software Engineering Conference (APSEC)},
    title={CCFinderSW: Clone Detection Tool with Flexible Multilingual Tokenization},
    year={2017},
    url = {https://ieeexplore.ieee.org/abstract/document/8305997/},
    volume={},
    number={},
    pages={654-659},
    }

  • S. Sharad Patil, S. Santosh Chaudhari, A. Mukunda Sonawane, S. Siddharth Salunke, and M. Ramakant Bhole, “Code Clone Detection Using Hybrid Approach,” (ijirct )international journal of innovative research and creative technology, vol. 201, pp. 201-204, 2017.
    [BibTeX] [Abstract] [PDF]

    Many researchers have look over distinct techniques to detect duplicate code in programs exceeding thousand lines of code. These techniques have drawback of finding either the structural or functional clones. Code clones are the duplicated code that degrade the software quality and hence increase maintenance value. Detection of code clone in software system is extremely necessary to improve design structure and quality of software product. The proposed lightweight weight hybrid approach uses textual comparison and template conversion for detection of method level syntactical and semantic clones in C file and functional clones in C and Java file.

    @article{sharad_patil_code_nodate,
    title = {Code {Clone} {Detection} {Using} {Hybrid} {Approach}},
    volume = {201},
    issn = {2454-5988},
    url = {http://www.ijirct.org/viewPaper.php?paperId=IJIRCT1601037},
    abstract = {Many researchers have look over distinct techniques to detect duplicate code in programs exceeding thousand lines of code. These techniques have drawback of finding either the structural or functional clones. Code clones are the duplicated code that degrade the software quality and hence increase maintenance value. Detection of code clone in software system is extremely necessary to improve design structure and quality of software product. The proposed lightweight weight hybrid approach uses textual comparison and template conversion for detection of method level syntactical and semantic clones in C file and functional clones in C and Java file.},
    journal = {(IJIRCT )International Journal of Innovative Research and Creative Technology},
    author = {Sharad Patil, Sayali and Santosh Chaudhari, Sachin and Mukunda Sonawane, Ashwini and Siddharth Salunke, Sonal and Ramakant Bhole, Makarand},
    year = {2017},
    pages = {201-204},
    keywords = {Clone detection, Functional clones, Textual analysis}
    }

  • A. Sheneamer, H. Hazazi, S. Roy, and J. Kalita, “Schemes for labeling semantic code clones using machine learning,” in 2017 16th ieee international conference on machine learning and applications (icmla), 2017, pp. 981-985.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8260767,
    author={A. {Sheneamer} and H. {Hazazi} and S. {Roy} and J. {Kalita}},
    booktitle={2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)},
    title={Schemes for Labeling Semantic Code Clones using Machine Learning},
    year={2017},
    url = {https://ieeexplore.ieee.org/abstract/document/8260767/},
    volume={},
    number={},
    pages={981-985},
    }

  • M. Suzuki, A. Carvalho de Paula, E. Guerra, C. V. Lopes, and O. A. Lazzarini Lemos, “An exploratory study of functional redundancy in code repositories,” in 2017 ieee 17th international working conference on source code analysis and manipulation (scam), 2017, pp. 31-40.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8090136,
    author={M. {Suzuki} and A. {Carvalho de Paula} and E. {Guerra} and C. V. {Lopes} and O. A. {Lazzarini Lemos}},
    booktitle={2017 IEEE 17th International Working Conference on Source Code Analysis and Manipulation (SCAM)},
    title={An Exploratory Study of Functional Redundancy in Code Repositories},
    year={2017},
    url = {https://ieeexplore.ieee.org/document/8090136},
    volume={},
    number={},
    pages={31-40},
    }

  • S. Thompson, H. Li, and A. Schumacher, “The pragmatics of clone detection and elimination,” The art, science, and engineering of programming, vol. 1, 2017. doi:10.22152/programming-journal.org/2017/1/8
    [BibTeX] [PDF]
    @article{article,
    author = {Thompson, Simon and Li, Huiqing and Schumacher, Andreas},
    year = {2017},
    url = {https://ieeexplore.ieee.org/abstract/document/8247574/},
    month = {03},
    pages = {},
    title = {The pragmatics of clone detection and elimination},
    volume = {1},
    url = {https://www.researchgate.net/publication/315748526_The_pragmatics_of_clone_detection_and_elimination},
    journal = {The Art, Science, and Engineering of Programming},
    doi = {10.22152/programming-journal.org/2017/1/8}
    }

  • N. Tsantalis, D. Mazinanian, and S. Rostami, “Clone refactoring with lambda expressions,” in 2017 ieee/acm 39th international conference on software engineering (icse), 2017, pp. 60-70.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7985650,
    author={N. {Tsantalis} and D. {Mazinanian} and S. {Rostami}},
    booktitle={2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)},
    title={Clone Refactoring with Lambda Expressions},
    url = {https://ieeexplore.ieee.org/abstract/document/7985650/},
    year={2017},
    volume={},
    number={},
    pages={60-70},}

  • K. Uemura, A. Mori, K. Fujiwara, E. Choi, and H. Iida, “Detecting and analyzing code clones in hdl,” in 2017 ieee 11th international workshop on software clones (iwsc), 2017, pp. 1-7.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7880501,
    author={K. {Uemura} and A. {Mori} and K. {Fujiwara} and E. {Choi} and H. {Iida}},
    booktitle={2017 IEEE 11th International Workshop on Software Clones (IWSC)},
    title={Detecting and analyzing code clones in HDL},
    url = {https://ieeexplore.ieee.org/abstract/document/7880501/},
    year={2017},
    volume={},
    number={},
    pages={1-7},}

  • M. Wang, P. Wang, and Y. Xu, “Ccsharp: an efficient three-phase code clone detector using modified pdgs,” in 2017 24th asia-pacific software engineering conference (apsec), 2017, pp. 100-109.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8305932,
    author={M. {Wang} and P. {Wang} and Y. {Xu}},
    booktitle={2017 24th Asia-Pacific Software Engineering Conference (APSEC)},
    title={CCSharp: An Efficient Three-Phase Code Clone Detector Using Modified PDGs},
    year={2017},
    url = {https://ieeexplore.ieee.org/abstract/document/8305932/},
    volume={},
    number={},
    pages={100-109},
    }

  • H. Wei and M. Li, “Positive and Unlabeled Learning for Detecting Software Functional Clones with Adversarial Training,” 2017.
    [BibTeX] [Abstract] [PDF]

    Software clone detection is an important problem for software maintenance and evolution and it has attracted lots of attentions. However, existing approaches ignore a fact that people would label the pairs of code fragments as clone only if they happen to discover the clones while a huge number of undiscovered clone pairs and non-clone pairs are left unlabeled. In this paper, we argue that the clone detection task in the real-world should be formalized as a Positive-Unlabeled (PU) learning problem, and address this problem by proposing a novel positive and unlabeled learning approach, namely CDPU, to effectively detect software functional clones, i.e., pieces of codes with similar func-tionality but differing in both syntactical and lexical level, where adversarial training is employed to improve the robustness of the learned model to those non-clone pairs that look extremely similar but behave differently. Experiments on software clone detection benchmarks indicate that the proposed approach together with adversarial training outper-forms the state-of-the-art approaches for software functional clone detection.

    @techreport{wei_positive_2017,
    title = {Positive and {Unlabeled} {Learning} for {Detecting} {Software} {Functional} {Clones} with {Adversarial} {Training}},
    url = {https://pdfs.semanticscholar.org/ad3a/07425891ffa7d6cc2a1edbc2185874925048.pdf},
    abstract = {Software clone detection is an important problem for software maintenance and evolution and it has attracted lots of attentions. However, existing approaches ignore a fact that people would label the pairs of code fragments as clone only if they happen to discover the clones while a huge number of undiscovered clone pairs and non-clone pairs are left unlabeled. In this paper, we argue that the clone detection task in the real-world should be formalized as a Positive-Unlabeled (PU) learning problem, and address this problem by proposing a novel positive and unlabeled learning approach, namely CDPU, to effectively detect software functional clones, i.e., pieces of codes with similar func-tionality but differing in both syntactical and lexical level, where adversarial training is employed to improve the robustness of the learned model to those non-clone pairs that look extremely similar but behave differently. Experiments on software clone detection benchmarks indicate that the proposed approach together with adversarial training outper-forms the state-of-the-art approaches for software functional clone detection.},
    author = {Wei, Hui-Hui and Li, Ming},
    journal = {Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI)},
    year = {2017},
    pages = {2840-2846},
    note = {Publication Title: pdfs.semanticscholar.org},
    keywords = {Multidisciplinary Topics and Applications: Knowled, Machine Learning: Semi-Supervised Learning}
    }

  • C. Wijesiriwardana and P. Wimalaratne, “Component-based experimental testbed to faciltiate code clone detection research,” 2017 8th ieee international conference on software engineering and service, pp. 165-168, 2017. doi:10.1109/ICSESS.2017.8342888
    [BibTeX] [Abstract] [PDF]

    Over the past few decades clone detection has become a major area of study among software researchers and software practitioners. Clone detection experiments present a number of challenges such as accurate data collection, data cleaning and selection of proper code detection algorithms. This urges the need for a systematic and unambiguous approach to conduct clone detection experiments. As a solution, this paper presents an experimental testbed, which consists of a collection of “clone detection components (CDCs)”. CDCs are concrete representations of tasks associated with clone detection experiments such as data extraction, pre-processing and detection of clones. These CDCs could be used in isolation to represent a simple task or could be composed to represent a complex task. The usefulness of the experimental testbed is evaluated with an important clone detection experiment for three open source projects.

    @article{wijesiriwardana_component-based_nodate,
    title = {Component-based experimental testbed to faciltiate code clone detection research},
    url = {https://www.researchgate.net/publication/324725510},
    doi = {10.1109/ICSESS.2017.8342888},
    abstract = {Over the past few decades clone detection has become a major area of study among software researchers and software practitioners. Clone detection experiments present a number of challenges such as accurate data collection, data cleaning and selection of proper code detection algorithms. This urges the need for a systematic and unambiguous approach to conduct clone detection experiments. As a solution, this paper presents an experimental testbed, which consists of a collection of "clone detection components (CDCs)". CDCs are concrete representations of tasks associated with clone detection experiments such as data extraction, pre-processing and detection of clones. These CDCs could be used in isolation to represent a simple task or could be composed to represent a complex task. The usefulness of the experimental testbed is evaluated with an important clone detection experiment for three open source projects.},
    journal = {2017 8th IEEE International Conference on Software Engineering and Service },
    year = {2017},
    pages = {165-168},
    author = {Wijesiriwardana, Chaman and Wimalaratne, Prasad},
    keywords = {clone detection component, component composition, Index Terms-code clone detection}
    }

  • D. Yu, J. Wang, Q. Wu, J. Yang, J. Wang, W. Yang, and W. Yan, “Detecting java code clones with multi-granularities based on bytecode,” in 2017 ieee 41st annual computer software and applications conference (compsac), 2017, pp. 317-326.
    [BibTeX] [PDF]
    @INPROCEEDINGS{8029624,
    author={D. {Yu} and J. {Wang} and Q. {Wu} and J. {Yang} and J. {Wang} and W. {Yang} and W. {Yan}},
    booktitle={2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC)},
    title={Detecting Java Code Clones with Multi-granularities Based on Bytecode},
    year={2017},
    url = {https://ieeexplore.ieee.org/abstract/document/8029624/},
    volume={1},
    number={},
    pages={317-326},}

  • T. Zhang and M. Kim, “Automated transplantation and differential testing for clones,” in 2017 ieee/acm 39th international conference on software engineering (icse), 2017, pp. 665-676.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7985703,
    author={T. {Zhang} and M. {Kim}},
    booktitle={2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)},
    title={Automated Transplantation and Differential Testing for Clones},
    url = {https://ieeexplore.ieee.org/abstract/document/7985703/},
    year={2017},
    volume={},
    number={},
    pages={665-676},
    }

2016

  • F. Al-omari and C. K. Roy, “Is code cloning in games really different?,” in Proceedings of the 31st annual acm symposium on applied computing, New York, NY, USA, 2016, pp. 1512-1519. doi:10.1145/2851613.2851792
    [BibTeX] [PDF]
    @inproceedings{10.1145/2851613.2851792,
    author = {Al-omari, Farouq and Roy, Chanchal K.},
    title = {Is Code Cloning in Games Really Different?},
    year = {2016},
    isbn = {9781450337397},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2851613.2851792},
    doi = {10.1145/2851613.2851792},
    booktitle = {Proceedings of the 31st Annual ACM Symposium on Applied Computing},
    pages = {1512-1519},
    numpages = {8},
    keywords = {software clones, game clones, open source games},
    location = {Pisa, Italy},
    series = {SAC ’16}
    }

  • S. Kaur, G. Singh, and B. Sohal, “Language Independent Code Clone Detection Approach Using JSON String Parsing,” Article in international journal of control theory and applications, vol. 9, iss. 41, pp. 565-573, 2016.
    [BibTeX] [Abstract] [PDF]

    Cloning is an easy way of reusing the software. Some of the studies revealed that about 25-30 per cent of code in long term software development project may have been cloned. Though there are some benefits of software cloning but still software cloning is considered as a harmful practice. If there is an error in one code that is to be cloned then that error can be transferred to all the modules where cloning has been done. Cloning also increases the maintenance cost. So, the clone detection becomes an important part for project success. In this paper, a new language independent approach is proposed for code clone detection which will work on JSON (Java Script Object Notation) string parsing. This approach will trace code clones for nearly all the languages.

    @article{kaur_language_2016,
    title = {Language {Independent} {Code} {Clone} {Detection} {Approach} {Using} {JSON} {String} {Parsing}},
    volume = {9},
    issn = {0974-5572},
    url = {https://www.researchgate.net/publication/315698515},
    abstract = {Cloning is an easy way of reusing the software. Some of the studies revealed that about 25-30 per cent of code in long term software development project may have been cloned. Though there are some benefits of software cloning but still software cloning is considered as a harmful practice. If there is an error in one code that is to be cloned then that error can be transferred to all the modules where cloning has been done. Cloning also increases the maintenance cost. So, the clone detection becomes an important part for project success. In this paper, a new language independent approach is proposed for code clone detection which will work on JSON (Java Script Object Notation) string parsing. This approach will trace code clones for nearly all the languages.},
    number = {41},
    journal = {Article in International Journal of Control Theory and Applications},
    author = {Kaur, Sandeep and Singh, Gaurav and Sohal, Bhavneesh},
    year = {2016},
    keywords = {Clone Detection, Reuse, Plagiarism, JSON, Language Independent, Parsing},
    pages = {565-573}
    }

  • D. E. Krutz and M. Mirakhorl, “Architectural clones: toward tactical code reuse,” in Proceedings of the 31st annual acm symposium on applied computing, New York, NY, USA, 2016, p. 1480–1485. doi:10.1145/2851613.2851787
    [BibTeX] [PDF]
    @inproceedings{10.1145/2851613.2851787,
    author = {Krutz, Daniel E. and Mirakhorl, Mehdi},
    title = {Architectural Clones: Toward Tactical Code Reuse},
    year = {2016},
    isbn = {9781450337397},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2851613.2851787},
    doi = {10.1145/2851613.2851787},
    booktitle = {Proceedings of the 31st Annual ACM Symposium on Applied Computing},
    pages = {1480–1485},
    numpages = {6},
    keywords = {software architecture, tactical code clone, code reuse},
    location = {Pisa, Italy},
    series = {SAC ’16}
    }

  • W. Liu and H. Liu, “Major motivations for extract method refactorings: analysis based on interviews and change histories,” Front. comput. sci, vol. 10, iss. 4, pp. 644-656, 2016. doi:10.1007/s11704-016-5131-4
    [BibTeX] [Abstract] [PDF]

    Extract method is one of the most popular software refactorings. However, little work has been done to investigate or validate the major motivations for such refac-torings. Digging into this issue might help researchers to improve tool support for extract method refactorings, e.g., proposing better tools to recommend refactoring opportunities , and to select fragments to be extracted. To this end, we conducted an interview with 25 developers, and our results suggest that current reuse, decomposition of long methods, clone resolution, and future reuse are the major motivations for extract method refactorings. We also validated the results by analyzing the refactoring history of seven open-source applications. Analysis results suggest that current reuse was the primary motivation for 56\% of extract method refactorings, decomposition of methods was the primary motivation for 28\% of extract method refactorings, and clone resolution was the primary motivation for 16\% of extract method refactor-ings. These findings might suggest that recommending extract method opportunities by analyzing only the inner structure (e.g., complexity and length) of methods alone would miss many extract method opportunities. These findings also suggest that extract method refactorings are often driven by current and immediate reuse. Consequently, how to recognize or predict reuse requirements timely during software evolution may play a key role in the recommendation and automation of extract method refactorings. We also investigated the likelihood for the extracted methods to be reused in future, and our results suggest that such methods have a small chance (12\%) to be reused in future unless the extracted fragment could be reused immediately in software evolution and extracting such a fragment can resolve existing clones at the same time.

    @article{liu_major_2016,
    title = {Major motivations for extract method refactorings: analysis based on interviews and change histories},
    volume = {10},
    url = {https://link.springer.com/article/10.1007/s11704-016-5131-4},
    doi = {10.1007/s11704-016-5131-4},
    abstract = {Extract method is one of the most popular software refactorings. However, little work has been done to investigate or validate the major motivations for such refac-torings. Digging into this issue might help researchers to improve tool support for extract method refactorings, e.g., proposing better tools to recommend refactoring opportunities , and to select fragments to be extracted. To this end, we conducted an interview with 25 developers, and our results suggest that current reuse, decomposition of long methods, clone resolution, and future reuse are the major motivations for extract method refactorings. We also validated the results by analyzing the refactoring history of seven open-source applications. Analysis results suggest that current reuse was the primary motivation for 56\% of extract method refactorings, decomposition of methods was the primary motivation for 28\% of extract method refactorings, and clone resolution was the primary motivation for 16\% of extract method refactor-ings. These findings might suggest that recommending extract method opportunities by analyzing only the inner structure (e.g., complexity and length) of methods alone would miss many extract method opportunities. These findings also suggest that extract method refactorings are often driven by current and immediate reuse. Consequently, how to recognize or predict reuse requirements timely during software evolution may play a key role in the recommendation and automation of extract method refactorings. We also investigated the likelihood for the extracted methods to be reused in future, and our results suggest that such methods have a small chance (12\%) to be reused in future unless the extracted fragment could be reused immediately in software evolution and extracting such a fragment can resolve existing clones at the same time.},
    number = {4},
    journal = {Front. Comput. Sci},
    author = {Liu, Wenmei and Liu, Hui},
    month = aug,
    year = {2016},
    note = {Publisher: Higher Education Press},
    keywords = {software quality, data mining, extract method, motiva-tion, software refactoring},
    pages = {644-656}
    }

  • C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, and J. H. Drake, “Searching for Configurations in Clone Evaluation A Replication Study,” 2016.
    [BibTeX] [Abstract] [PDF]

    Clone detection is the process of finding duplicated code within a software code base in an automated manner. It is useful in several areas of software development such as code quality analysis, bug detection, and program understanding. We replicate a study of a genetic-algorithm based framework that optimises parameters for clone agreement (EvaClone). We apply the framework to 14 releases of Mockito, a Java mocking framework. We observe that the optimised parameters outperform the tools’ default parameters in term of clone agreement by 19.91\% to 66.43\%. However, the framework gives undesirable results in term of clone quality. EvaClone either maximises or minimises a number of clones in order to achieve the highest agreement resulting in more false positives or false negatives introduced consequently.

    @techreport{ragkhitwetsagul_searching_nodate,
    title = {Searching for {Configurations} in {Clone} {Evaluation} {A} {Replication} {Study}},
    journal = {Search Based Software Engineering(SSBSE)},
    url = {https://link.springer.com/chapter/10.1007/978-3-319-47106-8_20},
    abstract = {Clone detection is the process of finding duplicated code within a software code base in an automated manner. It is useful in several areas of software development such as code quality analysis, bug detection, and program understanding. We replicate a study of a genetic-algorithm based framework that optimises parameters for clone agreement (EvaClone). We apply the framework to 14 releases of Mockito, a Java mocking framework. We observe that the optimised parameters outperform the tools' default parameters in term of clone agreement by 19.91\% to 66.43\%. However, the framework gives undesirable results in term of clone quality. EvaClone either maximises or minimises a number of clones in order to achieve the highest agreement resulting in more false positives or false negatives introduced consequently.},
    author = {Ragkhitwetsagul, Chaiyong and Paixao, Matheus and Adham, Manal and Busari, Saheed and Krinke, Jens and Drake, John H},
    note = {Publication Title: Springer},
    volume = {9962},
    year = {2016}
    }

  • A. Sheneamer and J. K. Kalita, “A survey of software clone detection techniques,” International journal of computer applications, vol. 137, pp. 1-21, 2016.
    [BibTeX] [PDF]
    @article{Sheneamer2016ASO,
    title={A Survey of Software Clone Detection Techniques},
    author={Abdullah Sheneamer and Jugal Kumar Kalita},
    journal={International Journal of Computer Applications},
    year={2016},
    volume={137},
    url = {https://www.semanticscholar.org/paper/A-Survey-of-Software-Clone-Detection-Techniques-Sheneamer-Kalita/1b150df71fea9b7d131acb9ab0eecc504c920f3f?p2df},
    pages={1-21}
    }

  • D. Strüber, J. Plöger, and V. Acreµoaie, “Clone Detection for Graph-Based Model Transformation Languages,” International conference on model transformation, vol. 9765, pp. 191-206, 2016. doi:10.1007/978-3-319-42064-6_13
    [BibTeX] [Abstract] [PDF]

    Cloning is a convenient mechanism to enable reuse across and within software artifacts. On the downside, it is also a practice related to signicant long-term maintainability impediments, thus generating a need to identify clones in aected artifacts. A large variety of clone detection techniques has been proposed for programming and modeling languages; yet no specic ones have emerged for model transformation languages. In this paper, we explore clone detection for graph-based model transformation languages. We introduce potential use cases for such techniques in the context of constructive and analytical quality assurance. From these use cases, we derive a set of key requirements. We describe our customization of existing model clone detection techniques allowing us to address these requirements. Finally, we provide an experimental evaluation , indicating that our customization of ConQAT, one of the existing techniques, is well-suited to satisfy all identied requirements.

    @article{struber_clone_2016,
    title = {Clone {Detection} for {Graph}-{Based} {Model} {Transformation} {Languages}},
    volume = {9765},
    url = {https://www.researchgate.net/publication/302311346},
    doi = {10.1007/978-3-319-42064-6_13},
    abstract = {Cloning is a convenient mechanism to enable reuse across and within software artifacts. On the downside, it is also a practice related to signicant long-term maintainability impediments, thus generating a need to identify clones in aected artifacts. A large variety of clone detection techniques has been proposed for programming and modeling languages; yet no specic ones have emerged for model transformation languages. In this paper, we explore clone detection for graph-based model transformation languages. We introduce potential use cases for such techniques in the context of constructive and analytical quality assurance. From these use cases, we derive a set of key requirements. We describe our customization of existing model clone detection techniques allowing us to address these requirements. Finally, we provide an experimental evaluation , indicating that our customization of ConQAT, one of the existing techniques, is well-suited to satisfy all identied requirements.},
    journal = {International Conference on Model Transformation},
    author = {Strüber, Daniel and Plöger, Jennifer and Acreµoaie, Vlad},
    year = {2016},
    note = {Publisher: Springer Verlag},
    pages = {191-206}
    }

  • F. Su, J. Bell, K. Harvey, S. Sethumadhavan, G. Kaiser, and T. Jebara, “Code relatives: detecting similarly behaving software,” in Proceedings of the 2016 24th acm sigsoft international symposium on foundations of software engineering, New York, NY, USA, 2016, p. 702–714. doi:10.1145/2950290.2950321
    [BibTeX] [PDF]
    @inproceedings{10.1145/2950290.2950321,
    author = {Su, Fang-Hsiang and Bell, Jonathan and Harvey, Kenneth and Sethumadhavan, Simha and Kaiser, Gail and Jebara, Tony},
    title = {Code Relatives: Detecting Similarly Behaving Software},
    year = {2016},
    isbn = {9781450342186},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2950290.2950321},
    doi = {10.1145/2950290.2950321},
    booktitle = {Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering},
    pages = {702–714},
    numpages = {13},
    keywords = {link analysis, code clones, Code relatives, runtime behavior, subgraph matching},
    location = {Seattle, WA, USA},
    series = {FSE 2016}
    }

  • M. White, M. Tufano, C. Vendome, and D. Poshyvanyk, “Deep learning code fragments for code clone detection,” in ASE 2016 – Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, 2016, pp. 87-98. doi:10.1145/2970276.2970326
    [BibTeX] [Abstract] [PDF]

    Code clone detection is an important problem for software maintenance and evolution. Many approaches consider either structure or identifiers, but none of the existing detection techniques model both sources of information. These techniques also depend on generic, handcrafted features to represent code fragments. We introduce learning-based detection techniques where everything for representing terms and fragments in source code is mined from the repository. Our code analysis supports a framework, which relies on deep learning, for automatically linking patterns mined at the lexical level with patterns mined at the syntactic level. We evaluated our novel learning-based approach for code clone detection with respect to feasibility from the point of view of software maintainers. We sampled and manually evaluated 398 fileand 480 method-level pairs across eight real-world Java systems; 93\% of the fileand method-level samples were evaluated to be true positives. Among the true positives, we found pairs mapping to all four clone types.We compared our approach to a traditional structure-oriented technique and found that our learning-based approach detected clones that were either undetected or suboptimally reported by the prominent tool Deckard. Our results affirm that our learning-based approach is suitable for clone detection and a tenable technique for researchers.

    @inproceedings{white_deep_2016,
    title = {Deep learning code fragments for code clone detection},
    isbn = {978-1-4503-3845-5},
    url = {https://tufanomichele.com/publications/C5.pdf},
    doi = {10.1145/2970276.2970326},
    abstract = {Code clone detection is an important problem for software maintenance and evolution. Many approaches consider either structure or identifiers, but none of the existing detection techniques model both sources of information. These techniques also depend on generic, handcrafted features to represent code fragments. We introduce learning-based detection techniques where everything for representing terms and fragments in source code is mined from the repository. Our code analysis supports a framework, which relies on deep learning, for automatically linking patterns mined at the lexical level with patterns mined at the syntactic level. We evaluated our novel learning-based approach for code clone detection with respect to feasibility from the point of view of software maintainers. We sampled and manually evaluated 398 fileand 480 method-level pairs across eight real-world Java systems; 93\% of the fileand method-level samples were evaluated to be true positives. Among the true positives, we found pairs mapping to all four clone types.We compared our approach to a traditional structure-oriented technique and found that our learning-based approach detected clones that were either undetected or suboptimally reported by the prominent tool Deckard. Our results affirm that our learning-based approach is suitable for clone detection and a tenable technique for researchers.},
    booktitle = {{ASE} 2016 - {Proceedings} of the 31st {IEEE}/{ACM} {International} {Conference} on {Automated} {Software} {Engineering}},
    publisher = {Association for Computing Machinery, Inc},
    author = {White, Martin and Tufano, Michele and Vendome, Christopher and Poshyvanyk, Denys},
    month = aug,
    year = {2016},
    keywords = {Code clone detection, Machine learning, Abstract syntax trees, Deep learning, Language models, Neu-ral networks},
    pages = {87-98}
    }

  • N. Yoshida, “When, why and for whom do practitioners detect technical debt?: An experience report,” 1st international workshop on technical debt analytics (tda 2016), 2016. doi:10.1007/s10664-008-9076-6
    [BibTeX] [Abstract] [PDF]

    Code cloning is one of the most well-known code-level technical debts. In this paper, I discuss when, why and for whom practitioners detect code clones based on my experience of industry/university collaboration. At first, I introduce five project instances based on my experience. Next, I identify elements of the context model of a software maintenance project. After that, I discuss the impact of the context of a software maintenance project on technical debt.

    @article{yoshida_when_nodate,
    title = {When, why and for whom do practitioners detect technical debt?: {An} experience report},
    url = {http://ceur-ws.org/Vol-1771/paper10.pdf},
    doi = {10.1007/s10664-008-9076-6},
    abstract = {Code cloning is one of the most well-known code-level technical debts. In this paper, I discuss when, why and for whom practitioners detect code clones based on my experience of industry/university collaboration. At first, I introduce five project instances based on my experience. Next, I identify elements of the context model of a software maintenance project. After that, I discuss the impact of the context of a software maintenance project on technical debt.},
    journal = {1st International Workshop on Technical Debt Analytics (TDA 2016)},
    year ={2016},
    author = {Yoshida, Norihiro}
    }

  • M. Ahmad and M. Dmutto, “A Novel Approach for Code Clone Detection Using Hybrid Technique,” International journal of advanced engineering, management and science (ijaems), vol. 2, iss. 9, 2016.
    [BibTeX] [Abstract] [PDF]

    Code clones have been studied for long, and there is strong evidence that they are a major source of software faults. The copying of code has been studied within software engineering mostly in the area of clone analysis. Software clones are regions of source code which are highly similar; these regions of similarity are called clones, clone classes, or clone pairs In this paper a hybrid approach using metric based technique with the combination of text based technique for detection and reporting of clones is proposed. The Proposed work is divided into two stages selection of potential clones and comparing of potential clones using textual comparison. The proposed technique detects exact clones on the basis of metric match and then by text match.

    @article{ahmad_novel_2016,
    title = {A {Novel} {Approach} for {Code} {Clone} {Detection} {Using} {Hybrid} {Technique}},
    volume = {2},
    url = {https://www.semanticscholar.org/paper/A-Novel-Approach-for-Code-Clone-Detection-Using-Ahmad-Dmutto/cdc0fe9233884185a35f1086fa6d947f738f2ae2},
    issn = {2454-1311},
    abstract = {Code clones have been studied for long, and there is strong evidence that they are a major source of software faults. The copying of code has been studied within software engineering mostly in the area of clone analysis. Software clones are regions of source code which are highly similar; these regions of similarity are called clones, clone classes, or clone pairs In this paper a hybrid approach using metric based technique with the combination of text based technique for detection and reporting of clones is proposed. The Proposed work is divided into two stages selection of potential clones and comparing of potential clones using textual comparison. The proposed technique detects exact clones on the basis of metric match and then by text match.},
    number = {9},
    journal = {International Journal of Advanced Engineering, Management and Science (IJAEMS)},
    author = {Ahmad, Muneer and Dmutto, Mudasirahma},
    year = {2016},
    note = {Publisher: Infogain Publication},
    keywords = {code clone, Functional clone, Hybrid, Textual clones}
    }

  • Z. D. Al-Saffar, S. S. Sarhan, and S. Elmougy, “Automatic detecting and removal duplicate codes clones,” , 3, 2016.
    [BibTeX] [Abstract] [PDF]

    Code clones is considered now an important part of improving the overall design of software structure and software maintenance through making the source code more readable and more capable for maintenance. To remove code clones from a written code, refactoring technique could be used. Copying and pasting fragments of codes is a type of code clones that should be handled and has many practical applications such as software and project plagiarism detection clones and copyright infringements. To overcome this problem, we propose a computerized refactoring system to remove duplicate code clones. The simulation results of applying the proposed system showed that it increases the maintainability and quality of software system based on the total lines of code, blank lines and total methods count for the four used Java open source projects.

    @techreport{al-saffar_automatic_2016,
    title = {Automatic Detecting and Removal Duplicate Codes Clones},
    url = {https://journals.ekb.eg/article_19841.html},
    abstract = {Code clones is considered now an important part of improving the overall design of software structure and software maintenance through making the source code more readable and more capable for maintenance. To remove code clones from a written code, refactoring technique could be used. Copying and pasting fragments of codes is a type of code clones that should be handled and has many practical applications such as software and project plagiarism detection clones and copyright infringements. To overcome this problem, we propose a computerized refactoring system to remove duplicate code clones. The simulation results of applying the proposed system showed that it increases the maintainability and quality of software system based on the total lines of code, blank lines and total methods count for the four used Java open source projects.},
    number = {3},
    author = {Al-Saffar, Z D and Sarhan, S S and Elmougy, S},
    year = {2016},
    journal = {International Journal of Intelligent Computing and Information Sciences},
    note = {Publication Title: International Journal of Intelligent Computing and Information Sciences},
    pages = {81-93},
    keywords = {Clone Refactoring, Clones Removing, Code Smells, Duplicated Code, Keyword: Code Clones}
    }

  • S. Alam, R. Riley, I. Sogukpinar, and N. Carkaci, “Droidclone: detecting android malware variants by exposing code clones,” in 2016 sixth international conference on digital information and communication technology and its applications (dictap), 2016, pp. 79-84.
    [BibTeX]
    @INPROCEEDINGS{7544005,
    author={S. Alam and R. Riley and I. Sogukpinar and N. Carkaci},
    booktitle={2016 Sixth International Conference on Digital Information and Communication Technology and its Applications (DICTAP)},
    title={DroidClone: Detecting android malware variants by exposing code clones},
    year={2016},
    volume={},
    number={},
    pages={79-84},
    }

  • A. Ashish, “Clones Clustering Using K-Means,” 10th international conference on intelligent systems and control (isco), 2016. doi:10.1109/ISCO.2016.7726943
    [BibTeX] [Abstract] [PDF]

    Cloning is a process of reusing the existing code for development of fresh code or to modify an existing system. It involves using a known pattern or source code as aviation over which a new code designed with or without modifying the original source. Several approaches are being used for detection of clones. In our work we modified LSH base approach of Deckard to find clones. Deckard is a scalable and accurate clone detection tool which is LSH (Locality Sensitive Hashing) algorithm based. In this paper have we to replace the call to LSH with K-Means algorithm. LSH based proposed system will be used for of clones in Java, C, Php programs and will help in clone code optimization. K-Means algorithm for clustering uses set of observations to partition them into clusters.

    @article{ashish_clones_nodate,
    title = {Clones {Clustering} {Using} {K}-{Means}},
    url = {https://www.researchgate.net/publication/305992168},
    doi = {10.1109/ISCO.2016.7726943},
    abstract = {Cloning is a process of reusing the existing code for development of fresh code or to modify an existing system. It involves using a known pattern or source code as aviation over which a new code designed with or without modifying the original source. Several approaches are being used for detection of clones. In our work we modified LSH base approach of Deckard to find clones. Deckard is a scalable and accurate clone detection tool which is LSH (Locality Sensitive Hashing) algorithm based. In this paper have we to replace the call to LSH with K-Means algorithm. LSH based proposed system will be used for of clones in Java, C, Php programs and will help in clone code optimization. K-Means algorithm for clustering uses set of observations to partition them into clusters.},
    journal = {10th International Conference on Intelligent Systems and Control (ISCO)},
    author = {Ashish, Aveg},
    year={2016},
    keywords = {Clone Clustering, Clusters, Post-Processing, Sim-ilarity, Tokenization, Vector Generation}
    }

  • W. Casey and A. Shelmire, “Signature limits: an entire map of clone features and their discovery in nearly linear time,” in 2016 11th international conference on malicious and unwanted software (malware), 2016, pp. 1-10.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7888740,
    author={Casey, William and Shelmire, Aaron},
    booktitle={2016 11th International Conference on Malicious and Unwanted Software (MALWARE)},
    title={Signature limits: an entire map of clone features and their discovery in nearly linear time},
    year={2016},
    url = {https://ieeexplore.ieee.org/abstract/document/7888740/},
    volume={},
    number={},
    pages={1-10},
    }

  • D. Chatterji, J. C. Carver, and N. A. Kraft, “Code clones and developer behavior: results of two surveys of the clone research community,” Empirical softw. engg., vol. 21, iss. 4, p. 1476–1508, 2016. doi:10.1007/s10664-015-9394-4
    [BibTeX] [PDF]
    @article{10.1007/s10664-015-9394-4,
    author = {Chatterji, Debarshi and Carver, Jeffrey C. and Kraft, Nicholas A.},
    title = {Code Clones and Developer Behavior: Results of Two Surveys of the Clone Research Community},
    year = {2016},
    issue_date = {August 2016},
    publisher = {Kluwer Academic Publishers},
    address = {USA},
    volume = {21},
    number = {4},
    issn = {1382-3256},
    url = {https://doi.org/10.1007/s10664-015-9394-4},
    doi = {10.1007/s10664-015-9394-4},
    journal = {Empirical Softw. Engg.},
    month = aug,
    pages = {1476–1508},
    numpages = {33},
    keywords = {Clone evolution, Code clones, Clone management, Software maintenance, Developer behavior, Community survey}
    }

  • J. Chen, T. R. Dean, •. Manar, H. Alalfi, and M. H. Alalfi, “Clone detection in MATLAB Stateflow models,” Software quality journal, vol. 24, pp. 917-946, 2016. doi:10.1007/s11219-015-9296-0
    [BibTeX] [Abstract] [PDF]

    MATLAB Simulink is one of the leading tools for model-based software development in the automotive industry. One extension to Simulink is Stateflow, which allows the user to embed Statecharts as components in a Simulink model. These state machines contain nested states, an action language that describes events, guards, conditions , actions, and complex transitions. As Stateflow has become increasingly important in Simulink models for the automotive sector, we extend previous work on clone detection of Simulink models to Stateflow components. While Stateflow models are stored in the same file as the Simulink models that host them, the representations differ. Our approach incorporates a pretransformation that converts the Stateflow models into a form that allows us to use the SIMONE model clone detector to identify candidates and cluster them into classes. In addition, we push the results of the Stateflow clone detection back into the Simulink models, improving the accuracy of the clones found in the host Simulink models. We validated our approach on the MATLAB Simulink/Stateflow demo set. Our approach showed promising results on the identification of Stateflow clones in isolation, as well as integrated components of the Simulink models that are hosting them.

    @article{chen_clone_2016,
    title = {Clone detection in {MATLAB} {Stateflow} models},
    volume = {24},
    url = {https://link.springer.com/article/10.1007/s11219-015-9296-0},
    doi = {10.1007/s11219-015-9296-0},
    abstract = {MATLAB Simulink is one of the leading tools for model-based software development in the automotive industry. One extension to Simulink is Stateflow, which allows the user to embed Statecharts as components in a Simulink model. These state machines contain nested states, an action language that describes events, guards, conditions , actions, and complex transitions. As Stateflow has become increasingly important in Simulink models for the automotive sector, we extend previous work on clone detection of Simulink models to Stateflow components. While Stateflow models are stored in the same file as the Simulink models that host them, the representations differ. Our approach incorporates a pretransformation that converts the Stateflow models into a form that allows us to use the SIMONE model clone detector to identify candidates and cluster them into classes. In addition, we push the results of the Stateflow clone detection back into the Simulink models, improving the accuracy of the clones found in the host Simulink models. We validated our approach on the MATLAB Simulink/Stateflow demo set. Our approach showed promising results on the identification of Stateflow clones in isolation, as well as integrated components of the Simulink models that are hosting them.},
    journal = {Software Quality Journal},
    author = {Chen, Jian and Dean, Thomas R and Manar, • and Alalfi, H and Alalfi, Manar H},
    year = {2016},
    keywords = {Model, State machine, Stateflow},
    pages = {917-946}
    }

  • X. Cheng, Z. Peng, L. Jiang, H. Zhong, H. Yu, and J. Zhao, “Mining revision histories to detect cross-language clones without intermediates,” in Proceedings of the 31st ieee/acm international conference on automated software engineering, New York, NY, USA, 2016, p. 696–701. doi:10.1145/2970276.2970363
    [BibTeX] [PDF]
    @inproceedings{10.1145/2970276.2970363,
    author = {Cheng, Xiao and Peng, Zhiming and Jiang, Lingxiao and Zhong, Hao and Yu, Haibo and Zhao, Jianjun},
    title = {Mining Revision Histories to Detect Cross-Language Clones without Intermediates},
    year = {2016},
    isbn = {9781450338455},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2970276.2970363},
    doi = {10.1145/2970276.2970363},
    booktitle = {Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering},
    pages = {696–701},
    numpages = {6},
    keywords = {diff, cross-language clone, revision history},
    location = {Singapore, Singapore},
    series = {ASE 2016}
    }

  • P. Ciancarini, D. Russo, A. Sillitti, and G. Succi, “A guided tour of the legal implications of software cloning,” in Proceedings of the 38th international conference on software engineering companion, New York, NY, USA, 2016, p. 563–572. doi:10.1145/2889160.2889220
    [BibTeX] [PDF]
    @inproceedings{10.1145/2889160.2889220,
    author = {Ciancarini, Paolo and Russo, Daniel and Sillitti, Alberto and Succi, Giancarlo},
    title = {A Guided Tour of the Legal Implications of Software Cloning},
    year = {2016},
    isbn = {9781450342056},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2889160.2889220},
    doi = {10.1145/2889160.2889220},
    booktitle = {Proceedings of the 38th International Conference on Software Engineering Companion},
    pages = {563–572},
    numpages = {10},
    keywords = {copyright, software reuse, software cloning, IPR},
    location = {Austin, Texas},
    series = {ICSE ’16}
    }

  • V. Fördos and M. Tóth, “Identifying Code Clones with RefactorErl,” Acta cybernetica, vol. 22, pp. 553-571, 2016. doi:10.14232/actacyb.22.3.2016.1
    [BibTeX] [Abstract] [PDF]

    Code clones, the results of “copy&paste programming”, have a negative impact on software maintenance. Therefore several tools and techniques have been developed to identify them in the source code. Most of them concentrate on imperative, well known languages, while in this paper, we give an AST/metric based clone detection algorithm for the functional programming language Erlang. We propose a standalone solution that does not overload users with results that are insignificant from the point of view of the user. We emphasise that the maintenance costs can be decreased by using our solution, because the programmers need to deal only with important issues.

    @article{ford_identifying_2016,
    title = {Identifying {Code} {Clones} with {RefactorErl}},
    volume = {22},
    url = {http://cyber.bibl.u-szeged.hu/index.php/actcybern/article/view/3895},
    doi = {10.14232/actacyb.22.3.2016.1},
    abstract = {Code clones, the results of "copy\&paste programming", have a negative impact on software maintenance. Therefore several tools and techniques have been developed to identify them in the source code. Most of them concentrate on imperative, well known languages, while in this paper, we give an AST/metric based clone detection algorithm for the functional programming language Erlang. We propose a standalone solution that does not overload users with results that are insignificant from the point of view of the user. We emphasise that the maintenance costs can be decreased by using our solution, because the programmers need to deal only with important issues.},
    journal = {Acta Cybernetica},
    author = {Fördos, Viktória and Tóth, Melinda},
    year = {2016},
    pages = {553-571}
    }

  • S. Gupta and P. C. Gupta, “A Novel Approach to Detect Duplicate Code Blocks to Reduce Maintenance Effort,” , 4, 2016.
    [BibTeX] [Abstract] [PDF]

    It was found in many cases that a code might be a clone for one programmer but not the same for another one. This problem occurs because of inaccurate documentation. According to research, the maintainers are not aware of the original design and thus, face the difficulty of agreeing on the system’s components and their relations or understanding the work of the application. The problem also occurs because of the different team of development and maintenance resulting in more effort and time during maintenance. This paper proposes a novel approach to detect the clones at the programmer side such that if a particular code is a clone then it can be well documented. This approach will provide both the individual duplicate statements as well as the block in which they appear. The approach has been examined on seven open source systems.

    @techreport{gupta_novel_2016,
    title = {A {Novel} {Approach} to {Detect} {Duplicate} {Code} {Blocks} to {Reduce} {Maintenance} {Effort}},
    url = {https://www.researchgate.net/publication/301945885_A_Novel_Approach_to_Detect_Duplicate_Code_Blocks_to_Reduce_Maintenance_Effort},
    abstract = {It was found in many cases that a code might be a clone for one programmer but not the same for another one. This problem occurs because of inaccurate documentation. According to research, the maintainers are not aware of the original design and thus, face the difficulty of agreeing on the system's components and their relations or understanding the work of the application. The problem also occurs because of the different team of development and maintenance resulting in more effort and time during maintenance. This paper proposes a novel approach to detect the clones at the programmer side such that if a particular code is a clone then it can be well documented. This approach will provide both the individual duplicate statements as well as the block in which they appear. The approach has been examined on seven open source systems.},
    number = {4},
    author = {Gupta, Sonam and Gupta, P C},
    year = {2016},
    note = {Publication Title: IJACSA) International Journal of Advanced Computer Science and Applications},
    volume= {7},
    journal = {International Journal of Advanced Computer Science and Applications},
    keywords = {Clones, Abstract Syntax Tree (AST), Program Dependence Graph (PDG), Control Flow Graph (CFG)}
    }

  • T. A. D. Henderson and A. Podgurski, “Sampling code clones from program dependence graphs with graple,” in Proceedings of the 2nd international workshop on software analytics, New York, NY, USA, 2016, p. 47–53. doi:10.1145/2989238.2989241
    [BibTeX] [PDF]
    @inproceedings{10.1145/2989238.2989241,
    author = {Henderson, Tim A. D. and Podgurski, Andy},
    title = {Sampling Code Clones from Program Dependence Graphs with GRAPLE},
    year = {2016},
    isbn = {9781450343954},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2989238.2989241},
    doi = {10.1145/2989238.2989241},
    booktitle = {Proceedings of the 2nd International Workshop on Software Analytics},
    pages = {47–53},
    numpages = {7},
    keywords = {sampling estimation, clone detection, frequent subgraph mining, Markov chains, program dependence graphs, bug mining},
    location = {Seattle, WA, USA},
    series = {SWAN 2016}
    }

  • J. F. Islam, M. Mondal, and C. K. Roy, “Bug replication in code clones: an empirical study,” in 2016 ieee 23rd international conference on software analysis, evolution, and reengineering (saner), 2016, pp. 68-78.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7476631,
    author={J. F. {Islam} and M. {Mondal} and C. K. {Roy}},
    booktitle={2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER)},
    title={Bug Replication in Code Clones: An Empirical Study},
    year={2016},
    url = {https://ieeexplore.ieee.org/abstract/document/7476631/},
    volume={1},
    number={},
    pages={68-78},}

  • M. R. Islam and M. F. Zibran, “A comparative study on vulnerabilities in categories of clones and non-cloned code,” in 2016 ieee 23rd international conference on software analysis, evolution, and reengineering (saner), 2016, pp. 8-14.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7476787,
    author={M. R. {Islam} and M. F. {Zibran}},
    booktitle={2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER)},
    title={A Comparative Study on Vulnerabilities in Categories of Clones and Non-cloned Code},
    year={2016},
    url = {https://ieeexplore.ieee.org/abstract/document/7476787/},
    volume={3},
    number={},
    pages={8-14},}

  • J. Kaur and R. Singh, “Implementation of Model Cloning in Software Models using UML diagrams,” 2016.
    [BibTeX] [Abstract] [PDF]

    As model based development in the software field is increasing rapidly. Clone detection is emerging as an active research area. Models are integral part of the software development. The maintenance of the model designed is very important. One of the major problems that occur when the model is designed is cloning. Cloning is process of creating the copies of various elements of the model. Clones are divided into two code clones and model clone. Model cloning is the process of development of the duplicate parts of the models. Code cloning is the process of duplication of the code .The cloning results in the increase in the maintenance cost of the model and also increases the probability of bugs in the system. So the detection of the clones is the important process. In this paper a detailed study of the cloning is presented. Along with this the major causes of the creation of clones and various methods of clone detection have been discussed.

    @techreport{kaur_implementation_2016,
    title = {Implementation of {Model} {Cloning} in {Software} {Models} using {UML} diagrams},
    url = {http://www.ijtc.org/download/Volume-2/April-2/IJTC201604009-Implementation of Model Cloning in Software Models using UML diagrams-s129.pdf},
    abstract = {As model based development in the software field is increasing rapidly. Clone detection is emerging as an active research area. Models are integral part of the software development. The maintenance of the model designed is very important. One of the major problems that occur when the model is designed is cloning. Cloning is process of creating the copies of various elements of the model. Clones are divided into two code clones and model clone. Model cloning is the process of development of the duplicate parts of the models. Code cloning is the process of duplication of the code .The cloning results in the increase in the maintenance cost of the model and also increases the probability of bugs in the system. So the detection of the clones is the important process. In this paper a detailed study of the cloning is presented. Along with this the major causes of the creation of clones and various methods of clone detection have been discussed.},
    author = {Kaur, Jashandeep and Singh, Rasbir},
    year = {2016},
    pages = {114-119},
    journal = {International Journal of Technology and Computing (IJTC)},
    note = {Publication Title: ijtc.org},
    keywords = {Clone detection, code clones, software, development, model clones .}
    }

  • S. Kaur, G. Chatley, and B. Sohal, “Software Clone Detection: A review,” Article in international journal of control theory and applications, vol. 9, iss. 41, pp. 555-563, 2016.
    [BibTeX] [Abstract] [PDF]

    Software cloning is the current issue in industries, making acknowledgement of clones a key bit of programming examination. Existing writing on the topic of software or programming clones is grouped comprehensively into various classifications. Utilization of existing code either by duplication and paste methods or by performing minor adjustments in the current code is known as software cloning. Programming clones may prompt bug engendering and genuine support issues. Clone sorts/types, techniques of clones and different procedures are included in this paper. Also this paper will serve as a guide to a potential client of clone identification strategies, to help them in choosing the right apparatuses or methods for their interests.

    @article{kaur_software_2016,
    title = {Software {Clone} {Detection}: {A} review},
    volume = {9},
    issn = {0974-5572},
    url = {https://www.researchgate.net/publication/315702342},
    abstract = {Software cloning is the current issue in industries, making acknowledgement of clones a key bit of programming examination. Existing writing on the topic of software or programming clones is grouped comprehensively into various classifications. Utilization of existing code either by duplication and paste methods or by performing minor adjustments in the current code is known as software cloning. Programming clones may prompt bug engendering and genuine support issues. Clone sorts/types, techniques of clones and different procedures are included in this paper. Also this paper will serve as a guide to a potential client of clone identification strategies, to help them in choosing the right apparatuses or methods for their interests.},
    number = {41},
    journal = {Article in International Journal of Control Theory and Applications},
    author = {Kaur, Sandeep and Chatley, Geetika and Sohal, Bhavneesh},
    year = {2016},
    keywords = {Code clone, refactoring, parsing, plagiarism, reuse, similarity, semantic, syntactic},
    pages = {555-563}
    }

  • N. Kesswani, U. Devi, A. Sharma, and N. Kesswani, “A study on the nature of code clone occurrence predominantly in feature oriented programming and the prospects of refactoring,” Article in international journal of computer applications, vol. 141, iss. 8, pp. 975-8887, 2016. doi:10.5120/ijca2016909724
    [BibTeX] [Abstract] [PDF]

    In this position paper, it is tried to analyze the diverse type of code clones which is present and can easily be perpetuated in feature oriented programming. Along with that, a brief summary of the type of code clones and the use of Refactoring methodologies and tools which is effectively known to remove the problem of code clones is also discussed. The main observation that is made in this paper is the various type of code clones which are present in FOP. Through this discussion, it is intended to draw the attention to the various ways in which code clones could propagate and how important it is to curb it at the initial stages to reduce the complexities.

    @article{kesswani_study_2016,
    title = {A Study on the Nature of Code Clone Occurrence Predominantly in Feature Oriented Programming and the Prospects of Refactoring},
    volume = {141},
    url = {https://www.researchgate.net/publication/303319044},
    doi = {10.5120/ijca2016909724},
    abstract = {In this position paper, it is tried to analyze the diverse type of code clones which is present and can easily be perpetuated in feature oriented programming. Along with that, a brief summary of the type of code clones and the use of Refactoring methodologies and tools which is effectively known to remove the problem of code clones is also discussed. The main observation that is made in this paper is the various type of code clones which are present in FOP. Through this discussion, it is intended to draw the attention to the various ways in which code clones could propagate and how important it is to curb it at the initial stages to reduce the complexities.},
    number = {8},
    journal = {Article in International Journal of Computer Applications},
    author = {Kesswani, Nishtha and Devi, U and Sharma, A and Kesswani, N},
    year = {2016},
    keywords = {Refactoring, Code Clone metrics, Code clones in FOP and OOP, General Terms Code Clone Detection techniques and, Metrices, Refactoring Methods Keywords Code clones},
    pages = {975-8887}
    }

  • D. Mazinanian, N. Tsantalis, R. Stein, and Z. Valenta, “Jdeodorant: clone refactoring,” in Proceedings of the 38th international conference on software engineering companion, New York, NY, USA, 2016, p. 613–616. doi:10.1145/2889160.2889168
    [BibTeX] [PDF]
    @inproceedings{10.1145/2889160.2889168,
    author = {Mazinanian, Davood and Tsantalis, Nikolaos and Stein, Raphael and Valenta, Zackary},
    title = {JDeodorant: Clone Refactoring},
    year = {2016},
    isbn = {9781450342056},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2889160.2889168},
    doi = {10.1145/2889160.2889168},
    booktitle = {Proceedings of the 38th International Conference on Software Engineering Companion},
    pages = {613–616},
    numpages = {4},
    keywords = {refactoring, code duplication, refactorability analysis},
    location = {Austin, Texas},
    series = {ICSE ’16}
    }

  • M. Mondal, K. Chanchal Roy, and K. A. Schneider, “A comparative study on the intensity and harmfulness of late propagation in near-miss code clones,” Software quality journal, vol. 24, pp. 883-915, 2016. doi:10.1007/s11219-016-9305-y
    [BibTeX] [Abstract] [PDF]

    Exact or nearly similar code fragments in a software system’s source code are referred to as code clones. It is often the case that updates (i.e., changes) to a code clone will need to be propagated to its related code clones to preserve their similarity and to maintain source code consistency. When there is a delay in propagating the changes (possibly because the developer is unaware of the related cloned code), the system might behave incorrectly. A delay in propagating a change is referred to as ‘late propagation,’ and a number of studies have investigated this phenomenon. However, these studies did not investigate the intensity of late propagation nor how late propagation differs by clone type. In this research, we investigate late propagation separately for each of the three clone types (Type 1, Type 2, and Type 3). According to our experimental results on thousands of revisions of eight diverse subject systems written in two programming languages, late propagation occurs more frequently in Type 3 clones compared with the other two clone types. More importantly, there is a higher probability that Type 3 clones will experience buggy late propagations compared with the other two clone types. Also, we discovered that block clones are more involved in late propagation than method clones. Refactoring and tracking of Similarity Preserving Change Pattern (SPCP) clones (i.e., the clone fragments that evolve following a SPCP) can help us minimize the occurrences of late propagation in clones.

    @article{mondal_comparative_2016,
    title = {A comparative study on the intensity and harmfulness of late propagation in near-miss code clones},
    volume = {24},
    url = {https://idp.springer.com/authorize/casa?redirect_uri=https://link.springer.com/content/pdf/10.1007/s11219-016-9305-y.pdf&casa_token=NPZc4XCjtSsAAAAA:z9Q0w-es430WcDbdwj8x6h9CI48EAX9xXdp3UGb__llKoTyQHvY5DXt3beYvpko5SxFKg7snWrp1LZ-4},
    doi = {10.1007/s11219-016-9305-y},
    abstract = {Exact or nearly similar code fragments in a software system's source code are referred to as code clones. It is often the case that updates (i.e., changes) to a code clone will need to be propagated to its related code clones to preserve their similarity and to maintain source code consistency. When there is a delay in propagating the changes (possibly because the developer is unaware of the related cloned code), the system might behave incorrectly. A delay in propagating a change is referred to as 'late propagation,' and a number of studies have investigated this phenomenon. However, these studies did not investigate the intensity of late propagation nor how late propagation differs by clone type. In this research, we investigate late propagation separately for each of the three clone types (Type 1, Type 2, and Type 3). According to our experimental results on thousands of revisions of eight diverse subject systems written in two programming languages, late propagation occurs more frequently in Type 3 clones compared with the other two clone types. More importantly, there is a higher probability that Type 3 clones will experience buggy late propagations compared with the other two clone types. Also, we discovered that block clones are more involved in late propagation than method clones. Refactoring and tracking of Similarity Preserving Change Pattern (SPCP) clones (i.e., the clone fragments that evolve following a SPCP) can help us minimize the occurrences of late propagation in clones.},
    journal = {Software Quality Journal},
    author = {Mondal, Manishankar and Chanchal Roy, K and Schneider, Kevin A},
    year = {2016},
    keywords = {Code clones, Clone genealogy, Late propagation, Near-miss clones},
    pages = {883-915}
    }

  • Y. Nakamura, E. Choi, N. Yoshida, S. Haruna, and K. Inoue, “Towards detection and analysis of interlanguage clones for multilingual web applications,” in 2016 ieee 23rd international conference on software analysis, evolution, and reengineering (saner), 2016, pp. 17-18.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7476789,
    author={Y. {Nakamura} and E. {Choi} and N. {Yoshida} and S. {Haruna} and K. {Inoue}},
    booktitle={2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER)},
    title={Towards Detection and Analysis of Interlanguage Clones for Multilingual Web Applications},
    year={2016},
    url = {https://ieeexplore.ieee.org/abstract/document/7476789},
    volume={3},
    number={},
    pages={17-18},
    }

  • A. Okutan, O. Taner Yildiz, and O. Taner YILDIZ, “A novel kernel to predict software defectiveness,” Article in journal of systems and software, pp. 109-121, 2016. doi:10.1016/j.jss.2016.06.006
    [BibTeX] [Abstract] [PDF]

    Although the software defect prediction problem has been researched for a long time, the results achieved are not so bright. In this paper, we propose to use novel kernels for defect prediction that are based on the plagiarized source code, software clones and textual similarity. We generate precomputed kernel matrices and compare their performance on different data sets to model the relationship between source code similarity and defectiveness. Each value in a kernel matrix shows how much parallelism exists between the corresponding files of a software system chosen. Our experiments on 10 real world datasets indicate that support vector machines (SVM) with a precomputed kernel matrix performs better than the SVM with the usual linear kernel in terms of F-measure. Similarly, when used with a precomputed kernel, the k-nearest neighbor classifier (KNN) achieves comparable performance with respect to KNN classifier. The results from this preliminary study indicate that source code similarity can be used to predict defect proneness.

    @article{okutan_novel_2016,
    title = {A novel kernel to predict software defectiveness},
    url = {https://www.researchgate.net/publication/304106172},
    doi = {10.1016/j.jss.2016.06.006},
    abstract = {Although the software defect prediction problem has been researched for a long time, the results achieved are not so bright. In this paper, we propose to use novel kernels for defect prediction that are based on the plagiarized source code, software clones and textual similarity. We generate precomputed kernel matrices and compare their performance on different data sets to model the relationship between source code similarity and defectiveness. Each value in a kernel matrix shows how much parallelism exists between the corresponding files of a software system chosen. Our experiments on 10 real world datasets indicate that support vector machines (SVM) with a precomputed kernel matrix performs better than the SVM with the usual linear kernel in terms of F-measure. Similarly, when used with a precomputed kernel, the k-nearest neighbor classifier (KNN) achieves comparable performance with respect to KNN classifier. The results from this preliminary study indicate that source code similarity can be used to predict defect proneness.},
    journal = {Article in Journal of Systems and Software},
    author = {Okutan, Ahmet and Taner Yildiz, Olcay and Taner YILDIZ, Olcay},
    year = {2016},
    pages = {109-121},
    keywords = {Defect prediction, SVM, kernel methods}
    }

  • S. Jadon, “Code clones detection using machine learning technique: support vector machine,” in 2016 international conference on computing, communication and automation (iccca), 2016, pp. 399-303.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7813733,
    author={S. {Jadon}},
    booktitle={2016 International Conference on Computing, Communication and Automation (ICCCA)},
    title={Code clones detection using machine learning technique: Support vector machine},
    year={2016},
    url = {https://ieeexplore.ieee.org/abstract/document/7813733/},
    volume={},
    number={},
    pages={399-303},}

  • D. Rabiser, P. Grünbacher, H. Prähofer, and F. Angerer, “A prototype-based approach for managing clones in clone-and-own product lines,” in Proceedings of the 20th international systems and software product line conference, New York, NY, USA, 2016, p. 35–44. doi:10.1145/2934466.2934487
    [BibTeX] [PDF]
    @inproceedings{10.1145/2934466.2934487,
    author = {Rabiser, Daniela and Gr\"{u}nbacher, Paul and Pr\"{a}hofer, Herbert and Angerer, Florian},
    title = {A Prototype-Based Approach for Managing Clones in Clone-and-Own Product Lines},
    year = {2016},
    isbn = {9781450340502},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2934466.2934487},
    doi = {10.1145/2934466.2934487},
    booktitle = {Proceedings of the 20th International Systems and Software Product Line Conference},
    pages = {35–44},
    numpages = {10},
    keywords = {feature modeling, industrial systems, cloning, co-evolution},
    location = {Beijing, China},
    series = {SPLC ’16}
    }

  • D. Rattan and J. Kaur, “Systematic mapping study of metrics based clone detection techniques,” in Proceedings of the international conference on advances in information communication technology & computing, New York, NY, USA, 2016. doi:10.1145/2979779.2979855
    [BibTeX] [PDF]
    @inproceedings{10.1145/2979779.2979855,
    author = {Rattan, Dhavleesh and Kaur, Jagdeep},
    title = {Systematic Mapping Study of Metrics Based Clone Detection Techniques},
    year = {2016},
    isbn = {9781450342131},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2979779.2979855},
    doi = {10.1145/2979779.2979855},
    booktitle = {Proceedings of the International Conference on Advances in Information Communication Technology & Computing},
    articleno = {76},
    numpages = {7},
    keywords = {systematic mapping, Code clone, systematic review, software metrics},
    location = {Bikaner, India},
    series = {AICTC ’16}
    }

  • N. A. Sahithi and K. Ramani, “A Survey on Software Refactorability through Software Clone Detection Tools and Techniques,” 2016.
    [BibTeX] [Abstract] [PDF]

    Code duplication consists of copies of the same code at multiple locations of source code representing software clones. Refactoring is a way to eliminate duplicate codes without changing overall behaviour of the program. Several tools and techniques were developed to handle type-1, type-2, type-3 and type-4 categories of clones. This paper is a review on refactoring of clone detection tools and techniques supported by various platforms and programming languages.

    @techreport{sahithi_survey_2016,
    title = {A {Survey} on {Software} {Refactorability} through {Software} {Clone} {Detection} {Tools} and {Techniques}},
    url = {http://ijtet.com/wp-content/plugins/ijtet/file/upload/docx/3996IJTET1301006-pdf.pdf},
    abstract = {Code duplication consists of copies of the same code at multiple locations of source code representing software clones. Refactoring is a way to eliminate duplicate codes without changing overall behaviour of the program. Several tools and techniques were developed to handle type-1, type-2, type-3 and type-4 categories of clones. This paper is a review on refactoring of clone detection tools and techniques supported by various platforms and programming languages.},
    author = {Sahithi, A Naga and Ramani, K},
    year = {2016},
    note = {Publication Title: INTERNATIONAL JOURNAL FOR TRENDS IN ENGINEERING \& TECHNOLOGY VOLUME},
    volume = {13},
    journal = {International Journal for Trends in Engineering and Technology},
    keywords = {Software clones, Clone Detection Tools, Software Refactoring}
    }

  • V. Saini, H. Sajnani, J. Kim, and C. Lopes, “SourcererCC and SourcererCC-I: Tools to detect clones in batch mode and during software development,” in Proceedings of International Conference on Software Engineering, 2016, pp. 597-600. doi:10.1145/2889160.2889165
    [BibTeX] [Abstract] [PDF]

    Given the availability of large source-code repositories, there has been a large number of applications for large-scale clone detection. Unfortunately, despite a decade of active research, there is a marked lack in clone detectors that scale to big software systems or large repositories, specifically for detecting near-miss (Type 3) clones where significant editing activities may take place in the cloned code. This paper demonstrates: (i) SourcererCC, a token-based clone detector that targets the first three clone types, and exploits an index to achieve scalability to large inter-project repositories using a standard workstation. It uses an optimized inverted-index to quickly query the potential clones of a given code block. Filtering heuristics based on token ordering are used to significantly reduce the size of the index, the number of code-block comparisons needed to detect the clones, as well as the number of required token-comparisons needed to judge a potential clone; and (ii) SourcererCC-I, an Eclipse plug-in, that uses SourcererCC’s core engine to identify and navigate clones (both inter and intra project) in real-time during software development. In our experiments, comparing SourcererCC with the state-of-the-art tools 1, we found that it is the only clone detection tool to successfully scale to 250 MLOC on a standard workstation with 12 GB RAM and efficiently detect the first three types of clones (precision 86\% and recall 86-100\%). Link to the demo: https://youtu.be/17F-9Qp-ks4

    @inproceedings{saini_sourcerercc_2016,
    title = {{SourcererCC} and {SourcererCC}-{I}: {Tools} to detect clones in batch mode and during software development},
    isbn = {978-1-4503-4161-5},
    doi = {10.1145/2889160.2889165},
    url = {https://ieeexplore.ieee.org/document/7883349},
    abstract = {Given the availability of large source-code repositories, there has been a large number of applications for large-scale clone detection. Unfortunately, despite a decade of active research, there is a marked lack in clone detectors that scale to big software systems or large repositories, specifically for detecting near-miss (Type 3) clones where significant editing activities may take place in the cloned code. This paper demonstrates: (i) SourcererCC, a token-based clone detector that targets the first three clone types, and exploits an index to achieve scalability to large inter-project repositories using a standard workstation. It uses an optimized inverted-index to quickly query the potential clones of a given code block. Filtering heuristics based on token ordering are used to significantly reduce the size of the index, the number of code-block comparisons needed to detect the clones, as well as the number of required token-comparisons needed to judge a potential clone; and (ii) SourcererCC-I, an Eclipse plug-in, that uses SourcererCC's core engine to identify and navigate clones (both inter and intra project) in real-time during software development. In our experiments, comparing SourcererCC with the state-of-the-art tools 1, we found that it is the only clone detection tool to successfully scale to 250 MLOC on a standard workstation with 12 GB RAM and efficiently detect the first three types of clones (precision 86\% and recall 86-100\%). Link to the demo: https://youtu.be/17F-9Qp-ks4},
    booktitle = {Proceedings of {International} {Conference} on {Software} {Engineering}},
    publisher = {IEEE Computer Society},
    author = {Saini, Vaibhav and Sajnani, Hitesh and Kim, Jaewoo and Lopes, Cristina},
    month = may,
    year = {2016},
    note = {ISSN: 02705257
    \_eprint: 1603.01661},
    pages = {597-600}
    }

  • H. Sajnani, V. Saini, J. Svajlenko, C. K. Roy, and C. V. Lopes, “SourcererCC: Scaling code clone detection to big-code,” in Proceedings of International Conference on Software Engineering, 2016, pp. 1157-1168. doi:10.1145/2884781.2884877
    [BibTeX] [Abstract] [PDF]

    Despite a decade of active research, there has been a marked lack in clone detection techniques that scale to large repositories for detecting near-miss clones. In this paper, we present a token-based clone detector, SourcererCC, that can detect both exact and near-miss clones from large interproject repositories using a standard workstation. It exploits an optimized inverted-index to quickly query the potential clones of a given code block. Filtering heuristics based on token ordering are used to significantly reduce the size of the index, the number of code-block comparisons needed to detect the clones, as well as the number of required token-comparisons needed to judge a potential clone. We evaluate the scalability, execution time, recall and precision of SourcererCC, and compare it to four publicly available and state-of-the-art tools. To measure recall, we use two recent benchmarks: (1) a big benchmark of real clones, BigCloneBench, and (2) a Mutation/Injection-based framework of thousands of fine-grained artificial clones. We find SourcererCC has both high recall and precision, and is able to scale to a large inter-project repository (25K projects, 250MLOC) using a standard workstation.

    @inproceedings{sajnani_sourcerercc_2016,
    title = {{SourcererCC}: {Scaling} code clone detection to big-code},
    volume = {14-22-May-},
    isbn = {978-1-4503-3900-1},
    doi = {10.1145/2884781.2884877},
    url = {https://arxiv.org/abs/1512.06448},
    abstract = {Despite a decade of active research, there has been a marked lack in clone detection techniques that scale to large repositories for detecting near-miss clones. In this paper, we present a token-based clone detector, SourcererCC, that can detect both exact and near-miss clones from large interproject repositories using a standard workstation. It exploits an optimized inverted-index to quickly query the potential clones of a given code block. Filtering heuristics based on token ordering are used to significantly reduce the size of the index, the number of code-block comparisons needed to detect the clones, as well as the number of required token-comparisons needed to judge a potential clone. We evaluate the scalability, execution time, recall and precision of SourcererCC, and compare it to four publicly available and state-of-the-art tools. To measure recall, we use two recent benchmarks: (1) a big benchmark of real clones, BigCloneBench, and (2) a Mutation/Injection-based framework of thousands of fine-grained artificial clones. We find SourcererCC has both high recall and precision, and is able to scale to a large inter-project repository (25K projects, 250MLOC) using a standard workstation.},
    booktitle = {Proceedings of {International} {Conference} on {Software} {Engineering}},
    publisher = {IEEE Computer Society},
    author = {Sajnani, Hitesh and Saini, Vaibhav and Svajlenko, Jeffrey and Roy, Chanchal K. and Lopes, Cristina V.},
    month = may,
    year = {2016},
    note = {ISSN: 02705257},
    pages = {1157-1168}
    }

  • S. Sargsyan, S. Kurmangaleev, A. Belevantsev, and A. Avetisyan, “Scalable and Accurate Detection of Code Clones,” Original russian text, vol. 42, iss. 1, pp. 27-33, 2016. doi:10.1134/S0361768816010072
    [BibTeX] [Abstract] [PDF]

    A detailed description of a method for detection of code clones is described. This method is based on the semantic analysis of programs and on new algorithms that make it scalable without affecting its accuracy. The proposed method involves two phases. In the first phase, the program dependence graph (PDG) is constructed while the program is compiled. LLVM is used as the compilation infrastructure. In the second phase, similar subgraphs of maximum size that represent code clones are detected. Before starting the search for similar subgraphs, the PDG is divided into subgraphs that will be considered as potential clones of each other. To ensure scalability of the search for similar subgraphs, the composition of algorithms is used. The first algorithm checks that a pair of graphs cannot have similar subgraphs of the desired size; this is done in a linear amount of time. If this algorithm fails, another (approximate) algorithm is executed to find similar sub-graphs of maximum size. After similar subgraphs have been found, the program code is additionally checked for the position of the code lines corresponding to the detected clone candidates. Tests showed that the developed tool is more accurate than similar tools, such as MOSS, CCFinder, and CloneDR. Results obtained for the projects Linux-2.6, Firefox Mozilla, LLVM/Clang, and OpenSSL are presented.

    @article{sargsyan_scalable_2016,
    title = {Scalable and {Accurate} {Detection} of {Code} {Clones}},
    volume = {42},
    url = {https://idp.springer.com/authorize/casa?redirect_uri=https://link.springer.com/article/10.1134/S0361768816010072&casa_token=aOecztvPBXYAAAAA:X3xSfS64dC2ixATFOc_EhJfaULlkW-UDDLxfw3IIEZH7Cb67kOEMGkW9JJrTyc-IYsL1MQuhUB11a8c8},
    doi = {10.1134/S0361768816010072},
    abstract = {A detailed description of a method for detection of code clones is described. This method is based on the semantic analysis of programs and on new algorithms that make it scalable without affecting its accuracy. The proposed method involves two phases. In the first phase, the program dependence graph (PDG) is constructed while the program is compiled. LLVM is used as the compilation infrastructure. In the second phase, similar subgraphs of maximum size that represent code clones are detected. Before starting the search for similar subgraphs, the PDG is divided into subgraphs that will be considered as potential clones of each other. To ensure scalability of the search for similar subgraphs, the composition of algorithms is used. The first algorithm checks that a pair of graphs cannot have similar subgraphs of the desired size; this is done in a linear amount of time. If this algorithm fails, another (approximate) algorithm is executed to find similar sub-graphs of maximum size. After similar subgraphs have been found, the program code is additionally checked for the position of the code lines corresponding to the detected clone candidates. Tests showed that the developed tool is more accurate than similar tools, such as MOSS, CCFinder, and CloneDR. Results obtained for the projects Linux-2.6, Firefox Mozilla, LLVM/Clang, and OpenSSL are presented.},
    number = {1},
    journal = {Original Russian Text},
    author = {Sargsyan, S and Kurmangaleev, Sh and Belevantsev, A and Avetisyan, A},
    month = jan,
    year = {2016},
    note = {Publisher: ©Pleiades Publishing},
    pages = {27-33}
    }

  • S. Sharma, P. Mehta, and M. Tech Scholar, “To Enhance Type 4 Clone Detection in Clone Testing,” 2016.
    [BibTeX] [Abstract] [PDF]

    The means of software reuse is copying and modifying block of code that detect cloning. As a survey, it is observed that 20-30\% of module in system may be cloned. So it is mandatory to detect clones in system to reduce replication and improve reusability. Code clone is similar or duplicate code in source code that is created either by replication or some modifications. Clone is a persistent form of Software Reuse that effect on maintenance of large software. In previous research, the researcher emphasis on detect type 1, type 2, and type 3 of type of clones. The existing code clone detection tools are used to detect clone in source code. In this research, the enhancement in code clone detection algorithm will be proposed which detect type 4. In this work, firstly, use an existing algorithm to detect clone. Secondly, we put some intensification in that algorithm to detect clone. Thirdly, we combine algorithm with type 4 to detect a clone in particular function. By using type 4, the efficiency of clone detection is increased. Clone is detected in particular function, which is more accurate and more efficient in manner.

    @techreport{sharma_enhance_nodate,
    title = {To {Enhance} {Type} 4 {Clone} {Detection} in {Clone} {Testing}},
    url = {https://pdfs.semanticscholar.org/6423/5ec2fad562872ea2e43d4ba97c73df4a470d.pdf},
    abstract = {The means of software reuse is copying and modifying block of code that detect cloning. As a survey, it is observed that 20-30\% of module in system may be cloned. So it is mandatory to detect clones in system to reduce replication and improve reusability. Code clone is similar or duplicate code in source code that is created either by replication or some modifications. Clone is a persistent form of Software Reuse that effect on maintenance of large software. In previous research, the researcher emphasis on detect type 1, type 2, and type 3 of type of clones. The existing code clone detection tools are used to detect clone in source code. In this research, the enhancement in code clone detection algorithm will be proposed which detect type 4. In this work, firstly, use an existing algorithm to detect clone. Secondly, we put some intensification in that algorithm to detect clone. Thirdly, we combine algorithm with type 4 to detect a clone in particular function. By using type 4, the efficiency of clone detection is increased. Clone is detected in particular function, which is more accurate and more efficient in manner.},
    author = {Sharma, Swati and Mehta, Priyanka and Tech Scholar, M},
    note = {Publication Title: pdfs.semanticscholar.org},
    year = {2016},
    journal = { International Journal of Computer Science and Information Technologies},
    pages = {967-971},
    keywords = {code clone, Software clone, clone detection, algorithm, clone testing, effectiveness of software}
    }

  • A. Sheneamer and J. Kalita, “Semantic clone detection using machine learning,” in 2016 15th ieee international conference on machine learning and applications (icmla), 2016, pp. 1024-1028.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7838289,
    author={A. {Sheneamer} and J. {Kalita}},
    booktitle={2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)},
    title={Semantic Clone Detection Using Machine Learning},
    url= {https://ieeexplore.ieee.org/document/7838289},
    year={2016},
    volume={},
    number={},
    pages={1024-1028},}

  • Fang-Hsiang Su, J. Bell, G. Kaiser, and S. Sethumadhavan, “Identifying functionally similar code in complex codebases,” in 2016 ieee 24th international conference on program comprehension (icpc), 2016, pp. 1-10.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7503720,
    author={ {Fang-Hsiang Su} and J. {Bell} and G. {Kaiser} and S. {Sethumadhavan}},
    booktitle={2016 IEEE 24th International Conference on Program Comprehension (ICPC)},
    title={Identifying functionally similar code in complex codebases},
    year={2016},
    url = {https://ieeexplore.ieee.org/document/7503720},
    volume={},
    number={},
    pages={1-10},}

  • F. Su, J. Bell, G. Kaiser, and S. Sethumadhavan, “Discovering Functionally Similar Code with Taint Analysis,” Journal of software: evolution and process, vol. 00, pp. 1-23, 2016. doi:10.1002/smr
    [BibTeX] [Abstract] [PDF]

    Identifying similar code in software systems can assist many software engineering tasks such as program understanding and software refactoring. While most approaches focus on identifying code that looks alike, some techniques aim at detecting code that functions alike. Detecting these functional clones-code that functions alike-in object oriented languages remains an open question because of the difficulty in exposing and comparing programs’ functionality effectively, in general cases undecidable. We propose a novel technique, In-Vivo Clone Detection, which detects functional clones in arbitrary programs by identifying and mining their inputs and outputs. The key insight is to use existing workloads to execute programs and then measure functional similarities between programs based on their inputs and outputs. Further, to identify inputs and outputs of programs appropriately, we use the techniques of static and dynamic data flow analysis. These enhancements mitigate the problems in object oriented languages with respect to identifying program I/Os as reported by prior work. We implement such techniques in our system, HitoshiIO, which is open source and freely available. Our experimental results show that HitoshiIO detects ∼ 900 and ∼ 2, 000 functional clones by static and dynamic data flow analysis, respectively, across a corpus of 118 projects. In a random sample of the detected clones by the static data flow analysis, HitoshiIO achieves 68+\% true positive rate with only 15\% false positive rate.

    @article{su_discovering_2016,
    title = {Discovering {Functionally} {Similar} {Code} with {Taint} {Analysis}},
    volume = {00},
    url = {https://www.semanticscholar.org/paper/Discovering-Functionally-Similar-Code-with-Taint-Su-Bell/20f8ff1e20f999f332885ab4fe60beb5332e4f9a},
    doi = {10.1002/smr},
    abstract = {Identifying similar code in software systems can assist many software engineering tasks such as program understanding and software refactoring. While most approaches focus on identifying code that looks alike, some techniques aim at detecting code that functions alike. Detecting these functional clones-code that functions alike-in object oriented languages remains an open question because of the difficulty in exposing and comparing programs' functionality effectively, in general cases undecidable. We propose a novel technique, In-Vivo Clone Detection, which detects functional clones in arbitrary programs by identifying and mining their inputs and outputs. The key insight is to use existing workloads to execute programs and then measure functional similarities between programs based on their inputs and outputs. Further, to identify inputs and outputs of programs appropriately, we use the techniques of static and dynamic data flow analysis. These enhancements mitigate the problems in object oriented languages with respect to identifying program I/Os as reported by prior work. We implement such techniques in our system, HitoshiIO, which is open source and freely available. Our experimental results show that HitoshiIO detects ∼ 900 and ∼ 2, 000 functional clones by static and dynamic data flow analysis, respectively, across a corpus of 118 projects. In a random sample of the detected clones by the static data flow analysis, HitoshiIO achieves 68+\% true positive rate with only 15\% false positive rate.},
    journal = {Journal of Software: Evolution and Process},
    author = {Su, Fang-Hsiang and Bell, Jonathan and Kaiser, Gail and Sethumadhavan, Simha},
    year = {2016},
    keywords = {code clone detection, dynamic analysis, data flow analysis, I/O behavior},
    pages = {1-23}
    }

  • J. Svajlenko and C. K. Roy, “Efficiently Measuring an Accurate and Generalized Clone Detection Precision using Clone Clustering,” Proceedings of the 28th international conference on software engineering and knowledge engineering (seke), pp. 426-433, 2016. doi:10.18293/SEKE2016-150
    [BibTeX] [PDF]
    @article{svajlenko_efficiently_nodate,
    title = {Efficiently {Measuring} an {Accurate} and {Generalized} {Clone} {Detection} {Precision} using {Clone} {Clustering}},
    url = {https://pdfs.semanticscholar.org/664c/1739942bd6bd32d559afad5109e68bd962cd.pdf},
    doi = {10.18293/SEKE2016-150},
    journal = {Proceedings of the 28th International Conference on Software Engineering and Knowledge Engineering (SEKE)},
    year = {2016},
    pages = {426-433},
    author = {Svajlenko, Jeffrey and Roy, Chanchal K}
    }

  • W. Ting Cheung, S. Ryu, S. Kim, W. T. Cheung, ·. S. Ryu, and S. Kim, “Development nature matters: An empirical study of code clones in JavaScript applications,” Empir software eng, pp. 517-564, 2016. doi:10.1007/s10664-015-9368-6
    [BibTeX] [Abstract] [PDF]

    Code cloning is one of the active research areas in the software engineering community. Specifically, researchers have conducted numerous empirical studies on code cloning and reported that 7 \% to 23 \% of the code in a typical software system has been cloned. However, there was less awareness of code clones in dynamically-typed languages and most studies are limited to statically-typed languages such as Java, C, and C++. In addition , most previous studies did not consider different application domains such as standalone projects or web applications. As a result, very little is known about clones in dynamically-typed languages, such as JavaScript, in different application domains. In this paper, we report a large-scale clone detection experiment in a dynamically-typed programming language , JavaScript, for different application domains: web pages and standalone projects. Our experimental results showed that unlike JavaScript standalone projects, JavaScript web applications have 95 \% of inter-file clones and 91-97 \% of widely scattered clones. We observed that web application developers created clones intentionally and such clones may not be as risky as claimed in previous studies. Understanding the risks of cloning in web applications requires further studies, as cloning may be due to either good or bad intentions. Also, we identified unique development practices such as including browser-dependent or device-specific code in code clones of JavaScript web applications. This indicates that features of programming languages and technologies affect how developers duplicate code.

    @article{ting_cheung_development_nodate,
    title = {Development nature matters: {An} empirical study of code clones in {JavaScript} applications},
    url = {https://link.springer.com/content/pdf/10.1007/s10664-015-9368-6.pdf},
    doi = {10.1007/s10664-015-9368-6},
    abstract = {Code cloning is one of the active research areas in the software engineering community. Specifically, researchers have conducted numerous empirical studies on code cloning and reported that 7 \% to 23 \% of the code in a typical software system has been cloned. However, there was less awareness of code clones in dynamically-typed languages and most studies are limited to statically-typed languages such as Java, C, and C++. In addition , most previous studies did not consider different application domains such as standalone projects or web applications. As a result, very little is known about clones in dynamically-typed languages, such as JavaScript, in different application domains. In this paper, we report a large-scale clone detection experiment in a dynamically-typed programming language , JavaScript, for different application domains: web pages and standalone projects. Our experimental results showed that unlike JavaScript standalone projects, JavaScript web applications have 95 \% of inter-file clones and 91-97 \% of widely scattered clones. We observed that web application developers created clones intentionally and such clones may not be as risky as claimed in previous studies. Understanding the risks of cloning in web applications requires further studies, as cloning may be due to either good or bad intentions. Also, we identified unique development practices such as including browser-dependent or device-specific code in code clones of JavaScript web applications. This indicates that features of programming languages and technologies affect how developers duplicate code.},
    journal = {Empir Software Eng},
    author = {Ting Cheung, Wai and Ryu, Sukyoung and Kim, Sunghun and Cheung, W T and Ryu, · S and Kim, S},
    year ={2016},
    pages = {517-564},
    keywords = {Code clones ·, Java, Clone properties ·, Cloning patterns, Script ·, Software metrics ·, Web applications ·}
    }

  • J. J. Torres, M. C. Junior, and F. R. Santos, “Mining source code clones in a corporate environment,” in Information technology: new generations, 2016, pp. 531-541.
    [BibTeX] [PDF]
    @InProceedings{10.1007/978-3-319-32467-8_47,
    author={Torres, Jose J. and Junior, Methanias C. and Santos, Francisco R.},
    title={Mining Source Code Clones in a Corporate Environment},
    booktitle={Information Technology: New Generations},
    year={2016},
    url = {https://link.springer.com/chapter/10.1007/978-3-319-32467-8_47},
    publisher={Springer International Publishing},
    pages={531-541},
    }

  • Y. Udagawa, “Maximal frequent sequence mining for finding software clones,” in Proceedings of the 18th international conference on information integration and web-based applications and services, New York, NY, USA, 2016, p. 26–33. doi:10.1145/3011141.3011160
    [BibTeX] [PDF]
    @inproceedings{10.1145/3011141.3011160,
    author = {Udagawa, Yoshihisa},
    title = {Maximal Frequent Sequence Mining for Finding Software Clones},
    year = {2016},
    isbn = {9781450348072},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3011141.3011160},
    doi = {10.1145/3011141.3011160},
    booktitle = {Proceedings of the 18th International Conference on Information Integration and Web-Based Applications and Services},
    pages = {26–33},
    numpages = {8},
    keywords = {frequent sequence, maximal frequent sequence, software clone, control statement, method identifier, Java source code},
    location = {Singapore, Singapore},
    series = {iiWAS ’16}
    }

  • R. van Tonder and C. Le Goues, “Defending against the attack of the micro-clones,” in 2016 ieee 24th international conference on program comprehension (icpc), 2016, pp. 1-4.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7503736,
    author={R. {van Tonder} and C. {Le Goues}},
    booktitle={2016 IEEE 24th International Conference on Program Comprehension (ICPC)},
    title={Defending against the attack of the micro-clones},
    year={2016},
    url = {https://ieeexplore.ieee.org/document/7503736},
    volume={},
    number={},
    pages={1-4},
    }

  • S. Wagner, A. Abdulkhaleq, K. Kaya, and A. Paar, “On the relationship of inconsistent software clones and faults: an empirical study,” in 2016 ieee 23rd international conference on software analysis, evolution, and reengineering (saner), 2016, pp. 79-89.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7476632,
    author={S. {Wagner} and A. {Abdulkhaleq} and K. {Kaya} and A. {Paar}},
    booktitle={2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER)},
    title={On the Relationship of Inconsistent Software Clones and Faults: An Empirical Study},
    year={2016},
    url = {https://ieeexplore.ieee.org/abstract/document/7476632/},
    volume={1},
    number={},
    pages={79-89},
    }

  • S. Numata, N. Yoshida, E. Choi, and K. Inoue, “On the effectiveness of vector-based approach for supporting simultaneous editing of software clones,” in International conference on product-focused software process improvement, 2016, pp. 560-567. doi:10.1007/978-3-319-49094-6_41
    [BibTeX] [PDF]
    @inproceedings{inproceedings,
    author = {Numata, Seiya and Yoshida, Norihiro and Choi, Eunjong and Inoue, Katsuro},
    year = {2016},
    month = {11},
    pages = {560-567},
    booktitle = {International Conference on Product-Focused Software Process Improvement},
    title = {On the Effectiveness of Vector-Based Approach for Supporting Simultaneous Editing of Software Clones},
    doi = {10.1007/978-3-319-49094-6_41},
    url = {https://www.researchgate.net/publication/312693231_On_the_Effectiveness_of_Vector-Based_Approach_for_Supporting_Simultaneous_Editing_of_Software_Clones}
    }

  • M. F. Zibran, “Towards implementation of an integrated clone management infrastructure,” in 2016 ieee 23rd international conference on software analysis, evolution, and reengineering (saner), 2016, pp. 60-61.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7476798,
    author={M. F. {Zibran}},
    booktitle={2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER)},
    title={Towards Implementation of an Integrated Clone Management Infrastructure},
    year={2016},
    url = {https://ieeexplore.ieee.org/abstract/document/7476798},
    volume={3},
    number={},
    pages={60-61},
    }

2015

  • M. Akhin and A. Suhinin, “Discovering clones in software: from complex algorithms to everyday desktop tool,” in Proceedings of the 11th central & eastern european software engineering conference in russia, New York, NY, USA, 2015. doi:10.1145/2855667.2855676
    [BibTeX] [PDF]
    @inproceedings{10.1145/2855667.2855676,
    author = {Akhin, Marat and Suhinin, Alexandr},
    title = {Discovering Clones in Software: From Complex Algorithms to Everyday Desktop Tool},
    year = {2015},
    isbn = {9781450341301},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2855667.2855676},
    doi = {10.1145/2855667.2855676},
    booktitle = {Proceedings of the 11th Central & Eastern European Software Engineering Conference in Russia},
    articleno = {8},
    numpages = {6},
    keywords = {suffix trie, clone detection, IDE},
    location = {Moscow, Russia},
    series = {CEE-SECR ’15}
    }

  • M. O. Elish and Y. Al-Ghamdi, “Fault density analysis of object-oriented classes in presence of code clones,” in Proceedings of the 19th international conference on evaluation and assessment in software engineering, New York, NY, USA, 2015. doi:10.1145/2745802.2745811
    [BibTeX] [PDF]
    @inproceedings{10.1145/2745802.2745811,
    author = {Elish, Mahmoud O. and Al-Ghamdi, Yasser},
    title = {Fault Density Analysis of Object-Oriented Classes in Presence of Code Clones},
    year = {2015},
    isbn = {9781450333504},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2745802.2745811},
    doi = {10.1145/2745802.2745811},
    booktitle = {Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering},
    articleno = {10},
    numpages = {7},
    keywords = {software quality, code cloning, fault density, software metrics},
    location = {Nanjing, China},
    series = {EASE ’15}
    }

  • R. Gauci, “Smelling out Code Clones: Clone Detection Tool Evaluation and Corresponding Challenges,” Corr, 2015.
    [BibTeX] [Abstract] [PDF]

    Software clones have been an active area of research for the past two decades. However, although numerous clone detection tools are now available, only a small fraction of the literature has focused on tool evaluation, and this is in fact still an open problem. This is mostly due to the fact that standard information retrieval metrics such as recall and precision require a priori knowledge of clones already in the system. Detection tools also typically have a large number of parameters which are difficult to fine-tune for optimal performance on a particular software system, and different outputs produced by different tools add to the complexity of comparing one tool to another. In this review, we further explore the reasons why tool evaluation is still an open challenge, and present the current tools and frameworks targeted at mitigating these problems, focusing on the current standard benchmarks used to evaluate modern clone detection tools, and also presenting a recent method aimed at finding optimal tool configurations.

    @article{gauci_smelling_2015,
    title = {Smelling out {Code} {Clones}: {Clone} {Detection} {Tool} {Evaluation} and {Corresponding} {Challenges}},
    url = {http://arxiv.org/abs/1503.00711},
    abstract = {Software clones have been an active area of research for the past two decades. However, although numerous clone detection tools are now available, only a small fraction of the literature has focused on tool evaluation, and this is in fact still an open problem. This is mostly due to the fact that standard information retrieval metrics such as recall and precision require a priori knowledge of clones already in the system. Detection tools also typically have a large number of parameters which are difficult to fine-tune for optimal performance on a particular software system, and different outputs produced by different tools add to the complexity of comparing one tool to another. In this review, we further explore the reasons why tool evaluation is still an open challenge, and present the current tools and frameworks targeted at mitigating these problems, focusing on the current standard benchmarks used to evaluate modern clone detection tools, and also presenting a recent method aimed at finding optimal tool configurations.},
    author = {Gauci, Rachel},
    month = mar,
    year = {2015},
    journal = {CoRR},
    note = {\_eprint: 1503.00711}
    }

  • B. Joshi, P. Budhathoki, W. L. Woon, and D. Svetinovic, “Software clone detection using clustering approach,” in Proceeings, part ii, of the 22nd international conference on neural information processing – volume 9490, Berlin, Heidelberg, 2015, p. 520–527. doi:10.1007/978-3-319-26535-3_59
    [BibTeX] [PDF]
    @inproceedings{10.1007/978-3-319-26535-3_59,
    author = {Joshi, Bikash and Budhathoki, Puskar and Woon, Wei Lee and Svetinovic, Davor},
    title = {Software Clone Detection Using Clustering Approach},
    year = {2015},
    isbn = {9783319265346},
    publisher = {Springer-Verlag},
    address = {Berlin, Heidelberg},
    url = {https://doi.org/10.1007/978-3-319-26535-3_59},
    doi = {10.1007/978-3-319-26535-3_59},
    booktitle = {Proceeings, Part II, of the 22nd International Conference on Neural Information Processing - Volume 9490},
    pages = {520–527},
    numpages = {8},
    keywords = {Function clones, Clone detection, Data mining, Software metrics},
    location = {Istanbul, Turkey},
    series = {ICONIP 2015}
    }

  • M. Staron, W. Meding, P. Eriksson, J. Nilsson, N. Lövgren, and P. Österström, “Classifying Obstructive and Nonobstructive Code Clones of Type I Using Simplified Classification Scheme: A Case Study,” Advances in software engineering 2015, pp. 1-18, 2015. doi:10.1155/2015/829389
    [BibTeX] [Abstract] [PDF]

    Code cloning is a part of many commercial and open source development products. Multiple methods for detecting code clones have been developed and finding the clones is often used in modern quality assurance tools in industry. There is no consensus whether the detected clones are negative for the product and therefore the detected clones are often left unmanaged in the product code base. In this paper we investigate how obstructive code clones of Type I (duplicated exact code fragments) are in large software systems from the perspective of the quality of the product after the release. We conduct a case study at Ericsson and three of its large products, which handle mobile data traffic. We show how to use automated analogy-based classification to decrease the classification effort required to determine whether a clone pair should be refactored or remain untouched. The automated method allows classifying 96\% of Type I clones (both algorithms and data declarations) leaving the remaining 4\% for the manual classification. The results show that cloning is common in the studied commercial software, but that only 1\% of these clones are potentially obstructive and can jeopardize the quality of the product if left unmanaged.

    @article{staron_classifying_2015,
    title = {Classifying {Obstructive} and {Nonobstructive} {Code} {Clones} of {Type} {I} {Using} {Simplified} {Classification} {Scheme}: {A} {Case} {Study}},
    url = {http://dx.doi.org/10.1155/2015/829389},
    doi = {10.1155/2015/829389},
    abstract = {Code cloning is a part of many commercial and open source development products. Multiple methods for detecting code clones have been developed and finding the clones is often used in modern quality assurance tools in industry. There is no consensus whether the detected clones are negative for the product and therefore the detected clones are often left unmanaged in the product code base. In this paper we investigate how obstructive code clones of Type I (duplicated exact code fragments) are in large software systems from the perspective of the quality of the product after the release. We conduct a case study at Ericsson and three of its large products, which handle mobile data traffic. We show how to use automated analogy-based classification to decrease the classification effort required to determine whether a clone pair should be refactored or remain untouched. The automated method allows classifying 96\% of Type I clones (both algorithms and data declarations) leaving the remaining 4\% for the manual classification. The results show that cloning is common in the studied commercial software, but that only 1\% of these clones are potentially obstructive and can jeopardize the quality of the product if left unmanaged.},
    journal = {Advances in Software Engineering 2015},
    author = {Staron, Miroslaw and Meding, Wilhelm and Eriksson, Peter and Nilsson, Jimmy and Lövgren, Nils and Österström, Per},
    year = {2015},
    pages = {1-18}
    }

  • H. Wang, Y. Guo, Z. Ma, and X. Chen, “Wukong: a scalable and accurate two-phase approach to android app clone detection,” in Proceedings of the 2015 international symposium on software testing and analysis, New York, NY, USA, 2015, p. 71–82. doi:10.1145/2771783.2771795
    [BibTeX] [PDF]
    @inproceedings{10.1145/2771783.2771795,
    author = {Wang, Haoyu and Guo, Yao and Ma, Ziang and Chen, Xiangqun},
    title = {WuKong: A Scalable and Accurate Two-Phase Approach to Android App Clone Detection},
    year = {2015},
    isbn = {9781450336208},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2771783.2771795},
    doi = {10.1145/2771783.2771795},
    booktitle = {Proceedings of the 2015 International Symposium on Software Testing and Analysis},
    pages = {71–82},
    numpages = {12},
    keywords = {third-party library, mobile applications, Android, Clone detection, repackaging},
    location = {Baltimore, MD, USA},
    series = {ISSTA 2015}
    }

  • T. M. Ahmed, W. Shang, and A. E. Hassan, “An empirical study of the copy and paste behavior during development,” Working conference on mining software repositories, 2015. doi:10.1109/MSR.2015.17
    [BibTeX] [Abstract] [PDF]

    Developers frequently employ Copy and Paste. However , little is known about the copy and paste behavior during development. To better understand the copy and paste behavior, automated approaches are proposed to identify cloned code. However, such automated approaches can only identify the location of the code that has been copied and pasted, but little is known about the context of the copy and paste. On the other hand, prior research studying actual copy and paste behavior is based on a small number of users in an experimental setup. In this paper, we study the behavior of developers copying and pasting code while using the Eclipse IDE. We mine the usage data of over 20,000 Eclipse users. We aim to explore the different patterns of Copy and Paste (C&P) that are used by Eclipse users during development. We compare such usage patterns to the regular users’ usage of copy and paste during non-development tasks reported in earlier studies. Our findings instruct builders of future IDEs. We find that developers’ C&P behavior is considerably different from the behavior of regular users. For example, developers tend to perform more frequent C&P in the same file contrary to regular users, who tend to perform C&P across different windows. Moreover, we find that C&P across different programming languages is a common behavior as we extracted more than 75,000 C&P incidents across different programming languages. Such a finding highlights the need for clone detection techniques that can detect code clones across different programming languages.

    @article{ahmed_empirical_nodate,
    title = {An Empirical Study of the Copy and Paste Behavior during Development},
    url = {https://www.researchgate.net/publication/281117910},
    doi = {10.1109/MSR.2015.17},
    abstract = {Developers frequently employ Copy and Paste. However , little is known about the copy and paste behavior during development. To better understand the copy and paste behavior, automated approaches are proposed to identify cloned code. However, such automated approaches can only identify the location of the code that has been copied and pasted, but little is known about the context of the copy and paste. On the other hand, prior research studying actual copy and paste behavior is based on a small number of users in an experimental setup. In this paper, we study the behavior of developers copying and pasting code while using the Eclipse IDE. We mine the usage data of over 20,000 Eclipse users. We aim to explore the different patterns of Copy and Paste (C\&P) that are used by Eclipse users during development. We compare such usage patterns to the regular users' usage of copy and paste during non-development tasks reported in earlier studies. Our findings instruct builders of future IDEs. We find that developers' C\&P behavior is considerably different from the behavior of regular users. For example, developers tend to perform more frequent C\&P in the same file contrary to regular users, who tend to perform C\&P across different windows. Moreover, we find that C\&P across different programming languages is a common behavior as we extracted more than 75,000 C\&P incidents across different programming languages. Such a finding highlights the need for clone detection techniques that can detect code clones across different programming languages.},
    journal = {Working Conference on Mining Software Repositories},
    author = {Ahmed, Tarek M and Shang, Weiyi and Hassan, Ahmed E},
    year = {2015}
    }

  • H. A. Basit, H. S. Khan, F. Hamid, and I. Suhail, “Tool support for managing method clones,” in 2015 ieee 9th international workshop on software clones (iwsc), 2015, pp. 40-46.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7069888,
    author={H. A. {Basit} and H. S. {Khan} and F. {Hamid} and I. {Suhail}},
    booktitle={2015 IEEE 9th International Workshop on Software Clones (IWSC)},
    title={Tool support for managing method clones},
    year={2015},
    url = {https://ieeexplore.ieee.org/abstract/document/7069888/},
    volume={},
    number={},
    pages={40-46},
    }

  • H. Basit, M. Hammad, and R. Koschke, “A survey on goal-oriented visualization of clone data.” 2015, pp. 46-55. doi:10.1109/VISSOFT.2015.7332414
    [BibTeX] [PDF]
    @inproceedings{inproceedings,
    author = {Basit, Hamid and Hammad, Muhammad and Koschke, Rainer},
    year = {2015},
    month = {09},
    pages = {46-55},
    url = {https://www.researchgate.net/publication/308106751},
    journal = { IEEE 3rd Working Conference on Software Visualization (VISSOFT)},
    title = {A survey on goal-oriented visualization of clone data},
    doi = {10.1109/VISSOFT.2015.7332414}
    }

  • H. A. Basit, M. Hammad, S. Jarzabek, and R. Koschke, “What do we need to know about clones? deriving information needs from user goals,” in 2015 ieee 9th international workshop on software clones (iwsc), 2015, pp. 51-57.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7069891,
    author={H. A. {Basit} and M. {Hammad} and S. {Jarzabek} and R. {Koschke}},
    booktitle={2015 IEEE 9th International Workshop on Software Clones (IWSC)},
    title={What do we need to know about clones? deriving information needs from user goals},
    year={2015},
    url = {https://ieeexplore.ieee.org/document/7069891/footnotes#footnotes},
    volume={},
    number={},
    pages={51-57},
    }

  • A. Charpentier, J. Falleri, D. Lo, and L. Réveillère, “An empirical assessment of bellon’s clone benchmark,” in Proceedings of the 19th international conference on evaluation and assessment in software engineering, New York, NY, USA, 2015. doi:10.1145/2745802.2745821
    [BibTeX] [PDF]
    @inproceedings{10.1145/2745802.2745821,
    author = {Charpentier, Alan and Falleri, Jean-R\'{e}my and Lo, David and R\'{e}veill\`{e}re, Laurent},
    title = {An Empirical Assessment of Bellon’s Clone Benchmark},
    year = {2015},
    isbn = {9781450333504},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2745802.2745821},
    doi = {10.1145/2745802.2745821},
    booktitle = {Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering},
    articleno = {20},
    numpages = {10},
    keywords = {software metrics, code clone, empirical study},
    location = {Nanjing, China},
    series = {EASE ’15}
    }

  • J. Chen, M. H. Alalfi, T. R. Dean, and Y. Zou, “Detecting Android Malware Using Clone Detection,” Journal of computer science and technology, vol. 30, iss. 5, pp. 942-956, 2015. doi:10.1007/s11390-015-1573-7
    [BibTeX] [Abstract] [PDF]

    Android is currently one of the most popular smartphone operating systems. However, Android has the largest share of global mobile malware and significant public attention has been brought to the security issues of Android. In this paper, we investigate the use of a clone detector to identify known Android malware. We collect a set of Android applications known to contain malware and a set of benign applications. We extract the Java source code from the binary code of the applications and use NiCad, a near-miss clone detector, to find the classes of clones in a small subset of the malicious applications. We then use these clone classes as a signature to find similar source files in the rest of the malicious applications. The benign collection is used as a control group. In our evaluation, we successfully decompile more than 1 000 malicious apps in 19 malware families. Our results show that using a small portion of malicious applications as a training set can detect 95\% of previously known malware with very low false positives and high accuracy at 96.88\%. Our method can effectively and reliably pinpoint malicious applications that belong to certain malware families.

    @article{chen_detecting_2015,
    title = {Detecting {Android} {Malware} {Using} {Clone} {Detection}},
    volume = {30},
    url = {http://www.forbes.com/sites/gordonkelly/2014/03/24/report-97-of-mobile-malware-is-on-android-this-is-the-easy-way-you-st-},
    doi = {10.1007/s11390-015-1573-7},
    abstract = {Android is currently one of the most popular smartphone operating systems. However, Android has the largest share of global mobile malware and significant public attention has been brought to the security issues of Android. In this paper, we investigate the use of a clone detector to identify known Android malware. We collect a set of Android applications known to contain malware and a set of benign applications. We extract the Java source code from the binary code of the applications and use NiCad, a near-miss clone detector, to find the classes of clones in a small subset of the malicious applications. We then use these clone classes as a signature to find similar source files in the rest of the malicious applications. The benign collection is used as a control group. In our evaluation, we successfully decompile more than 1 000 malicious apps in 19 malware families. Our results show that using a small portion of malicious applications as a training set can detect 95\% of previously known malware with very low false positives and high accuracy at 96.88\%. Our method can effectively and reliably pinpoint malicious applications that belong to certain malware families.},
    number = {5},
    journal = {Journal of Computer Science and Technology},
    author = {Chen, Jian and Alalfi, Manar H and Dean, Thomas R and Zou, Ying},
    month = sep,
    year = {2015},
    note = {Publisher: Springer New York LLC},
    keywords = {Android, clone detection, malware},
    pages = {942-956}
    }

  • M. Claes, T. Mens, M. Claes, N. Tabout, and P. Grosjean, “An empirical study of identical function clones in cran,” in 2015 ieee 9th international workshop on software clones (iwsc), 2015, pp. 19-25.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7069885,
    author={Claes, Maelick and Mens, Tom and Claes, Maëlick and Tabout, Narjisse and Grosjean, Philippe},
    booktitle={2015 IEEE 9th International Workshop on Software Clones (IWSC)},
    title={An empirical study of identical function clones in CRAN},
    year={2015},
    volume={},
    url = {https://ieeexplore.ieee.org/document/7069885},
    number={},
    pages={19-25},
    }

  • J. R. Cordy, “Simone: architecture-sensitive near-miss clone detection for simulink models,” in 2015 first international workshop on automotive software architecture (wasa), 2015, pp. 1-2.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7447218,
    author={Cordy, James R},
    booktitle={2015 First International Workshop on Automotive Software Architecture (WASA)},
    title={SIMONE: architecture-sensitive near-miss clone detection for Simulink models},
    year={2015},
    url = {https://ieeexplore.ieee.org/document/7447218},
    volume={},
    number={},
    pages={1-2},
    }

  • S. Dang and S. A. Wani, “Survey based analysis of effect of code clones on software quality,” Article in international journal of engineering and technical research, pp. 371-379, 2015. doi:10.17577/IJERTV4IS030495
    [BibTeX] [Abstract] [PDF]

    Code clones are similar code portions. Cloning is a process of duplicating code segments by copy-paste activities that is a common activity in software development. It is believed that the presence of code clone is one of the factors that have a great impact on software quality attributes. In literature many techniques have been proposed to detect and eliminate code clones on this basis. Various research efforts are being performed to reduce somber problems caused by code clones. This paper presents the study of the effect of code clones on software quality. In this paper an industrial study is presented to understand impact of code clones on a software system from software developer’s point of view. This study involves a questionnaire survey and collects enough data about the reasons behind the cloning activity and the impact of code clones on a software system. The results of the study show that clones have a harmful effect on the system. This study also suggests that maintenance is the mostly effected software quality attribute.

    @article{dang_survey_2015,
    title = {Survey Based Analysis of Effect of Code Clones on Software Quality},
    url = {https://www.researchgate.net/publication/274326181_Survey_Based_Analysis_of_Effect_of_Code_Clones_on_Software_Quality},
    doi = {10.17577/IJERTV4IS030495},
    abstract = {Code clones are similar code portions. Cloning is a process of duplicating code segments by copy-paste activities that is a common activity in software development. It is believed that the presence of code clone is one of the factors that have a great impact on software quality attributes. In literature many techniques have been proposed to detect and eliminate code clones on this basis. Various research efforts are being performed to reduce somber problems caused by code clones. This paper presents the study of the effect of code clones on software quality. In this paper an industrial study is presented to understand impact of code clones on a software system from software developer's point of view. This study involves a questionnaire survey and collects enough data about the reasons behind the cloning activity and the impact of code clones on a software system. The results of the study show that clones have a harmful effect on the system. This study also suggests that maintenance is the mostly effected software quality attribute.},
    journal = {Article in International Journal of Engineering and Technical Research},
    author = {Dang, Shilpa and Wani, Shahid Ahmad},
    year = {2015},
    pages = {371-379},
    keywords = {Code Clones, Abstract Syntax Tree (AST), Program Dependence Graph (PDG)}
    }

  • A. El-Matarawy, M. El-Ramly, and R. Bahgat, “Code Clone Detection using Sequential Pattern Mining,” , 2, 2015.
    [BibTeX] [Abstract] [PDF]

    This paper presents a new technique for clone detection using sequential pattern mining titled EgyCD. Over the last decade many techniques and tools for software clone detection have been proposed such as textual approaches, lexical approaches, syntactic approaches, semantic approaches {\textbackslash}ldots, etc. In this paper, we explore the potential of data mining techniques in clone detection. In particular, we developed a clone detection technique based on sequential pattern mining (SPM). The source code is treated as a sequence of transactions processed by the SPM algorithm to find frequent itemsets. We run three experiments to discover code clones of Type I, Type II and Type III and for plagiarism detection. We compared the results with other established code clone detectors. Our technique discovers all code clones in the source code and hence it is slower than the compared code clone detectors since they discover few code clones compared with EgyCD.

    @techreport{el-matarawy_code_2015,
    title = {Code {Clone} {Detection} using {Sequential} {Pattern} {Mining}},
    url = {http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.735.1950&rep=rep1&type=pdf},
    abstract = {This paper presents a new technique for clone detection using sequential pattern mining titled EgyCD. Over the last decade many techniques and tools for software clone detection have been proposed such as textual approaches, lexical approaches, syntactic approaches, semantic approaches {\textbackslash}ldots, etc. In this paper, we explore the potential of data mining techniques in clone detection. In particular, we developed a clone detection technique based on sequential pattern mining (SPM). The source code is treated as a sequence of transactions processed by the SPM algorithm to find frequent itemsets. We run three experiments to discover code clones of Type I, Type II and Type III and for plagiarism detection. We compared the results with other established code clone detectors. Our technique discovers all code clones in the source code and hence it is slower than the compared code clone detectors since they discover few code clones compared with EgyCD.},
    number = {2},
    author = {El-Matarawy, Ali and El-Ramly, Mohammad and Bahgat, Reem},
    year = {2015},
    journal = {International Journal of Computer Applications},
    note = {Publication Title: International Journal of Computer Applications},
    volume = {127},
    keywords = {Clone Detection, Data Mining, Sequential Pattern Mining},
    pages = {975-8887}
    }

  • P. Estefó, R. Robbes, J. Fabry, and R. Labs, “Code duplication in ros launchfiles,” in 2015 34th international conference of the chilean computer science society (sccc), 2015, pp. 1-6.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7416575,
    author={Estefó, Pablo and Robbes, Romain and Fabry, Johan and Labs, Rych},
    booktitle={2015 34th International Conference of the Chilean Computer Science Society (SCCC)},
    title={Code duplication in ROS launchfiles},
    year={2015},
    url = {https://ieeexplore.ieee.org/document/7416575},
    volume={},
    number={},
    pages={1-6},
    }

  • “Comprehensible presentation of clone detection results,” in Aip conference proceedings, 2015. doi:10.1063/1.4912559
    [BibTeX] [PDF]
    @inproceedings{fordos_comprehensible_2015,
    title = {Comprehensible presentation of clone detection results},
    volume = {1648},
    isbn = {978-0-7354-1287-3},
    doi = {10.1063/1.4912559},
    url = {https://aip.scitation.org/doi/10.1063/1.4912559},
    month = mar,
    year = {2015},
    booktitle = {AIP Conference Proceedings},
    note = {ISSN: 15517616},
    keywords = {comprehensive clone detection results, grouping, maximal cliques}
    }

  • E. Grover and E. Rana, “Detection of Non Continguous Clones in Software using Program Slicing,” , 2015.
    [BibTeX] [PDF]
    @article{grover_detection_nodate,
    title = {Detection of {Non} {Continguous} {Clones} in {Software} using {Program} {Slicing}},
    author = {Grover, ER and Rana, Er },
    voulme = {4},
    url = {http://ijarcet.org/wp-content/uploads/IJARCET-VOL-4-ISSUE-5-1784-1789.pdf},s
    journal = {International Journal of Advanced Research in Computer Engineering & Technology},
    year = {2015},
    }

  • S. Gupta and P. C. Gupta, “Algorithm to Detect Non-Contiguous Clones with High Precision,” 2015.
    [BibTeX] [Abstract] [PDF]

    Researchers have proved that duplication of code occur frequently in several systems because of various reasons [5,7]. It had been proved that almost 70\% of the effort is wasted in resolving the clones during maintenance [8] since if the clones are not removed then it will lead to further more problems like hindrance to comprehension of the program, independent evolution of clones, bad design etc. Although code clones are a major problem but still they are evolved in the system because of the limitation of the programmer to finish the work as soon as possible. This paper focuses on detection of clones in large software systems so as to reduce the effort during maintenance. It describes a novel approach to design and implement a tool for detecting cloned codes in the system. The algorithm is developed in such a manner that it is precise and scalable with performance factor.

    @techreport{gupta_algorithm_nodate,
    title = {Algorithm to {Detect} {Non}-{Contiguous} {Clones} with {High} {Precision}},
    url = {http://ijiet.com/wp-content/uploads/2015/02/30.pdf},
    abstract = {Researchers have proved that duplication of code occur frequently in several systems because of various reasons [5,7]. It had been proved that almost 70\% of the effort is wasted in resolving the clones during maintenance [8] since if the clones are not removed then it will lead to further more problems like hindrance to comprehension of the program, independent evolution of clones, bad design etc. Although code clones are a major problem but still they are evolved in the system because of the limitation of the programmer to finish the work as soon as possible. This paper focuses on detection of clones in large software systems so as to reduce the effort during maintenance. It describes a novel approach to design and implement a tool for detecting cloned codes in the system. The algorithm is developed in such a manner that it is precise and scalable with performance factor.},
    author = {Gupta, Sonam and Gupta, P C},
    note = {Publication Title: pdfs.semanticscholar.org},
    year = {2015},
    pages = {215-219},
    journal = {International Journal of Innovations in Engineering and Technology},
    keywords = {Software Maintenance, clones, collectors, statement grouping}
    }

  • H. Murakami, Y. Higo, and S. Kusumoto, “Clonepacker: a tool for clone set visualization,” in 2015 ieee 22nd international conference on software analysis, evolution, and reengineering (saner), 2015, pp. 474-478.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7081859,
    author={H. {Murakami} and Y. {Higo} and S. {Kusumoto}},
    booktitle={2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)},
    title={ClonePacker: A tool for clone set visualization},
    year={2015},
    url = {https://ieeexplore.ieee.org/document/7081859},
    volume={},
    number={},
    pages={474-478},}

  • T. Kamiya, “An execution-semantic and content-and-context-based code-clone detection and analysis,” in 2015 ieee 9th international workshop on software clones (iwsc), 2015, pp. 1-7.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7069882,
    author={T. {Kamiya}},
    booktitle={2015 IEEE 9th International Workshop on Software Clones (IWSC)},
    title={An execution-semantic and content-and-context-based code-clone detection and analysis},
    year={2015},
    url = {https://ieeexplore.ieee.org/abstract/document/7069882/},
    volume={},
    number={},
    pages={1-7},
    }

  • B. Kaur and E. H. Kaur, “A Review of Clone Detection in UML Models,” Advances in computer science and information technology (acsit), vol. 2, iss. 7, pp. 27-32, 2015.
    [BibTeX] [Abstract] [PDF]

    Model Driven Engineering has become standard and important framework in software research field. Unified Modeling Language (UML) domain models are conceptual models which are used to design and develop software in software development life cycle. Models contain design level similarities, these are called model clones. Model clones are harmful for software maintenance as code clones and also lead to bad design. So number of clones need to be detected from UML domain models. Awareness of clones helps in reusable mechanism. Many techniques have been proposed for code clone detection but a few work has been done on model clone detection. In this paper review has been provided related to various techniques for detection of clones in UML models. Tree comparison technique is used to find similarity in two fragments of a model. Tree is less false positive because of minimum non-relevant matches. Suffix array technique is used to detect clones in class diagrams. Suffix array consumes minimum memory. NiCad Clone detector tool is a scalable and flexible tool to detect type-3 near-miss clones in behavioural models. This paper provides comparative features of these above different techniques.

    @article{kaur_review_nodate,
    title = {A {Review} of {Clone} {Detection} in {UML} {Models}},
    volume = {2},
    issn = {2393-9915},
    url = {http://www.krishisanskriti.org/ACSIT.html},
    abstract = {Model Driven Engineering has become standard and important framework in software research field. Unified Modeling Language (UML) domain models are conceptual models which are used to design and develop software in software development life cycle. Models contain design level similarities, these are called model clones. Model clones are harmful for software maintenance as code clones and also lead to bad design. So number of clones need to be detected from UML domain models. Awareness of clones helps in reusable mechanism. Many techniques have been proposed for code clone detection but a few work has been done on model clone detection. In this paper review has been provided related to various techniques for detection of clones in UML models. Tree comparison technique is used to find similarity in two fragments of a model. Tree is less false positive because of minimum non-relevant matches. Suffix array technique is used to detect clones in class diagrams. Suffix array consumes minimum memory. NiCad Clone detector tool is a scalable and flexible tool to detect type-3 near-miss clones in behavioural models. This paper provides comparative features of these above different techniques.},
    number = {7},
    journal = {Advances in Computer Science and Information Technology (ACSIT)},
    author = {Kaur, Balwinder and Kaur, Er Harpreet},
    note = {Publisher: Krishi Sanskriti Publications},
    keywords = {Code clones, Model clone detection, Model clones, UML models},
    pages = {27-32},
    year ={2015}
    }

  • I. Keivanloo, F. Zhang, and Y. Zou, “Threshold-free code clone detection for a large-scale heterogeneous java repository,” in 2015 ieee 22nd international conference on software analysis, evolution, and reengineering (saner), 2015, pp. 201-210.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7081830,
    author={I. {Keivanloo} and F. {Zhang} and Y. {Zou}},
    booktitle={2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)},
    title={Threshold-free code clone detection for a large-scale heterogeneous Java repository},
    year={2015},
    url ={https://ieeexplore.ieee.org/document/7081830},
    volume={},
    number={},
    pages={201-210},}

  • S. Karus and K. Kilgi, “Code clone detection using wavelets,” in 2015 ieee 9th international workshop on software clones (iwsc), 2015, pp. 8-14.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7069883,
    author={S. {Karus} and K. {Kilgi}},
    booktitle={2015 IEEE 9th International Workshop on Software Clones (IWSC)},
    title={Code clone detection using wavelets},
    year={2015},
    url = {https://ieeexplore.ieee.org/document/7069883},
    volume={},
    number={},
    pages={8-14},}

  • D. E. Krutz, S. A. Malachowsky, and E. Shihab, “Examining the effectiveness of using concolic analysis to detect code clones,” in Proceedings of the 30th annual acm symposium on applied computing, New York, NY, USA, 2015, p. 1610–1615. doi:10.1145/2695664.2695929
    [BibTeX] [PDF]
    @inproceedings{10.1145/2695664.2695929,
    author = {Krutz, Daniel E. and Malachowsky, Samuel A. and Shihab, Emad},
    title = {Examining the Effectiveness of Using Concolic Analysis to Detect Code Clones},
    year = {2015},
    isbn = {9781450331968},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2695664.2695929},
    doi = {10.1145/2695664.2695929},
    booktitle = {Proceedings of the 30th Annual ACM Symposium on Applied Computing},
    pages = {1610–1615},
    numpages = {6},
    keywords = {concolic analysis, software engineering, code clones},
    location = {Salamanca, Spain},
    series = {SAC ’15}
    }

  • Y. Lin, X. Peng, Z. Xing, D. Zheng, and W. Zhao, “Clone-based and interactive recommendation for modifying pasted code,” in Proceedings of the 2015 10th joint meeting on foundations of software engineering, New York, NY, USA, 2015, p. 520–531. doi:10.1145/2786805.2786871
    [BibTeX] [PDF]
    @inproceedings{10.1145/2786805.2786871,
    author = {Lin, Yun and Peng, Xin and Xing, Zhenchang and Zheng, Diwen and Zhao, Wenyun},
    title = {Clone-Based and Interactive Recommendation for Modifying Pasted Code},
    year = {2015},
    isbn = {9781450336758},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2786805.2786871},
    doi = {10.1145/2786805.2786871},
    booktitle = {Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering},
    pages = {520–531},
    numpages = {12},
    keywords = {differencing, recommendation, code clone, copy and paste, reuse},
    location = {Bergamo, Italy},
    series = {ESEC/FSE 2015}
    }

  • S. Mallaiah and L. Rangarajan, “Duplicate code detection using control statements,” International journal of computer applications technology and research, vol. 4, iss. 10, pp. 728-736, 2015. doi:10.7753/IJCATR0410.1003
    [BibTeX] [Abstract] [PDF]

    Code clone detection is an important area of research as reusability is a key factor in software evolution. Duplicate code degrades the design and structure of software and software qualities like readability, changeability, maintainability. Code clone increases the maintenance cost as incorrect changes in copied code may lead to more errors. In this paper we address structural code similarity detection and propose new methods to detect structural clones using structure of control statements. By structure we mean order of control statements used in the source code. We have considered two orders of control structures: (i) Sequence of control statements as it appears (ii) Execution flow of control statements.

    @article{mallaiah_duplicate_2015,
    title = {Duplicate Code Detection using Control Statements},
    volume = {4},
    issn = {2319-8656},
    url = {https://www.researchgate.net/publication/282431515_Duplicate_Code_Detection_using_Control_Statements/references},
    doi = {10.7753/IJCATR0410.1003},
    abstract = {Code clone detection is an important area of research as reusability is a key factor in software evolution. Duplicate code degrades the design and structure of software and software qualities like readability, changeability, maintainability. Code clone increases the maintenance cost as incorrect changes in copied code may lead to more errors. In this paper we address structural code similarity detection and propose new methods to detect structural clones using structure of control statements. By structure we mean order of control statements used in the source code. We have considered two orders of control structures: (i) Sequence of control statements as it appears (ii) Execution flow of control statements.},
    number = {10},
    journal = {International Journal of Computer Applications Technology and Research},
    author = {Mallaiah, Sudhamani and Rangarajan, Lalitha},
    year = {2015},
    keywords = {Structural similarity, Control statements, Control structure, Execution flow, Similarity value},
    pages = {728-736}
    }

  • M. Mondal, C. K. Roy, and K. A. Schneider, “A comparative study on the bug-proneness of different types of code clones,” in 2015 ieee international conference on software maintenance and evolution (icsme), 2015, pp. 91-100.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7332455,
    author={M. {Mondal} and C. K. {Roy} and K. A. {Schneider}},
    booktitle={2015 IEEE International Conference on Software Maintenance and Evolution (ICSME)},
    title={A comparative study on the bug-proneness of different types of code clones},
    year={2015},
    url = {https://ieeexplore.ieee.org/abstract/document/7332455/},
    volume={},
    number={},
    pages={91-100},}

  • M. Mondal, C. K. Roy, and K. A. Schneider, “Spcp-miner: a tool for mining code clones that are important for refactoring or tracking,” in 2015 ieee 22nd international conference on software analysis, evolution, and reengineering (saner), 2015, pp. 484-488.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7081861,
    author={M. {Mondal} and C. K. {Roy} and K. A. {Schneider}},
    booktitle={2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER)},
    title={SPCP-Miner: A tool for mining code clones that are important for refactoring or tracking},
    year={2015},
    url = {https://ieeexplore.ieee.org/abstract/document/7081861},
    volume={},
    number={},
    pages={484-488},}

  • A. Mubarak-Ali, S. Syed-Mohamad, and S. Sulaiman, “Enhancing generic pipeline model for code clone detection using divide and conquer approach,” , 5, 2015.
    [BibTeX] [Abstract] [PDF]

    Code clone is known as identical copies of the same instances or fragments of source codes in software. Current code clone research focuses on the detection and analysis of code clones in order to help software developers identify code clones in source codes and reuse the source codes in order to decrease the maintenance cost. Many approaches such as textual based comparison approach, token based comparison and tree based comparison approach have been used to detect code clones. As software grows and becomes a legacy system, the complexity of these approaches in detecting code clones increases. Thus, this scenario makes it more difficult to detect code clones. Generic pipeline model is the most recent code clone detection that comprises five processes which are parsing process, pre-processing process, pooling process, comparing processes and filtering process to detect code clone. This research highlights the enhancement of the generic pipeline model using divide and conquer approach that involves concatenation process. The aim of this approach is to produce a better input for the generic pipeline model by processing smaller part of source code files before focusing on the large chunk of source codes in a single pipeline. We implement and apply the proposed approach with the support of a tool called Java Code Clone Detector (JCCD). The result obtained shows an improvement in the rate of code clone detection and overall runtime performance as compared to the existing generic pipeline model.

    @techreport{mubarak-ali_enhancing_2015,
    title = {Enhancing Generic Pipeline Model for Code Clone Detection Using Divide and Conquer Approach},
    url = {https://www.researchgate.net/publication/270760261_Enhancing_Generic_Pipeline_Model_for_Code_Clone_Detection_Using_Divide_and_Conquer_Approach},
    abstract = {Code clone is known as identical copies of the same instances or fragments of source codes in software. Current code clone research focuses on the detection and analysis of code clones in order to help software developers identify code clones in source codes and reuse the source codes in order to decrease the maintenance cost. Many approaches such as textual based comparison approach, token based comparison and tree based comparison approach have been used to detect code clones. As software grows and becomes a legacy system, the complexity of these approaches in detecting code clones increases. Thus, this scenario makes it more difficult to detect code clones. Generic pipeline model is the most recent code clone detection that comprises five processes which are parsing process, pre-processing process, pooling process, comparing processes and filtering process to detect code clone. This research highlights the enhancement of the generic pipeline model using divide and conquer approach that involves concatenation process. The aim of this approach is to produce a better input for the generic pipeline model by processing smaller part of source code files before focusing on the large chunk of source codes in a single pipeline. We implement and apply the proposed approach with the support of a tool called Java Code Clone Detector (JCCD). The result obtained shows an improvement in the rate of code clone detection and overall runtime performance as compared to the existing generic pipeline model.},
    number = {5},
    author = {Mubarak-Ali, Al-Fahim and Syed-Mohamad, Sharifah and Sulaiman, Shahida},
    year = {2015},
    note = {Publication Title: The International Arab Journal of Information Technology Volume: 12},
    journal = {International Arab Journal of Information Technology},
    keywords = {Code clone detection, divide and conquer approach, generic pipeline model}
    }

  • P. Pulkkinen, J. Holvitie, O. S. Nevalainen, and V. Leppänen, “Reusability based program clone detection: case study on large scale healthcare software system,” in Proceedings of the 16th international conference on computer systems and technologies, New York, NY, USA, 2015, p. 90–97. doi:10.1145/2812428.2812471
    [BibTeX] [PDF]
    @inproceedings{10.1145/2812428.2812471,
    author = {Pulkkinen, Petri and Holvitie, Johannes and Nevalainen, Olli S. and Lepp\"{a}nen, Ville},
    title = {Reusability Based Program Clone Detection: Case Study on Large Scale Healthcare Software System},
    year = {2015},
    isbn = {9781450333573},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2812428.2812471},
    doi = {10.1145/2812428.2812471},
    booktitle = {Proceedings of the 16th International Conference on Computer Systems and Technologies},
    pages = {90–97},
    numpages = {8},
    keywords = {refactoring, reusability, clone detection},
    location = {Dublin, Ireland},
    series = {CompSysTech ’15}
    }

  • E. J. Rapos, A. Stevenson, M. H. Alalfi, and J. R. Cordy, “Simnav: simulink navigation of model clone classes,” in 2015 ieee 15th international working conference on source code analysis and manipulation (scam), 2015, pp. 241-246.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7335420,
    author={E. J. {Rapos} and A. {Stevenson} and M. H. {Alalfi} and J. R. {Cordy}},
    booktitle={2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM)},
    title={SimNav: Simulink navigation of model clone classes},
    year={2015},
    url = {https://ieeexplore.ieee.org/document/7335420},
    volume={},
    number={},
    pages={241-246},}

  • E. Richa Grover, E. Narender Rana, and A. Proff In CSE, “Various possibilities of Clone Detection in Software’s: A Review,” Ijrit international journal of research in information technology, vol. 3, pp. 407-413, 2015.
    [BibTeX] [Abstract] [PDF]

    Software clone detection involves detection of duplicated code from two source codes. As a result, software systems often contain sections of code that are very similar, called software clones or code clones. In bug detection if a bug is present in one code fragments then it have to checked to all similar copied code fragment and it results in more in bug detection Every clone detection technique requires an intermediate representation of program so that the matching algorithm can accurately detect clone in an efficient manner. Program slicing is one of the most widely used intermediate representations to detect code clones. A program slice is an independent part of the program which does not affect the behavior of remaining program. Thereafter, various algorithm is used that performs a matching between the computed variable dependencies. .The aim of this paper is that to study different type of clones and various possibilities of clone detection in softwares.

    @article{richa_grover_various_2015,
    title = {Various possibilities of {Clone} {Detection} in {Software}'s: {A} {Review}},
    volume = {3},
    issn = {2001-5569},
    url = {www.ijrit.com},
    abstract = {Software clone detection involves detection of duplicated code from two source codes. As a result, software systems often contain sections of code that are very similar, called software clones or code clones. In bug detection if a bug is present in one code fragments then it have to checked to all similar copied code fragment and it results in more in bug detection Every clone detection technique requires an intermediate representation of program so that the matching algorithm can accurately detect clone in an efficient manner. Program slicing is one of the most widely used intermediate representations to detect code clones. A program slice is an independent part of the program which does not affect the behavior of remaining program. Thereafter, various algorithm is used that performs a matching between the computed variable dependencies. .The aim of this paper is that to study different type of clones and various possibilities of clone detection in softwares.},
    journal = {IJRIT International Journal of Research in Information Technology},
    author = {Richa Grover, Er and Narender Rana, Er and Proff In CSE, Astt},
    year = {2015},
    keywords = {Software clones, Program slicing, Dead code, Matching algorithm, Variable dependencies},
    pages = {407-413}
    }

  • M. La Rosa, M. Dumas, C. C. Ekanayake, L. García-Bañuelos, J. Recker, and A. H. M. ter Hofstede, “Detecting approximate clones in business process model repositories,” Information systems, vol. 49, pp. 102-125, 2015. doi:https://doi.org/10.1016/j.is.2014.11.010
    [BibTeX] [PDF]
    @article{LAROSA2015102,
    title = "Detecting approximate clones in business process model repositories",
    journal = "Information Systems",
    volume = "49",
    pages = "102-125",
    year = "2015",
    issn = "0306-4379",
    doi = "https://doi.org/10.1016/j.is.2014.11.010",
    url = "http://www.sciencedirect.com/science/article/pii/S0306437914001860",
    author = "Marcello {La Rosa} and Marlon Dumas and Chathura C. Ekanayake and Luciano García-Bañuelos and Jan Recker and Arthur H.M. {ter Hofstede}",
    keywords = "Business process model, Clone detection, Model collection, Repository, Standardization",
    }

  • M. Shanmughasundaram and S. Subramani, “A Measurement of Similarity to Identify Identical Code Clones,” , 6A, 2015.
    [BibTeX] [Abstract] [PDF]

    Code clones are described as a part of the program which is completely or partially similar to the other portions. In the earlier research the code clones have been detected using fingerprinting technique. The major challenge in our work was to group the code clones based on similarity measure. The proposed system measures the similarity based on similarity distance. The defined expression considers two parameters for calculating the similarity measure namely the similarity distance and the population of the clone. Thereby the code clones are clustered and ranked on the basis of their similarity measures. Indexing is used to interactively identify the clones which are caused due to inconsistent changes. As a result of this work all the identical clusters for most similar and more similar categories are identified.

    @techreport{shanmughasundaram_measurement_2015,
    title = {A {Measurement} of {Similarity} to {Identify} {Identical} {Code} {Clones}},
    url = {https://www.researchgate.net/profile/Saeed_Shafieian/publication/267840728_Comparison_of_Clone_Detection_Techniques/links/54b6aa760cf2bd04be324938.pdf},
    abstract = {Code clones are described as a part of the program which is completely or partially similar to the other portions. In the earlier research the code clones have been detected using fingerprinting technique. The major challenge in our work was to group the code clones based on similarity measure. The proposed system measures the similarity based on similarity distance. The defined expression considers two parameters for calculating the similarity measure namely the similarity distance and the population of the clone. Thereby the code clones are clustered and ranked on the basis of their similarity measures. Indexing is used to interactively identify the clones which are caused due to inconsistent changes. As a result of this work all the identical clusters for most similar and more similar categories are identified.},
    number = {6A},
    author = {Shanmughasundaram, Mythili and Subramani, Sarala},
    year = {2015},
    note = {Publication Title: The International Arab Journal of Information Technology Volume: 12},
    volume = {12},
    journal = {The International Arab Journal of Information Technology},
    keywords = {Clone detection, software clones, clustering, reuse, fingerprinting}
    }

  • A. Sheneamer and J. Kalita, “Code clone detection using coarse and fine-grained hybrid approaches,” in 2015 ieee seventh international conference on intelligent computing and information systems (icicis), 2015, pp. 472-480.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7397263,
    author={A. {Sheneamer} and J. {Kalita}},
    booktitle={2015 IEEE Seventh International Conference on Intelligent Computing and Information Systems (ICICIS)},
    title={Code clone detection using coarse and fine-grained hybrid approaches},
    url = {https://ieeexplore.ieee.org/abstract/document/7397263/},
    year={2015},
    volume={},
    number={},
    pages={472-480},
    }

  • M. Stephan and J. R. Cordy, “Identifying instances of model design patterns and antipatterns using model clone detection,” in 2015 ieee/acm 7th international workshop on modeling in software engineering, 2015, pp. 48-53.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7167402,
    author={M. {Stephan} and J. R. {Cordy}},
    booktitle={2015 IEEE/ACM 7th International Workshop on Modeling in Software Engineering},
    title={Identifying Instances of Model Design Patterns and Antipatterns Using Model Clone Detection},
    year={2015},
    url = {https://ieeexplore.ieee.org/document/7167402},
    volume={},
    number={},
    pages={48-53},}

  • N. Tsantalis, D. Mazinanian, and G. P. Krishnan, “Assessing the refactorability of software clones,” Ieee transactions on software engineering, vol. 41, iss. 11, pp. 1055-1090, 2015.
    [BibTeX] [PDF]
    @ARTICLE{7130676,
    author={N. {Tsantalis} and D. {Mazinanian} and G. P. {Krishnan}},
    journal={IEEE Transactions on Software Engineering},
    title={Assessing the Refactorability of Software Clones},
    url = {https://ieeexplore.ieee.org/document/7130676},
    year={2015},
    volume={41},
    number={11},
    pages={1055-1090},}

  • M. S. Uddin, V. Gaur, C. Gutwin, and C. K. Roy, “On the comprehension of code clone visualizations: a controlled study using eye tracking,” in 2015 ieee 15th international working conference on source code analysis and manipulation (scam), 2015, pp. 161-170.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7335412,
    author={M. S. {Uddin} and V. {Gaur} and C. {Gutwin} and C. K. {Roy}},
    booktitle={2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM)},
    title={On the comprehension of code clone visualizations: A controlled study using eye tracking},
    year={2015},
    url = {https://ieeexplore.ieee.org/document/7335412},
    volume={},
    number={},
    pages={161-170},
    }

  • M. S. Uddin, C. K. Roy, and K. A. Schneider, “Towards convenient management of software clone codes in practice: an integrated approach,” in Proceedings of the 25th annual international conference on computer science and software engineering, USA, 2015, p. 211–220.
    [BibTeX] [PDF]
    @inproceedings{10.5555/2886444.2886475,
    author = {Uddin, Md Sharif and Roy, Chanchal K. and Schneider, Kevin A.},
    title = {Towards Convenient Management of Software Clone Codes in Practice: An Integrated Approach},
    year = {2015},
    publisher = {IBM Corp.},
    address = {USA},
    url = {https://dl.acm.org/doi/10.5555/2886444.2886475},
    booktitle = {Proceedings of the 25th Annual International Conference on Computer Science and Software Engineering},
    pages = {211–220},
    numpages = {10},
    keywords = {software clone, integrated clone management, IDE plugin},
    location = {Markham, Canada},
    series = {CASCON ’15}
    }

  • J. Yang, K. Hotta, Y. Higo, and I. H, “Classification model for code clones based on machine learning,” Empirical software engineering, pp. 1095-1125, 2015.
    [BibTeX] [PDF]
    @article{yang_classification_nodate,
    title = {Classification model for code clones based on machine learning},
    url = {https://doi.org/10.1007/s10664-014-9316-x},
    journal = {Empirical Software Engineering},
    author = {Yang, J and Hotta, K and Higo, Y and H, Igaki},
    pages = {1095-1125},
    year = {2015}
    }

  • M. F. Zibran, “Analysis and visualization for clone refactoring,” in 2015 ieee 9th international workshop on software clones (iwsc), 2015, pp. 47-48.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7069889,
    author={M. F. {Zibran}},
    booktitle={2015 IEEE 9th International Workshop on Software Clones (IWSC)},
    url = {https://ieeexplore.ieee.org/abstract/document/7069889/},
    title={Analysis and visualization for clone refactoring},
    year={2015},
    volume={},
    number={},
    pages={47-48},
    }

2014

  • K. Hotta, J. Yang, Y. Higo, and S. Kusumoto, “Proceedings of the eighth international workshop on software clones (iwsc 2014) how accurate is coarse-grained clone detection?: comparision with fine-grained detectors,” Electronic communications of the easst, vol. 63, 2014.
    [BibTeX] [PDF]
    @article{hotta_proceedings_2014,
    title = {Proceedings of the Eighth International Workshop on Software Clones (IWSC 2014) How Accurate Is Coarse-grained Clone Detection?: Comparision with Fine-grained Detectors},
    volume = {63},
    issn = {1863-2122},
    url = {http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.685.7674&rep=rep1&type=pdf},
    journal = {Electronic Communications of the EASST},
    author = {Hotta, Keisuke and Yang, Jiachen and Higo, Yoshiki and Kusumoto, Shinji},
    year = {2014},
    keywords = {Software evolution, Clone detection, Mining software repositories}
    }

  • D. E. Krutz and W. Le, “A code clone oracle,” in Proceedings of the 11th working conference on mining software repositories, New York, NY, USA, 2014, pp. 388-391. doi:10.1145/2597073.2597127
    [BibTeX] [PDF]
    @inproceedings{10.1145/2597073.2597127,
    author = {Krutz, Daniel E. and Le, Wei},
    title = {A Code Clone Oracle},
    year = {2014},
    isbn = {9781450328630},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2597073.2597127},
    doi = {10.1145/2597073.2597127},
    booktitle = {Proceedings of the 11th Working Conference on Mining Software Repositories},
    pages = {388-391},
    numpages = {4},
    keywords = {Code Clone Detection, Software Engineering, Clone Oracle},
    location = {Hyderabad, India},
    series = {MSR 2014}
    }

  • M. Lillack, C. Bucholdt, and D. Schilling, “Detection of code clones in software generators,” in Proceedings of the 6th international workshop on feature-oriented software development, New York, NY, USA, 2014, pp. 37-44. doi:10.1145/2660190.2662116
    [BibTeX] [PDF]
    @inproceedings{10.1145/2660190.2662116,
    author = {Lillack, Max and Bucholdt, Christian and Schilling, Daniela},
    title = {Detection of Code Clones in Software Generators},
    year = {2014},
    isbn = {9781450329804},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2660190.2662116},
    doi = {10.1145/2660190.2662116},
    booktitle = {Proceedings of the 6th International Workshop on Feature-Oriented Software Development},
    pages = {37-44},
    numpages = {8},
    keywords = {macros, software generators, code clones, feature-oriented refactoring},
    location = {V\"{a}ster\r{a}s, Sweden},
    series = {FOSD ’14}
    }

  • M. Mondal, C. K. Roy, and K. A. Schneider, “Automatic identification of important clones for refactoring and tracking,” in 2014 ieee 14th international working conference on source code analysis and manipulation, 2014, pp. 11-20.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6975631,
    author={M. Mondal and C. K. Roy and K. A. Schneider},
    booktitle={2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation},
    title={Automatic Identification of Important Clones for Refactoring and Tracking},
    year={2014},
    volume={},
    number={},
    pages={11-20},
    url = {https://ieeexplore.ieee.org/document/6975631}
    }

  • M. S. Rahman and C. K. Roy, “A change-type based empirical study on the stability of cloned code,” in 2014 ieee 14th international working conference on source code analysis and manipulation, 2014, pp. 31-40.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6975633,
    author={M. S. Rahman and C. K. Roy},
    booktitle={2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation},
    title={A Change-Type Based Empirical Study on the Stability of Cloned Code},
    year={2014},
    volume={},
    number={},
    pages={31-40},
    url = {https://ieeexplore.ieee.org/abstract/document/6975633/},
    }

  • M. Singh and V. Sharma, “Detection of Behavioural Clone,” , 14, 2014.
    [BibTeX] [Abstract] [PDF]

    High level cloning in a system is the aggregation of four classes of high level similarities these four classes are behavioural Clones, concept clones, structural clones and domain model clones. Behavioural clones are used to depict similar run time behavior. This paper presents a method for the detection of behavioural clone. The proposed system detects behavioural type of higher level clones in two file of same or different directories. This is done with the help of clone detection tool designed in DOT NET. The work is implemented as a generalized tool which accepts different programming language as input and the existence of clone can be detected across the source code of different languages. The main distinctive feature of the proposed methodology is the use of command prompt to determine runtime behaviour of cloned code.

    @techreport{singh_detection_2014,
    title = {Detection of {Behavioural} {Clone}},
    url = {http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.800.907&rep=rep1&type=pdf},
    abstract = {High level cloning in a system is the aggregation of four classes of high level similarities these four classes are behavioural Clones, concept clones, structural clones and domain model clones. Behavioural clones are used to depict similar run time behavior. This paper presents a method for the detection of behavioural clone. The proposed system detects behavioural type of higher level clones in two file of same or different directories. This is done with the help of clone detection tool designed in DOT NET. The work is implemented as a generalized tool which accepts different programming language as input and the existence of clone can be detected across the source code of different languages. The main distinctive feature of the proposed methodology is the use of command prompt to determine runtime behaviour of cloned code.},
    number = {14},
    author = {Singh, Manu and Sharma, Vidushi},
    year = {2014},
    note = {Publication Title: International Journal of Computer Applications
    Volume: 102},
    volume = {102},
    journal = {International Journal of Computer Applications},
    keywords = {Behavioural Clones, classification of high level clones, higher level clone, simple clone},
    pages = {975-8887}
    }

  • Y. Udagawa, “An empirical study on retrieving structural clones using sequence pattern mining algorithms,” in ACM International Conference Proceeding Series, 2014, p. 270–276. doi:10.1145/2684200.2684290
    [BibTeX] [Abstract] [PDF]

    Many clone detection techniques focus on fragments of duplicated code, i.e., simple clones. Structural clones are simple clones within a syntactic boundary that are good candidates for refactoring. In this paper, a new approach for detection of structural clones in source code is presented. The proposed approach is parse-tree-based and is enhanced by frequent subsequence mining. It comprises three stages: preprocessing, mining frequent statement sequences, and fine-matching for structural clones using a modified longest common subsequence (LCS) algorithm. The lengths of control statements in a programming language and method identifiers differ; thus, a conventional LCS algorithm does not return the expected length of matched identifiers. We propose an encoding algorithm for control statements and method identifiers. Retrieval experiments were conducted using the Java SWING source code. The results show that the proposed data mining algorithm detects clones comprising 51 extracted statements. Our modified LCS algorithm retrieves a number of structural clones with arbitrary statement gaps.

    @inproceedings{udagawa_empirical_2014,
    title = {An empirical study on retrieving structural clones using sequence pattern mining algorithms},
    volume = {04-06-Dece},
    url = {https://www.researchgate.net/publication/300915649_An_Empirical_Study_on_Retrieving_Structural_Clones_Using_Sequence_Pattern_Mining_Algorithms},
    isbn = {978-1-4503-3001-5},
    doi = {10.1145/2684200.2684290},
    abstract = {Many clone detection techniques focus on fragments of duplicated code, i.e., simple clones. Structural clones are simple clones within a syntactic boundary that are good candidates for refactoring. In this paper, a new approach for detection of structural clones in source code is presented. The proposed approach is parse-tree-based and is enhanced by frequent subsequence mining. It comprises three stages: preprocessing, mining frequent statement sequences, and fine-matching for structural clones using a modified longest common subsequence (LCS) algorithm. The lengths of control statements in a programming language and method identifiers differ; thus, a conventional LCS algorithm does not return the expected length of matched identifiers. We propose an encoding algorithm for control statements and method identifiers. Retrieval experiments were conducted using the Java SWING source code. The results show that the proposed data mining algorithm detects clones comprising 51 extracted statements. Our modified LCS algorithm retrieves a number of structural clones with arbitrary statement gaps.},
    booktitle = {{ACM} {International} {Conference} {Proceeding} {Series}},
    publisher = {Association for Computing Machinery},
    author = {Udagawa, Yoshihisa},
    month = dec,
    year = {2014},
    keywords = {Control statement, Frequent subsequence mining, Java source code, Method identifier, Structural clone},
    pages = {270--276}
    }

  • S. A. Ajila and A. S. Gakhar, “Aspectualization of code clones-an algorithmic approach Chung-Horng Lung,” Information systems frontiers, pp. 835-851, 2014. doi:10.1007/s10796-013-9428-7
    [BibTeX] [Abstract] [PDF]

    System Modularity has positive effects on software maintainability, reusability, and understandability. One factor that can affect system modularity is code tangling due to code clones. Code tangling can have serious cross-cutting effects on the source code and thereby affect maintainability and reusability of the code. In this research we have developed an algorithmic approach to convert code clones to aspects in order to improve modularity and aid maintainability. Firstly, we use an existing code-clone detection tool to identify code clones in a source code. Secondly, we design algorithms to convert the code clones into aspects and do aspect composition with the original source code. Thirdly, we implement a prototype based on the algorithms. Fourthly, we carry out a performance analysis on the aspects composed source code and our analysis shows that the aspect composed code performs as well as the original code and even better in terms of execution times.

    @article{ajila_aspectualization_nodate,
    title = {Aspectualization of code clones-an algorithmic approach {Chung}-{Horng} {Lung}},
    url = {https://doi.org/10.1007/s10796-013-9428-7},
    doi = {10.1007/s10796-013-9428-7},
    abstract = {System Modularity has positive effects on software maintainability, reusability, and understandability. One factor that can affect system modularity is code tangling due to code clones. Code tangling can have serious cross-cutting effects on the source code and thereby affect maintainability and reusability of the code. In this research we have developed an algorithmic approach to convert code clones to aspects in order to improve modularity and aid maintainability. Firstly, we use an existing code-clone detection tool to identify code clones in a source code. Secondly, we design algorithms to convert the code clones into aspects and do aspect composition with the original source code. Thirdly, we implement a prototype based on the algorithms. Fourthly, we carry out a performance analysis on the aspects composed source code and our analysis shows that the aspect composed code performs as well as the original code and even better in terms of execution times.},
    journal = {Information Systems Frontiers},
    author = {Ajila, Samuel A and Gakhar, Angad S},
    pages = {835-851},
    year = {2014},
    keywords = {Code clones, Reuse, Algorithm, Aspects mining, Modularity, Performance analysis}
    }

  • M. Aktas, “On the structural code clone detection problem: a survey and software metric based approach,” in Computational science and its applications – iccsa 2014, 2014, pp. 492-507. doi:10.1007/978-3-319-09156-3_35
    [BibTeX] [PDF]
    @inproceedings{inproceedings,
    author = {Aktas, Mehmet},
    year = {2014},
    month = {08},
    booktitle = {Computational Science and Its Applications - ICCSA 2014},
    url = {https://www.researchgate.net/publication/273204076},
    title = {On the Structural Code Clone Detection Problem: A Survey and Software Metric Based Approach},
    doi = {10.1007/978-3-319-09156-3_35},
    pages = {492-507}
    }

  • M. H. Alalfi, J. R. Cordy, and T. R. Dean, “Analysis and clustering of model clones: an automotive industrial experience,” in 2014 software evolution week – ieee conference on software maintenance, reengineering, and reverse engineering (csmr-wcre), 2014, pp. 375-378.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6747198,
    author={M. H. {Alalfi} and J. R. {Cordy} and T. R. {Dean}},
    booktitle={2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE)},
    title={Analysis and clustering of model clones: An automotive industrial experience},
    year={2014},
    url = {https://ieeexplore.ieee.org/abstract/document/6227873/},
    volume={},
    number={},
    pages={375-378},
    }

  • M. Asif, A. Khan, C. K. Roy, and K. A. Schneider, “Proceedings of the Eighth International Workshop on Software Clones (IWSC 2014) active clones: source code clones at runtime,” Electronic communications of the easst, vol. 63, 2014.
    [BibTeX] [Abstract] [PDF]

    Code cloning is a common programming practice, and there have been a considerable amount of research that investigated the implications of code clones on software maintenance using static analysis. However, little has been done to investigate the runtime implications of code cloning. In this paper we investigate source code clones at runtime, referring to clones as ‘active clones’ if they are invoked when a software system is in use. For example, if a particular use u of a system results in a clone c being invoked, we say that clone c is active with respect to use u. From this definition and given a set of uses \{u 1 , u 2 , …\} and clones \{c 1 , c 2 , …\} we are able to identify the extent clones are active at runtime and analyze active clone resource use (e.g., CPU time) and define and calculate a set of active clone metrics to provide insights into source code cloning implications at runtime. We developed a hybrid static and dynamic analysis technique for detecting and analysing active clones, and conducted an empirical study on five software systems (HSQLDB, JHotDraw, RText, jEdit and UniCentaoPOS) to validate our approach. We found a small portion of clones are active during a typical use of a software system, and that active clones have the potential for guiding a software developer’s code inspection activity during a software maintenance task.

    @article{asif_proceedings_2014,
    title = {Proceedings of the {Eighth} {International} {Workshop} on {Software} {Clones} ({IWSC} 2014) Active Clones: Source Code Clones at Runtime},
    volume = {63},
    issn = {1863-2122},
    url = {http://mdakhan.weebly.com/index.html2croy@cs.usask.ca,http://www.cs.usask.ca/∼croy/3kevin.schneider@usask.ca,http://www.cs.usask.ca/∼kas/Welcome.html},
    abstract = {Code cloning is a common programming practice, and there have been a considerable amount of research that investigated the implications of code clones on software maintenance using static analysis. However, little has been done to investigate the runtime implications of code cloning. In this paper we investigate source code clones at runtime, referring to clones as 'active clones' if they are invoked when a software system is in use. For example, if a particular use u of a system results in a clone c being invoked, we say that clone c is active with respect to use u. From this definition and given a set of uses \{u 1 , u 2 , ...\} and clones \{c 1 , c 2 , ...\} we are able to identify the extent clones are active at runtime and analyze active clone resource use (e.g., CPU time) and define and calculate a set of active clone metrics to provide insights into source code cloning implications at runtime. We developed a hybrid static and dynamic analysis technique for detecting and analysing active clones, and conducted an empirical study on five software systems (HSQLDB, JHotDraw, RText, jEdit and UniCentaoPOS) to validate our approach. We found a small portion of clones are active during a typical use of a software system, and that active clones have the potential for guiding a software developer's code inspection activity during a software maintenance task.},
    journal = {Electronic Communications of the EASST},
    author = {Asif, Mohammad and Khan, A and Roy, Chanchal K and Schneider, Kevin A},
    year = {2014},
    numpages = {},
    keywords = {clone detection, active clones, dynamic analysis}
    }

  • G. Bansal and R. Tekchandani, “Selecting a set of appropriate metrics for detecting code clones,” in 2014 seventh international conference on contemporary computing (ic3), 2014, pp. 484-488.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6897221,
    author={G. Bansal and R. Tekchandani},
    url = {https://ieeexplore.ieee.org/abstract/document/6897221/},
    booktitle={2014 Seventh International Conference on Contemporary Computing (IC3)},
    title={Selecting a set of appropriate metrics for detecting code clones},
    year={2014},
    volume={},
    number={},
    pages={484-488},
    }

  • X. Chen, A. Y. Wang, and E. Tempero, “A replication and reproduction of code clone detection studies,” in Proceedings of the thirty-seventh australasian computer science conference – volume 147, AUS, 2014, p. 105–114.
    [BibTeX] [PDF]
    @inproceedings{10.5555/2667473.2667486,
    author = {Chen, Xiliang and Wang, Alice Yuchen and Tempero, Ewan},
    title = {A Replication and Reproduction of Code Clone Detection Studies},
    year = {2014},
    isbn = {9781921770302},
    publisher = {Australian Computer Society, Inc.},
    address = {AUS},
    booktitle = {Proceedings of the Thirty-Seventh Australasian Computer Science Conference - Volume 147},
    pages = {105–114},
    url = {https://dl.acm.org/doi/10.5555/2667473.2667486},
    numpages = {10},
    location = {Auckland, New Zealand},
    series = {ACSC ’14}
    }

  • Y. Chen, I. Keivanloo, and C. K. Roy, “Near-miss software clones in open source games: an empirical study,” in 2014 ieee 27th canadian conference on electrical and computer engineering (ccece), 2014, pp. 1-7.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6901018,
    author={Y. {Chen} and I. {Keivanloo} and C. K. {Roy}},
    booktitle={2014 IEEE 27th Canadian Conference on Electrical and Computer Engineering (CCECE)},
    title={Near-miss software clones in open source games: An empirical study},
    year={2014},
    url = {https://ieeexplore.ieee.org/abstract/document/6901018/},
    volume={},
    number={},
    pages={1-7},
    }

  • E. Choi, N. Yoshida, and K. Inoue, “An Investigation into the Characteristics of Merged Code Clones during Software Evolution,” Ieice transactions on information and systems, pp. 1244-1253, 2014. doi:10.1587/transinf.E97.D.1244
    [BibTeX] [Abstract] [PDF]

    Although code clones (i.e. code fragments that have similar or identical code fragments in the source code) are regarded as a factor that increases the complexity of software maintenance, tools for supporting clone refactoring (i.e. merging a set of code clones into a single method or function) are not commonly used. To promote the development of refactor-ing tools that can be more widely utilized, we present an investigation of clone refactoring carried out in the development of open source software systems. In the investigation, we identified the most frequently used refac-toring patterns and discovered how merged code clone token sequences and differences in token sequence lengths vary for each refactoring pattern.

    @article{choi_investigation_2014,
    title = {An {Investigation} into the {Characteristics} of {Merged} {Code} {Clones} during {Software} {Evolution}},
    url = {https://search.ieice.org/bin/summary.php?id=e97-d_5_1244},
    doi = {10.1587/transinf.E97.D.1244},
    abstract = {Although code clones (i.e. code fragments that have similar or identical code fragments in the source code) are regarded as a factor that increases the complexity of software maintenance, tools for supporting clone refactoring (i.e. merging a set of code clones into a single method or function) are not commonly used. To promote the development of refactor-ing tools that can be more widely utilized, we present an investigation of clone refactoring carried out in the development of open source software systems. In the investigation, we identified the most frequently used refac-toring patterns and discovered how merged code clone token sequences and differences in token sequence lengths vary for each refactoring pattern.},
    author = {Choi, Eunjong and Yoshida, Norihiro and Inoue, Katsuro},
    year = {2014},
    journal = {IEICE TRANSACTIONS on Information and Systems},
    pages = {1244-1253},
    keywords = {code clone, refactoring, open source software}
    }

  • M. Deepika and S. Sarala, “Implication of Clone Detection and Refactoring Techniques using Delayed Duplicate Detection Refactoring,” , 6, 2014.
    [BibTeX] [Abstract] [PDF]

    Code maintenance has been increased when the similar code fragments is reduced in the software systems. Refactoring is a change made to the internal structure of software to make it easier to understand and cheaper to modify without changing its observable behavior based on code, the refactoring mechanism is used to discover the clone detection. The proposed algorithm insists semantic relevance between files, classes and methods towards c\# applications. The delayed duplicate detection refactoring technique uses the code analyzer and semantic graph for quickly detect the duplicate files in the application. The implemented clone refactoring technique enhances the Semantic Relevance Entity Detection algorithm which provides better performance and accurate result for unifying the process of clone detection and refactoring.

    @techreport{deepika_implication_2014,
    title = {Implication of {Clone} {Detection} and {Refactoring} {Techniques} using {Delayed} {Duplicate} {Detection} {Refactoring}},
    url = {http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.677.9629&rep=rep1&type=pdf},
    abstract = {Code maintenance has been increased when the similar code fragments is reduced in the software systems. Refactoring is a change made to the internal structure of software to make it easier to understand and cheaper to modify without changing its observable behavior based on code, the refactoring mechanism is used to discover the clone detection. The proposed algorithm insists semantic relevance between files, classes and methods towards c\# applications. The delayed duplicate detection refactoring technique uses the code analyzer and semantic graph for quickly detect the duplicate files in the application. The implemented clone refactoring technique enhances the Semantic Relevance Entity Detection algorithm which provides better performance and accurate result for unifying the process of clone detection and refactoring.},
    number = {6},
    journal = {International Journal of Computer Applications},
    author = {Deepika, M and Sarala, S},
    year = {2014},
    note = {Publication Title: International Journal of Computer Applications},
    volume={93},
    keywords = {Code clones, Clone detection, Refactoring, Parsing, Abstract Syntax Tree (AST), Delayed duplicate detection, Source code fragments},
    pages = {975-8887}
    }

  • A. F. Desouky, M. D. Beard, and L. H. Etzkorn, “A qualitative analysis of code clones and object oriented runtime complexity based on method access points,” in International conference for convergence for technology-2014, 2014, pp. 1-5.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7092292,
    author={A. F. {Desouky} and M. D. {Beard} and L. H. {Etzkorn}},
    booktitle={International Conference for Convergence for Technology-2014},
    title={A qualitative analysis of code clones and object oriented runtime complexity based on method access points},
    year={2014},
    url = {https://ieeexplore.ieee.org/abstract/document/7092292/},
    volume={},
    number={},
    pages={1-5},
    }

  • M. R. Farhadi, B. C. M. Fung, P. Charland, and M. Debbabi, “Binclone: detecting code clones in malware,” in 2014 eighth international conference on software security and reliability (sere), 2014, pp. 78-87.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6895418,
    author={M. R. {Farhadi} and B. C. M. {Fung} and P. {Charland} and M. {Debbabi}},
    booktitle={2014 Eighth International Conference on Software Security and Reliability (SERE)},
    title={BinClone: Detecting Code Clones in Malware},
    year={2014},
    url = {https://ieeexplore.ieee.org/document/6895418},
    volume={},
    number={},
    pages={78-87},
    }

  • R. Garg and R. Tekchandani, “An approach to rank code clones for efficient clone management,” in 2014 international conference on advances in electronics computers and communications, 2014, pp. 1-5.
    [BibTeX] [PDF]
    @INPROCEEDINGS{7002385,
    author={R. {Garg} and R. {Tekchandani}},
    booktitle={2014 International Conference on Advances in Electronics Computers and Communications},
    title={An approach to rank code clones for efficient clone management},
    year={2014},
    url = {https://ieeexplore.ieee.org/abstract/document/7002385/},
    volume={},
    number={},
    pages={1-5},
    }

  • R. Garg and R. Bhatia, “Code Clone v/s Model Clones: Pros and Cons,” , 15, 2014.
    [BibTeX] [Abstract] [PDF]

    Every software has time and budget constraints associated with it.The time and budget of the software also depends on the risk and inconsistencies during the software life cycle phases.These risks and inconsistencies can be reduced by detecting clones in form of redundancy between the software systems.This paper provides a brief overview to the detection of these risk and inconsistencies in either of the two phases of software development system i.e.design phase or the implementation phase along with their pros and cons.

    @techreport{garg_code_2014,
    title = {Code {Clone} v/s {Model} {Clones}: {Pros} and {Cons}},
    url = {https://www.researchgate.net/publication/269668531_Code_Clone_vs_Model_Clones_Pros_and_Cons},
    abstract = {Every software has time and budget constraints associated with it.The time and budget of the software also depends on the risk and inconsistencies during the software life cycle phases.These risks and inconsistencies can be reduced by detecting clones in form of redundancy between the software systems.This paper provides a brief overview to the detection of these risk and inconsistencies in either of the two phases of software development system i.e.design phase or the implementation phase along with their pros and cons.},
    number = {15},
    author = {Garg, Ritu and Bhatia, Rajesh},
    year = {2014},
    journal = {International Journal of Computer Applications},
    note = {Publication Title: International Journal of Computer Applications},
    volume = {89},
    keywords = {Clone detection, Code based Clone detection, Model based Clone detection, Software System},
    pages = {20-22}
    }

  • T. Görg, “Incremental detection of parameterized code clones,” Softwaretechnik-trends, vol. 33, pp. 25-26, 2014. doi:10.1007/s40568-013-0031-3
    [BibTeX] [PDF]
    @article{article,
    author = {Görg, Torsten},
    year = {2014},
    month = {05},
    pages = {25-26},
    url = {https://www.researchgate.net/publication/272867793_Incremental_Detection_of_Parameterized_Code_Clones},
    title = {Incremental Detection of Parameterized Code Clones},
    volume = {33},
    journal = {Softwaretechnik-Trends},
    doi = {10.1007/s40568-013-0031-3}
    }

  • A. Hamid and V. Zaytsev, “Detecting refactorable clones by slicing program dependence graphs,” Ceur workshop proceedings, vol. 1354, pp. 37-48, 2014.
    [BibTeX] [PDF]
    @article{article,
    author = {Hamid, A. and Zaytsev, Vadim},
    year = {2014},
    month = {01},
    url ={https://www.researchgate.net/publication/289645965_Detecting_refactorable_clones_by_slicing_program_dependence_graphs},
    pages = {37-48},
    title = {Detecting refactorable clones by slicing program dependence graphs},
    volume = {1354},
    journal = {CEUR Workshop Proceedings}
    }

  • T. Kamiya, “Toward a Code-Clone Search through the Entire Lifecycle of a Software Product-Position Paper-Toshihiro Kamiya 7 pages Toward a Code-Clone Search through the Entire Lifecycle of a Software Product,” Proceedings of the eighth international workshop on software clones (iwsc), vol. 63, 2014.
    [BibTeX] [Abstract] [PDF]

    This paper presents a clone-detection method/tool currently under development. This tool is useful as a code-clone search through the entire lifecycle of a software product; The tool searches code examples and analyzes of code clones in both preventive and postmortem ways[LRHK10]. The approach is based on a sequence equivalence on execution paths[Kam13] and extends the equivalence to include gaps, thus type-3[BKA + 07] clone detection. Each of the detected clones is a sub-sequence of an execution path of a given program, in other words, a set of code fragments of multiple procedures (methods) which can be executed in a run of the program. The approach is relaxed in terms of adaptability to incomplete (not-yet-finished) code, but also makes use of concrete information such as types (including hierarchy) and dynamic dispatch when such information is available.

    @article{kamiya_toward_2014,
    title = {Toward a {Code}-{Clone} {Search} through the {Entire} {Lifecycle} of a {Software} {Product}-{Position} {Paper}-{Toshihiro} {Kamiya} 7 pages {Toward} a {Code}-{Clone} {Search} through the {Entire} {Lifecycle} of a {Software} {Product}},
    volume = {63},
    issn = {1863-2122},
    url = {http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.685.3059&rep=rep1&type=pdf},
    abstract = {This paper presents a clone-detection method/tool currently under development. This tool is useful as a code-clone search through the entire lifecycle of a software product; The tool searches code examples and analyzes of code clones in both preventive and postmortem ways[LRHK10]. The approach is based on a sequence equivalence on execution paths[Kam13] and extends the equivalence to include gaps, thus type-3[BKA + 07] clone detection. Each of the detected clones is a sub-sequence of an execution path of a given program, in other words, a set of code fragments of multiple procedures (methods) which can be executed in a run of the program. The approach is relaxed in terms of adaptability to incomplete (not-yet-finished) code, but also makes use of concrete information such as types (including hierarchy) and dynamic dispatch when such information is available.},
    journal = {Proceedings of the Eighth International Workshop on Software Clones (IWSC)},
    author = {Kamiya, Toshihiro},
    year = {2014},
    keywords = {Code Clone, Code Search, Postmortem Code-Clone Detection, Preventive Code-Clone Detection}
    }

  • H. Kaur and K. and M, “Detecting Clones in Class Diagrams Using Suffix Array 244 Figure:-1.a Classes with similar attributes due to unfinished modeling,” International journal of engineering and advanced technology (ijeat), pp. 243-246, 2014. doi:10.1109/CIS.2009.175
    [BibTeX] [Abstract] [PDF]

    Figure:-1.b Classes with removal of similar attributes Fig1.a describes the similar attributes present in two classes that will act as duplicity for class diagram.Fig1.b is showing a new super class that contains the duplicate attributes of two classes hence reduces code size and maintainability. C. Advantages and Applications of Detecting Model Clones  To produce a better designed model.  To derive a definition for understanding model clones  To develop an algorithm to detects model clones of actual meaning.  Resource requirements can be reduced if clones in the models are detected.  Easy for the maintainer to maintain the model if he is aware of the presence of clones.  Good knowledge of clones will help to introduce a effective reusable mechanism[3,10,16,20].

    @article{kaur_detecting_nodate,
    title = {Detecting {Clones} in {Class} {Diagrams} {Using} {Suffix} {Array} 244 {Figure}:-1.a {Classes} with similar attributes due to unfinished modeling},
    url = {http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.677.4671&rep=rep1&type=pdf},
    doi = {10.1109/CIS.2009.175},
    abstract = {Figure:-1.b Classes with removal of similar attributes Fig1.a describes the similar attributes present in two classes that will act as duplicity for class diagram.Fig1.b is showing a new super class that contains the duplicate attributes of two classes hence reduces code size and maintainability. C. Advantages and Applications of Detecting Model Clones  To produce a better designed model.  To derive a definition for understanding model clones  To develop an algorithm to detects model clones of actual meaning.  Resource requirements can be reduced if clones in the models are detected.  Easy for the maintainer to maintain the model if he is aware of the presence of clones.  Good knowledge of clones will help to introduce a effective reusable mechanism[3,10,16,20].},
    journal = {International Journal of Engineering and Advanced Technology (IJEAT)},
    year = {2014},
    pages = {243-246},
    author = {Kaur, H and and M, Kaur}
    }

  • R. Kaur and S. Singh, “Clone detection in software source code using operational similarity of statements,” Acm sigsoft software engineering notes, vol. 39, iss. 3, pp. 1-5, 2014. doi:10.1145/2597716.2597723
    [BibTeX] [Abstract] [PDF]

    This paper presents a technique to detect clones in source code by comparing the operations performed in the statements comprising a function. The key concept used is that two functions are considered clones if the statements in the functions perform the same operation up to a certain extent. This could be ascertained by categorizing the available statement types based on the operations performed (for instance, addition, multiplication, function invocation, etc). Then, a category is assigned to each statement present in every function in the source code. Comparisons are then made between functions by comparing the categories of the statements to each other. If one function contains exactly the same categories of statement as another (same operations performed in both the functions), or contains a subset of statement categories (operations performed in one function are subset of another), then these functions are judged to be clones.

    @article{kaur_clone_2014,
    title = {Clone detection in software source code using operational similarity of statements},
    volume = {39},
    issn = {0163-5948},
    doi = {10.1145/2597716.2597723},
    url = {https://dl.acm.org/doi/10.1145/2597716.2597723},
    abstract = {This paper presents a technique to detect clones in source code by comparing the operations performed in the statements comprising a function. The key concept used is that two functions are considered clones if the statements in the functions perform the same operation up to a certain extent. This could be ascertained by categorizing the available statement types based on the operations performed (for instance, addition, multiplication, function invocation, etc). Then, a category is assigned to each statement present in every function in the source code. Comparisons are then made between functions by comparing the categories of the statements to each other. If one function contains exactly the same categories of statement as another (same operations performed in both the functions), or contains a subset of statement categories (operations performed in one function are subset of another), then these functions are judged to be clones.},
    number = {3},
    journal = {ACM SIGSOFT Software Engineering Notes},
    author = {Kaur, Raminder and Singh, Satwinder},
    month = jun,
    year = {2014},
    note = {Publisher: Association for Computing Machinery (ACM)},
    pages = {1-5}
    }

  • O. Kononenko, C. Zhang, and M. W. Godfrey, “Compiling clones: what happens?,” in 2014 ieee international conference on software maintenance and evolution, 2014, pp. 481-485.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6976122,
    author={O. {Kononenko} and C. {Zhang} and M. W. {Godfrey}},
    booktitle={2014 IEEE International Conference on Software Maintenance and Evolution},
    title={Compiling Clones: What Happens?},
    year={2014},
    url = {https://ieeexplore.ieee.org/abstract/document/6976122/},
    volume={},
    number={},
    pages={481-485},}

  • R. Koschke and S. Bazrafshan, “Effect of clone information on the performance of developers fixing cloned bugs,” 2014 ieee 14th international working conference on source code analysis and manipulation (scam), pp. 1-10, 2014. doi:10.1109/SCAM.2014.10
    [BibTeX] [Abstract] [PDF]

    Duplicated source code-clones-is known to occur frequently in software systems and bears the risk of inconsistent updates of the code. The impact of clones has been investigated mostly by retrospective analysis of software systems. Only little effort has been spent to investigate human interaction when dealing with clones. A previous study by Chatterji and colleagues found that cloned defects are removed significantly more accurately when clone information is provided to the programmers. We conducted a controlled experiment to extend the previous study on the use of clone information by investigating the effect of clone information on the performance of developers in common bug-fixing tasks. The experiment shows that developers are quite capable to compensate missing clone information through testing to provide correct solutions. Clone information does help to detect cloned defects faster, although developers may exploit semantic code relations such as inheritance to uncover cloned defects only slightly slower if they do not have clone information. If cloned defects lurk in semantically unrelated places however, clone information helps to find them faster at statistical significance. Developers without clone information needed 17 minutes longer on average or 140 \% more time in relative terms to complete the task successfully.

    @article{koschke_effect_2014,
    title = {Effect of Clone Information on the Performance of Developers Fixing Cloned Bugs},
    url = {https://www.researchgate.net/publication/286668711},
    doi = {10.1109/SCAM.2014.10},
    abstract = {Duplicated source code-clones-is known to occur frequently in software systems and bears the risk of inconsistent updates of the code. The impact of clones has been investigated mostly by retrospective analysis of software systems. Only little effort has been spent to investigate human interaction when dealing with clones. A previous study by Chatterji and colleagues found that cloned defects are removed significantly more accurately when clone information is provided to the programmers. We conducted a controlled experiment to extend the previous study on the use of clone information by investigating the effect of clone information on the performance of developers in common bug-fixing tasks. The experiment shows that developers are quite capable to compensate missing clone information through testing to provide correct solutions. Clone information does help to detect cloned defects faster, although developers may exploit semantic code relations such as inheritance to uncover cloned defects only slightly slower if they do not have clone information. If cloned defects lurk in semantically unrelated places however, clone information helps to find them faster at statistical significance. Developers without clone information needed 17 minutes longer on average or 140 \% more time in relative terms to complete the task successfully.},
    journal = {2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation (SCAM)},
    author = {Koschke, Rainer and Bazrafshan, Saman},
    year = {2014},
    pages = {1-10},
    }

  • X. Li, X. Su, P. Ma, and T. Wang, “Refactoring structure semantics similar clones combining standardization with metrics,” in Proceedings of international conference on soft computing techniques and engineering application, New Delhi, 2014, pp. 361-367.
    [BibTeX] [PDF]
    @InProceedings{10.1007/978-81-322-1695-7_41,
    author={Li, Xia and Su, Xiaohong and Ma, Peijun and Wang, Tiantian},
    editor={Patnaik, Srikanta
    and Li, Xiaolong},
    url = {https://link.springer.com/chapter/10.1007/978-81-322-1695-7_41},
    title={Refactoring Structure Semantics Similar Clones Combining Standardization with Metrics},
    booktitle={Proceedings of International Conference on Soft Computing Techniques and Engineering Application},
    year={2014},
    publisher={Springer India},
    address={New Delhi},
    pages={361-367},
    isbn={978-81-322-1695-7},
    }

  • Y. Lin, Z. Xing, X. Peng, Y. Liu, J. Sun, W. Zhao, and J. Dong, “Clonepedia: summarizing code clones by common syntactic context for software maintenance,” in 2014 ieee international conference on software maintenance and evolution, 2014, pp. 341-350.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6976100,
    author={Y. {Lin} and Z. {Xing} and X. {Peng} and Y. {Liu} and J. {Sun} and W. {Zhao} and J. {Dong}},
    booktitle={2014 IEEE International Conference on Software Maintenance and Evolution},
    title={Clonepedia: Summarizing Code Clones by Common Syntactic Context for Software Maintenance},
    year={2014},
    url = {https://ieeexplore.ieee.org/document/6976100},
    volume={},
    number={},
    pages={341-350},}

  • Y. Lin, Z. Xing, Y. Xue, Y. Liu, X. Peng, J. Sun, and W. Zhao, “Detecting differences across multiple instances of code clones,” in Proceedings of International Conference on Software Engineering, 2014, pp. 164-174. doi:10.1145/2568225.2568298
    [BibTeX] [Abstract] [PDF]

    Clone detectors find similar code fragments (i.e., instances of code clones) and report large numbers of them for industrial systems. To maintain or manage code clones, developers often have to investigate differences of multiple cloned code fragments. However,existing program differencing techniques compare only two code fragments at a time. Developers then have to manually combine several pairwise differencing results. In this paper, we present an approach to automatically detecting differences across multiple clone instances. We have implemented our approach as an Eclipse plugin and evaluated its accuracy with three Java software systems. Our evaluation shows that our algorithm has precision over 97.66\% and recall over 95.63\% in three open source Java projects. We also conducted a user study of 18 developers to evaluate the usefulness of our approach for eight clone-related refactoring tasks. Our study shows that our approach can significantly improve developersperformance in refactoring decisions, refactoring details, and task completion time on clone-related refactoring tasks. Automatically detecting differences across multiple clone instances also opens opportunities for building practical applications of code clones in software maintenance, such as auto-generation of application skeleton, intelligent simultaneous code editing.

    @inproceedings{lin_detecting_2014,
    title = {Detecting differences across multiple instances of code clones},
    doi = {10.1145/2568225.2568298},
    url = {https://dl.acm.org/doi/10.1145/2568225.2568298},
    abstract = {Clone detectors find similar code fragments (i.e., instances of code clones) and report large numbers of them for industrial systems. To maintain or manage code clones, developers often have to investigate differences of multiple cloned code fragments. However,existing program differencing techniques compare only two code fragments at a time. Developers then have to manually combine several pairwise differencing results. In this paper, we present an approach to automatically detecting differences across multiple clone instances. We have implemented our approach as an Eclipse plugin and evaluated its accuracy with three Java software systems. Our evaluation shows that our algorithm has precision over 97.66\% and recall over 95.63\% in three open source Java projects. We also conducted a user study of 18 developers to evaluate the usefulness of our approach for eight clone-related refactoring tasks. Our study shows that our approach can significantly improve developersperformance in refactoring decisions, refactoring details, and task completion time on clone-related refactoring tasks. Automatically detecting differences across multiple clone instances also opens opportunities for building practical applications of code clones in software maintenance, such as auto-generation of application skeleton, intelligent simultaneous code editing.},
    booktitle = {Proceedings of {International} {Conference} on {Software} {Engineering}},
    publisher = {IEEE Computer Society},
    author = {Lin, Yun and Xing, Zhenchang and Xue, Yinxing and Liu, Yang and Peng, Xin and Sun, Jun and Zhao, Wenyun},
    month = may,
    year = {2014},
    note = {ISSN: 02705257 Issue: 1},
    keywords = {Code clone, Human study, Program differencing},
    pages = {164-174}
    }

  • A. Lozano, F. Jaafar, K. Mens, and Y. Gaël Guéhéneuc, “Proceedings of the Eighth International Workshop on Software Clones (IWSC 2014) Clones and Macro co-changes Clones and Macro co-changes,” Electronic communications of the easst, vol. 63, 2014.
    [BibTeX] [Abstract] [PDF]

    Ideally, any change that modifies the similar parts of a cloned code snippet should be propagated to all its duplicates. In practice however, consistent propagation of changes in clones does not always happen. Current evidence indicates that clone families have a 50\% chance of having consistent changes. This paper measures cloning and co-changes at file level as a proxy to assess the frequency of consistent changes. Given that changes to a clone group are not necessarily propagated in the same commit transaction (i.e., late propagations), our analysis uses macro co-changes instead of the traditional definition of co-changes. Macro changes group bursts of changes that are closer among themselves than to other changes, regardless of author or message. Then, macro co-changes are sets of files that change in the same macro changes. Each cloned file is tagged depending on whether any of the files with which it macro co-changes is cloned with it (during the macro change) or not. Contrary to previous results, we discovered that most of the cloned files macro co-change only with files with which they share clones. Thus providing evidence that macro changes are appropriate to study the conjecture of clones requiring co-changes, and indicating that consistent changes might be the norm in cloned code.

    @article{lozano_proceedings_2014,
    title = {Proceedings of the {Eighth} {International} {Workshop} on {Software} {Clones} ({IWSC} 2014) {Clones} and {Macro} co-changes {Clones} and {Macro} co-changes},
    volume = {63},
    issn = {1863-2122},
    url = {http://www.easst.org/eceasst/},
    abstract = {Ideally, any change that modifies the similar parts of a cloned code snippet should be propagated to all its duplicates. In practice however, consistent propagation of changes in clones does not always happen. Current evidence indicates that clone families have a 50\% chance of having consistent changes. This paper measures cloning and co-changes at file level as a proxy to assess the frequency of consistent changes. Given that changes to a clone group are not necessarily propagated in the same commit transaction (i.e., late propagations), our analysis uses macro co-changes instead of the traditional definition of co-changes. Macro changes group bursts of changes that are closer among themselves than to other changes, regardless of author or message. Then, macro co-changes are sets of files that change in the same macro changes. Each cloned file is tagged depending on whether any of the files with which it macro co-changes is cloned with it (during the macro change) or not. Contrary to previous results, we discovered that most of the cloned files macro co-change only with files with which they share clones. Thus providing evidence that macro changes are appropriate to study the conjecture of clones requiring co-changes, and indicating that consistent changes might be the norm in cloned code.},
    journal = {Electronic Communications of the EASST},
    author = {Lozano, Angela and Jaafar, Fehmi and Mens, Kim and Gaël Guéhéneuc, Yann},
    year = {2014},
    keywords = {Maintenance, Mining Software Repositories, Cloning, Empirical Software Engineering, Impact, Macro co-changes, Stability.}
    }

  • S. McIntosh, M. Poehlmann, E. Juergens, A. Mockus, B. Adams, A. E. Hassan, B. Haupt, and C. Wagner, “Collecting and leveraging a benchmark of build system clones to aid in quality assessments,” in 36th International Conference on Software Engineering, ICSE Companion 2014 – Proceedings, 2014, pp. 145-154. doi:10.1145/2591062.2591181
    [BibTeX] [Abstract] [PDF]

    Build systems specify how sources are transformed into deliverables, and hence must be carefully maintained to ensure that deliverables are assembled correctly. Similar to source code, build systems tend to grow in complexity unless specifications are refactored. This paper describes how clone detection can aid in quality assessments that determine if and where build refactoring effort should be applied. We gauge cloning rates in build systems by collecting and analyzing a benchmark comprising 3,872 build systems. Analysis of the benchmark reveals that: (1) build systems tend to have higher cloning rates than other software artifacts, (2) recent build technologies tend to be more prone to cloning, especially of configuration details like API dependencies, than older technologies, and (3) build systems that have fewer clones achieve higher levels of reuse via mechanisms not offered by build technologies. Our findings aided in refactoring a large industrial build system containing 1.1 million lines. Copyright © 2014 ACM.

    @inproceedings{mcintosh_collecting_2014,
    title = {Collecting and leveraging a benchmark of build system clones to aid in quality assessments},
    isbn = {978-1-4503-2768-8},
    doi = {10.1145/2591062.2591181},
    url = {https://dl.acm.org/doi/10.1145/2591062.2591181},
    abstract = {Build systems specify how sources are transformed into deliverables, and hence must be carefully maintained to ensure that deliverables are assembled correctly. Similar to source code, build systems tend to grow in complexity unless specifications are refactored. This paper describes how clone detection can aid in quality assessments that determine if and where build refactoring effort should be applied. We gauge cloning rates in build systems by collecting and analyzing a benchmark comprising 3,872 build systems. Analysis of the benchmark reveals that: (1) build systems tend to have higher cloning rates than other software artifacts, (2) recent build technologies tend to be more prone to cloning, especially of configuration details like API dependencies, than older technologies, and (3) build systems that have fewer clones achieve higher levels of reuse via mechanisms not offered by build technologies. Our findings aided in refactoring a large industrial build system containing 1.1 million lines. Copyright © 2014 ACM.},
    booktitle = {36th {International} {Conference} on {Software} {Engineering}, {ICSE} {Companion} 2014 - {Proceedings}},
    publisher = {Association for Computing Machinery},
    author = {McIntosh, Shane and Poehlmann, Martin and Juergens, Elmar and Mockus, Audris and Adams, Bram and Hassan, Ahmed E. and Haupt, Brigitte and Wagner, Christian},
    year = {2014},
    keywords = {Clone detection, Build systems, Quality assessments},
    pages = {145-154}
    }

  • M. Mondal, C. K. Roy, and K. A. Schneider, “A fine-grained analysis on the evolutionary coupling of cloned code,” in 2014 ieee international conference on software maintenance and evolution, 2014, pp. 51-60.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6976071,
    author={M. {Mondal} and C. K. {Roy} and K. A. {Schneider}},
    booktitle={2014 IEEE International Conference on Software Maintenance and Evolution},
    title={A Fine-Grained Analysis on the Evolutionary Coupling of Cloned Code},
    year={2014},
    url = {https://ieeexplore.ieee.org/document/6976071},
    volume={},
    number={},
    pages={51-60},}

  • M. Mondal, C. K. Roy, and K. A. Schneider, “Late Propagation in Near-Miss Clones: An Empirical Study,” Electronic communications of the easst, vol. 63, 2014.
    [BibTeX] [Abstract] [PDF]

    If two or more code fragments in the code-base of a software system are exactly or nearly similar to one another, we call them code clones. It is often important that updates (i.e., changes) in one clone fragment should be propagated to the other similar clone fragments to ensure consistency. However, if there is a delay in this propagation because of unawareness, the system might behave inconsistently. This delay in propagation, also known as late propagation, has been investigated by a number of existing studies. However, the existing studies did not investigate the intensity as well as the effect of late propagation in different types of clones separately. Also, late propagation in Type 3 clones is yet to investigate. In this research work we investigate late propagation in three types of clones (Type 1, Type 2, and Type 3) separately. According to our experimental results on six subject systems written in three programming languages, late propagation is more intense in Type 3 clones compared to the other two clone-types. Block clones are mostly involved in late propagation instead of method clones. Refactoring of block clones can possibly minimize late propagation. If not refactorable, then the clones that often need to be changed together consistently should be placed in close proximity to one another.

    @article{mondal_late_2014,
    title = {Late {Propagation} in {Near}-{Miss} {Clones}: {An} {Empirical} {Study}},
    volume = {63},
    issn = {1863-2122},
    url = {http://ubsrvweb09.ub.tu-berlin.de/eceasst/article/view/913},
    abstract = {If two or more code fragments in the code-base of a software system are exactly or nearly similar to one another, we call them code clones. It is often important that updates (i.e., changes) in one clone fragment should be propagated to the other similar clone fragments to ensure consistency. However, if there is a delay in this propagation because of unawareness, the system might behave inconsistently. This delay in propagation, also known as late propagation, has been investigated by a number of existing studies. However, the existing studies did not investigate the intensity as well as the effect of late propagation in different types of clones separately. Also, late propagation in Type 3 clones is yet to investigate. In this research work we investigate late propagation in three types of clones (Type 1, Type 2, and Type 3) separately. According to our experimental results on six subject systems written in three programming languages, late propagation is more intense in Type 3 clones compared to the other two clone-types. Block clones are mostly involved in late propagation instead of method clones. Refactoring of block clones can possibly minimize late propagation. If not refactorable, then the clones that often need to be changed together consistently should be placed in close proximity to one another.},
    journal = {Electronic Communications of the EASST},
    author = {Mondal, Manishankar and Roy, Chanchal K and Schneider, Kevin A},
    year = {2014},
    keywords = {Code Clone, Software Maintenance, Code Evolution, Late Propagation, Method Genealogy}
    }

  • M. Mondal, C. K. Roy, and K. A. Schneider, “Prediction and ranking of co-change candidates for clones,” in Proceedings of the 11th working conference on mining software repositories, New York, NY, USA, 2014, p. 32–41. doi:10.1145/2597073.2597104
    [BibTeX] [PDF]
    @inproceedings{10.1145/2597073.2597104,
    author = {Mondal, Manishankar and Roy, Chanchal K. and Schneider, Kevin A.},
    title = {Prediction and Ranking of Co-Change Candidates for Clones},
    year = {2014},
    isbn = {9781450328630},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2597073.2597104},
    doi = {10.1145/2597073.2597104},
    booktitle = {Proceedings of the 11th Working Conference on Mining Software Repositories},
    pages = {32–41},
    numpages = {10},
    keywords = {Co-change Recency, Co-change Frequency, Evolutionary Coupling, Code Clones, Co-change Candidates, Ranking},
    location = {Hyderabad, India},
    series = {MSR 2014}
    }

  • M. Mandal, C. K. Roy, and K. A. Schneider, “Automatic ranking of clones for refactoring through mining association rules,” in 2014 software evolution week – ieee conference on software maintenance, reengineering, and reverse engineering (csmr-wcre), 2014, pp. 114-123.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6747161,
    author={M. {Mandal} and C. K. {Roy} and K. A. {Schneider}},
    booktitle={2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE)},
    title={Automatic ranking of clones for refactoring through mining association rules},
    year={2014},
    url = {https://ieeexplore.ieee.org/document/6747161},
    volume={},
    number={},
    pages={114-123},}

  • A. Mubarak-Ali and S. Sulaiman, “A systematic literature review of code clone prevention approaches,” 2014.
    [BibTeX] [PDF]
    @techreport{mubarak-ali_systematic_nodate,
    title = {A Systematic Literature Review of Code Clone Prevention Approaches},
    url = {https://www.researchgate.net/publication/270760283},
    journal = {International Journal of Software Engineering and Technology},
    year = {2014},
    author = {Mubarak-Ali, Al-Fahim and Sulaiman, Shahida},
    note = {Publication Title: researchgate.net},
    keywords = {Code Clone, Code Clone Prevention Approach, Systematic Literature Review}
    }

  • H. Murakami, Y. Higo, and S. Kusumoto, “A dataset of clone references with gaps,” in Proceedings of the 11th working conference on mining software repositories, New York, NY, USA, 2014, p. 412–415. doi:10.1145/2597073.2597133
    [BibTeX] [PDF]
    @inproceedings{10.1145/2597073.2597133,
    author = {Murakami, Hiroaki and Higo, Yoshiki and Kusumoto, Shinji},
    title = {A Dataset of Clone References with Gaps},
    year = {2014},
    isbn = {9781450328630},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2597073.2597133},
    doi = {10.1145/2597073.2597133},
    booktitle = {Proceedings of the 11th Working Conference on Mining Software Repositories},
    pages = {412–415},
    numpages = {4},
    keywords = {Dataset, Software maintenance, Code clone},
    location = {Hyderabad, India},
    series = {MSR 2014}
    }

  • G. P. Krishnan and N. Tsantalis, “Unification and refactoring of clones,” in 2014 software evolution week – ieee conference on software maintenance, reengineering, and reverse engineering (csmr-wcre), 2014, pp. 104-113.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6747160,
    author={G. P. {Krishnan} and N. {Tsantalis}},
    booktitle={2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE)},
    title={Unification and refactoring of clones},
    url = {https://ieeexplore.ieee.org/abstract/document/6747160/},
    year={2014},
    volume={},
    number={},
    pages={104-113},}

  • D. Poongodi and G. Arasu, “Multi-agent based sequence algorithm for detecting plagiarism and clones in java source code using abstract syntax tree,” International journal of computer applications, vol. 90, 2014. doi:10.5120/15796-4494
    [BibTeX] [PDF]
    @article{article,
    author = {Poongodi, D. and Arasu, G.},
    year = {2014},
    url ={https://www.researchgate.net/publication/262990748_Multi-Agent_based_Sequence_Algorithm_for_Detecting_Plagiarism_and_Clones_in_Java_Source_Code_using_Abstract_Syntax_Tree},
    month = {02},
    pages = {},
    title = {Multi-Agent based Sequence Algorithm for Detecting Plagiarism and Clones in Java Source Code using Abstract Syntax Tree},
    volume = {90},
    journal = {International Journal of Computer Applications},
    doi = {10.5120/15796-4494}
    }

  • A. Lozano, F. Jaafar, K. Mens, and Y. Guéhéneuc, “Clones and macro co-changes.” 2014.
    [BibTeX] [PDF]
    @inproceedings{inproceedings,
    author = {Lozano, Angela and Jaafar, Fehmi and Mens, Kim and Guéhéneuc, Yann-Gaël},
    year = {2014},
    month = {02},
    pages = {},
    journal= { Proceedings of the Eight International Workshop on Clones (IWSC)},
    url = {https://www.researchgate.net/publication/279532581_Clones_and_Macro_co-changes},
    title = {Clones and Macro co-changes}
    }

  • Y. Shao, X. Luo, C. Qian, P. Zhu, and L. Zhang, “Towards a scalable resource-driven approach for detecting repackaged android applications,” in Proceedings of the 30th annual computer security applications conference, New York, NY, USA, 2014, p. 56–65. doi:10.1145/2664243.2664275
    [BibTeX] [PDF]
    @inproceedings{10.1145/2664243.2664275,
    author = {Shao, Yuru and Luo, Xiapu and Qian, Chenxiong and Zhu, Pengfei and Zhang, Lei},
    title = {Towards a Scalable Resource-Driven Approach for Detecting Repackaged Android Applications},
    year = {2014},
    isbn = {9781450330053},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2664243.2664275},
    doi = {10.1145/2664243.2664275},
    booktitle = {Proceedings of the 30th Annual Computer Security Applications Conference},
    pages = {56–65},
    numpages = {10},
    location = {New Orleans, Louisiana, USA},
    series = {ACSAC ’14}
    }

  • M. Sano, E. Choi, N. Yoshida, Y. Yamanaka, and K. Inoue, “Supporting clone analysis with tag cloud visualization,” in International Workshop on Innovative Software Development Methodologies and Practices, InnoSWDev, 2014, pp. 94-99. doi:10.1145/2666581.2666586
    [BibTeX] [Abstract] [PDF]

    So far, a lot of techniques have been developed on the detection of code clones (i.e., duplicated code) in large-scale source code. Currently, the code clone research community is gradually shifting its focus of attention from the detection to the management (e.g., clone refactoring). During clone management, developers need to understand how and why code clones are scattered in source code, and then decide how to handle those code clones. In this paper, we present a clone analysis tool with tag cloud visualization. This tool is aimed at helping to understand why code clone are concentrated in a part of a software system by generating tag clouds from a collection of identifier names in source code.

    @inproceedings{sano_supporting_2014,
    title = {Supporting clone analysis with tag cloud visualization},
    isbn = {978-1-4503-3226-2},
    doi = {10.1145/2666581.2666586},
    url = {https://dl.acm.org/doi/abs/10.1145/2666581.2666586},
    abstract = {So far, a lot of techniques have been developed on the detection of code clones (i.e., duplicated code) in large-scale source code. Currently, the code clone research community is gradually shifting its focus of attention from the detection to the management (e.g., clone refactoring). During clone management, developers need to understand how and why code clones are scattered in source code, and then decide how to handle those code clones. In this paper, we present a clone analysis tool with tag cloud visualization. This tool is aimed at helping to understand why code clone are concentrated in a part of a software system by generating tag clouds from a collection of identifier names in source code.},
    booktitle = {International {Workshop} on {Innovative} {Software} {Development} {Methodologies} and {Practices}, {InnoSWDev}},
    publisher = {Association for Computing Machinery, Inc},
    author = {Sano, Manamu and Choi, Eunjong and Yoshida, Norihiro and Yamanaka, Yuki and Inoue, Katsuro},
    month = nov,
    year = {2014},
    keywords = {Code clone, Tag cloud},
    pages = {94-99}
    }

  • P. Singh and H. Kaur, “Detection Of Clones from UML Diagrams Unified Modeling Language,” 2014.
    [BibTeX] [Abstract] [PDF]

    Model Driven Engineering has become standard and important framework in software research field. UML domain models are conceptual models which are used to design and develop software in software development life cycle. Unexpected copy of model elements leads to many problem. Models contain design level similarities and are equally harmful for software maintain-ace as code clones are. So number of clones need to be detected from UML domain models. This paper introduces an approach to detect clones in UML diagrams. UML diagram contains redundant element which increases the complexity and need to be removed. Firstly, UML diagrams are encoded as XML files. Tokens are extracted and matched using programming technique. The approach is based on finding similarities in tokens known as clones.In this , we tried a program to detect clones from UML Class Diagram. Class diagrams are basically converted to XML format and the proposed program is run on that XML file to detect similar classes.Our approach detect only exact matching classes means classes which are 1000\% similar to one another in terms of attributes and operations

    @techreport{singh_detection_2014-1,
    title = {Detection {Of} {Clones} from {UML} {Diagrams} {Unified} {Modeling} {Language}},
    url = {https://www.semanticscholar.org/paper/Detection-Of-Clones-from-UML-Diagrams-Unified-Singh-Kaur/d19537795679d2a375eb9d4bc1d9811eea153122},
    abstract = {Model Driven Engineering has become standard and important framework in software research field. UML domain models are conceptual models which are used to design and develop software in software development life cycle. Unexpected copy of model elements leads to many problem. Models contain design level similarities and are equally harmful for software maintain-ace as code clones are. So number of clones need to be detected from UML domain models. This paper introduces an approach to detect clones in UML diagrams. UML diagram contains redundant element which increases the complexity and need to be removed. Firstly, UML diagrams are encoded as XML files. Tokens are extracted and matched using programming technique. The approach is based on finding similarities in tokens known as clones.In this , we tried a program to detect clones from UML Class Diagram. Class diagrams are basically converted to XML format and the proposed program is run on that XML file to detect similar classes.Our approach detect only exact matching classes means classes which are 1000\% similar to one another in terms of attributes and operations},
    author = {Singh, Pamilpreet and Kaur, Harpreet},
    year = {2014},
    journal = {IJSRD-International Journal for Scientific Research \& Development},
    volume = {2},
    keywords = {Model Clones, StarUMLTool, XML File},
    pages = {2321-2613}
    }

  • J. Svajlenko, J. F. Islam, I. Keivanloo, C. K. Roy, and M. M. Mia, “Towards a big data curated benchmark of inter-project code clones,” in 2014 ieee international conference on software maintenance and evolution, 2014, pp. 476-480.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6976121,
    author={J. {Svajlenko} and J. F. {Islam} and I. {Keivanloo} and C. K. {Roy} and M. M. {Mia}},
    booktitle={2014 IEEE International Conference on Software Maintenance and Evolution},
    title={Towards a Big Data Curated Benchmark of Inter-project Code Clones},
    year={2014},
    url = {https://ieeexplore.ieee.org/abstract/document/6976121/},
    volume={},
    number={},
    pages={476-480},
    }

  • J. Svajlenko and C. K. Roy, “Evaluating modern clone detection tools,” in 2014 ieee international conference on software maintenance and evolution, 2014, pp. 321-330.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6976098,
    author={J. {Svajlenko} and C. K. {Roy}},
    booktitle={2014 IEEE International Conference on Software Maintenance and Evolution},
    title={Evaluating Modern Clone Detection Tools},
    year={2014},
    url = {https://ieeexplore.ieee.org/abstract/document/6976098/},
    volume={},
    number={},
    pages={321-330},
    }

  • L. Voinea and A. C. Telea, “Visual clone analysis with solidsdd,” in 2014 second ieee working conference on software visualization, 2014, pp. 79-82.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6980217,
    author={L. {Voinea} and A. C. {Telea}},
    booktitle={2014 Second IEEE Working Conference on Software Visualization},
    title={Visual Clone Analysis with SolidSDD},
    year={2014},
    url = {https://ieeexplore.ieee.org/abstract/document/6980217/},
    volume={},
    number={},
    pages={79-82},}

  • W. Wang and M. W. Godfrey, “Recommending clones for refactoring using design, context, and history,” in 2014 ieee international conference on software maintenance and evolution, 2014, pp. 331-340.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6976099,
    author={W. {Wang} and M. W. {Godfrey}},
    booktitle={2014 IEEE International Conference on Software Maintenance and Evolution},
    title={Recommending Clones for Refactoring Using Design, Context, and History},
    year={2014},
    url = {https://ieeexplore.ieee.org/document/6976099},
    volume={},
    number={},
    pages={331-340},
    }

  • X. Wang, Y. Dang, L. Zhang, D. Zhang, E. Lan, and H. Mei, “Predicting consistency-maintenance requirement of code clonesat copy-and-paste time,” Ieee transactions on software engineering, vol. 40, iss. 8, pp. 773-794, 2014.
    [BibTeX] [PDF]
    @ARTICLE{6815760,
    author={X. {Wang} and Y. {Dang} and L. {Zhang} and D. {Zhang} and E. {Lan} and H. {Mei}},
    journal={IEEE Transactions on Software Engineering},
    title={Predicting Consistency-Maintenance Requirement of Code Clonesat Copy-and-Paste Time},
    year={2014},
    url = {https://ieeexplore.ieee.org/abstract/document/6815760/},
    volume={40},
    number={8},
    pages={773-794},}

  • S. Xie, F. Khomh, Y. Zou, and I. Keivanloo, “An empirical study on the fault-proneness of clone migration in clone genealogies,” in 2014 software evolution week – ieee conference on software maintenance, reengineering, and reverse engineering (csmr-wcre), 2014, pp. 94-103.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6747229,
    author={S. {Xie} and F. {Khomh} and Y. {Zou} and I. {Keivanloo}},
    booktitle={2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE)},
    title={An empirical study on the fault-proneness of clone migration in clone genealogies},
    year={2014},
    url = {https://ieeexplore.ieee.org/abstract/document/6747229/},
    volume={},
    number={},
    pages={94-103},
    }

  • L. Yin, L. Zhang, M. Hou, and D. Liu, “A novel approach for predicting the probability of inconsistent changes to code clones based lda,” in Proceedings of the 2014 international conference on computer, communications and information technology, 2014, pp. 118-122. doi:https://doi.org/10.2991/ccit-14.2014.32
    [BibTeX] [PDF]
    @inproceedings{Yin2014/01,
    title={A Novel Approach for Predicting the Probability of Inconsistent Changes to Code Clones Based LDA},
    author={Lili Yin and Liping Zhang and Min Hou and Dongsheng Liu},
    year={2014},
    booktitle={Proceedings of the 2014 International Conference on Computer, Communications and Information Technology},
    pages={118-122},
    issn={1951-6851},
    isbn={978-90786-77-97-0},
    url={https://doi.org/10.2991/ccit-14.2014.32},
    doi={https://doi.org/10.2991/ccit-14.2014.32},
    publisher={Atlantis Press}
    }

  • W. Qu, Y. Jia, and M. Jiang, “Pattern mining of cloned codes in software systems,” Inf. sci., vol. 259, p. 544–554, 2014. doi:10.1016/j.ins.2010.04.022
    [BibTeX] [PDF]
    @article{10.1016/j.ins.2010.04.022,
    author = {Qu, Wei and Jia, Yuanyuan and Jiang, Michael},
    title = {Pattern Mining of Cloned Codes in Software Systems},
    year = {2014},
    issue_date = {February, 2014},
    publisher = {Elsevier Science Inc.},
    address = {USA},
    volume = {259},
    issn = {0020-0255},
    url = {https://doi.org/10.1016/j.ins.2010.04.022},
    doi = {10.1016/j.ins.2010.04.022},
    journal = {Inf. Sci.},
    month = feb,
    pages = {544–554},
    numpages = {11},
    keywords = {Software reuse detection, Software engineering, Pattern mining, Software clone detection}
    }

2013} url = {https://ieeexplore.ieee.org/abstract/document/6613838

  • M. Mondal, C. K. Roy, and K. A. Schneider, “Insight into a method co-change pattern to identify highly coupled methods: an empirical study,” in 2013 21st international conference on program comprehension (icpc), 2013} url = {https://ieeexplore.ieee.org/abstract/document/6613838, pp. 103-112.
    [BibTeX]
    @INPROCEEDINGS{6613838,
    author={M. {Mondal} and C. K. {Roy} and K. A. {Schneider}},
    booktitle={2013 21st International Conference on Program Comprehension (ICPC)},
    title={Insight into a method co-change pattern to identify highly coupled methods: An empirical study},
    year={2013}
    url = {https://ieeexplore.ieee.org/abstract/document/6613838},
    volume={},
    number={},
    pages={103-112},}

2013

  • E. P. Antony, M. H. Alalfi, and J. R. Cordy, “An approach to clone detection in behavioural models,” in 2013 20th working conference on reverse engineering (wcre), 2013, pp. 472-476.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6671325,
    author={E. P. Antony and M. H. Alalfi and J. R. Cordy},
    booktitle={2013 20th Working Conference on Reverse Engineering (WCRE)},
    title={An approach to clone detection in behavioural models},
    year={2013},
    volume={},
    number={},
    pages={472-476},
    url = {http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.414.4555&rep=rep1&type=pdf},
    }

  • C. Dandois and W. Vanhoof, “Semantic code clones in logic programs,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2013, p. 35–50. doi:10.1007/978-3-642-38197-3_4
    [BibTeX] [Abstract] [PDF]

    In this paper, we study what is a semantic code clone pair in a logic program. Unlike our earlier work, that focused on simple syntactic equivalence for defining clones, we propose a more general approximation based on operational semantics and transformation rules. This new definition captures a wider set of clones, and allows to formally define the conditions under which a number of refactorings can be applied. © 2013 Springer-Verlag.

    @inproceedings{dandois_semantic_2013,
    title = {Semantic code clones in logic programs},
    volume = {7844 LNCS},
    isbn = {978-3-642-38196-6},
    doi = {10.1007/978-3-642-38197-3_4},
    abstract = {In this paper, we study what is a semantic code clone pair in a logic program. Unlike our earlier work, that focused on simple syntactic equivalence for defining clones, we propose a more general approximation based on operational semantics and transformation rules. This new definition captures a wider set of clones, and allows to formally define the conditions under which a number of refactorings can be applied. © 2013 Springer-Verlag.},
    booktitle = {Lecture {Notes} in {Computer} {Science} (including subseries {Lecture} {Notes} in {Artificial} {Intelligence} and {Lecture} {Notes} in {Bioinformatics})},
    author = {Dandois, Celine and Vanhoof, Wim},
    year = {2013},
    note = {ISSN: 03029743},
    pages = {35--50},
    url = {https://link.springer.com/chapter/10.1007/978-3-642-38197-3_4}
    }

  • F. Gauthier, T. Lavoie, and E. Merlo, “Uncovering access control weaknesses and flaws with security-discordant software clones,” in Proceedings of the 29th annual computer security applications conference, New York, NY, USA, 2013, p. 209–218. doi:10.1145/2523649.2523650
    [BibTeX] [PDF]
    @inproceedings{10.1145/2523649.2523650,
    author = {Gauthier, Fran\c{c}ois and Lavoie, Thierry and Merlo, Ettore},
    title = {Uncovering Access Control Weaknesses and Flaws with Security-Discordant Software Clones},
    year = {2013},
    isbn = {9781450320153},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2523649.2523650},
    doi = {10.1145/2523649.2523650},
    booktitle = {Proceedings of the 29th Annual Computer Security Applications Conference},
    pages = {209–218},
    numpages = {10},
    keywords = {flaws, PHP, security, clones, measurements, access control},
    location = {New Orleans, Louisiana, USA},
    series = {ACSAC ’13}
    }

  • L. Marks, Y. Zou, and I. Keivanloo, “An empirical study of the factors affecting co-change frequency of cloned code,” in Proceedings of the 2013 conference of the center for advanced studies on collaborative research, USA, 2013, p. 161–175.
    [BibTeX] [PDF]
    @inproceedings{10.5555/2555523.2555541,
    author = {Marks, Lionel and Zou, Ying and Keivanloo, Iman},
    title = {An Empirical Study of the Factors Affecting Co-Change Frequency of Cloned Code},
    year = {2013},
    publisher = {IBM Corp.},
    address = {USA},
    booktitle = {Proceedings of the 2013 Conference of the Center for Advanced Studies on Collaborative Research},
    pages = {161–175},
    numpages = {15},
    location = {Ontario, Canada},
    series = {CASCON ’13},
    url = {https://dl.acm.org/doi/10.5555/2555523.2555541}
    }

  • D. Speicher and A. Bremm, “Clone Removal in Java Programs as a Process of Stepwise Unification,” 26th workshop on logic programming, 2013.
    [BibTeX] [Abstract] [PDF]

    Cloned code is one of the most important obstacles against consistent software maintenance and evolution. Although today’s clone detection tools find a variety of clones, they do not offer any advice how to remove such clones. We explain the problems involved in finding a sequence of changes for clone removal and suggest to view this problem as a process of stepwise unification of the clone instances. Consequently the problem can be solved by backtracking over the possible unification steps.

    @article{speicher_clone_2013,
    title = {Clone {Removal} in {Java} {Programs} as a {Process} of {Stepwise} {Unification}},
    url = {http://arxiv.org/abs/1301.2447},
    abstract = {Cloned code is one of the most important obstacles against consistent software maintenance and evolution. Although today's clone detection tools find a variety of clones, they do not offer any advice how to remove such clones. We explain the problems involved in finding a sequence of changes for clone removal and suggest to view this problem as a process of stepwise unification of the clone instances. Consequently the problem can be solved by backtracking over the possible unification steps.},
    author = {Speicher, Daniel and Bremm, Andri},
    month = jan,
    year = {2013},
    journal ={26th Workshop on Logic Programming},
    note = {\_eprint: 1301.2447}
    }

  • J. Svajlenko, I. Keivanloo, and C. K. Roy, “Scaling classical clone detection tools for ultra-large datasets: an exploratory study,” in 2013 7th international workshop on software clones (iwsc), 2013, pp. 16-22.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6613037,
    author={J. {Svajlenko} and I. {Keivanloo} and C. K. {Roy}},
    booktitle={2013 7th International Workshop on Software Clones (IWSC)},
    title={Scaling classical clone detection tools for ultra-large datasets: An exploratory study},
    year={2013},
    volume={},
    number={},
    pages={16-22},
    url={https://ieeexplore.ieee.org/abstract/document/6613037}
    }

  • S. Xie, F. Khomh, and Y. Zou, “An empirical study of the fault-proneness of clone mutation and clone migration,” in 2013 10th working conference on mining software repositories (msr), 2013, pp. 149-158.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6624022,
    author={S. {Xie} and F. {Khomh} and Y. {Zou}},
    booktitle={2013 10th Working Conference on Mining Software Repositories (MSR)},
    title={An empirical study of the fault-proneness of clone mutation and clone migration},
    year={2013},
    url = {https://ieeexplore.ieee.org/abstract/document/6624022/},
    volume={},
    number={},
    pages={149-158}
    }

  • N. Gode and R. Koschke, “Studying clone evolution using incremental clone detection,” in Journal of software: Evolution and Process, 2013, pp. 165-192. doi:10.1002/smr.520
    [BibTeX] [Abstract] [PDF]

    Finding, understanding and managing software clones-passages of duplicated source code-is of large interest in research and practice. Analyzing the evolution of clones across multiple versions of a program adds value to both applications. Although there is an abundance of techniques to detect clones, current approaches are limited to a single version of a program. The current techniques to track clones utilize these single-version approaches and map clones of consecutive versions retroactively. This causes an unnecessary overhead in runtime and may lead to an incorrect mapping due to ambiguity. In this paper, we present an incremental clone detection algorithm, which detects clones based on the results of the previous version’s analysis. It creates a mapping between clones of consecutive versions along with the detection. We evaluated our incremental approach regarding its advantage in runtime as well as the usefulness of the mapping for studies on the clone evolution. Copyright ©2010 John Wiley & Sons, Ltd.

    @inproceedings{ode_studying_2013,
    title = {Studying clone evolution using incremental clone detection},
    volume = {25},
    doi = {10.1002/smr.520},
    url = {https://onlinelibrary.wiley.com/doi/abs/10.1002/smr.520},
    abstract = {Finding, understanding and managing software clones-passages of duplicated source code-is of large interest in research and practice. Analyzing the evolution of clones across multiple versions of a program adds value to both applications. Although there is an abundance of techniques to detect clones, current approaches are limited to a single version of a program. The current techniques to track clones utilize these single-version approaches and map clones of consecutive versions retroactively. This causes an unnecessary overhead in runtime and may lead to an incorrect mapping due to ambiguity. In this paper, we present an incremental clone detection algorithm, which detects clones based on the results of the previous version's analysis. It creates a mapping between clones of consecutive versions along with the detection. We evaluated our incremental approach regarding its advantage in runtime as well as the usefulness of the mapping for studies on the clone evolution. Copyright ©2010 John Wiley \& Sons, Ltd.},
    booktitle = {Journal of software: {Evolution} and {Process}},
    author = {Gode, Nils and Koschke, Rainer},
    month = feb,
    year = {2013},
    note = {ISSN: 20477481
    Issue: 2},
    keywords = {Software maintenance, Clone evolution, Incremental clone detection},
    pages = {165-192}
    }

  • H. Abdeen and O. Shata, “Characterizing and evaluating the impact of software interface clones,” International journal of software engineering & applications (ijsea), vol. 4, iss. 1, 2013. doi:10.5121/ijsea.2013.4106
    [BibTeX] [Abstract] [PDF]

    Software Interfaces are meant to describe contracts governing interactions between logic modules. Interfaces, if well designed, significantly reduce software complexity and ease maintainability. However, as software evolves, the organization and the quality of software interfaces gradually deteriorate. As a consequence, this often leads to increased development cost, lower code quality and reduced reusability. Code clones are one of the most known bad smells in source code. This design defect may occur in interfaces by duplicating method/API declarations in several interfaces. Such interfaces are similar from the point of view of public services/APIs they specify, thus they indicate a bad organization of application services. In this paper, we characterize the interface clone design defect and illustrate it via examples taken from real-world open source software applications. We conduct an empirical study covering nine real-world open source software applications to quantify the presence of interface clones and evaluate their impact on interface design quality. The results of the empirical study show that interface clones are widely present in software interfaces. They also show that the presence of interface clones may cause a degradation of interface cohesion and indicate a considerable presence of code clones at implementations level.

    @article{abdeen_characterizing_2013,
    title = {Characterizing and Evaluating the Impact of Software Interface Clones},
    volume = {4},
    url = {https://arxiv.org/abs/1302.1355},
    doi = {10.5121/ijsea.2013.4106},
    abstract = {Software Interfaces are meant to describe contracts governing interactions between logic modules. Interfaces, if well designed, significantly reduce software complexity and ease maintainability. However, as software evolves, the organization and the quality of software interfaces gradually deteriorate. As a consequence, this often leads to increased development cost, lower code quality and reduced reusability. Code clones are one of the most known bad smells in source code. This design defect may occur in interfaces by duplicating method/API declarations in several interfaces. Such interfaces are similar from the point of view of public services/APIs they specify, thus they indicate a bad organization of application services. In this paper, we characterize the interface clone design defect and illustrate it via examples taken from real-world open source software applications. We conduct an empirical study covering nine real-world open source software applications to quantify the presence of interface clones and evaluate their impact on interface design quality. The results of the empirical study show that interface clones are widely present in software interfaces. They also show that the presence of interface clones may cause a degradation of interface cohesion and indicate a considerable presence of code clones at implementations level.},
    number = {1},
    journal = {International Journal of Software Engineering \& Applications (IJSEA)},
    author = {Abdeen, Hani and Shata, Osama},
    year = {2013},
    keywords = {Code Clones, Interface Clones, Interface Cohesion, Interface Design Quality, Software Interfaces}
    }

  • M. Akhin, I. A. C. V. -. Sciences, Computer, and undefined 2013, “Tree slicing: Finding intertwined and gapped clones in one simple step,” Automatic control and computer sciences, vol. 47, iss. 7, pp. 427-432, 2013. doi:10.3103/S0146411613070171
    [BibTeX] [Abstract] [PDF]

    Most of software nowadays contain code duplication that leads to serious problems in software maintenance. A lot of different clone detection approaches have been proposed over the years to deal with this problem, but almost all of them do not consider semantic properties of the source code. We propose to reinforce traditional tree-based clone detection algorithm by using additional information about variable slices. This allows to find intertwined/gapped clones on variables; preliminary evaluation confirms applicability of our approach to real-world software. ©Allerton Press, Inc., 2013.

    @article{akhin_tree_2013,
    title = {Tree slicing: {Finding} intertwined and gapped clones in one simple step},
    volume = {47},
    url = {https://idp.springer.com/authorize/casa?redirect_uri=https://link.springer.com/article/10.3103/S0146411613070171&casa_token=kqQiLCiZrsUAAAAA:cVSsRcJ4pHrKZJjaVFahfhbcEbV5gDSiwaU5XdlqpXa5U69hnzHDjf_gHtxWSgAX3aK9f9Ao_GQbdFae},
    doi = {10.3103/S0146411613070171},
    abstract = {Most of software nowadays contain code duplication that leads to serious problems in software maintenance. A lot of different clone detection approaches have been proposed over the years to deal with this problem, but almost all of them do not consider semantic properties of the source code. We propose to reinforce traditional tree-based clone detection algorithm by using additional information about variable slices. This allows to find intertwined/gapped clones on variables; preliminary evaluation confirms applicability of our approach to real-world software. ©Allerton Press, Inc., 2013.},
    number = {7},
    journal = {Automatic Control and Computer Sciences},
    author = {Akhin, M and Sciences, V Itsykson - Automatic Control and {Computer} and undefined 2013},
    month = dec,
    year = {2013},
    keywords = {Software maintenance, Clone detection, Tree patterns, Tree slicing},
    pages = {427-432}
    }

  • N. Baliyan, V. Sharma, and Shivani, “A fuzzy model for high-level clones in software,” Sigsoft softw. eng. notes, vol. 38, iss. 3, p. 1–4, 2013. doi:10.1145/2464526.2464531
    [BibTeX] [PDF]
    @article{10.1145/2464526.2464531,
    author = {Baliyan, Niyati and Sharma, Vidushi and Shivani},
    title = {A Fuzzy Model for High-Level Clones in Software},
    year = {2013},
    issue_date = {May 2013},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    volume = {38},
    number = {3},
    issn = {0163-5948},
    url = {https://doi.org/10.1145/2464526.2464531},
    doi = {10.1145/2464526.2464531},
    journal = {SIGSOFT Softw. Eng. Notes},
    month = may,
    pages = {1–4},
    numpages = {4},
    keywords = {collocated simple clones, concept clones, fuzzy model, domain model clones, behavior clones, software similarity}
    }

  • S. Bazrafshan, “Late propagation of type-3 clones,” Softwaretechnik-trends, vol. 32, 2013. doi:10.1007/BF03323465
    [BibTeX] [PDF]
    @article{article,
    author = {Bazrafshan, Saman},
    year = {2013},
    month = {05},
    pages = {},
    title = {Late propagation of Type-3 Clones},
    volume = {32},
    url = {https://www.researchgate.net/publication/267856869_Late_propagation_of_Type-3_Clones},
    journal = {Softwaretechnik-Trends},
    doi = {10.1007/BF03323465}
    }

  • Y. Bian, G. Koru, X. Su, and P. Ma, “Spape: a semantic-preserving amorphous procedure extraction method for near-miss clones,” Journal of systems and software, pp. 2077-2093, 2013.
    [BibTeX] [PDF]
    @article{bian_spape_nodate,
    title = {SPAPE: A semantic-preserving amorphous procedure extraction method for near-miss clones},
    url = {https://www.sciencedirect.com/science/article/pii/S0164121213000733},
    journal = {Journal of Systems and Software},
    author = {Yixin Bian and Gunes Koru and Xiaohong Su and Peijun Ma},
    pages = {2077-2093},
    year = {2013}
    }

  • M. S. Uddin, C. K. Roy, and K. A. Schneider, “Simcad: an extensible and faster clone detection tool for large scale software systems,” in 2013 21st international conference on program comprehension (icpc), 2013, pp. 236-238.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6613857,
    author={M. S. {Uddin} and C. K. {Roy} and K. A. {Schneider}},
    booktitle={2013 21st International Conference on Program Comprehension (ICPC)},
    title={SimCad: An extensible and faster clone detection tool for large scale software systems},
    year={2013},
    url = {https://ieeexplore.ieee.org/document/6613857},
    volume={},
    number={},
    pages={236-238},
    }

  • D. Chatterji, J. C. Carver, N. A. Kraft, and J. Harder, “Effects of cloned code on software maintainability: a replicated developer study,” in 2013 20th working conference on reverse engineering (wcre), 2013, pp. 112-121.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6671286,
    author={Chatterji, Debarshi and Carver, Jeffrey C and Kraft, Nicholas A and Harder, Jan},
    booktitle={2013 20th Working Conference on Reverse Engineering (WCRE)},
    title={Effects of cloned code on software maintainability: A replicated developer study},
    year={2013},
    url = {https://ieeexplore.ieee.org/abstract/document/6671286/},
    volume={},
    number={},
    pages={112-121},
    }

  • A. Corazza, S. Di Martino, V. Maggio, A. Moschitti, A. Passerini, G. Scanniello, and F. Silvestri, “Using Machine Learning and Information Retrieval Techniques to Improve Software Maintainability,” in Communications in Computer and Information Science, 2013, pp. 117-134. doi:10.1007/978-3-642-45260-4_9
    [BibTeX] [PDF]
    @inproceedings{corazza_using_2013,
    title = {Using {Machine} {Learning} and {Information} {Retrieval} {Techniques} to {Improve} {Software} {Maintainability}},
    volume = {379 CCIS},
    url = {https://link.springer.com/chapter/10.1007/978-3-642-45260-4_9},
    isbn = {978-3-642-45259-8},
    doi = {10.1007/978-3-642-45260-4_9},
    booktitle = {Communications in {Computer} and {Information} {Science}},
    publisher = {Springer Verlag},
    author = {Corazza, Anna and Di Martino, Sergio and Maggio, Valerio and Moschitti, Alessandro and Passerini, Andrea and Scanniello, Giuseppe and Silvestri, Fabrizio},
    year = {2013},
    note = {ISSN: 18650929},
    keywords = {Machine Learning, Information Retrieval, Natural Language Processing, Software Maintenance and Evolution},
    pages = {117-134}
    }

  • A. El-Matarawy, M. El-Ramly, and R. Bahgat, “Parallel and distributed code clone detection using sequential pattern mining,” Article in international journal of computer applications, vol. 62, iss. 10, pp. 975-8887, 2013. doi:10.5120/10118-4792
    [BibTeX] [PDF]
    @article{el-matarawy_parallel_2013,
    title = {Parallel and Distributed Code Clone Detection using Sequential Pattern Mining},
    volume = {62},
    url = {https://www.researchgate.net/publication/258789775_Parallel_and_Distributed_Code_Clone_Detection_using_Sequential_Pattern_Mining},
    doi = {10.5120/10118-4792},
    number = {10},
    journal = {Article in International Journal of Computer Applications},
    author = {El-Matarawy, Ali and El-Ramly, Mohammad and Bahgat, Reem},
    year = {2013},
    keywords = {Code clones, data mining, apriori property, clone relation terminologies, clone types, distributed code clone detector, lexical approach, parallel code clone detector, sequential pattern mining, syntactic approach, textual approach},
    pages = {975-8887},
    }

  • T. Görg, “A model-based approach to type-3 clone elimination,” Softwaretechnik-trends, vol. 32, 2013. doi:10.1007/BF03323467
    [BibTeX] [PDF]
    @article{article,
    author = {Görg, Torsten},
    year = {2013},
    month = {05},
    pages = {},
    title = {A Model-Based Approach to Type-3 Clone Elimination},
    volume = {32},
    url = {https://www.researchgate.net/publication/268370115_A_Model-Based_Approach_to_Type-3_Clone_Elimination},
    journal = {Softwaretechnik-Trends},
    doi = {10.1007/BF03323467}
    }

  • F. Hermans, B. Sedee, M. Pinzger, and A. Van Deursen, “Data clone detection and visualization in spreadsheets,” in 2013 35th international conference on software engineering (icse), 2013, pp. 292-301.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6606575,
    author={Hermans, Felienne and Sedee, Ben and Pinzger, Martin and Van Deursen, Arie},
    booktitle={2013 35th International Conference on Software Engineering (ICSE)},
    title={Data clone detection and visualization in spreadsheets},
    year={2013},
    url = {https://ieeexplore.ieee.org/document/6606575},
    volume={},
    number={},
    pages={292-301},
    }

  • H. Murakami, K. Hotta, Y. Higo, H. Igaki, and S. Kusumoto, “Gapped code clone detection with lightweight source code analysis,” in 2013 21st international conference on program comprehension (icpc), 2013, pp. 93-102.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6613837,
    author={H. {Murakami} and K. {Hotta} and Y. {Higo} and H. {Igaki} and S. {Kusumoto}},
    booktitle={2013 21st International Conference on Program Comprehension (ICPC)},
    title={Gapped code clone detection with lightweight source code analysis},
    year={2013},
    url = {https://ieeexplore.ieee.org/document/6613837},
    volume={},
    number={},
    pages={93-102},}

  • Y. Higo and S. Kusumoto, “Identifying clone removal opportunities based on co-evolution analysis,” in Proceedings of the 2013 international workshop on principles of software evolution, New York, NY, USA, 2013, p. 63–67. doi:10.1145/2501543.2501552
    [BibTeX] [PDF]
    @inproceedings{10.1145/2501543.2501552,
    author = {Higo, Yoshiki and Kusumoto, Shinji},
    title = {Identifying Clone Removal Opportunities Based on Co-Evolution Analysis},
    year = {2013},
    isbn = {9781450323116},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2501543.2501552},
    doi = {10.1145/2501543.2501552},
    booktitle = {Proceedings of the 2013 International Workshop on Principles of Software Evolution},
    pages = {63–67},
    numpages = {5},
    keywords = {Refactoring, Code clone, Co-evolution analysis},
    location = {Saint Petersburg, Russia},
    series = {IWPSE 2013}
    }

  • T. Kamiya, “Agec: an execution-semantic clone detection tool,” in 2013 21st international conference on program comprehension (icpc), 2013, pp. 227-229.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6613854,
    author={T. {Kamiya}},
    booktitle={2013 21st International Conference on Program Comprehension (ICPC)},
    title={Agec: An execution-semantic clone detection tool},
    year={2013},
    url = {https://ieeexplore.ieee.org/document/6613854},
    volume={},
    number={},
    pages={227-229},
    }

  • J. Kim and B. Moon, “Disguised malware script detection system using hybrid genetic algorithm,” in Proceedings of the 28th annual acm symposium on applied computing, New York, NY, USA, 2013, p. 182–187. doi:10.1145/2480362.2480401
    [BibTeX] [PDF]
    @inproceedings{10.1145/2480362.2480401,
    author = {Kim, Jinhyun and Moon, Byung-Ro},
    title = {Disguised Malware Script Detection System Using Hybrid Genetic Algorithm},
    year = {2013},
    isbn = {9781450316569},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2480362.2480401},
    doi = {10.1145/2480362.2480401},
    booktitle = {Proceedings of the 28th Annual ACM Symposium on Applied Computing},
    pages = {182–187},
    numpages = {6},
    keywords = {metric-based method, malware detection, malware disguise techniques, hybrid genetic algorithm},
    location = {Coimbra, Portugal},
    series = {SAC ’13}
    }

  • X. Li, T. Wang, X. Su, and P. Ma, “Functionally Equivalent Clone Detection Using IOT-Behavior Algorithm,” 2013.
    [BibTeX] [Abstract] [PDF]

    This paper presents an algorithm for the detection of functionally equivalent code clones in C code. The functionally equivalent code clones is the forth type of clones, which means that two or more code fragments that do the same calculation but with different syntax. Thus, we can detect the functionally equivalent clones by observing the input-output behavior. We propose the definition of input-output behavior by including not only the values of input-output, but also the number and types of input sets and output sets. We call this as the IOT (input, output and types)-Behavior of C code fragment. Our algorithm has been tested in open source code.

    @techreport{li_functionally_2013,
    title = {Functionally {Equivalent} {Clone} {Detection} {Using} {IOT}-{Behavior} {Algorithm}},
    url = {https://www.atlantis-press.com/proceedings/icaise-13/6969},
    abstract = {This paper presents an algorithm for the detection of functionally equivalent code clones in C code. The functionally equivalent code clones is the forth type of clones, which means that two or more code fragments that do the same calculation but with different syntax. Thus, we can detect the functionally equivalent clones by observing the input-output behavior. We propose the definition of input-output behavior by including not only the values of input-output, but also the number and types of input sets and output sets. We call this as the IOT (input, output and types)-Behavior of C code fragment. Our algorithm has been tested in open source code.},
    author = {Li, Xia and Wang, Tiantian and Su, Xiaohong and Ma, Peijun},
    year = {2013},
    journal = {Proceedings of the 2013 The International Conference on Artificial Intelligence and Software Engineering (ICAISE) },
    note = {Publication Title: atlantis-press.com},
    keywords = {Clone Detection, Functionally Equivalent Clones, IOT-Behavior}
    }

  • A. Mahmoud and N. Niu, “Supporting requirements to code traceability through refactoring,” Requirements engineering, pp. 309-329, 2013. doi:10.1007/s00766-013-0197-0
    [BibTeX] [Abstract] [PDF]

    In this paper, we hypothesize that the distorted traceability tracks of a software system can be systematically re-established through refactoring, a set of behavior-preserving transformations for keeping the system quality under control during evolution. To test our hypothesis, we conduct an experimental analysis using three requirements-to-code datasets from various application domains. Our objective is to assess the impact of various refactoring methods on the performance of automated tracing tools based on information retrieval. Results show that renaming inconsistently named code identifiers, using RENAME IDEN-TIFIER refactoring, often leads to improvements in trace-ability. In contrast, removing code clones, using EXTRACT METHOD (XM) refactoring, is found to be detrimental. In addition, results show that moving misplaced code fragments , using MOVE METHOD refactoring, has no significant impact on trace link retrieval. We further evaluate RENAME IDENTIFIER refactoring by comparing its performance with other strategies often used to overcome the vocabulary mismatch problem in software artifacts. In addition, we propose and evaluate various techniques to mitigate the negative impact of XM refactoring. An effective trace-ability sign analysis is also conducted to quantify the effect of these refactoring methods on the vocabulary structure of software systems.

    @article{mahmoud_supporting_2013,
    title = {Supporting requirements to code traceability through refactoring},
    url = {https://doi.org/10.1007/s00766-013-0197-0},
    doi = {10.1007/s00766-013-0197-0},
    abstract = {In this paper, we hypothesize that the distorted traceability tracks of a software system can be systematically re-established through refactoring, a set of behavior-preserving transformations for keeping the system quality under control during evolution. To test our hypothesis, we conduct an experimental analysis using three requirements-to-code datasets from various application domains. Our objective is to assess the impact of various refactoring methods on the performance of automated tracing tools based on information retrieval. Results show that renaming inconsistently named code identifiers, using RENAME IDEN-TIFIER refactoring, often leads to improvements in trace-ability. In contrast, removing code clones, using EXTRACT METHOD (XM) refactoring, is found to be detrimental. In addition, results show that moving misplaced code fragments , using MOVE METHOD refactoring, has no significant impact on trace link retrieval. We further evaluate RENAME IDENTIFIER refactoring by comparing its performance with other strategies often used to overcome the vocabulary mismatch problem in software artifacts. In addition, we propose and evaluate various techniques to mitigate the negative impact of XM refactoring. An effective trace-ability sign analysis is also conducted to quantify the effect of these refactoring methods on the vocabulary structure of software systems.},
    journal = {Requirements Engineering},
    author = {Mahmoud, Anas and Niu, Nan},
    year = {2013},
    pages = {309-329},
    keywords = {Refactoring, Information retrieval, Traceability}
    }

  • T. Muhammad, M. F. Zibran, Y. Yamamoto, and C. K. Roy, “Near-miss clone patterns in web applications: an empirical study with industrial systems,” in 2013 26th ieee canadian conference on electrical and computer engineering (ccece), 2013, pp. 1-6.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6567821,
    author={T. {Muhammad} and M. F. {Zibran} and Y. {Yamamoto} and C. K. {Roy}},
    booktitle={2013 26th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE)},
    title={Near-miss clone patterns in web applications: An empirical study with industrial systems},
    year={2013},
    url = {https://ieeexplore.ieee.org/abstract/document/6567821},
    volume={},
    number={},
    pages={1-6},}

  • A. Hanjalić, “Clonevol: visualizing software evolution with code clones,” in 2013 first ieee working conference on software visualization (vissoft), 2013, pp. 1-4.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6650525,
    author={A. {Hanjalić}},
    booktitle={2013 First IEEE Working Conference on Software Visualization (VISSOFT)},
    title={ClonEvol: Visualizing software evolution with code clones},
    year={2013},
    url = {https://ieeexplore.ieee.org/abstract/document/6650525/},
    volume={},
    number={},
    pages={1-4},}

  • J. Harder, “How multiple developers affect the evolution of code clones,” in 2013 ieee international conference on software maintenance, 2013, pp. 30-39.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6676874,
    author={J. {Harder}},
    booktitle={2013 IEEE International Conference on Software Maintenance},
    title={How Multiple Developers Affect the Evolution of Code Clones},
    year={2013},
    url = {https://ieeexplore.ieee.org/abstract/document/6676874/},
    volume={},
    number={},
    pages={30-39},}

  • K. Raheja and R. K. Tekchandani, “An efficient code clone detection model on Java byte code using hybrid approach,” in IET Conference Publications, 2013, pp. 16-21. doi:10.1049/cp.2013.2287
    [BibTeX] [Abstract] [PDF]

    The objective of this study is to understand and analyse the concept of software Cloning and its detection. Software cloning is a perception in which source code is duplicated. Code clones and its detection is one of the emerging and most dominant areas of research in the field of software engineering. There exist a number of techniques to detect clone in software. The aim of this study will be given on acquiring and analysing the concept of hybrid clone detection technique. An algorithm is devised for detecting duplicity in the software by using hybrid software clone detection technique. This algorithm will first compute the required software metrics that provide sufficient information regarding the software application and then depending on software metrics matches the potential clone will be detected. It uses byte code to calculate the metrics of Java source code instead of using any intermediate representation. The reason of using byte code is that it is platform independent and represents the unified structure of the code. While detecting clones token based approach is applied on potential clones.

    @inproceedings{raheja_efficient_2013,
    title = {An efficient code clone detection model on {Java} byte code using hybrid approach},
    volume = {2013},
    url = {https://ieeexplore.ieee.org/document/6832302},
    isbn = {978-1-84919-846-2},
    doi = {10.1049/cp.2013.2287},
    abstract = {The objective of this study is to understand and analyse the concept of software Cloning and its detection. Software cloning is a perception in which source code is duplicated. Code clones and its detection is one of the emerging and most dominant areas of research in the field of software engineering. There exist a number of techniques to detect clone in software. The aim of this study will be given on acquiring and analysing the concept of hybrid clone detection technique. An algorithm is devised for detecting duplicity in the software by using hybrid software clone detection technique. This algorithm will first compute the required software metrics that provide sufficient information regarding the software application and then depending on software metrics matches the potential clone will be detected. It uses byte code to calculate the metrics of Java source code instead of using any intermediate representation. The reason of using byte code is that it is platform independent and represents the unified structure of the code. While detecting clones token based approach is applied on potential clones.},
    booktitle = {{IET} {Conference} {Publications}},
    publisher = {Institution of Engineering and Technology},
    author = {Raheja, Kanika and Tekchandani, Raj Kumar},
    year = {2013},
    note = {Issue: 647 CP},
    keywords = {Clone detection, Hybrid approach, Byte code, Metrics computation, Potential clones},
    pages = {16-21}
    }

  • M. S. Rahman, A. Aryani, C. K. Roy, and F. Perin, “On the relationships between domain-based coupling and code clones: an exploratory study,” in 2013 35th international conference on software engineering (icse), 2013, pp. 1265-1268.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6606694,
    author={M. S. {Rahman} and A. {Aryani} and C. K. {Roy} and F. {Perin}},
    booktitle={2013 35th International Conference on Software Engineering (ICSE)},
    title={On the relationships between domain-based coupling and code clones: An exploratory study},
    year={2013},
    url = {https://ieeexplore.ieee.org/abstract/document/6606694/},
    volume={},
    number={},
    pages={1265-1268},}

  • D. Rattan, R. Bhatia, and M. Singh, “Software clone detection: a systematic review,” Information and software technology, vol. 55, iss. 7, pp. 1165-1199, 2013. doi:https://doi.org/10.1016/j.infsof.2013.01.008
    [BibTeX] [PDF]
    @article{RATTAN20131165,
    title = "Software clone detection: A systematic review",
    journal = "Information and Software Technology",
    volume = "55",
    number = "7",
    pages = "1165 - 1199",
    year = "2013",
    issn = "0950-5849",
    doi = "https://doi.org/10.1016/j.infsof.2013.01.008",
    url = "http://www.sciencedirect.com/science/article/pii/S0950584913000323",
    author = "Dhavleesh Rattan and Rajesh Bhatia and Maninder Singh",
    keywords = "Software clone, Clone detection, Systematic literature review, Semantic clones, Model based clone",
    }

  • R. K. Saha, C. K. Roy, and K. A. Schneider, “Gcad: a near-miss clone genealogy extractor to support clone evolution analysis,” in 2013 ieee international conference on software maintenance, 2013, pp. 488-491.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6676939,
    author={R. K. {Saha} and C. K. {Roy} and K. A. {Schneider}},
    booktitle={2013 IEEE International Conference on Software Maintenance},
    title={gCad: A Near-Miss Clone Genealogy Extractor to Support Clone Evolution Analysis},
    year={2013},
    url = {https://ieeexplore.ieee.org/abstract/document/6676939/},
    volume={},
    number={},
    pages={488-491},
    }

  • R. K. Saha, C. K. Roy, K. A. Schneider, and D. E. Perry, “Understanding the evolution of type-3 clones: an exploratory study,” in 2013 10th working conference on mining software repositories (msr), 2013, pp. 139-148.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6624021,
    author={R. K. {Saha} and C. K. {Roy} and K. A. {Schneider} and D. E. {Perry}},
    booktitle={2013 10th Working Conference on Mining Software Repositories (MSR)},
    title={Understanding the evolution of Type-3 clones: An exploratory study},
    year={2013},
    url = {https://ieeexplore.ieee.org/abstract/document/6624021/},
    volume={},
    number={},
    pages={139-148},
    }

  • S. Schulze and D. Meyer, “On the robustness of clone detection to code obfuscation,” 7th international workshop on software clones (iwsc), pp. 62-68, 2013. doi:10.1109/IWSC.2013.6613045
    [BibTeX] [Abstract] [PDF]

    Code clones are a common reuse mechanism in software development. While there is an ongoing discussion about harmfulness and advantages of code cloning, this discussion is mainly centered around aspects of software quality. However, recent research has shown, that code cloning may have legal implications as well such as license violations. From this point of view, a developer may favor to hide his cloning activities. To this end, he could obfuscate the cloned code to deceive clone detectors. However, it is unknown how robust certain clone detection techniques are against code obfuscations. In this paper, we present a framework for semi-automated code obfuscations. Additionally, we present a case study to evaluate the robustness of selected clone detectors against such obfuscations.

    @article{schulze_robustness_2013,
    title = {On the robustness of clone detection to code obfuscation},
    url = {https://www.researchgate.net/publication/261447651},
    doi = {10.1109/IWSC.2013.6613045},
    abstract = {Code clones are a common reuse mechanism in software development. While there is an ongoing discussion about harmfulness and advantages of code cloning, this discussion is mainly centered around aspects of software quality. However, recent research has shown, that code cloning may have legal implications as well such as license violations. From this point of view, a developer may favor to hide his cloning activities. To this end, he could obfuscate the cloned code to deceive clone detectors. However, it is unknown how robust certain clone detection techniques are against code obfuscations. In this paper, we present a framework for semi-automated code obfuscations. Additionally, we present a case study to evaluate the robustness of selected clone detectors against such obfuscations.},
    journal = {7th International Workshop on Software Clones (IWSC)},
    author = {Schulze, Sandro and Meyer, Daniel},
    year = {2013},
    pages = {62-68},
    note = {ISBN: 9781467364454}
    }

  • Qing Qing Shi, Li Ping Zhang, Fan Jun Meng, and Dong Sheng Liu, “A novel detection approach for statement clones,” in 2013 ieee 4th international conference on software engineering and service science, 2013, pp. 27-30.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6615249,
    author={ {Qing Qing Shi} and {Li Ping Zhang} and {Fan Jun Meng} and {Dong Sheng Liu}},
    booktitle={2013 IEEE 4th International Conference on Software Engineering and Service Science},
    title={A novel detection approach for statement clones},
    year={2013},
    url = {https://ieeexplore.ieee.org/abstract/document/6615249/},
    volume={},
    number={},
    pages={27-30},}

  • M. Singh and V. Sharma, “High Level Clones Classification,” , 2, 2013.
    [BibTeX] [Abstract] [PDF]

    In present time’s High level clones (HLC) is an emerging concept that uses a hierarchical organization of fine gained clone fragments (Simple clones) to form coarser-grained clones (High Level Clone). Different research groups categorize clones with respect to different contexts. In this paper we review all such available categories of clones and present them in the form of a High Level Clone Classification. Classification can serve various purposes like studying the more frequently occurring high level clones, prioritizing different types of high level clones, devising re-engineering strategies for different types of high level clones etc.. For this classification of HLC we develop a fuzzy rule-based system and also visualize the results. Index Terms-High Level Clones, Fuzzy rule-based system, Fuzzy Inference System, Classification of High Level Clone.

    @techreport{singh_high_2013,
    title = {High {Level} {Clones} {Classification}},
    url = {http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.674.5726&rep=rep1&type=pdf},
    abstract = {In present time's High level clones (HLC) is an emerging concept that uses a hierarchical organization of fine gained clone fragments (Simple clones) to form coarser-grained clones (High Level Clone). Different research groups categorize clones with respect to different contexts. In this paper we review all such available categories of clones and present them in the form of a High Level Clone Classification. Classification can serve various purposes like studying the more frequently occurring high level clones, prioritizing different types of high level clones, devising re-engineering strategies for different types of high level clones etc.. For this classification of HLC we develop a fuzzy rule-based system and also visualize the results. Index Terms-High Level Clones, Fuzzy rule-based system, Fuzzy Inference System, Classification of High Level Clone.},
    number = {2},
    author = {Singh, Manu and Sharma, Vidushi},
    year = {2013},
    journal = {International Journal of Engineering and Advanced Technology (IJEAT)},
    note = {Publication Title: International Journal of Engineering and Advanced Technology (IJEAT)},
    keywords = {()},
    pages = {2249-8958}
    }

  • D. Steidl and N. Göde, “Feature-based detection of bugs in clones,” in Proceedings of the 7th international workshop on software clones, 2013, p. 76–82.
    [BibTeX] [PDF]
    @inproceedings{10.5555/2662708.2662724,
    author = {Steidl, Daniela and G\"{o}de, Nils},
    title = {Feature-Based Detection of Bugs in Clones},
    year = {2013},
    isbn = {9781467364454},
    url = {https://dl.acm.org/doi/10.5555/2662708.2662724},
    publisher = {IEEE Press},
    booktitle = {Proceedings of the 7th International Workshop on Software Clones},
    pages = {76–82},
    numpages = {7},
    keywords = {bug detection, code clones, software quality},
    location = {San Francisco, California},
    series = {IWSC ’13}
    }

  • M. Stephan, M. H. Alafi, A. Stevenson, and J. R. Cordy, “Using mutation analysis for a model-clone detector comparison framework,” in 2013 35th international conference on software engineering (icse), 2013, pp. 1261-1264.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6606693,
    author={M. {Stephan} and M. H. {Alafi} and A. {Stevenson} and J. R. {Cordy}},
    booktitle={2013 35th International Conference on Software Engineering (ICSE)},
    url = {https://ieeexplore.ieee.org/abstract/document/6606693/},
    title={Using mutation analysis for a model-clone detector comparison framework},
    year={2013},
    volume={},
    number={},
    pages={1261-1264},
    }

  • A. Surendran, P. Samuel, and P. K. Jacob, “Code Clones in Program Test Sequence Identification,” 2013.
    [BibTeX] [Abstract] [PDF]

    Code clones are portions of source code which are similar to the original program code. The presence of code clones is considered as a bad feature of software as the maintenance of software becomes difficult due to the presence of code clones. Methods for code clone detection have gained immense significance in the last few years as they play a significant role in engineering applications such as analysis of program code, program understanding, plagiarism detection, error detection, code compaction and many more similar tasks. Despite of all these facts, several features of code clones if properly utilized can make software development process easier. In this work, we have pointed out such a feature of code clones which highlight the relevance of code clones in test sequence identification. Here program slicing is used in code clone detection. In addition, a classification of code clones is presented and the benefit of using program slicing in code clone detection is also mentioned in this work.

    @techreport{surendran_code_2013,
    title = {Code {Clones} in {Program} {Test} {Sequence} {Identification}},
    url = {www.mirlabs.net/ijcisim/index.html},
    abstract = {Code clones are portions of source code which are similar to the original program code. The presence of code clones is considered as a bad feature of software as the maintenance of software becomes difficult due to the presence of code clones. Methods for code clone detection have gained immense significance in the last few years as they play a significant role in engineering applications such as analysis of program code, program understanding, plagiarism detection, error detection, code compaction and many more similar tasks. Despite of all these facts, several features of code clones if properly utilized can make software development process easier. In this work, we have pointed out such a feature of code clones which highlight the relevance of code clones in test sequence identification. Here program slicing is used in code clone detection. In addition, a classification of code clones is presented and the benefit of using program slicing in code clone detection is also mentioned in this work.},
    author = {Surendran, Anupama and Samuel, Philip and Jacob, K Poulose},
    year = {2013},
    journal = {International Journal of Computer Information Systems and Industrial Management Applications},
    note = {Publication Title: International Journal of Computer Information Systems and Industrial Management Applications
    Volume: 5},
    keywords = {Code clone, program slicing, static slicing, test sequence},
    pages = {564-570}
    }

  • J. Svajlenko, C. K. Roy, and J. R. Cordy, “A mutation analysis based benchmarking framework for clone detectors,” in 2013 7th international workshop on software clones (iwsc), 2013, pp. 8-9.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6613033,
    author={J. {Svajlenko} and C. K. {Roy} and J. R. {Cordy}},
    booktitle={2013 7th International Workshop on Software Clones (IWSC)},
    title={A mutation analysis based benchmarking framework for clone detectors},
    year={2013},
    url = {https://ieeexplore.ieee.org/abstract/document/6613033/},
    volume={},
    number={},
    pages={8-9},}

  • R. Tekchandani, R. K. Bhatia, and M. Singh, “Semantic code clone detection using parse trees and grammar recovery,” in IET Conference Publications, 2013, pp. 41-46. doi:10.1049/cp.2013.2291
    [BibTeX] [Abstract] [PDF]

    Code cloning is the common requirement for most of the software applications. Code clones are the similar code fragments that exist at different locations in a software system. This type of reuse approach of existing code is called code cloning and the pasted code fragment is called as clone of the original. Code duplication exists in one of the two categories: Syntactic or semantic. Existing techniques of semantic code clone detection deals with program dependence graphs. In this paper, we proposed an algorithm that finds the semantic code clones on the basis of parse trees and formal grammars. This paper finds the similar code fragments those are structurally divergent but semantically equivalent on the basis of parse trees and grammar recovery. It also provides the design and implementation of proposed approach followed by results.

    @inproceedings{tekchandani_semantic_2013,
    title = {Semantic code clone detection using parse trees and grammar recovery},
    volume = {2013},
    isbn = {978-1-84919-846-2},
    doi = {10.1049/cp.2013.2291},
    url = {https://ieeexplore.ieee.org/document/6832306},
    abstract = {Code cloning is the common requirement for most of the software applications. Code clones are the similar code fragments that exist at different locations in a software system. This type of reuse approach of existing code is called code cloning and the pasted code fragment is called as clone of the original. Code duplication exists in one of the two categories: Syntactic or semantic. Existing techniques of semantic code clone detection deals with program dependence graphs. In this paper, we proposed an algorithm that finds the semantic code clones on the basis of parse trees and formal grammars. This paper finds the similar code fragments those are structurally divergent but semantically equivalent on the basis of parse trees and grammar recovery. It also provides the design and implementation of proposed approach followed by results.},
    booktitle = {{IET} {Conference} {Publications}},
    publisher = {Institution of Engineering and Technology},
    author = {Tekchandani, Rajkumar and Bhatia, Rajesh Kumar and Singh, Maninder},
    year = {2013},
    note = {Issue: 647 CP},
    keywords = {Code clones, Semantic code clones, Grammar recovery, Parse trees},
    pages = {41-46}
    }

  • E. Tempero, “Towards a curated collection of code clones,” in 2013 7th international workshop on software clones (iwsc), 2013, pp. 53-59.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6613043,
    author={E. {Tempero}},
    booktitle={2013 7th International Workshop on Software Clones (IWSC)},
    title={Towards a curated collection of code clones},
    year={2013},
    url = {https://ieeexplore.ieee.org/abstract/document/6613043/},
    volume={},
    number={},
    pages={53-59},}

  • Z. Xing, Y. Xue, and S. Jarzabek, “Distilling useful clones by contextual differencing,” in 2013 20th working conference on reverse engineering (wcre), 2013, pp. 102-111.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6671285,
    author={Z. {Xing} and Y. {Xue} and S. {Jarzabek}},
    booktitle={2013 20th Working Conference on Reverse Engineering (WCRE)},
    title={Distilling useful clones by contextual differencing},
    url = {https://ieeexplore.ieee.org/abstract/document/6671285/},
    year={2013},
    volume={},
    number={},
    pages={102-111},}

  • M. F. Zibran, R. K. Saha, C. K. Roy, and K. A. Schneider, “Genealogical insights into the facts and fictions of clone removal,” Sigapp appl. comput. rev., vol. 13, iss. 4, p. 30–42, 2013. doi:10.1145/2577554.2577559
    [BibTeX] [PDF]
    @article{10.1145/2577554.2577559,
    author = {Zibran, Minhaz F. and Saha, Ripon K. and Roy, Chanchal K. and Schneider, Kevin A.},
    title = {Genealogical Insights into the Facts and Fictions of Clone Removal},
    year = {2013},
    issue_date = {December 2013},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    volume = {13},
    number = {4},
    issn = {1559-6915},
    url = {https://doi.org/10.1145/2577554.2577559},
    doi = {10.1145/2577554.2577559},
    journal = {SIGAPP Appl. Comput. Rev.},
    month = dec,
    pages = {30–42},
    numpages = {13},
    keywords = {clone evolution, clone removal, reengineering, refactoring}
    }

  • M. F. Zibran, R. K. Saha, C. K. Roy, and K. A. Schneider, “Evaluating the conventional wisdom in clone removal: A genealogy-based empirical study,” in Proceedings of the ACM Symposium on Applied Computing, 2013, pp. 1123-1130. doi:10.1145/2480362.2480573
    [BibTeX] [Abstract] [PDF]

    Clone management has drawn immense interest from the research community in recent years. It is recognized that a deep understanding of how code clones change and are refactored is necessary for devising effective clone management tools and techniques. This paper presents an empirical study based on the clone genealogies from a significant number of releases of six software systems, to characterize the patterns of clone change and removal in evolving software systems. With a blend of qualitative analysis, quantitative analysis and statistical tests of significance, we address a number of research questions. Our findings reveal insights into the removal of individual clone fragments and provide empirical evidence in support of conventional clone evolution wisdom. The results can be used to devise informed clone management tools and techniques. Copyright 2013 ACM.

    @inproceedings{zibran_evaluating_2013,
    title = {Evaluating the conventional wisdom in clone removal: {A} genealogy-based empirical study},
    isbn = {978-1-4503-1656-9},
    doi = {10.1145/2480362.2480573},
    url = {https://www.cs.usask.ca/~croy/papers/2013/ZibranCloneChange_SAC2013.pdf},
    abstract = {Clone management has drawn immense interest from the research community in recent years. It is recognized that a deep understanding of how code clones change and are refactored is necessary for devising effective clone management tools and techniques. This paper presents an empirical study based on the clone genealogies from a significant number of releases of six software systems, to characterize the patterns of clone change and removal in evolving software systems. With a blend of qualitative analysis, quantitative analysis and statistical tests of significance, we address a number of research questions. Our findings reveal insights into the removal of individual clone fragments and provide empirical evidence in support of conventional clone evolution wisdom. The results can be used to devise informed clone management tools and techniques. Copyright 2013 ACM.},
    booktitle = {Proceedings of the {ACM} {Symposium} on {Applied} {Computing}},
    author = {Zibran, Minhaz F. and Saha, Ripon K. and Roy, Chanchal K. and Schneider, Kevin A.},
    year = {2013},
    keywords = {Refactoring, Clone evolution, Clone removal, Reengineering},
    pages = {1123-1130}
    }

2012

  • N. Li, M. Shen, S. Li, L. Zhang, and Z. Li, “STVsm: similar structural code detection based on ast and vsm,” in Communications in Computer and Information Science, 2012, pp. 15-21. doi:10.1007/978-3-642-35267-6_3
    [BibTeX] [Abstract] [PDF]

    The potential software defects are most derived from the frequent changes during the development life cycle. It is very helpful to inform developers of the related codes which are affected by the change they are currently performing. In this paper, we propose a new approach STVsm to detect the similar structural code which related to some software changes. The method of STVsm is based on abstract syntax tree and vector space model. Experimental results show that our STVsm method achieves a significant accurate to detect the similar structural codes in C programming language, including exact clones, change code format, renamed codes, reordered codes and add redundancy codes. © 2012 Springer-Verlag.

    @inproceedings{li_stvsm_2012,
    title = {{STVsm}: Similar structural code detection based on AST and VSM},
    volume = {340 CCIS},
    url = {https://link.springer.com/chapter/10.1007/978-3-642-35267-6_3},
    isbn = {978-3-642-35266-9},
    doi = {10.1007/978-3-642-35267-6_3},
    abstract = {The potential software defects are most derived from the frequent changes during the development life cycle. It is very helpful to inform developers of the related codes which are affected by the change they are currently performing. In this paper, we propose a new approach STVsm to detect the similar structural code which related to some software changes. The method of STVsm is based on abstract syntax tree and vector space model. Experimental results show that our STVsm method achieves a significant accurate to detect the similar structural codes in C programming language, including exact clones, change code format, renamed codes, reordered codes and add redundancy codes. © 2012 Springer-Verlag.},
    booktitle = {Communications in {Computer} and {Information} {Science}},
    author = {Li, Ning and Shen, Mingda and Li, Sinan and Zhang, Lijun and Li, Zhanhuai},
    year = {2012},
    note = {ISSN: 18650929},
    keywords = {AST, Change Related Code, Clone Code Detection, Similar Stuctural, VSM},
    pages = {15-21}
    }

  • Y. Yuan, “A scalable and accurate approach based on count matrix for detecting code clones,” in AOSD’12 Companion – Proceedings of the 11th Annual International Conference on Aspect Oriented Software Development, 2012, pp. 21-22. doi:10.1145/2162110.2162126
    [BibTeX] [Abstract] [PDF]

    In this paper, we introduce a new token based algorithm for code clone detection. Count Environment(CE) is certain scenario related to variables. Count Vector(CV) for one variable is consisted of counting occurrences of this variable in different CEs. Count Matrix(CM) for one code fragment is consisted of different CVs of all variables in the code fragment. We use CVs to depict variables, and use CM to represent a code fragment. Two code fragments will be compared by their corresponding CMs, and during the comparison, two heuristics are used. Experimental results show that our algorithm is significantly faster than Deckard, a state-of-the-art syntactic technique for detecting code clones.

    @inproceedings{yuan_scalable_2012,
    title = {A scalable and accurate approach based on count matrix for detecting code clones},
    isbn = {978-1-4503-1222-6},
    doi = {10.1145/2162110.2162126},
    abstract = {In this paper, we introduce a new token based algorithm for code clone detection. Count Environment(CE) is certain scenario related to variables. Count Vector(CV) for one variable is consisted of counting occurrences of this variable in different CEs. Count Matrix(CM) for one code fragment is consisted of different CVs of all variables in the code fragment. We use CVs to depict variables, and use CM to represent a code fragment. Two code fragments will be compared by their corresponding CMs, and during the comparison, two heuristics are used. Experimental results show that our algorithm is significantly faster than Deckard, a state-of-the-art syntactic technique for detecting code clones.},
    booktitle = {{AOSD}'12 {Companion} - {Proceedings} of the 11th {Annual} {International} {Conference} on {Aspect} {Oriented} {Software} {Development}},
    author = {Yuan, Yang},
    year = {2012},
    keywords = {Code clone, Count matrix, Token based},
    pages = {21-22},
    url = {https://dl.acm.org/doi/abs/10.1145/2162110.2162126}
    }

  • M. F. Zibran and C. K. Roy, “IDE-based real-time focused search for near-miss clones,” in Proceedings of the ACM Symposium on Applied Computing, 2012, pp. 1235-1242. doi:10.1145/2245276.2231970
    [BibTeX] [Abstract] [PDF]

    Code clone is a well-known code smell that needs to be detected and managed during the software development process. However, the existing clone detectors have one or more of the three shortcomings: (a) limitation in detecting Type-3 clones, (b) they come as stand-alone tools separate from IDE and thus cannot support clone-aware development, (c) they overwhelm the developer with all clones from the entire code-base, instead of a focused search for clones of a selected code segment of the developer’s interest. This paper presents our IDE-integrated clone search tool, that addresses all the above issues. For clone detection, we adapt a suffix-tree-based hybrid algorithm. Through an asymptotic analysis, we show that our approach for clone detection is both time and memory efficient. Moreover, using three separate empirical studies, we demonstrate that our tool is flexibly usable for searching exact (Type-1) and near-miss (Type-2 and Type-3) clones with high precision and recall. © 2012 ACM.

    @inproceedings{zibran_ide-based_2012,
    title = {{IDE}-based real-time focused search for near-miss clones},
    isbn = {978-1-4503-0857-1},
    url = {https://www.cs.usask.ca/~croy/papers/2012/Zibran_SAC2012_CloneSearch.pdf},
    doi = {10.1145/2245276.2231970},
    abstract = {Code clone is a well-known code smell that needs to be detected and managed during the software development process. However, the existing clone detectors have one or more of the three shortcomings: (a) limitation in detecting Type-3 clones, (b) they come as stand-alone tools separate from IDE and thus cannot support clone-aware development, (c) they overwhelm the developer with all clones from the entire code-base, instead of a focused search for clones of a selected code segment of the developer's interest. This paper presents our IDE-integrated clone search tool, that addresses all the above issues. For clone detection, we adapt a suffix-tree-based hybrid algorithm. Through an asymptotic analysis, we show that our approach for clone detection is both time and memory efficient. Moreover, using three separate empirical studies, we demonstrate that our tool is flexibly usable for searching exact (Type-1) and near-miss (Type-2 and Type-3) clones with high precision and recall. © 2012 ACM.},
    booktitle = {Proceedings of the {ACM} {Symposium} on {Applied} {Computing}},
    author = {Zibran, Minhaz F. and Roy, Chanchal K.},
    year = {2012},
    keywords = {reengineering, clone detection, clone search, maintenance},
    pages = {1235-1242}
    }

  • S. A. Ajila, A. S. Gakhar, C. H. Lung, and M. Zaman, “Reusing and converting code clones to aspects – an algorithmic approach,” in 2012 ieee 13th international conference on information reuse integration (iri), 2012, pp. 9-16.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6302984,
    author={S. A. {Ajila} and A. S. {Gakhar} and C. H. {Lung} and M. {Zaman}},
    booktitle={2012 IEEE 13th International Conference on Information Reuse Integration (IRI)},
    title={Reusing and converting code clones to aspects - An algorithmic approach},
    year={2012},
    url = {https://ieeexplore.ieee.org/abstract/document/6302984/},
    volume={},
    number={},
    pages={9-16},}

  • F. Al-Omari, I. Keivanloo, C. K. Roy, and J. Rilling, “Detecting clones across microsoft .net programming languages,” in 2012 19th working conference on reverse engineering, 2012, pp. 405-414.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6385136,
    author={F. {Al-Omari} and I. {Keivanloo} and C. K. {Roy} and J. {Rilling}},
    booktitle={2012 19th Working Conference on Reverse Engineering},
    title={Detecting Clones Across Microsoft .NET Programming Languages},
    year={2012},
    volume={},
    number={},
    url = {https://ieeexplore.ieee.org/document/6385136},
    pages={405-414},
    }

  • M. H. Alalfi, J. R. Cordy, T. R. Dean, M. Stephan, and A. Stevenson, “Near-miss model clone detection for simulink models,” in 2012 6th international workshop on software clones (iwsc), 2012, pp. 78-79.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6227873,
    author={M. H. {Alalfi} and J. R. {Cordy} and T. R. {Dean} and M. {Stephan} and A. {Stevenson}},
    booktitle={2012 6th International Workshop on Software Clones (IWSC)},
    title={Near-miss model clone detection for Simulink models},
    year={2012},
    url = {https://ieeexplore.ieee.org/abstract/document/6227873/},
    volume={},
    number={},
    pages={78-79},
    }

  • M. Alalfi, J. Cordy, T. Dean, M. Stephan, and A. Stevenson, “Models are code too: near-miss clone detection for simulink models,” Ieee international conference on software maintenance, icsm, 2012. doi:10.1109/ICSM.2012.6405285
    [BibTeX]
    @article{article,
    author = {Alalfi, Manar and Cordy, James and Dean, Thomas and Stephan, Matthew and Stevenson, Andrew},
    year = {2012},
    month = {09},
    pages = {},
    title = {Models are Code too: Near-miss Clone Detection for Simulink Models},
    journal = {IEEE International Conference on Software Maintenance, ICSM},
    doi = {10.1109/ICSM.2012.6405285}
    }

  • H. Basit, U. Ali, S. Haque, and S. Jarzabek, “Things structural clones tell that simple clones don’t,” in 28th ieee international conference on software maintenance (icsm), 2012, pp. 275-284. doi:10.1109/ICSM.2012.6405283
    [BibTeX] [PDF]
    @inproceedings{inproceedings,
    author = {Basit, Hamid and Ali, Usman and Haque, Sidra and Jarzabek, Stan},
    year = {2012},
    month = {09},
    pages = {275-284},
    title = {Things structural clones tell that simple clones don't},
    isbn = {978-1-4673-2313-0},
    url = {https://www.researchgate.net/publication/261417001},
    booktitle = {28th IEEE International Conference on Software Maintenance (ICSM)},
    doi = {10.1109/ICSM.2012.6405283}
    }

  • M. Beard, N. Kraft, and L. Etzkorn, “Code clones in rhino: a case study,” Proceedings of the iasted international conference on software engineering and applications, sea 2012, 2012. doi:10.2316/P.2012.790-056
    [BibTeX] [PDF]
    @article{article,
    author = {Beard, Matthew and Kraft, Nicholas and Etzkorn, Letha},
    year = {2012},
    month = {12},
    pages = {},
    title = {Code Clones in Rhino: A Case Study},
    journal = {Proceedings of the IASTED International Conference on Software Engineering and Applications, SEA 2012},
    url = {https://www.researchgate.net/publication/266632001_Code_Clones_in_Rhino_A_Case_Study/references},
    doi = {10.2316/P.2012.790-056}
    }

  • A. Belmabrouk and B. Messabih, “The Reverse Engineering in Oriented Aspect “Detection of semantics clones”,” International journal of scientific & engineering research, vol. 3, iss. 5, 2012.
    [BibTeX] [Abstract] [PDF]

    Attention to the reverse engineering in oriented aspect programming (AOP) is rapidly growing as its benefits in large softwar e system development and maintenance are increasingly recognized. This paper reports on the challenges of using the reverse engineering in oriented aspect to detect the crosscutting concerns. So we present a new idea to detect a clone semantic in code. We first present the Principe of the AOP, then, we report on application of reverse engineering in legacy industrial software system. The novel as pect of our approach is the use of program dependence graphs (PDGs) wich one of the important techniques of aspect mining to detect duplicate code in programs. We have extended the definition of a code clone to include semantically related code. We reduced the difficult graph similarity problem to a tree similarity problem by mapping interesting semantic fragments to their related syntax.

    @article{belmabrouk_reverse_2012,
    title = {The {Reverse} {Engineering} in {Oriented} {Aspect} "{Detection} of semantics clones"},
    volume = {3},
    issn = {2229-5518},
    url = {https://www.ijser.org/paper/The-Reverse-Engineering-in-Oriented-Aspect-Detection-of-semantics-clones.html},
    abstract = {Attention to the reverse engineering in oriented aspect programming (AOP) is rapidly growing as its benefits in large softwar e system development and maintenance are increasingly recognized. This paper reports on the challenges of using the reverse engineering in oriented aspect to detect the crosscutting concerns. So we present a new idea to detect a clone semantic in code. We first present the Principe of the AOP, then, we report on application of reverse engineering in legacy industrial software system. The novel as pect of our approach is the use of program dependence graphs (PDGs) wich one of the important techniques of aspect mining to detect duplicate code in programs. We have extended the definition of a code clone to include semantically related code. We reduced the difficult graph similarity problem to a tree similarity problem by mapping interesting semantic fragments to their related syntax.},
    number = {5},
    journal = {International Journal of Scientific \& Engineering Research},
    author = {Belmabrouk, Amel and Messabih, Belhadri},
    year = {2012},
    keywords = {Aspect mining, Crosscutting Concern, Oriented Aspect programming, Program Dependence Graphs (PDG's), Reverse engineering}
    }

  • N. Bettenburg, W. Shang, W. M. Ibrahim, B. Adams, Y. Zou, and A. E. Hassan, “An empirical study on inconsistent changes to code clones at the release level,” Science of computer programming, vol. 77, iss. 6, pp. 760-776, 2012. doi:https://doi.org/10.1016/j.scico.2010.11.010
    [BibTeX] [PDF]
    @article{BETTENBURG2012760,
    title = {An empirical study on inconsistent changes to code clones at the release level},
    journal = {Science of Computer Programming},
    volume = {77},
    number = {6},
    pages = {760-776},
    year = {2012},
    note = {(1) Coordination 2009 (2) WCRE 2009},
    issn = {0167-6423},
    doi = {https://doi.org/10.1016/j.scico.2010.11.010},
    url = {http://www.sciencedirect.com/science/article/pii/S0167642310002091},
    author = {Nicolas Bettenburg and Weiyi Shang and Walid M. Ibrahim and Bram Adams and Ying Zou and Ahmed E. Hassan},
    keywords = {Software engineering, Maintenance management, Reuse models, Clone detection, Maintainability, Software evolution},
    }

  • J. D. Carver, D. Chatterji, J. C. Carver, and N. A. Kraft, “Claims and beliefs about code clones: do we agree as a community? a survey,” in 2012 6th international workshop on software clones (iwsc), 2012, pp. 15-21.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6227860,
    author={D. Carver, Jeff and Chatterji, Debarshi and Carver, Jeffrey C and Kraft, Nicholas A},
    booktitle={2012 6th International Workshop on Software Clones (IWSC)},
    title={Claims and beliefs about code clones: Do we agree as a community? A survey},
    year={2012},
    volume={},
    number={},
    url = {https://ieeexplore.ieee.org/document/6227860/footnotes#footnotes},
    pages={15-21},
    }

  • A. Cuomo, A. Santone, and U. Villano, “A novel approach based on formal methods for clone detection,” in Proceedings of the 6th international workshop on software clones, 2012, p. 8–14.
    [BibTeX] [PDF]
    @inproceedings{10.5555/2664398.2664400,
    author = {Cuomo, Antonio and Santone, Antonella and Villano, Umberto},
    title = {A Novel Approach Based on Formal Methods for Clone Detection},
    year = {2012},
    url = {https://dl.acm.org/doi/10.5555/2664398.2664400},
    isbn = {9781467317955},
    publisher = {IEEE Press},
    booktitle = {Proceedings of the 6th International Workshop on Software Clones},
    pages = {8–14},
    numpages = {7},
    keywords = {CCS, clone detection, formal methods},
    location = {Zurich, Switzerland},
    series = {IWSC ’12}
    }

  • C. Dandois and W. Vanhoof, “Clones in logic programs and how to detect them,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, pp. 90-105. doi:10.1007/978-3-642-32211-2_7
    [BibTeX] [Abstract] [PDF]

    In this paper, we propose a theoretical framework that allows us to capture, by program analysis, the notion of code clone in the context of logic programming. Informally, two code fragments are considered as cloned if they implement the same functionality. Clone detection can be advantageous from a software engineering viewpoint, as the presence of code clones inside a program reveals redundancy, broadly considered a “bad smell”. In the paper, we present a detailed definition of a clone in a logic program and provide an efficient detection algorithm able to identify an important subclass of code clones that could be used for various applications such as program refactoring and plagiarism recognition. Our clone detection algorithm is not tied to a particular logic programming language, and can easily be instantiated for different such languages. © 2012 Springer-Verlag.

    @inproceedings{dandois_clones_2012,
    title = {Clones in logic programs and how to detect them},
    volume = {7225 LNCS},
    isbn = {978-3-642-32210-5},
    url = {https://link.springer.com/chapter/10.1007/978-3-642-32211-2_7},
    doi = {10.1007/978-3-642-32211-2_7},
    abstract = {In this paper, we propose a theoretical framework that allows us to capture, by program analysis, the notion of code clone in the context of logic programming. Informally, two code fragments are considered as cloned if they implement the same functionality. Clone detection can be advantageous from a software engineering viewpoint, as the presence of code clones inside a program reveals redundancy, broadly considered a "bad smell". In the paper, we present a detailed definition of a clone in a logic program and provide an efficient detection algorithm able to identify an important subclass of code clones that could be used for various applications such as program refactoring and plagiarism recognition. Our clone detection algorithm is not tied to a particular logic programming language, and can easily be instantiated for different such languages. © 2012 Springer-Verlag.},
    booktitle = {Lecture {Notes} in {Computer} {Science} (including subseries {Lecture} {Notes} in {Artificial} {Intelligence} and {Lecture} {Notes} in {Bioinformatics})},
    author = {Dandois, Céline and Vanhoof, Wim},
    year = {2012},
    note = {ISSN: 03029743},
    keywords = {code clone, code duplication, clone detection algorithm, logic programming languages},
    pages = {90-105}
    }

  • Y. Dang, D. Zhang, S. Ge, C. Chu, Y. Qiu, and T. Xie, “XIAO: Tuning code clones at hands of engineers in practice,” in ACM International Conference Proceeding Series, 2012, pp. 369-378. doi:10.1145/2420950.2421004
    [BibTeX] [Abstract] [PDF]

    During software development, engineers often reuse a code fragment via copy-and-paste with or without modifications or adaptations. Such practices lead to a number of the same or similar code fragments spreading within one or many large codebases. Detecting code clones has been shown to be useful towards security such as detection of similar security bugs and, more generally, quality improvement such as refactoring of code clones. A large number of academic research projects have been carried out on empirical studies or tool supports for detecting code clones. In this paper, we report our experiences of carrying out successful technology transfer of our new approach of code-clone detection, called XIAO. XIAO has been integrated into Microsoft Visual Studio 2012, to be benefiting a huge number of developers in industry. The main success factors of XIAO include its high tunability, scalability, compatibility, and explorability. Based on substantial industrial experiences, we present the XIAO approach with emphasis on these success factors of XIAO. We also present empirical results on applying XIAO on real scenarios within Microsoft for the tasks of security-bug detection and refactoring. Copyright 2012 ACM.

    @inproceedings{dang_xiao_2012,
    title = {{XIAO}: {Tuning} code clones at hands of engineers in practice},
    isbn = {978-1-4503-1312-4},
    doi = {10.1145/2420950.2421004},
    url = {https://dl.acm.org/doi/10.1145/2420950.2421004},
    abstract = {During software development, engineers often reuse a code fragment via copy-and-paste with or without modifications or adaptations. Such practices lead to a number of the same or similar code fragments spreading within one or many large codebases. Detecting code clones has been shown to be useful towards security such as detection of similar security bugs and, more generally, quality improvement such as refactoring of code clones. A large number of academic research projects have been carried out on empirical studies or tool supports for detecting code clones. In this paper, we report our experiences of carrying out successful technology transfer of our new approach of code-clone detection, called XIAO. XIAO has been integrated into Microsoft Visual Studio 2012, to be benefiting a huge number of developers in industry. The main success factors of XIAO include its high tunability, scalability, compatibility, and explorability. Based on substantial industrial experiences, we present the XIAO approach with emphasis on these success factors of XIAO. We also present empirical results on applying XIAO on real scenarios within Microsoft for the tasks of security-bug detection and refactoring. Copyright 2012 ACM.},
    booktitle = {{ACM} {International} {Conference} {Proceeding} {Series}},
    author = {Dang, Yingnong and Zhang, Dongmei and Ge, Song and Chu, Chengyun and Qiu, Yingjun and Xie, Tao},
    year = {2012},
    keywords = {Code clone, Code duplication, Code-clone detection, Code-clone search, Duplicated security vulnerability},
    pages = {369-378}
    }

  • X. L. Dong and D. Srivastava, “Detecting clones, copying and reuse on the web,” in 2012 ieee 28th international conference on data engineering, 2012, pp. 1211-1213.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6228170,
    author={X. L. {Dong} and D. {Srivastava}},
    booktitle={2012 IEEE 28th International Conference on Data Engineering},
    title={Detecting Clones, Copying and Reuse on the Web},
    year={2012},
    url = {https://ieeexplore.ieee.org/abstract/document/6228170/},
    volume={},
    number={},
    pages={1211-1213},
    }

  • C. Forbes, I. Keivanloo, and J. Rilling, “Doppel-code: a clone visualization tool for prioritizing global and local clone impacts,” 2012 ieee 36th annual computer software and applications conference, pp. 366-367, 2012.
    [BibTeX]
    @article{Forbes2012DoppelCodeAC,
    title={Doppel-Code: A Clone Visualization Tool for Prioritizing Global and Local Clone Impacts},
    author={Christopher Forbes and Iman Keivanloo and Juergen Rilling},
    journal={2012 IEEE 36th Annual Computer Software and Applications Conference},
    year={2012},
    pages={366-367}
    }

  • J. Harder and R. Tiarks, “A controlled experiment on software clones,” in 2012 20th ieee international conference on program comprehension (icpc), 2012, pp. 219-228.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6240491,
    author={J. {Harder} and R. {Tiarks}},
    booktitle={2012 20th IEEE International Conference on Program Comprehension (ICPC)},
    title={A controlled experiment on software clones},
    year={2012},
    url = {https://ieeexplore.ieee.org/abstract/document/6240491/},
    volume={},
    number={},
    pages={219-228},
    }

  • I. Keivanloo, C. K. Roy, and J. Rilling, “Java bytecode clone detection via relaxation on code fingerprint and semantic web reasoning,” in 2012 6th international workshop on software clones (iwsc), 2012, pp. 36-42.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6227864,
    author={I. {Keivanloo} and C. K. {Roy} and J. {Rilling}},
    booktitle={2012 6th International Workshop on Software Clones (IWSC)},
    title={Java bytecode clone detection via relaxation on code fingerprint and Semantic Web reasoning},
    year={2012},
    url = {https://ieeexplore.ieee.org/abstract/document/6227864/},
    volume={},
    number={},
    pages={36-42},}

  • I. Keivanloo, C. K. Roy, and J. Rilling, “Sebyte: a semantic clone detection tool for intermediate languages,” in 2012 20th ieee international conference on program comprehension (icpc), 2012, pp. 247-249.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6240495,
    author={I. {Keivanloo} and C. K. {Roy} and J. {Rilling}},
    booktitle={2012 20th IEEE International Conference on Program Comprehension (ICPC)},
    title={SeByte: A semantic clone detection tool for intermediate languages},
    year={2012},
    url = {https://ieeexplore.ieee.org/document/6240495},
    volume={},
    number={},
    pages={247-249},}

  • T. Lavoie, F. Khomh, E. Merlo, and Y. Zou, “Inferring repository file structure modifications using nearest-neighbor clone detection,” in 2012 19th working conference on reverse engineering, 2012, pp. 325-334.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6385128,
    author={T. {Lavoie} and F. {Khomh} and E. {Merlo} and Y. {Zou}},
    booktitle={2012 19th Working Conference on Reverse Engineering},
    title={Inferring Repository File Structure Modifications Using Nearest-Neighbor Clone Detection},
    year={2012},
    url = {https://ieeexplore.ieee.org/abstract/document/6385128/},
    volume={},
    number={},
    pages={325-334},}

  • J. Li and M. D. Ernst, “Cbcd: cloned buggy code detector,” in 2012 34th international conference on software engineering (icse), 2012, pp. 310-320.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6227183,
    author={J. {Li} and M. D. {Ernst}},
    booktitle={2012 34th International Conference on Software Engineering (ICSE)},
    title={CBCD: Cloned buggy code detector},
    url = {https://ieeexplore.ieee.org/abstract/document/6227183/},
    year={2012},
    volume={},
    number={},
    pages={310-320},
    }

  • G. Mahajan, “SOFTWARE CLONING IN EXTREME PROGRAMMING ENVIRONMENT,” Ijreas, vol. 2, iss. 2, 2012.
    [BibTeX] [Abstract] [PDF]

    Software systems are evolving by adding new functions and modifying existing functions over time. Through the evolution, the structure of software is becoming more complex and so the understandability and maintainability of software systems is deteriorating day by day. These are not only important but one of the most expensive activities in software development. Refactoring has often been applied to the software to improve them. One of the targets of refactoring is to limit Code Cloning because it hinders software maintenance and affects its quality. And in order to cope with the constant changes, refactoring is seen as an essential component of Extreme Programming. Agile Methods use refactoring as important key practice and are first choice for developing clone-free code. This paper summarizes my overview talk on software cloning analysis. It first discusses the notion of code cloning, types of clones, reasons, its consequences and analysis. It highlights Code Cloning in Extreme Programming Environment and finds Clone Detection as effective tool for Refactoring.

    @article{mahajan_software_2012,
    title = {{SOFTWARE} {CLONING} {IN} {EXTREME} {PROGRAMMING} {ENVIRONMENT}},
    volume = {2},
    issn = {2249-3905},
    url = {http://www.euroasiapub.orghttp//www.euroasiapub.org},
    abstract = {Software systems are evolving by adding new functions and modifying existing functions over time. Through the evolution, the structure of software is becoming more complex and so the understandability and maintainability of software systems is deteriorating day by day. These are not only important but one of the most expensive activities in software development. Refactoring has often been applied to the software to improve them. One of the targets of refactoring is to limit Code Cloning because it hinders software maintenance and affects its quality. And in order to cope with the constant changes, refactoring is seen as an essential component of Extreme Programming. Agile Methods use refactoring as important key practice and are first choice for developing clone-free code. This paper summarizes my overview talk on software cloning analysis. It first discusses the notion of code cloning, types of clones, reasons, its consequences and analysis. It highlights Code Cloning in Extreme Programming Environment and finds Clone Detection as effective tool for Refactoring.},
    number = {2},
    journal = {IJREAS},
    author = {Mahajan, Ginika},
    year = {2012}
    }

  • M. Mondal, C. K. Roy, and K. A. Schneider, “Dispersion of changes in cloned and non-cloned code,” in 2012 6th international workshop on software clones (iwsc), 2012, pp. 29-35.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6227863,
    author={M. {Mondal} and C. K. {Roy} and K. A. {Schneider}},
    booktitle={2012 6th International Workshop on Software Clones (IWSC)},
    title={Dispersion of changes in cloned and non-cloned code},
    year={2012},
    url = {https://ieeexplore.ieee.org/abstract/document/6227863/},
    volume={},
    number={},
    pages={29-35},}

  • M. Mondal, C. K. Roy, and K. A. Schneider, “An empirical study on clone stability,” Sigapp appl. comput. rev., vol. 12, iss. 3, p. 20–36, 2012. doi:10.1145/2387358.2387360
    [BibTeX] [PDF]
    @article{10.1145/2387358.2387360,
    author = {Mondal, Manishankar and Roy, Chanchal K. and Schneider, Kevin A.},
    title = {An Empirical Study on Clone Stability},
    year = {2012},
    issue_date = {September 2012},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    volume = {12},
    number = {3},
    issn = {1559-6915},
    url = {https://doi.org/10.1145/2387358.2387360},
    doi = {10.1145/2387358.2387360},
    journal = {SIGAPP Appl. Comput. Rev.},
    month = sep,
    pages = {20–36},
    numpages = {17},
    keywords = {software clones, overall instability, modification frequency, types of clones, code stability, changeability}
    }

  • M. Mondal, C. K. Roy, M. S. Rahman, R. K. Saha, J. Krinke, and K. A. Schneider, “Comparative stability of cloned and non-cloned code: an empirical study,” in Proceedings of the 27th annual acm symposium on applied computing, New York, NY, USA, 2012, p. 1227–1234. doi:10.1145/2245276.2231969
    [BibTeX] [PDF]
    @inproceedings{10.1145/2245276.2231969,
    author = {Mondal, Manishankar and Roy, Chanchal K. and Rahman, Md. Saidur and Saha, Ripon K. and Krinke, Jens and Schneider, Kevin A.},
    title = {Comparative Stability of Cloned and Non-Cloned Code: An Empirical Study},
    year = {2012},
    isbn = {9781450308571},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2245276.2231969},
    doi = {10.1145/2245276.2231969},
    booktitle = {Proceedings of the 27th Annual ACM Symposium on Applied Computing},
    pages = {1227–1234},
    numpages = {8},
    keywords = {modification frequency, code stability, clone types, average last change date, average age},
    location = {Trento, Italy},
    series = {SAC ’12}
    }

  • H. Murakami, K. Hotta, Y. Higo, H. Igaki, and S. Kusumoto, “Folding repeated instructions for improving token-based code clone detection,” in 2012 ieee 12th international working conference on source code analysis and manipulation, 2012, pp. 64-73.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6392103,
    author={H. {Murakami} and K. {Hotta} and Y. {Higo} and H. {Igaki} and S. {Kusumoto}},
    booktitle={2012 IEEE 12th International Working Conference on Source Code Analysis and Manipulation},
    title={Folding Repeated Instructions for Improving Token-Based Code Clone Detection},
    year={2012},
    url = {https://ieeexplore.ieee.org/abstract/document/6392103/},
    volume={},
    number={},
    pages={64-73},
    }

  • H. A. Nguyen, T. T. Nguyen, N. H. Pham, J. Al-Kofahi, and T. N. Nguyen, “Clone management for evolving software,” Ieee transactions on software engineering, vol. 38, iss. 5, pp. 1008-1026, 2012.
    [BibTeX] [PDF]
    @ARTICLE{6007141,
    author={H. A. {Nguyen} and T. T. {Nguyen} and N. H. {Pham} and J. {Al-Kofahi} and T. N. {Nguyen}},
    journal={IEEE Transactions on Software Engineering},
    title={Clone Management for Evolving Software},
    url = {https://ieeexplore.ieee.org/abstract/document/6007141/},
    year={2012},
    volume={38},
    number={5},
    pages={1008-1026},}

  • S. Bazrafshan, “Evolution of near-miss clones,” in 2012 ieee 12th international working conference on source code analysis and manipulation, 2012, pp. 74-83.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6392104,
    author={S. {Bazrafshan}},
    booktitle={2012 IEEE 12th International Working Conference on Source Code Analysis and Manipulation},
    title={Evolution of Near-Miss Clones},
    url = {https://ieeexplore.ieee.org/abstract/document/6392104/},
    year={2012},
    volume={},
    number={},
    pages={74-83},}

  • F. Rahman, C. Bird, and P. Devanbu, “Clones: What is that smell?,” in Empirical Software Engineering, 2012, pp. 503-530. doi:10.1007/s10664-011-9195-3
    [BibTeX] [Abstract] [PDF]

    Clones are generally considered bad programming practice in software engineering folklore. They are identified as a bad smell (Fowler et al. 1999) and a major contributor to project maintenance difficulties. Clones inherently cause code bloat, thus increasing project size and maintenance costs. In this work, we try to validate the conventional wisdom empirically to see whether cloning makes code more defect prone. This paper analyses the relationship between cloning and defect proneness. For the four medium to large open source projects that we studied, we find that, first, the great majority of bugs are not significantly associated with clones. Second, we find that clones may be less defect prone than non-cloned code. Third, we find little evidence that clones with more copies are actually more error prone. Fourth, we find little evidence to support the claim that clone groups that span more than one file or directory are more defect prone than collocated clones. Finally, we find that developers do not need to put a disproportionately higher effort to fix clone dense bugs. Our findings do not support the claim that clones are really a “bad smell” (Fowler et al. 1999). Perhaps we can clone, and breathe easily, at the same time. © Springer Science+Business Media, LLC 2011.

    @inproceedings{rahman_clones_2012,
    title = {Clones: {What} is that smell?},
    volume = {17},
    url = {https://link.springer.com/article/10.1007/s10664-011-9195-3},
    doi = {10.1007/s10664-011-9195-3},
    abstract = {Clones are generally considered bad programming practice in software engineering folklore. They are identified as a bad smell (Fowler et al. 1999) and a major contributor to project maintenance difficulties. Clones inherently cause code bloat, thus increasing project size and maintenance costs. In this work, we try to validate the conventional wisdom empirically to see whether cloning makes code more defect prone. This paper analyses the relationship between cloning and defect proneness. For the four medium to large open source projects that we studied, we find that, first, the great majority of bugs are not significantly associated with clones. Second, we find that clones may be less defect prone than non-cloned code. Third, we find little evidence that clones with more copies are actually more error prone. Fourth, we find little evidence to support the claim that clone groups that span more than one file or directory are more defect prone than collocated clones. Finally, we find that developers do not need to put a disproportionately higher effort to fix clone dense bugs. Our findings do not support the claim that clones are really a "bad smell" (Fowler et al. 1999). Perhaps we can clone, and breathe easily, at the same time. © Springer Science+Business Media, LLC 2011.},
    booktitle = {Empirical {Software} {Engineering}},
    author = {Rahman, Foyzur and Bird, Christian and Devanbu, Premkumar},
    month = aug,
    year = {2012},
    note = {ISSN: 13823256
    Issue: 4-5},
    keywords = {Software evolution, Software maintenance, Software clone, Empirical software engineering, Software quality},
    pages = {503-530}
    }

  • I. Keivanloo and J. Rilling, “Clone detection meets semantic web-based transitive closure computation,” in Proceedings of the first international workshop on realizing ai synergies in software engineering, 2012, p. 12–16.
    [BibTeX] [PDF]
    @inproceedings{10.5555/2666527.2666530,
    author = {Keivanloo, Iman and Rilling, Juergen},
    title = {Clone Detection Meets Semantic Web-Based Transitive Closure Computation},
    year = {2012},
    isbn = {9781467317535},
    url = {https://dl.acm.org/doi/10.5555/2666527.2666530},
    publisher = {IEEE Press},
    booktitle = {Proceedings of the First International Workshop on Realizing AI Synergies in Software Engineering},
    pages = {12–16},
    numpages = {5},
    keywords = {object oriented, semantic web, clone detection},
    location = {Zurich, Switzerland},
    series = {RAISE ’12}
    }

  • Jinguo He, “Detecting c source code clones in college students’ homework,” in 2012 international conference on computer science and information processing (csip), 2012, pp. 104-107.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6308805,
    author={ {Jinguo He}},
    booktitle={2012 International Conference on Computer Science and Information Processing (CSIP)},
    title={Detecting C source code clones in college students' homework},
    year={2012},
    url = {https://ieeexplore.ieee.org/abstract/document/6308805/},
    volume={},
    number={},
    pages={104-107},
    }

  • M. Stephan, M. H. Alafi, A. Stevenson, and J. R. Cordy, “Towards qualitative comparison of simulink model clone detection approaches,” in Proceedings of the 6th international workshop on software clones, 2012, p. 84–85.
    [BibTeX] [PDF]
    @inproceedings{10.5555/2664398.2664416,
    author = {Stephan, Matthew and Alafi, Manar H. and Stevenson, Andrew and Cordy, James R.},
    title = {Towards Qualitative Comparison of Simulink Model Clone Detection Approaches},
    year = {2012},
    isbn = {9781467317955},
    publisher = {IEEE Press},
    booktitle = {Proceedings of the 6th International Workshop on Software Clones},
    pages = {84–85},
    url = {https://dl.acm.org/doi/10.5555/2664398.2664416},
    numpages = {2},
    keywords = {simulink, comparison, clone detection},
    location = {Zurich, Switzerland},
    series = {IWSC ’12}
    }

  • E. Tüzün and E. Er, “A case study on applying clone technology to an industrial application framework,” in Proceedings of the 6th international workshop on software clones, 2012, p. 57–61.
    [BibTeX] [PDF]
    @inproceedings{10.5555/2664398.2664407,
    author = {T\"{u}z\"{u}n, Eray and Er, Emre},
    title = {A Case Study on Applying Clone Technology to an Industrial Application Framework},
    year = {2012},
    url = {https://dl.acm.org/doi/10.5555/2664398.2664407},
    isbn = {9781467317955},
    publisher = {IEEE Press},
    booktitle = {Proceedings of the 6th International Workshop on Software Clones},
    pages = {57–61},
    numpages = {5},
    keywords = {software clones, types of clones, application of clone analysis, industrial experiences with clone analysis},
    location = {Zurich, Switzerland},
    series = {IWSC ’12}
    }

  • R. D. Venkatasubramanyam, H. K. Singh, and K. Ravikanth, “A method for proactive moderation of code clones in ides,” in 2012 6th international workshop on software clones (iwsc), 2012, pp. 62-66.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6227868,
    author={R. D. {Venkatasubramanyam} and H. K. {Singh} and K. {Ravikanth}},
    booktitle={2012 6th International Workshop on Software Clones (IWSC)},
    title={A method for proactive moderation of code clones in IDEs},
    year={2012},
    url = {https://ieeexplore.ieee.org/abstract/document/6227868/},
    volume={},
    number={},
    pages={62-66},
    }

  • X. Wang, Y. Dang, L. Zhang, D. Zhang, E. Lan, and H. Mei, “Can i clone this piece of code here?,” in 2012 27th ieee/acm international conference on automated software engineering (ase), 2012, pp. 170-179. doi:10.1145/2351676.2351701
    [BibTeX] [Abstract] [PDF]

    While code cloning is a convenient way for developers to reuse existing code, it could potentially lead to negative impacts, such as degrading code quality or increasing maintenance costs. Actually, some cloned code pieces are viewed as harmless since they evolve independently, while other cloned code pieces are viewed as harmful since they need to be changed consistently, thus incurring extra maintenance costs. Recent studies demonstrate that neither the percentage of harmful code clones nor that of harmless code clones is negligible. To assist developers in leveraging the benefits of harmless code cloning and/or in avoiding the negative impacts of harmful code cloning, we propose a novel approach that automatically predicts the harmfulness of a code cloning operation at the point of performing copy-and-paste. Our insight is that the potential harmfulness of a code cloning operation may relate to some characteristics of the code to be cloned and the characteristics of its context. Based on a number of features extracted from the cloned code and the context of the code cloning operation, we use Bayesian Networks, a machine-learning technique, to predict the harmfulness of an intended code cloning operation. We evaluated our approach on two large-scale industrial software projects under two usage scenarios: 1) approving only cloning operations predicted to be very likely of no harm, and 2) blocking only cloning operations predicted to be very likely of harm. In the first scenario, our approach is able to approve more than 50\% cloning operations with a precision higher than 94.9\% in both subjects. In the second scenario, our approach is able to avoid more than 48\% of the harmful cloning operations by blocking only 15\% of the cloning operations for the first subject, and avoid more than 67\% of the cloning operations by blocking only 34\% of the cloning operations for the second subject. Copyright 2012 ACM.

    @inproceedings{wang_can_2012,
    title = {Can i clone this piece of code here?},
    isbn = {978-1-4503-1204-2},
    doi = {10.1145/2351676.2351701},
    url = {https://ieeexplore.ieee.org/document/6494924},
    abstract = {While code cloning is a convenient way for developers to reuse existing code, it could potentially lead to negative impacts, such as degrading code quality or increasing maintenance costs. Actually, some cloned code pieces are viewed as harmless since they evolve independently, while other cloned code pieces are viewed as harmful since they need to be changed consistently, thus incurring extra maintenance costs. Recent studies demonstrate that neither the percentage of harmful code clones nor that of harmless code clones is negligible. To assist developers in leveraging the benefits of harmless code cloning and/or in avoiding the negative impacts of harmful code cloning, we propose a novel approach that automatically predicts the harmfulness of a code cloning operation at the point of performing copy-and-paste. Our insight is that the potential harmfulness of a code cloning operation may relate to some characteristics of the code to be cloned and the characteristics of its context. Based on a number of features extracted from the cloned code and the context of the code cloning operation, we use Bayesian Networks, a machine-learning technique, to predict the harmfulness of an intended code cloning operation. We evaluated our approach on two large-scale industrial software projects under two usage scenarios: 1) approving only cloning operations predicted to be very likely of no harm, and 2) blocking only cloning operations predicted to be very likely of harm. In the first scenario, our approach is able to approve more than 50\% cloning operations with a precision higher than 94.9\% in both subjects. In the second scenario, our approach is able to avoid more than 48\% of the harmful cloning operations by blocking only 15\% of the cloning operations for the first subject, and avoid more than 67\% of the cloning operations by blocking only 34\% of the cloning operations for the second subject. Copyright 2012 ACM.},
    booktitle = {2012 27th IEEE/ACM International Conference on Automated Software Engineering (ASE)},
    author = {Wang, Xiaoyin and Dang, Yingnong and Zhang, Lu and Zhang, Dongmei and Lan, Erica and Mei, Hong},
    year = {2012},
    keywords = {Code cloning, Harmfulness prediction, Programming aid},
    pages = {170-179}
    }

  • P. Xia, Y. Manabe, N. Yoshida, and K. Inoue, “Development of a Code Clone Search Tool for Open Source Repositories,” , 4, 2012.
    [BibTeX] [Abstract] [PDF]

    Finding code clones in the open source systems is important for efficient and safe reuse of existing open source software. In this paper, we propose a novel search model, open code clone search, to explore code clones in open source repositories on the Internet. Based on this search model, we have designed and implemented a prototype system named OpenCCFinder. This system takes a query code fragment as its input, and returns the code fragments containing the code clones with the query. It utilizes publicly available code search engines as external resources. Using OpenCCFinder , we have conducted several case studies for Java code. These case studies show the applicability of our system.

    @techreport{xia_development_2012,
    title = {Development of a {Code} {Clone} {Search} {Tool} for {Open} {Source} {Repositories}},
    url = {https://www.jstage.jst.go.jp/article/imt/7/4/7_1370/_article/-char/ja/},
    abstract = {Finding code clones in the open source systems is important for efficient and safe reuse of existing open source software. In this paper, we propose a novel search model, open code clone search, to explore code clones in open source repositories on the Internet. Based on this search model, we have designed and implemented a prototype system named OpenCCFinder. This system takes a query code fragment as its input, and returns the code fragments containing the code clones with the query. It utilizes publicly available code search engines as external resources. Using OpenCCFinder , we have conducted several case studies for Java code. These case studies show the applicability of our system.},
    number = {4},
    author = {Xia, Pei and Manabe, Yuki and Yoshida, Norihiro and Inoue, Katsuro},
    year = {2012},
    journal ={Information and Media Technologies},
    note = {Publication Title: Information and Media Technologies
    Volume: 7},
    pages = {181-187}
    }

  • P. Xia, Y. Manabe, N. Yoshida, and K. Inoue, “Development of a code clone search tool for open source repositories,” Computer software, vol. 29, 2012.
    [BibTeX] [PDF]
    @article{article,
    author = {Xia, Pei and Manabe, Yuki and Yoshida, Norihiro and Inoue, Katsuro},
    year = {2012},
    month = {01},
    pages = {},
    url = {https://www.researchgate.net/publication/265077570_Development_of_a_code_clone_search_tool_for_open_source_repositories},
    title = {Development of a code clone search tool for open source repositories},
    volume = {29},
    journal = {Computer Software}
    }

  • R. Yokomori, H. Siy, N. Yoshida, M. Noro, and K. Inoue, “Evolution of component relationships between framework and application,” Journal of computers, vol. 23, 2012.
    [BibTeX] [PDF]
    @article{article,
    author = {Yokomori, Reishi and Siy, Harvey and Yoshida, Norihiro and Noro, Masami and Inoue, Katsuro},
    url = {https://www.researchgate.net/publication/267986580_Evolution_of_Component_Relationships_between_Framework_and_Application},
    year = {2012},
    month = {07},
    pages = {},
    title = {Evolution of Component Relationships between Framework and Application},
    volume = {23},
    journal = {Journal of Computers}
    }

  • N. Yoshida, Y. Higo, S. Kusumoto, and K. Inoue, “An experience report on analyzing industrial software systems using code clone detection techniques,” in 2012 19th asia-pacific software engineering conference, 2012, pp. 310-313.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6462669,
    author={N. {Yoshida} and Y. {Higo} and S. {Kusumoto} and K. {Inoue}},
    booktitle={2012 19th Asia-Pacific Software Engineering Conference},
    title={An Experience Report on Analyzing Industrial Software Systems Using Code Clone Detection Techniques},
    year={2012},
    url = {https://ieeexplore.ieee.org/abstract/document/6462669/},
    volume={1},
    number={},
    pages={310-313},}

  • Y. Yuan, “A scalable and accurate approach based on count matrix for detecting code clones,” in Proceedings of the 11th annual international conference on aspect-oriented software development companion, New York, NY, USA, 2012, p. 21–22. doi:10.1145/2162110.2162126
    [BibTeX] [PDF]
    @inproceedings{10.1145/2162110.2162126,
    author = {Yuan, Yang},
    title = {A Scalable and Accurate Approach Based on Count Matrix for Detecting Code Clones},
    year = {2012},
    isbn = {9781450312226},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2162110.2162126},
    doi = {10.1145/2162110.2162126},
    booktitle = {Proceedings of the 11th Annual International Conference on Aspect-Oriented Software Development Companion},
    pages = {21–22},
    numpages = {2},
    keywords = {token based, code clone, count matrix},
    location = {Potsdam, Germany},
    series = {AOSD Companion ’12}
    }

  • Y. Yuan and Y. Guo, “Boreas: An accurate and scalable token-based approach to code clone detection,” in 2012 27th IEEE/ACM International Conference on Automated Software Engineering, ASE 2012 – Proceedings, 2012, pp. 286-289. doi:10.1145/2351676.2351725
    [BibTeX] [Abstract] [PDF]

    Detecting code clones in a program has many applications in software engineering and other related fields. In this paper, we present Boreas, an accurate and scalable tokenbased approach for code clone detection. Boreas introduces a novel counting-based method to define the characteristic matrices, which are able to describe the program segments distinctly and effectively for the purpose of clone detection. We conducted experiments on JDK 7 and Linux kernel 2.6.38.6 source code. Experimental results show that Boreas is able to match the detecting accuracy of a recently proposed syntactic-based tool Deckard, with the execution time reduced by more than an order of magnitude. Copyright 2012 ACM.

    @inproceedings{yuan_boreas_2012,
    title = {Boreas: {An} accurate and scalable token-based approach to code clone detection},
    isbn = {978-1-4503-1204-2},
    doi = {10.1145/2351676.2351725},
    url = {https://ieeexplore.ieee.org/document/6494937},
    abstract = {Detecting code clones in a program has many applications in software engineering and other related fields. In this paper, we present Boreas, an accurate and scalable tokenbased approach for code clone detection. Boreas introduces a novel counting-based method to define the characteristic matrices, which are able to describe the program segments distinctly and effectively for the purpose of clone detection. We conducted experiments on JDK 7 and Linux kernel 2.6.38.6 source code. Experimental results show that Boreas is able to match the detecting accuracy of a recently proposed syntactic-based tool Deckard, with the execution time reduced by more than an order of magnitude. Copyright 2012 ACM.},
    booktitle = {2012 27th {IEEE}/{ACM} {International} {Conference} on {Automated} {Software} {Engineering}, {ASE} 2012 - {Proceedings}},
    author = {Yuan, Yang and Guo, Yao},
    year = {2012},
    keywords = {Code clone detection, Count matrix, Count vector},
    pages = {286-289}
    }

  • D. Zage, W. Zage, and D. Zage, “Clones: Underlying Patterns throughout the Software Lifecycle.,” 9th working conference on mining software repositories, 2012.
    [BibTeX] [PDF]
    @article{zage_clones_2012,
    title = {Clones: {Underlying} {Patterns} throughout the {Software} {Lifecycle}.},
    url = {https://www.osti.gov/servlets/purl/1145241},
    author = {Zage, DJ and Zage, WM and Zage, D},
    journal = {9th Working Conference on Mining Software Repositories},
    year = {2012}
    }

  • G. Zhang, X. Peng, Z. Xing, and W. Zhao, “Cloning practices: why developers clone and what can be changed,” in 2012 28th ieee international conference on software maintenance (icsm), 2012, pp. 285-294.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6405284,
    author={G. {Zhang} and X. {Peng} and Z. {Xing} and W. {Zhao}},
    booktitle={2012 28th IEEE International Conference on Software Maintenance (ICSM)},
    title={Cloning practices: Why developers clone and what can be changed},
    year={2012},
    url = {https://ieeexplore.ieee.org/abstract/document/6405284/},
    volume={},
    number={},
    pages={285-294},
    }

  • M. F. Zibran and C. K. Roy, “Ide-based real-time focused search for near-miss clones,” in Proceedings of the 27th annual acm symposium on applied computing, New York, NY, USA, 2012, p. 1235–1242. doi:10.1145/2245276.2231970
    [BibTeX] [PDF]
    @inproceedings{10.1145/2245276.2231970,
    author = {Zibran, Minhaz F. and Roy, Chanchal K.},
    title = {IDE-Based Real-Time Focused Search for near-Miss Clones},
    year = {2012},
    isbn = {9781450308571},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2245276.2231970},
    doi = {10.1145/2245276.2231970},
    booktitle = {Proceedings of the 27th Annual ACM Symposium on Applied Computing},
    pages = {1235–1242},
    numpages = {8},
    keywords = {clone search, reengineering, maintenance, clone detection},
    location = {Trento, Italy},
    series = {SAC ’12}
    }

2011

  • M. Asaduzzaman, C. K. Roy, and K. A. Schneider, “Viscad: flexible code clone analysis support for nicad,” in Proceedings of the 5th international workshop on software clones, New York, NY, USA, 2011, pp. 77-78. doi:10.1145/1985404.1985425
    [BibTeX] [PDF]
    @inproceedings{10.1145/1985404.1985425,
    author = {Asaduzzaman, Muhammad and Roy, Chanchal K. and Schneider, Kevin A.},
    title = {VisCad: Flexible Code Clone Analysis Support for NiCad},
    year = {2011},
    isbn = {9781450305884},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1985404.1985425},
    doi = {10.1145/1985404.1985425},
    booktitle = {Proceedings of the 5th International Workshop on Software Clones},
    pages = {77-78},
    numpages = {2},
    keywords = {analysis, code clones, visualization},
    location = {Waikiki, Honolulu, HI, USA},
    series = {IWSC ’11}
    }

  • J. Carver, D. Chatterji, and N. A. Kraft, “On the need for human-based empirical validation of techniques and tools for code clone analysis,” in Proceedings of the 5th international workshop on software clones, New York, NY, USA, 2011, p. 61–62. doi:10.1145/1985404.1985416
    [BibTeX] [PDF]
    @inproceedings{10.1145/1985404.1985416,
    author = {Carver, Jeffrey and Chatterji, Debarshi and Kraft, Nicholas A.},
    title = {On the Need for Human-Based Empirical Validation of Techniques and Tools for Code Clone Analysis},
    year = {2011},
    isbn = {9781450305884},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1985404.1985416},
    doi = {10.1145/1985404.1985416},
    booktitle = {Proceedings of the 5th International Workshop on Software Clones},
    pages = {61–62},
    numpages = {2},
    keywords = {code clones, clone management, human-based empirical studies},
    location = {Waikiki, Honolulu, HI, USA},
    series = {IWSC ’11}
    }

  • J. R. Cordy and C. K. Roy, “DebCheck: Efficient Checking for Open Source Code Clones in Software Systems,” 2011.
    [BibTeX] [Abstract] [PDF]

    The problem of finding code cloned from open source code in software systems is of interest both to the open source community (e.g., for GPL and other open source license enforcement) and the industrial community (e.g., to prevent GPL “contamination” of proprietary commercial software systems). The largest collection of open source software in general distribution is the collection of eight DVDs in the Debian source distribution, and checking for cross-cloning with the Debian source distribution goes a long way towards finding any possible copying from the set of all open source code in the world. The NiCad clone detector is an open source language-sensitive robust clone detector that has been shown to yield both high precision and high recall in detecting syntactically meaningful near-miss clones such as functions and blocks. Given a directory of new source code to check, DebCheck uses NiCad in its incremental mode to efficiently check the system for near-miss clones of C functions in the entire Debian source base in a few minutes on a 2 Gb home computer. The same technique can be used to check systems for cross-clones with any large source collection.

    @techreport{cordy_debcheck_nodate,
    title = {{DebCheck}: {Efficient} {Checking} for {Open} {Source} {Code} {Clones} in {Software} {Systems}},
    url = {https://ieeexplore.ieee.org/abstract/document/5970188/},
    abstract = {The problem of finding code cloned from open source code in software systems is of interest both to the open source community (e.g., for GPL and other open source license enforcement) and the industrial community (e.g., to prevent GPL "contamination" of proprietary commercial software systems). The largest collection of open source software in general distribution is the collection of eight DVDs in the Debian source distribution, and checking for cross-cloning with the Debian source distribution goes a long way towards finding any possible copying from the set of all open source code in the world. The NiCad clone detector is an open source language-sensitive robust clone detector that has been shown to yield both high precision and high recall in detecting syntactically meaningful near-miss clones such as functions and blocks. Given a directory of new source code to check, DebCheck uses NiCad in its incremental mode to efficiently check the system for near-miss clones of C functions in the entire Debian source base in a few minutes on a 2 Gb home computer. The same technique can be used to check systems for cross-clones with any large source collection.},
    author = {Cordy, James R and Roy, Chanchal K},
    note = {Publication Title: ieeexplore.ieee.org},
    keywords = {clone detection, open source, GPL, licensing},
    year = {2011},
    booktitle={2011 IEEE 19th International Conference on Program Comprehension},
    pages = {217-218}
    }

  • J. R. Cordy, K. Inoue, S. Jarzabek, and R. Koschke, “Fifth international workshop on software clones (iwsc 2011),” in Proceedings of the 33rd international conference on software engineering, New York, NY, USA, 2011, pp. 1210-1211. doi:10.1145/1985793.1986050
    [BibTeX] [PDF]
    @inproceedings{10.1145/1985793.1986050,
    author = {Cordy, James R. and Inoue, Katsuro and Jarzabek, Stanislaw and Koschke, Rainer},
    title = {Fifth International Workshop on Software Clones (IWSC 2011)},
    year = {2011},
    isbn = {9781450304450},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1985793.1986050},
    doi = {10.1145/1985793.1986050},
    booktitle = {Proceedings of the 33rd International Conference on Software Engineering},
    pages = {1210-1211},
    numpages = {2},
    keywords = {software clones, software maintenance, clone detection, reverse engineering},
    location = {Waikiki, Honolulu, HI, USA},
    series = {ICSE ’11}
    }

  • T. Lavoie and E. Merlo, “Automated type-3 clone oracle using levenshtein metric,” in Proceedings of the 5th international workshop on software clones, New York, NY, USA, 2011, pp. 34-40. doi:10.1145/1985404.1985411
    [BibTeX] [PDF]
    @inproceedings{10.1145/1985404.1985411,
    author = {Lavoie, Thierry and Merlo, Ettore},
    title = {Automated Type-3 Clone Oracle Using Levenshtein Metric},
    year = {2011},
    isbn = {9781450305884},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1985404.1985411},
    doi = {10.1145/1985404.1985411},
    booktitle = {Proceedings of the 5th International Workshop on Software Clones},
    pages = {34-40},
    numpages = {7},
    keywords = {clone detection, software clones, type-3 clones, clone benchmark},
    location = {Waikiki, Honolulu, HI, USA},
    series = {IWSC ’11}
    }

  • Y. Dang, S. Ge, R. Huang, and D. Zhang, “Code clone detection experience at microsoft,” in Proceedings of the 5th international workshop on software clones, New York, NY, USA, 2011, p. 63–64. doi:10.1145/1985404.1985417
    [BibTeX] [PDF]
    @inproceedings{10.1145/1985404.1985417,
    author = {Dang, Yingnong and Ge, Song and Huang, Ray and Zhang, Dongmei},
    title = {Code Clone Detection Experience at Microsoft},
    year = {2011},
    isbn = {9781450305884},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1985404.1985417},
    doi = {10.1145/1985404.1985417},
    booktitle = {Proceedings of the 5th International Workshop on Software Clones},
    pages = {63–64},
    numpages = {2},
    keywords = {experience, clone detection},
    location = {Waikiki, Honolulu, HI, USA},
    series = {IWSC ’11}
    }

  • M. W. Godfrey, D. M. German, J. Davies, and A. Hindle, “Determining the provenance of software artifacts,” in Proceedings of the 5th international workshop on software clones, New York, NY, USA, 2011, p. 65–66. doi:10.1145/1985404.1985418
    [BibTeX] [PDF]
    @inproceedings{10.1145/1985404.1985418,
    author = {Godfrey, Michael W. and German, Daniel M. and Davies, Julius and Hindle, Abram},
    title = {Determining the Provenance of Software Artifacts},
    year = {2011},
    isbn = {9781450305884},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1985404.1985418},
    doi = {10.1145/1985404.1985418},
    booktitle = {Proceedings of the 5th International Workshop on Software Clones},
    pages = {65–66},
    numpages = {2},
    keywords = {provenance, code evolution, code fingerprints, bertillonage},
    location = {Waikiki, Honolulu, HI, USA},
    series = {IWSC ’11}
    }

  • H. Li and S. Thompson, “Incremental clone detection and elimination for erlang programs,” in Proceedings of the 14th international conference on fundamental approaches to software engineering: part of the joint european conferences on theory and practice of software, Berlin, Heidelberg, 2011, p. 356–370.
    [BibTeX] [PDF]
    @inproceedings{10.5555/1987434.1987468,
    author = {Li, Huiqing and Thompson, Simon},
    title = {Incremental Clone Detection and Elimination for Erlang Programs},
    year = {2011},
    isbn = {9783642198106},
    publisher = {Springer-Verlag},
    address = {Berlin, Heidelberg},
    booktitle = {Proceedings of the 14th International Conference on Fundamental Approaches to Software Engineering: Part of the Joint European Conferences on Theory and Practice of Software},
    pages = {356–370},
    numpages = {15},
    keywords = {program analysis, wrangler, erlang, code clone detection, program transformation, software maintenance, refactoring},
    location = {Saarbr\"{u}cken, Germany},
    series = {FASE’11/ETAPS’11},
    url = {https://dl.acm.org/doi/10.5555/1987434.1987468}
    }

  • R. K. Saha, C. K. Roy, and K. A. Schneider, “Visualizing the evolution of code clones,” in Proceedings of the 5th international workshop on software clones, New York, NY, USA, 2011, p. 71–72. doi:10.1145/1985404.1985421
    [BibTeX] [PDF]
    @inproceedings{10.1145/1985404.1985421,
    author = {Saha, Ripon K. and Roy, Chanchal K. and Schneider, Kevin A.},
    title = {Visualizing the Evolution of Code Clones},
    year = {2011},
    isbn = {9781450305884},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1985404.1985421},
    doi = {10.1145/1985404.1985421},
    booktitle = {Proceedings of the 5th International Workshop on Software Clones},
    pages = {71–72},
    numpages = {2},
    keywords = {scatter plot, clone evolution, visualization},
    location = {Waikiki, Honolulu, HI, USA},
    series = {IWSC ’11}
    }

  • M. F. Zibran, R. K. Saha, M. Asaduzzaman, and C. K. Roy, “Analyzing and forecasting near-miss clones in evolving software: an empirical study,” in 2011 16th ieee international conference on engineering of complex computer systems, 2011, pp. 295-304.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5773403,
    author={M. F. {Zibran} and R. K. {Saha} and M. {Asaduzzaman} and C. K. {Roy}},
    booktitle={2011 16th IEEE International Conference on Engineering of Complex Computer Systems},
    title={Analyzing and Forecasting Near-Miss Clones in Evolving Software: An Empirical Study},
    year={2011},
    url = {https://ieeexplore.ieee.org/abstract/document/5773403/},
    volume={},
    number={},
    pages={295-304},
    }

  • N. Tillmann, M. Fahndrich, M. Moskal, and P. de Halleux, “Code similarity in touchdevelop: harnessing clones,” , MSR-TR-2011-103, 2011.
    [BibTeX] [Abstract] [PDF]

    The number of applications available in mobile marketplaces is increasing rapidly. It’s very easy to become overwhelmed by the sheer size of their codebase. We propose to use code clone analysis to help manage existing applications and develop new ones. First, we propose an automatic application ranking scheme based on (dis)similarity. Traditionally, applications in app stores are ranked manually, by user or moderator input. We argue that automatically computed (dis)similarity information can be used to reinforce this ranking and help in dealing with possible application cloning. Second, we consider code snippet search, a task commonly performed by application developers. We view it as a special instance of the clone detection problem which allows us to perform precise search with little to no configuration and completely agnostic of code formatting, variable renamings, etc. We built a prototype of our approach in TouchDevelop, a novel application development environment for Windows Phone, and will use it as a testing ground for future evaluation.

    @techreport{tillmann2011code,
    author = {Tillmann, Nikolai and Fahndrich, Manuel and Moskal, Michal and de Halleux, Peli},
    title = {Code Similarity in TouchDevelop: Harnessing Clones},
    year = {2011},
    month = {September},
    abstract = {The number of applications available in mobile marketplaces is increasing rapidly. It's very easy to become overwhelmed by the sheer size of their codebase. We propose to use code clone analysis to help manage existing applications and develop new ones. First, we propose an automatic application ranking scheme based on (dis)similarity. Traditionally, applications in app stores are ranked manually, by user or moderator input. We argue that automatically computed (dis)similarity information can be used to reinforce this ranking and help in dealing with possible application cloning. Second, we consider code snippet search, a task commonly performed by application developers. We view it as a special instance of the clone detection problem which allows us to perform precise search with little to no configuration and completely agnostic of code formatting, variable renamings, etc. We built a prototype of our approach in TouchDevelop, a novel application development environment for Windows Phone, and will use it as a testing ground for future evaluation.},
    journal = {Microsoft Technical Report},
    url = {https://www.microsoft.com/en-us/research/publication/code-similarity-in-touchdevelop-harnessing-clones/},
    number = {MSR-TR-2011-103},
    }

  • B. Hummel, E. Juergens, and D. Steidl, “Index-based model clone detection,” in Proceedings of the 5th international workshop on software clones, New York, NY, USA, 2011, p. 21–27. doi:10.1145/1985404.1985409
    [BibTeX] [PDF]
    @inproceedings{10.1145/1985404.1985409,
    author = {Hummel, Benjamin and Juergens, Elmar and Steidl, Daniela},
    title = {Index-Based Model Clone Detection},
    year = {2011},
    isbn = {9781450305884},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1985404.1985409},
    doi = {10.1145/1985404.1985409},
    booktitle = {Proceedings of the 5th International Workshop on Software Clones},
    pages = {21–27},
    numpages = {7},
    keywords = {matlab/simulink, data-flow, model clone, clone detection},
    location = {Waikiki, Honolulu, HI, USA},
    series = {IWSC ’11}
    }

  • E. Juergens, “Research in cloning beyond code: a first roadmap,” in Proceedings of the 5th international workshop on software clones, New York, NY, USA, 2011, p. 67–68. doi:10.1145/1985404.1985419
    [BibTeX] [PDF]
    @inproceedings{10.1145/1985404.1985419,
    author = {Juergens, Elmar},
    title = {Research in Cloning beyond Code: A First Roadmap},
    year = {2011},
    isbn = {9781450305884},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1985404.1985419},
    doi = {10.1145/1985404.1985419},
    booktitle = {Proceedings of the 5th International Workshop on Software Clones},
    pages = {67–68},
    numpages = {2},
    keywords = {clone detection beyond code, roadmap},
    location = {Waikiki, Honolulu, HI, USA},
    series = {IWSC ’11}
    }

  • F. Beck and S. Diehl, “On the congruence of modularity and code coupling,” in Proceedings of the 19th acm sigsoft symposium and the 13th european conference on foundations of software engineering, New York, NY, USA, 2011, p. 354–364. doi:10.1145/2025113.2025162
    [BibTeX] [PDF]
    @inproceedings{10.1145/2025113.2025162,
    author = {Beck, Fabian and Diehl, Stephan},
    title = {On the Congruence of Modularity and Code Coupling},
    year = {2011},
    isbn = {9781450304436},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/2025113.2025162},
    doi = {10.1145/2025113.2025162},
    booktitle = {Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering},
    pages = {354–364},
    numpages = {11},
    keywords = {package design, modularity, code coupling},
    location = {Szeged, Hungary},
    series = {ESEC/FSE ’11}
    }

  • M. S. Uddin, C. K. Roy, K. A. Schneider, and A. Hindle, “On the effectiveness of simhash for detecting near-miss clones in large scale software systems,” in 2011 18th working conference on reverse engineering, 2011, pp. 13-22.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6079770,
    author={M. S. {Uddin} and C. K. {Roy} and K. A. {Schneider} and A. {Hindle}},
    booktitle={2011 18th Working Conference on Reverse Engineering},
    title={On the Effectiveness of Simhash for Detecting Near-Miss Clones in Large Scale Software Systems},
    year={2011},
    url = {https://ieeexplore.ieee.org/document/6079770},
    volume={},
    number={},
    pages={13-22},
    }

  • E. Choi, N. Yoshida, T. Ishio, K. Inoue, and T. Sano, “Extracting code clones for refactoring using combinations of clone metrics,” in Proceedings – International Conference on Software Engineering, 2011, pp. 7-13. doi:10.1145/1985404.1985407
    [BibTeX] [Abstract] [PDF]

    Code clone detection tools may report a large number of code clones, while software developers are interested in only a subset of code clones that are relevant to software development tasks such as refactoring. Our research group has supported many software developers with the code clone detection tool CCFinder and its GUI front-end Gemini. Gemini shows clone sets (i.e., a set of code clones identical or similar to each other) with several clone metrics including their length and the number of code clones; however, it is not clear how to use those metrics to extract interesting code clones for developers. In this paper, we propose a method combining clone metrics to extract code clones for refactoring activity. We have conducted an empirical study on a web application developed by a Japanese software company. The result indicates that combinations of simple clone metric is more effective to extract refactoring candidates in detected code clones than individual clone metric. ©2011 ACM.

    @inproceedings{choi_extracting_2011,
    title = {Extracting code clones for refactoring using combinations of clone metrics},
    isbn = {978-1-4503-0588-4},
    doi = {10.1145/1985404.1985407},
    url = {https://dl.acm.org/doi/10.1145/1985404.1985407},
    abstract = {Code clone detection tools may report a large number of code clones, while software developers are interested in only a subset of code clones that are relevant to software development tasks such as refactoring. Our research group has supported many software developers with the code clone detection tool CCFinder and its GUI front-end Gemini. Gemini shows clone sets (i.e., a set of code clones identical or similar to each other) with several clone metrics including their length and the number of code clones; however, it is not clear how to use those metrics to extract interesting code clones for developers. In this paper, we propose a method combining clone metrics to extract code clones for refactoring activity. We have conducted an empirical study on a web application developed by a Japanese software company. The result indicates that combinations of simple clone metric is more effective to extract refactoring candidates in detected code clones than individual clone metric. ©2011 ACM.},
    booktitle = {Proceedings - {International} {Conference} on {Software} {Engineering}},
    author = {Choi, Eunjong and Yoshida, Norihiro and Ishio, Takashi and Inoue, Katsuro and Sano, Tateki},
    year = {2011},
    note = {ISSN: 02705257},
    keywords = {Code clone, Refactoring, Industrial case study},
    pages = {7-13}
    }

  • J. R. Cordy, “Live scatterplots,” in Proceedings of the 5th international workshop on software clones, New York, NY, USA, 2011, p. 79–80. doi:10.1145/1985404.1985426
    [BibTeX] [PDF]
    @inproceedings{10.1145/1985404.1985426,
    author = {Cordy, James R.},
    title = {Live Scatterplots},
    year = {2011},
    isbn = {9781450305884},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1985404.1985426},
    doi = {10.1145/1985404.1985426},
    booktitle = {Proceedings of the 5th International Workshop on Software Clones},
    pages = {79–80},
    numpages = {2},
    keywords = {clone analysis, visualization, software clones},
    location = {Waikiki, Honolulu, HI, USA},
    series = {IWSC ’11}
    }

  • J. R. Cordy, “Exploring large-scale system similarity using incremental clone detection and live scatterplots,” in 2011 ieee 19th international conference on program comprehension, 2011, pp. 151-160.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5970149,
    author={J. R. {Cordy}},
    booktitle={2011 IEEE 19th International Conference on Program Comprehension},
    title={Exploring Large-Scale System Similarity Using Incremental Clone Detection and Live Scatterplots},
    year={2011},
    url = {https://ieeexplore.ieee.org/abstract/document/5970149/},
    volume={},
    number={},
    pages={151-160},
    }

  • N. Göde and R. Koschke, “Frequency and risks of changes to clones,” in Proceedings – International Conference on Software Engineering, 2011, pp. 311-320. doi:10.1145/1985793.1985836
    [BibTeX] [Abstract] [PDF]

    Code Clones – duplicated source fragments – are said to increase maintenance effort and to facilitate problems caused by inconsistent changes to identical parts. While this is certainly true for some clones and certainly not true for others, it is unclear how many clones are real threats to the system’s quality and need to be taken care of. Our analysis of clone evolution in mature software projects shows that most clones are rarely changed and the number of unintentional inconsistent changes to clones is small. We thus have to carefully select the clones to be managed to avoid unnecessary effort managing clones with no risk potential. © 2011 ACM.

    @inproceedings{gode_frequency_2011,
    title = {Frequency and risks of changes to clones},
    isbn = {978-1-4503-0445-0},
    doi = {10.1145/1985793.1985836},
    url = {https://ieeexplore.ieee.org/document/6032470},
    abstract = {Code Clones - duplicated source fragments - are said to increase maintenance effort and to facilitate problems caused by inconsistent changes to identical parts. While this is certainly true for some clones and certainly not true for others, it is unclear how many clones are real threats to the system's quality and need to be taken care of. Our analysis of clone evolution in mature software projects shows that most clones are rarely changed and the number of unintentional inconsistent changes to clones is small. We thus have to carefully select the clones to be managed to avoid unnecessary effort managing clones with no risk potential. © 2011 ACM.},
    booktitle = {Proceedings - {International} {Conference} on {Software} {Engineering}},
    author = {Göde, Nils and Koschke, Rainer},
    year = {2011},
    note = {ISSN: 02705257},
    keywords = {clone evolution, clone detection, software maintenance},
    pages = {311-320}
    }

  • Y. Higo and S. Kusumoto, “Code clone detection on specialized pdgs with heuristics,” in Proceedings of the 2011 15th european conference on software maintenance and reengineering, USA, 2011, p. 75–84. doi:10.1109/CSMR.2011.12
    [BibTeX] [PDF]
    @inproceedings{10.1109/CSMR.2011.12,
    author = {Higo, Yoshiki and Kusumoto, Shinji},
    title = {Code Clone Detection on Specialized PDGs with Heuristics},
    year = {2011},
    isbn = {9780769543437},
    publisher = {IEEE Computer Society},
    address = {USA},
    url = {https://doi.org/10.1109/CSMR.2011.12},
    doi = {10.1109/CSMR.2011.12},
    booktitle = {Proceedings of the 2011 15th European Conference on Software Maintenance and Reengineering},
    pages = {75–84},
    numpages = {10},
    keywords = {code clone, program dependency graph},
    series = {CSMR ’11}
    }

  • H. Kim, Y. Jung, S. Kim, and K. Yi, “MeCC: Memory comparison-based clone detector,” in Proceedings – International Conference on Software Engineering, 2011, pp. 301-310. doi:10.1145/1985793.1985835
    [BibTeX] [Abstract] [PDF]

    In this paper, we propose a new semantic clone detection technique by comparing programs’ abstract memory states, which are computed by a semantic-based static analyzer. Our experimental study using three large-scale open source projects shows that our technique can detect semantic clones that existing syntactic- or semantic-based clone detectors miss. Our technique can help developers identify inconsistent clone changes, find refactoring candidates, and understand software evolution related to semantic clones. © 2011 ACM.

    @inproceedings{kim_mecc_2011,
    title = {{MeCC}: {Memory} comparison-based clone detector},
    isbn = {978-1-4503-0445-0},
    doi = {10.1145/1985793.1985835},
    url = {https://ieeexplore.ieee.org/document/6032469},
    abstract = {In this paper, we propose a new semantic clone detection technique by comparing programs' abstract memory states, which are computed by a semantic-based static analyzer. Our experimental study using three large-scale open source projects shows that our technique can detect semantic clones that existing syntactic- or semantic-based clone detectors miss. Our technique can help developers identify inconsistent clone changes, find refactoring candidates, and understand software evolution related to semantic clones. © 2011 ACM.},
    booktitle = {Proceedings - {International} {Conference} on {Software} {Engineering}},
    author = {Kim, Heejung and Jung, Yungbum and Kim, Sunghun and Yi, Kwankeun},
    year = {2011},
    note = {ISSN: 02705257},
    keywords = {clone detection, software maintenance, static analysis, abstract interpretation},
    pages = {301-310}
    }

  • D. Martin and J. R. Cordy, “Analyzing web service similarity using contextual clones,” in Proceedings of the 5th international workshop on software clones, New York, NY, USA, 2011, p. 41–46. doi:10.1145/1985404.1985412
    [BibTeX] [PDF]
    @inproceedings{10.1145/1985404.1985412,
    author = {Martin, Douglas and Cordy, James R.},
    title = {Analyzing Web Service Similarity Using Contextual Clones},
    year = {2011},
    isbn = {9781450305884},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1985404.1985412},
    doi = {10.1145/1985404.1985412},
    booktitle = {Proceedings of the 5th International Workshop on Software Clones},
    pages = {41–46},
    numpages = {6},
    keywords = {web services, wsdl, clone detection techniques},
    location = {Waikiki, Honolulu, HI, USA},
    series = {IWSC ’11}
    }

  • M. Mondal, M. S. Rahman, R. K. Saha, C. K. Roy, J. Krinke, and K. A. Schneider, “An empirical study of the impacts of clones in software maintenance,” in 2011 ieee 19th international conference on program comprehension, 2011, pp. 242-245.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5970172,
    author={M. {Mondal} and M. S. {Rahman} and R. K. {Saha} and C. K. {Roy} and J. {Krinke} and K. A. {Schneider}},
    booktitle={2011 IEEE 19th International Conference on Program Comprehension},
    title={An Empirical Study of the Impacts of Clones in Software Maintenance},
    year={2011},
    url = {https://ieeexplore.ieee.org/abstract/document/5970172/},
    volume={},
    number={},
    pages={242-245},}

  • I. Keivanloo, J. Rilling, and P. Charland, “Internet-scale real-time code clone search via multi-level indexing.” 2011, pp. 23-27. doi:10.1109/WCRE.2011.13
    [BibTeX] [PDF]
    @inproceedings{inproceedings,
    author = {Keivanloo, Iman and Rilling, Juergen and Charland, Philippe},
    year = {2011},
    month = {10},
    url = {https://www.researchgate.net/publication/221200445_Internet-scale_Real-time_Code_Clone_Search_Via_Multi-level_Indexing},
    pages = {23-27},
    journal = {8th Working Conference on Reverse Engineering, WCRE },
    title = {Internet-scale Real-time Code Clone Search Via Multi-level Indexing},
    doi = {10.1109/WCRE.2011.13}
    }

  • P. Schugerl, J. Rilling, and P. Charland, “Reasoning about global clones: scalable semantic clone detection,” in 2011 ieee 35th annual computer software and applications conference, 2011, pp. 486-491.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6032385,
    author={P. {Schugerl} and J. {Rilling} and P. {Charland}},
    booktitle={2011 IEEE 35th Annual Computer Software and Applications Conference},
    title={Reasoning about Global Clones: Scalable Semantic Clone Detection},
    year={2011},
    url = {https://ieeexplore.ieee.org/abstract/document/6032385},
    volume={},
    number={},
    pages={486-491},}

  • R. K. Saha, C. K. Roy, and K. A. Schneider, “An automatic framework for extracting and classifying near-miss clone genealogies,” in 2011 27th ieee international conference on software maintenance (icsm), 2011, pp. 293-302.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6080796,
    author={R. K. {Saha} and C. K. {Roy} and K. A. {Schneider}},
    booktitle={2011 27th IEEE International Conference on Software Maintenance (ICSM)},
    title={An automatic framework for extracting and classifying near-miss clone genealogies},
    year={2011},
    url = {https://ieeexplore.ieee.org/abstract/document/6080796/},
    volume={},
    number={},
    pages={293-302},
    }

  • P. Schugerl, “Scalable clone detection using description logic,” in Proceedings of the 5th international workshop on software clones, New York, NY, USA, 2011, p. 47–53. doi:10.1145/1985404.1985413
    [BibTeX] [PDF]
    @inproceedings{10.1145/1985404.1985413,
    author = {Schugerl, Philipp},
    title = {Scalable Clone Detection Using Description Logic},
    year = {2011},
    isbn = {9781450305884},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1985404.1985413},
    doi = {10.1145/1985404.1985413},
    booktitle = {Proceedings of the 5th International Workshop on Software Clones},
    pages = {47–53},
    numpages = {7},
    keywords = {code clone detection, semantic-web.},
    location = {Waikiki, Honolulu, HI, USA},
    series = {IWSC ’11}
    }

  • S. Schulze, E. Jurgens, and J. Feigenspan, “Analyzing the effect of preprocessor annotations on code clones,” in 2011 ieee 11th international working conference on source code analysis and manipulation, 2011, pp. 115-124.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6065170,
    author={S. {Schulze} and E. {Jurgens} and J. {Feigenspan}},
    booktitle={2011 IEEE 11th International Working Conference on Source Code Analysis and Manipulation},
    title={Analyzing the Effect of Preprocessor Annotations on Code Clones},
    url = {https://ieeexplore.ieee.org/abstract/document/6065170/},
    year={2011},
    volume={},
    number={},
    pages={115-124},
    }

  • Z. Xing, Y. Xue, and S. Jarzabek, “Clonedifferentiator: analyzing clones by differentiation,” in 2011 26th ieee/acm international conference on automated software engineering (ase 2011), 2011, pp. 576-579.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6100129,
    author={Z. {Xing} and Y. {Xue} and S. {Jarzabek}},
    booktitle={2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011)},
    title={CloneDifferentiator: Analyzing clones by differentiation},
    year={2011},
    volume={},
    number={},
    pages={576-579},
    url = {https://ieeexplore.ieee.org/abstract/document/6100129/},
    }

  • Y. Yu, T. T. Tun, and B. Nuseibeh, “Specifying and detecting meaningful changes in programs,” in 2011 26th ieee/acm international conference on automated software engineering (ase 2011), 2011, pp. 273-282.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6100063,
    author={Y. {Yu} and T. T. {Tun} and B. {Nuseibeh}},
    booktitle={2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011)},
    title={Specifying and detecting meaningful changes in programs},
    year={2011},
    url = {http://dx.doi.org/doi:10.1109/ASE.2011.6100063},
    volume={},
    number={},
    pages={273-282},}

  • Y. Yuan and Y. Guo, “Cmcd: count matrix based code clone detection,” in 2011 18th asia-pacific software engineering conference, 2011, pp. 250-257.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6130694,
    author={Y. {Yuan} and Y. {Guo}},
    booktitle={2011 18th Asia-Pacific Software Engineering Conference},
    title={CMCD: Count Matrix Based Code Clone Detection},
    year={2011},
    url = {https://ieeexplore.ieee.org/document/6130694},
    volume={},
    number={},
    pages={250-257},}

  • M. F. Zibran and C. K. Roy, “Conflict-aware optimal scheduling of code clone refactoring: a constraint programming approach,” in 2011 ieee 19th international conference on program comprehension, 2011, pp. 266-269.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5970178,
    author={M. F. {Zibran} and C. K. {Roy}},
    booktitle={2011 IEEE 19th International Conference on Program Comprehension},
    title={Conflict-Aware Optimal Scheduling of Code Clone Refactoring: A Constraint Programming Approach},
    year={2011},
    url = {https://ieeexplore.ieee.org/abstract/document/5970178/},
    volume={},
    number={},
    pages={266-269},
    }

  • M. F. Zibran and C. K. Roy, “Towards flexible code clone detection, management, and refactoring in ide,” in Proceedings of the 5th international workshop on software clones, New York, NY, USA, 2011, p. 75–76. doi:10.1145/1985404.1985423
    [BibTeX] [PDF]
    @inproceedings{10.1145/1985404.1985423,
    author = {Zibran, Minhaz F. and Roy, Chanchal K.},
    title = {Towards Flexible Code Clone Detection, Management, and Refactoring in IDE},
    year = {2011},
    isbn = {9781450305884},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1985404.1985423},
    doi = {10.1145/1985404.1985423},
    booktitle = {Proceedings of the 5th International Workshop on Software Clones},
    pages = {75–76},
    numpages = {2},
    keywords = {maintenance, refactoring, clone analysis, detection},
    location = {Waikiki, Honolulu, HI, USA},
    series = {IWSC ’11}
    }

  • T. Kamiya, “How code skips over revisions,” in Proceedings of the 5th international workshop on software clones, New York, NY, USA, 2011, p. 69–70. doi:10.1145/1985404.1985420
    [BibTeX] [PDF]
    @inproceedings{10.1145/1985404.1985420,
    author = {Kamiya, Toshihiro},
    title = {How Code Skips over Revisions},
    year = {2011},
    isbn = {9781450305884},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1985404.1985420},
    doi = {10.1145/1985404.1985420},
    booktitle = {Proceedings of the 5th International Workshop on Software Clones},
    pages = {69–70},
    numpages = {2},
    keywords = {code clone, mining repositories, code search},
    location = {Waikiki, Honolulu, HI, USA},
    series = {IWSC ’11}
    }

  • J. Carver, D. Chatterji, and N. A. Kraft, “On the need for human-based empirical validation of techniques and tools for code clone analysis,” in Proceedings of the 5th international workshop on software clones, New York, NY, USA, 2011, p. 61–62. doi:10.1145/1985404.1985416
    [BibTeX] [PDF]
    @inproceedings{10.1145/1985404.1985416,
    author = {Carver, Jeffrey and Chatterji, Debarshi and Kraft, Nicholas A.},
    title = {On the Need for Human-Based Empirical Validation of Techniques and Tools for Code Clone Analysis},
    year = {2011},
    isbn = {9781450305884},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1985404.1985416},
    doi = {10.1145/1985404.1985416},
    booktitle = {Proceedings of the 5th International Workshop on Software Clones},
    pages = {61–62},
    numpages = {2},
    keywords = {human-based empirical studies, code clones, clone management},
    location = {Waikiki, Honolulu, HI, USA},
    series = {IWSC ’11}
    }

  • H. A. Basit, U. Ali, and S. Jarzabek, “Viewing simple clones from structural clones’ perspective,” in Proceedings of the 5th international workshop on software clones, New York, NY, USA, 2011, p. 1–6. doi:10.1145/1985404.1985406
    [BibTeX] [PDF]
    @inproceedings{10.1145/1985404.1985406,
    author = {Basit, Hamid Abdul and Ali, Usman and Jarzabek, Stan},
    title = {Viewing Simple Clones from Structural Clones’ Perspective},
    year = {2011},
    isbn = {9781450305884},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1985404.1985406},
    doi = {10.1145/1985404.1985406},
    booktitle = {Proceedings of the 5th International Workshop on Software Clones},
    pages = {1–6},
    numpages = {6},
    keywords = {code clones, high level similarities},
    location = {Waikiki, Honolulu, HI, USA},
    series = {IWSC ’11}
    }

  • N. Göde and J. Harder, “Clone stability,” in 2011 15th european conference on software maintenance and reengineering, 2011, pp. 65-74.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5741247,
    author={N. {Göde} and J. {Harder}},
    booktitle={2011 15th European Conference on Software Maintenance and Reengineering},
    title={Clone Stability},
    url = {https://ieeexplore.ieee.org/document/5741247},
    year={2011},
    volume={},
    number={},
    pages={65-74},}

  • A. Monden, S. Okahara, Y. Manabe, and K. Matsumoto, “Guilty or not guilty: using clone metrics to determine open source licensing violations,” Ieee software, vol. 28, iss. 2, pp. 42-47, 2011.
    [BibTeX] [PDF]
    @ARTICLE{5661763,
    author={A. {Monden} and S. {Okahara} and Y. {Manabe} and K. {Matsumoto}},
    journal={IEEE Software},
    title={Guilty or Not Guilty: Using Clone Metrics to Determine Open Source Licensing Violations},
    year={2011},
    url = {https://ieeexplore.ieee.org/document/5661763},
    volume={28},
    number={2},
    pages={42-47},}

2010

  • E. Juergens and N. Gode, “Achieving accurate clone detection results,” in Proceedings of the 4th international workshop on software clones, New York, NY, USA, 2010, pp. 1-8. doi:10.1145/1808901.1808902
    [BibTeX] [PDF]
    @inproceedings{10.1145/1808901.1808902,
    author = {Juergens, Elmar and Gode, Nils},
    title = {Achieving Accurate Clone Detection Results},
    year = {2010}, isbn = {9781605589800},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1808901.1808902},
    doi = {10.1145/1808901.1808902},
    booktitle = {Proceedings of the 4th International Workshop on Software Clones},
    pages = {1-8},
    numpages = {8},
    keywords = {software maintenance, clone detection, tailoring, assessment},
    location = {Cape Town, South Africa},
    series = {IWSC ’10}
    }

  • D. Chatterji, B. Massengill, J. Oslin, J. C. Carver, and N. A. Kraft, “Measuring the efficacy of code clone information: an empirical study,” in Evaluation and usability of programming languages and tools, New York, NY, USA, 2010. doi:10.1145/1937117.1937121
    [BibTeX] [PDF]
    @inproceedings{10.1145/1937117.1937121,
    author = {Chatterji, Debarshi and Massengill, Beverly and Oslin, Jason and Carver, Jeffrey C. and Kraft, Nicholas A.},
    title = {Measuring the Efficacy of Code Clone Information: An Empirical Study},
    year = {2010},
    isbn = {9781450305471},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1937117.1937121},
    doi = {10.1145/1937117.1937121},
    booktitle = {Evaluation and Usability of Programming Languages and Tools},
    articleno = {4},
    numpages = {1},
    location = {Reno, Nevada},
    series = {PLATEAU ’10}
    }

  • I. Davis, I. J. Davis, and M. W. Godfrey, “From whence it came: detecting source code clones by analyzing assembler,” Proceedings of working conference on reverse engineering (wcre), pp. 242-246, 2010. doi:10.1109/WCRE.2010.35
    [BibTeX] [PDF]
    @article{davis_whence_2010,
    title = {From Whence It Came: Detecting Source Code Clones by Analyzing Assembler},
    url = {https://www.researchgate.net/publication/224198233},
    doi = {10.1109/WCRE.2010.35},
    journal = {Proceedings of Working Conference on Reverse Engineering (WCRE)},
    author = {Davis, Ian and Davis, Ian J and Godfrey, Michael W},
    year = {2010},
    pages = {242-246}
    }

  • M. Funaro, D. Braga, A. Campi, and C. Ghezzi, “A hybrid approach (syntactic and textual) to clone detection,” in Proceedings of the 4th international workshop on software clones, New York, NY, USA, 2010, p. 79–80. doi:10.1145/1808901.1808914
    [BibTeX] [PDF]
    @inproceedings{10.1145/1808901.1808914,
    author = {Funaro, Marco and Braga, Daniele and Campi, Alessandro and Ghezzi, Carlo},
    title = {A Hybrid Approach (Syntactic and Textual) to Clone Detection},
    year = {2010},
    isbn = {9781605589800},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1808901.1808914},
    doi = {10.1145/1808901.1808914},
    booktitle = {Proceedings of the 4th International Workshop on Software Clones},
    pages = {79–80},
    numpages = {2},
    keywords = {abstract syntax tree, clone detection},
    location = {Cape Town, South Africa},
    series = {IWSC ’10}
    }

  • N. Göde, “Clone removal: fact or fiction?.” 2010, pp. 33-40. doi:10.1145/1808901.1808906
    [BibTeX]
    @inproceedings{inproceedings,
    author = {Göde, Nils},
    year = {2010},
    month = {01},
    pages = {33-40},
    title = {Clone removal: fact or fiction?},
    doi = {10.1145/1808901.1808906},
    journal = {Proceeding of the 4th ICSE International Workshop on Software Clones (IWSC)}
    }

  • N. Gold, J. Krinke, M. Harman, and D. Binkley, “Issues in clone classification for dataflow languages,” in Proceedings – International Conference on Software Engineering, 2010, pp. 83-84. doi:10.1145/1808901.1808916
    [BibTeX] [Abstract] [PDF]

    While clone detection and classification research for textual source code is well-established, clones in visual dataflow languages have only recently received attention. The accepted existing clone classification framework does not adequately capture the nature of clones in the latter kind of programs. In this article, we propose a new classification framework for clone types that may be found in dataflow programs. It parallels the scheme for textual languages but accounts for the differences in syntax and semantics present in graphical languages. © 2010 ACM.

    @inproceedings{gold_issues_2010,
    title = {Issues in clone classification for dataflow languages},
    isbn = {978-1-60558-980-0},
    doi = {10.1145/1808901.1808916},
    url = {http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.301.4574&rep=rep1&type=pdf},
    abstract = {While clone detection and classification research for textual source code is well-established, clones in visual dataflow languages have only recently received attention. The accepted existing clone classification framework does not adequately capture the nature of clones in the latter kind of programs. In this article, we propose a new classification framework for clone types that may be found in dataflow programs. It parallels the scheme for textual languages but accounts for the differences in syntax and semantics present in graphical languages. © 2010 ACM.},
    booktitle = {Proceedings - {International} {Conference} on {Software} {Engineering}},
    author = {Gold, Nicolas and Krinke, Jens and Harman, Mark and Binkley, David},
    year = {2010},
    note = {ISSN: 02705257},
    keywords = {clone detection, clone classification},
    pages = {83-84}
    }

  • K. Inoue, S. Jarzabek, J. R. Cordy, and R. Koshke, “Fourth international workshop on software clones (iwsc),” in Proceedings of the 32nd acm/ieee international conference on software engineering – volume 2, New York, NY, USA, 2010, p. 465–466. doi:10.1145/1810295.1810431
    [BibTeX] [PDF]
    @inproceedings{10.1145/1810295.1810431,
    author = {Inoue, Katsuro and Jarzabek, Stanislaw and Cordy, James R. and Koshke, Rainer},
    title = {Fourth International Workshop on Software Clones (IWSC)},
    year = {2010},
    isbn = {9781605587196},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1810295.1810431},
    doi = {10.1145/1810295.1810431},
    booktitle = {Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 2},
    pages = {465–466},
    numpages = {2},
    keywords = {software maintenance, software clone, code clone detection},
    location = {Cape Town, South Africa},
    series = {ICSE ’10}
    }

  • H. Kim, Y. Jung, S. Kim, and K. Yi, “Clone Detection by Comparing Abstract Memory States,” 2010.
    [BibTeX] [Abstract] [PDF]

    In this paper, we propose a new semantic clone detection technique by comparing programs’ abstract memory states, which are computed by a semantic-based static an-alyzer. Our experimental study using three large-scale open source projects shows that our technique can detect semantic clones that existing syntactic-or semantic-based clone detectors miss. Our technique can help developers identify inconsistent clone changes, find refactoring candidates, and understand software evolution related to semantic clones.

    @techreport{kim_clone_2010,
    title = {Clone {Detection} by {Comparing} {Abstract} {Memory} {States}},
    url = {https://www.semanticscholar.org/paper/Clone-Detection-by-Comparing-Abstract-Memory-States-Kim-Jung/61c0727e3783e456eb22e59f33664785dca5bb69},
    abstract = {In this paper, we propose a new semantic clone detection technique by comparing programs' abstract memory states, which are computed by a semantic-based static an-alyzer. Our experimental study using three large-scale open source projects shows that our technique can detect semantic clones that existing syntactic-or semantic-based clone detectors miss. Our technique can help developers identify inconsistent clone changes, find refactoring candidates, and understand software evolution related to semantic clones.},
    author = {Kim, Heejung and Jung, Yungbum and Kim, Sunghun and Yi, Kwangkeun},
    year = {2010},
    journal = {Research on Software Analysis for Error-free Computing (ROSAEC)},
    note = {Publication Title: rosaec.snu.ac.kr}
    }

  • J. Krinke, N. Gold, Y. Jia, and D. Binkley, “Distinguishing copies from originals in software clones,” in Proceedings of the 4th international workshop on software clones, New York, NY, USA, 2010, p. 41–48. doi:10.1145/1808901.1808907
    [BibTeX] [PDF]
    @inproceedings{10.1145/1808901.1808907,
    author = {Krinke, Jens and Gold, Nicolas and Jia, Yue and Binkley, David},
    title = {Distinguishing Copies from Originals in Software Clones},
    year = {2010},
    isbn = {9781605589800},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1808901.1808907},
    doi = {10.1145/1808901.1808907},
    booktitle = {Proceedings of the 4th International Workshop on Software Clones},
    pages = {41–48},
    numpages = {8},
    keywords = {clone detection, mining software archives, software evolution},
    location = {Cape Town, South Africa},
    series = {IWSC ’10}
    }

  • M. Lee, J. Roh, S. Hwang, and S. Kim, “Instant code clone search,” in Proceedings of the eighteenth acm sigsoft international symposium on foundations of software engineering, New York, NY, USA, 2010, p. 167–176. doi:10.1145/1882291.1882317
    [BibTeX] [PDF]
    @inproceedings{10.1145/1882291.1882317,
    author = {Lee, Mu-Woong and Roh, Jong-Won and Hwang, Seung-won and Kim, Sunghun},
    title = {Instant Code Clone Search},
    year = {2010},
    isbn = {9781605587912},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1882291.1882317},
    doi = {10.1145/1882291.1882317},
    booktitle = {Proceedings of the Eighteenth ACM SIGSOFT International Symposium on Foundations of Software Engineering},
    pages = {167–176},
    numpages = {10},
    keywords = {code search, clone detection},
    location = {Santa Fe, New Mexico, USA},
    series = {FSE ’10}
    }

  • H. Li and S. Thompson, Similar Code Detection and Elimination for Erlang Programs, Springer, 2010.
    [BibTeX] [Abstract] [PDF]

    A well-known bad code smell in refactoring and software maintenance is duplicated code, that is the existence of code clones, which are code fragments that are identical or similar to one another. Unjustified code clones increase code size, make maintenance and comprehension more difficult, and also indicate design problems such as a lack of encapsulation or abstraction. This paper describes an approach to detecting ‘similar’ code based on the notion of anti-unification, or least-general common abstraction. This mechanism is used for detecting code clones in Erlang programs, and is supplemented by a collection of refactorings to support user-controlled automatic clone removal. The similar code detection algorithm and refac-torings are integrated within Wrangler, a tool developed at the University of Kent for interactive refactoring of Erlang programs. We conclude with a report on case studies and comparisons with other tools.

    @book{li_similar_2010,
    title = {Similar {Code} {Detection} and {Elimination} for {Erlang} {Programs}},
    isbn = {978-3-642-11502-8},
    url = {https://link.springer.com/chapter/10.1007/978-3-642-11503-5_10},
    abstract = {A well-known bad code smell in refactoring and software maintenance is duplicated code, that is the existence of code clones, which are code fragments that are identical or similar to one another. Unjustified code clones increase code size, make maintenance and comprehension more difficult, and also indicate design problems such as a lack of encapsulation or abstraction. This paper describes an approach to detecting 'similar' code based on the notion of anti-unification, or least-general common abstraction. This mechanism is used for detecting code clones in Erlang programs, and is supplemented by a collection of refactorings to support user-controlled automatic clone removal. The similar code detection algorithm and refac-torings are integrated within Wrangler, a tool developed at the University of Kent for interactive refactoring of Erlang programs. We conclude with a report on case studies and comparisons with other tools.},
    number = {5937},
    publisher = {Springer},
    author = {Li, Huiqing and Thompson, Simon},
    year = {2010},
    pages = {104-118},
    booktitle={Practical Aspects of Declarative Languages},
    note = {Publication Title: Lecture Notes in Computer Science},
    keywords = {Code clone detection, Refactoring, Program analysis, Erlang, Program transformation, Wrangler, Anti-unification, Similar code}
    }

  • G. M. K. Selim, K. C. Foo, and Y. Zou, “Enhancing source-based clone detection using intermediate representation,” in 2010 17th working conference on reverse engineering, 2010, pp. 227-236.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5645563,
    author={G. M. K. {Selim} and K. C. {Foo} and Y. {Zou}},
    booktitle={2010 17th Working Conference on Reverse Engineering},
    title={Enhancing Source-Based Clone Detection Using Intermediate Representation},
    year={2010},
    url = {https://ieeexplore.ieee.org/document/5645563},
    volume={},
    number={},
    pages={227-236},}

  • A. Perumal, S. Kanmani, and E. Kodhai, “Extracting the similarity in detected software clones using metrics,” in 2010 international conference on computer and communication technology (iccct), 2010, pp. 575-579.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5640465,
    author={A. {Perumal} and S. {Kanmani} and E. {Kodhai}},
    booktitle={2010 International Conference on Computer and Communication Technology (ICCCT)},
    title={Extracting the similarity in detected software clones using metrics},
    year={2010},
    url = {https://ieeexplore.ieee.org/abstract/document/5640465/},
    volume={},
    number={},
    pages={575-579},}

  • C. K. Roy and J. R. Cordy, “Near-miss function clones in open source software: An empirical study,” in Journal of Software Maintenance and Evolution, 2010, pp. 165-189. doi:10.1002/smr.416
    [BibTeX] [Abstract] [PDF]

    The new hybrid clone detection tool NICAD combines the strengths and overcomes the limitations of both text-based and AST-based clone detection techniques and exploits novel applications of a source transformation system to yield highly accurate identification of cloned code in software systems. In this paper, we present an in-depth study of near-miss function clones in open source software using NICAD. We examine more than 20 open source C, Java and C\# systems, including the entire Linux Kernel, Apache httpd, J2SDK-Swing and db4o and compare their use of cloned code in several different dimensions, including language, clone size, clone similarity, clone location and clone density both by proportion of cloned functions and lines of cloned code. We manually verify all detected clones and provide a complete catalogue of different clones in an online repository in a variety of formats. These validated results can be used as a cloning reference for these systems and as a benchmark for evaluating other clone detection tools. Copyright © 2009 John Wiley & Sons, Ltd.

    @inproceedings{roy_near-miss_2010,
    title = {Near-miss function clones in open source software: {An} empirical study},
    volume = {22},
    doi = {10.1002/smr.416},
    url = {https://www.cs.usask.ca/~croy/papers/2010/RC_JSME_OSS_Clones.pdf},
    abstract = {The new hybrid clone detection tool NICAD combines the strengths and overcomes the limitations of both text-based and AST-based clone detection techniques and exploits novel applications of a source transformation system to yield highly accurate identification of cloned code in software systems. In this paper, we present an in-depth study of near-miss function clones in open source software using NICAD. We examine more than 20 open source C, Java and C\# systems, including the entire Linux Kernel, Apache httpd, J2SDK-Swing and db4o and compare their use of cloned code in several different dimensions, including language, clone size, clone similarity, clone location and clone density both by proportion of cloned functions and lines of cloned code. We manually verify all detected clones and provide a complete catalogue of different clones in an online repository in a variety of formats. These validated results can be used as a cloning reference for these systems and as a benchmark for evaluating other clone detection tools. Copyright © 2009 John Wiley \& Sons, Ltd.},
    booktitle = {Journal of {Software} {Maintenance} and {Evolution}},
    author = {Roy, C. K. and Cordy, J. R.},
    month = apr,
    year = {2010},
    note = {ISSN: 1532060X
    Issue: 3},
    keywords = {Empirical study, Open source software, Near-miss function clones},
    pages = {165-189}
    }

  • C. K. Roy and J. R. Cordy, “Are scripting languages really different?,” in Proceedings of international conference on software engineering, 2010, pp. 17-24. doi:10.1145/1808901.1808904
    [BibTeX] [Abstract] [PDF]

    Scripting languages such as Python, Perl, Ruby and PHP are increasingly important in new software systems as web technology becomes a dominant force. These languages are often spoken of as having different properties, in particular with respect to cloning, and the question arises whether the observations made based on traditional languages also apply to them. In this paper we present a first experiment in measuring the cloning properties of open source software systems written in the Python scripting language using the NiCad clone detector. We compare our results for Python with previous observations of C, C\#, and Java, and discover that perhaps scripting languages are not so different after all. © 2010 ACM.

    @inproceedings{roy_are_2010,
    title = {Are scripting languages really different?},
    isbn = {978-1-60558-980-0},
    doi = {10.1145/1808901.1808904},
    url = {https://dl.acm.org/doi/10.1145/1808901.1808904},
    abstract = {Scripting languages such as Python, Perl, Ruby and PHP are increasingly important in new software systems as web technology becomes a dominant force. These languages are often spoken of as having different properties, in particular with respect to cloning, and the question arises whether the observations made based on traditional languages also apply to them. In this paper we present a first experiment in measuring the cloning properties of open source software systems written in the Python scripting language using the NiCad clone detector. We compare our results for Python with previous observations of C, C\#, and Java, and discover that perhaps scripting languages are not so different after all. © 2010 ACM.},
    booktitle = {Proceedings of International Conference on Software Engineering},
    author = {Roy, Chanchal K. and Cordy, James R.},
    year = {2010},
    note = {ISSN: 02705257},
    keywords = {code clones, empirical study, Python, scripting languages},
    pages = {17-24}
    }

  • R. K. Saha, M. Asaduzzaman, M. F. Zibran, C. K. Roy, and K. A. Schneider, “Evaluating code clone genealogies at release level: an empirical study,” in 2010 10th ieee working conference on source code analysis and manipulation, 2010, pp. 87-96.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5601826,
    author={R. K. {Saha} and M. {Asaduzzaman} and M. F. {Zibran} and C. K. {Roy} and K. A. {Schneider}},
    booktitle={2010 10th IEEE Working Conference on Source Code Analysis and Manipulation},
    title={Evaluating Code Clone Genealogies at Release Level: An Empirical Study},
    year={2010},
    url = {https://ieeexplore.ieee.org/abstract/document/5601826/},
    volume={},
    number={},
    pages={87-96},
    }

  • S. Schulze, S. Apel, and C. Kästner, “Code clones in feature-oriented software product lines,” in Proceedings of the ninth international conference on generative programming and component engineering, New York, NY, USA, 2010, p. 103–112. doi:10.1145/1868294.1868310
    [BibTeX] [PDF]
    @inproceedings{10.1145/1868294.1868310,
    author = {Schulze, Sandro and Apel, Sven and K\"{a}stner, Christian},
    title = {Code Clones in Feature-Oriented Software Product Lines},
    year = {2010},
    isbn = {9781450301541},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1868294.1868310},
    doi = {10.1145/1868294.1868310},
    booktitle = {Proceedings of the Ninth International Conference on Generative Programming and Component Engineering},
    pages = {103–112},
    numpages = {10},
    keywords = {code clones, refactoring, feature-oriented programming, software product lines},
    location = {Eindhoven, The Netherlands},
    series = {GPCE ’10}
    }

  • G. M. K. Selim, L. Barbour, W. Shang, B. Adams, A. E. Hassan, and Y. Zou, “Studying the impact of clones on software defects,” in 2010 17th working conference on reverse engineering, 2010, pp. 13-21.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5645480,
    author={G. M. K. {Selim} and L. {Barbour} and W. {Shang} and B. {Adams} and A. E. {Hassan} and Y. {Zou}},
    booktitle={2010 17th Working Conference on Reverse Engineering},
    title={Studying the Impact of Clones on Software Defects},
    url = {https://ieeexplore.ieee.org/abstract/document/5645480/},
    year={2010},
    volume={},
    number={},
    pages={13-21},
    }

  • D. M. Shawky and A. F. Ali, “Modeling clones evolution in open source systems through chaos theory,” in 2010 2nd international conference on software technology and engineering, 2010, pp. 159-164.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5608893,
    author={D. M. {Shawky} and A. F. {Ali}},
    booktitle={2010 2nd International Conference on Software Technology and Engineering},
    title={Modeling clones evolution in open source systems through chaos theory},
    year={2010},
    url = {https://ieeexplore.ieee.org/abstract/document/5608893/},
    volume={1},
    number={},
    pages={159-164},
    }

  • I. J. Davis and M. W. Godfrey, “Clone detection by exploiting assembler,” in Proceedings of the 4th international workshop on software clones, New York, NY, USA, 2010, p. 77–78. doi:10.1145/1808901.1808913
    [BibTeX] [PDF]
    @inproceedings{10.1145/1808901.1808913,
    author = {Davis, Ian J. and Godfrey, Michael W.},
    title = {Clone Detection by Exploiting Assembler},
    year = {2010},
    isbn = {9781605589800},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1808901.1808913},
    doi = {10.1145/1808901.1808913},
    booktitle = {Proceedings of the 4th International Workshop on Software Clones},
    pages = {77–78},
    numpages = {2},
    keywords = {assembler, C++, C, software, Java, clone detection},
    location = {Cape Town, South Africa},
    series = {IWSC ’10}
    }

  • M. Chilowicz, E. Duris, and G. Roussel, “Towards a multi-scale approach for source code approximate match report,” in Proceedings of the 4th international workshop on software clones, New York, NY, USA, 2010, p. 89–90. doi:10.1145/1808901.1808919
    [BibTeX] [PDF]
    @inproceedings{10.1145/1808901.1808919,
    author = {Chilowicz, Michel and Duris, Etienne and Roussel, Gilles},
    title = {Towards a Multi-Scale Approach for Source Code Approximate Match Report},
    year = {2010},
    isbn = {9781605589800},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1808901.1808919},
    doi = {10.1145/1808901.1808919},
    booktitle = {Proceedings of the 4th International Workshop on Software Clones},
    pages = {89–90},
    numpages = {2},
    keywords = {source code similarity, software plagiarism, clones},
    location = {Cape Town, South Africa},
    series = {IWSC ’10}
    }

  • F. Jacob, D. Hou, and P. Jablonski, “Actively comparing clones inside the code editor,” in Proceedings of the 4th international workshop on software clones, New York, NY, USA, 2010, p. 9–16. doi:10.1145/1808901.1808903
    [BibTeX] [PDF]
    @inproceedings{10.1145/1808901.1808903,
    author = {Jacob, Ferosh and Hou, Daqing and Jablonski, Patricia},
    title = {Actively Comparing Clones inside the Code Editor},
    year = {2010},
    isbn = {9781605589800},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1808901.1808903},
    doi = {10.1145/1808901.1808903},
    booktitle = {Proceedings of the 4th International Workshop on Software Clones},
    pages = {9–16},
    numpages = {8},
    keywords = {code clone, differencing tools, Eclipse integrated development environment, software evolution, software maintenance, copy-and-paste programming, code comparison, Java},
    location = {Cape Town, South Africa},
    series = {IWSC ’10}
    }

  • C. K. Roy and J. R. Cordy, “Are scripting languages really different?,” in Proceedings of the 4th international workshop on software clones, New York, NY, USA, 2010, p. 17–24. doi:10.1145/1808901.1808904
    [BibTeX] [PDF]
    @inproceedings{10.1145/1808901.1808904,
    author = {Roy, Chanchal K. and Cordy, James R.},
    title = {Are Scripting Languages Really Different?},
    year = {2010},
    isbn = {9781605589800},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1808901.1808904},
    doi = {10.1145/1808901.1808904},
    booktitle = {Proceedings of the 4th International Workshop on Software Clones},
    pages = {17–24},
    numpages = {8},
    keywords = {code clones, empirical study, Python, scripting languages},
    location = {Cape Town, South Africa},
    series = {IWSC ’10}
    }

  • A. Lozano and M. Wermelinger, “Tracking clones’ imprint,” in Proceedings of the 4th international workshop on software clones, New York, NY, USA, 2010, p. 65–72. doi:10.1145/1808901.1808910
    [BibTeX] [PDF]
    @inproceedings{10.1145/1808901.1808910,
    author = {Lozano, Angela and Wermelinger, Michel},
    title = {Tracking Clones’ Imprint},
    year = {2010},
    isbn = {9781605589800},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1808901.1808910},
    doi = {10.1145/1808901.1808910},
    booktitle = {Proceedings of the 4th International Workshop on Software Clones},
    pages = {65–72},
    numpages = {8},
    keywords = {extension, empirical software engineering, stability, changeability, maintenance, impact, cloning, persistence, clones, mining software repositories},
    location = {Cape Town, South Africa},
    series = {IWSC ’10}
    }

  • S. Jarzabek and Y. Xue, “Are clones harmful for maintenance?,” in Proceedings of the 4th international workshop on software clones, New York, NY, USA, 2010, p. 73–74. doi:10.1145/1808901.1808911
    [BibTeX] [PDF]
    @inproceedings{10.1145/1808901.1808911,
    author = {Jarzabek, Stan and Xue, Yinxing},
    title = {Are Clones Harmful for Maintenance?},
    year = {2010},
    isbn = {9781605589800},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1808901.1808911},
    doi = {10.1145/1808901.1808911},
    booktitle = {Proceedings of the 4th International Workshop on Software Clones},
    pages = {73–74},
    numpages = {2},
    keywords = {software clones, similarity patterns, clone detection},
    location = {Cape Town, South Africa},
    series = {IWSC ’10}
    }

  • J. Harder and N. Göde, “Quo vadis, clone management?,” in Proceedings of the 4th international workshop on software clones, New York, NY, USA, 2010, p. 85–86. doi:10.1145/1808901.1808917
    [BibTeX] [PDF]
    @inproceedings{10.1145/1808901.1808917,
    author = {Harder, Jan and G\"{o}de, Nils},
    title = {Quo Vadis, Clone Management?},
    year = {2010},
    isbn = {9781605589800},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1808901.1808917},
    doi = {10.1145/1808901.1808917},
    booktitle = {Proceedings of the 4th International Workshop on Software Clones},
    pages = {85–86},
    numpages = {2},
    keywords = {clone management, cost-benefit analysis, software maintenance},
    location = {Cape Town, South Africa},
    series = {IWSC ’10}
    }

  • C. Brown and S. Thompson, “Clone detection and elimination for haskell,” in Proceedings of the 2010 acm sigplan workshop on partial evaluation and program manipulation, New York, NY, USA, 2010, p. 111–120. doi:10.1145/1706356.1706378
    [BibTeX] [PDF]
    @inproceedings{10.1145/1706356.1706378,
    author = {Brown, Christopher and Thompson, Simon},
    title = {Clone Detection and Elimination for Haskell},
    year = {2010},
    isbn = {9781605587271},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1706356.1706378},
    doi = {10.1145/1706356.1706378},
    booktitle = {Proceedings of the 2010 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation},
    pages = {111–120},
    numpages = {10},
    keywords = {haskell, generalisation, refactoring, duplicated code, program analysis, hare, program transformation},
    location = {Madrid, Spain},
    series = {PEPM ’10}
    }

  • D. M. Shawky and A. F. Ali, “An approach for assessing similarity metrics used in metric-based clone detection techniques,” in 2010 3rd international conference on computer science and information technology, 2010, pp. 580-584.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5563834,
    author={D. M. {Shawky} and A. F. {Ali}},
    booktitle={2010 3rd International Conference on Computer Science and Information Technology},
    title={An approach for assessing similarity metrics used in metric-based clone detection techniques},
    year={2010},
    url = {https://ieeexplore.ieee.org/document/5563834},
    volume={1},
    number={},
    pages={580-584},}

  • K. Jalbert and J. S. Bradbury, “Using clone detection to identify bugs in concurrent software,” in 2010 ieee international conference on software maintenance, 2010, pp. 1-5.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5609529,
    author={K. {Jalbert} and J. S. {Bradbury}},
    booktitle={2010 IEEE International Conference on Software Maintenance},
    title={Using clone detection to identify bugs in concurrent software},
    year={2010},
    url = {https://ieeexplore.ieee.org/document/5609529},
    volume={},
    number={},
    pages={1-5},}

  • E. Juergens, F. Deissenboeck, M. Feilkas, B. Hummel, B. Schaetz, S. Wagner, C. Domann, and J. Streit, “Can clone detection support quality assessments of requirements specifications?,” in 2010 acm/ieee 32nd international conference on software engineering, 2010, pp. 79-88.
    [BibTeX] [PDF]
    @INPROCEEDINGS{6062141,
    author={E. {Juergens} and F. {Deissenboeck} and M. {Feilkas} and B. {Hummel} and B. {Schaetz} and S. {Wagner} and C. {Domann} and J. {Streit}},
    booktitle={2010 ACM/IEEE 32nd International Conference on Software Engineering},
    title={Can clone detection support quality assessments of requirements specifications?},
    year={2010},
    url = {https://ieeexplore.ieee.org/document/6062141},
    volume={2},
    number={},
    pages={79-88},}

  • E. Juergens, F. Deissenboeck, and B. Hummel, “Code similarities beyond copy paste,” in 2010 14th european conference on software maintenance and reengineering, 2010, pp. 78-87.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5714422,
    author={E. {Juergens} and F. {Deissenboeck} and B. {Hummel}},
    booktitle={2010 14th European Conference on Software Maintenance and Reengineering},
    title={Code Similarities Beyond Copy Paste},
    year={2010},
    url = {https://ieeexplore.ieee.org/document/5714422},
    volume={},
    number={},
    pages={78-87},}

  • A. Atreya and C. Elkan, “Latent semantic indexing (lsi) fails for trec collections,” Sigkdd explorations, vol. 12, pp. 5-10, 2010. doi:10.1145/1964897.1964900
    [BibTeX] [PDF]
    @article{article,
    author = {Atreya, Avinash and Elkan, Charles},
    year = {2010},
    month = {01},
    pages = {5-10},
    title = {Latent semantic indexing (LSI) fails for TREC collections},
    volume = {12},
    url = {https://www.researchgate.net/publication/220520236_Latent_semantic_indexing_LSI_fails_for_TREC_collections},
    journal = {SIGKDD Explorations},
    doi = {10.1145/1964897.1964900}
    }

2009

  • D. Hou, F. Jacob, and P. Jablonski, “Exploring the design space of proactive tool support for copy-and-paste programming,” in Proceedings of the 2009 conference of the center for advanced studies on collaborative research, USA, 2009, pp. 188-202. doi:10.1145/1723028.1723051
    [BibTeX] [PDF]
    @inproceedings{10.1145/1723028.1723051,
    author = {Hou, Daqing and Jacob, Ferosh and Jablonski, Patricia},
    title = {Exploring the Design Space of Proactive Tool Support for Copy-and-Paste Programming},
    year = {2009},
    publisher = {IBM Corp.},
    address = {USA},
    url = {https://doi.org/10.1145/1723028.1723051},
    doi = {10.1145/1723028.1723051},
    booktitle = {Proceedings of the 2009 Conference of the Center for Advanced Studies on Collaborative Research},
    pages = {188-202},
    numpages = {15},
    location = {Ontario, Canada},
    series = {CASCON ’09}
    }

  • S. Grant and J. R. Cordy, “Vector space analysis of software clones,” in 2009 ieee 17th international conference on program comprehension, 2009, pp. 233-237.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5090048,
    author={S. {Grant} and J. R. {Cordy}},
    booktitle={2009 IEEE 17th International Conference on Program Comprehension},
    url = {https://ieeexplore.ieee.org/abstract/document/5090048/},
    title={Vector space analysis of software clones},
    year={2009},
    volume={},
    number={},
    pages={233-237},
    }

  • D. Hou, P. Jablonski, and F. Jacob, “Cnp: towards an environment for the proactive management of copy-and-paste programming,” in 2009 ieee 17th international conference on program comprehension, 2009, pp. 238-242.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5090049,
    author={D. {Hou} and P. {Jablonski} and F. {Jacob}},
    booktitle={2009 IEEE 17th International Conference on Program Comprehension},
    title={CnP: Towards an environment for the proactive management of copy-and-paste programming},
    year={2009},
    url = {https://ieeexplore.ieee.org/abstract/document/5090049},
    volume={},
    number={},
    pages={238-242},
    }

  • E. Juergens, F. Deissenboeck, B. Hummel, and S. Wagner, “Do code clones matter?,” in 2009 ieee 31st international conference on software engineering, 2009, pp. 485-495.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5070547,
    author={E. {Juergens} and F. {Deissenboeck} and B. {Hummel} and S. {Wagner}},
    booktitle={2009 IEEE 31st International Conference on Software Engineering},
    title={Do code clones matter?},
    year={2009},
    url = {https://ieeexplore.ieee.org/abstract/document/5070547/},
    volume={},
    number={},
    pages={485-495},}

  • R. Koschke, S. Jarzabek, J. Cordy, and K. Inoue, “Third international workshop on software clones (iwsc),” in 2009 13th european conference on software maintenance and reengineering, 2009, pp. 269-270.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4812766,
    author={R. {Koschke} and S. {Jarzabek} and J. {Cordy} and K. {Inoue}},
    booktitle={2009 13th European Conference on Software Maintenance and Reengineering},
    title={Third International Workshop on Software Clones (IWSC)},
    year={2009},
    volume={},
    url = {https://ieeexplore.ieee.org/abstract/document/4812766/},
    number={},
    pages={269-270},}

  • H. Lee and K. Doh, “Tree-pattern-based duplicate code detection,” in Proceedings of the acm first international workshop on data-intensive software management and mining, New York, NY, USA, 2009, p. 7–12. doi:10.1145/1651309.1651312
    [BibTeX] [PDF]
    @inproceedings{10.1145/1651309.1651312,
    author = {Lee, Hyo-Sub and Doh, Kyung-Goo},
    title = {Tree-Pattern-Based Duplicate Code Detection},
    year = {2009},
    isbn = {9781605588100},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1651309.1651312},
    doi = {10.1145/1651309.1651312},
    booktitle = {Proceedings of the ACM First International Workshop on Data-Intensive Software Management and Mining},
    pages = {7–12},
    numpages = {6},
    keywords = {clone detection, software maintenance, reverse engineering, tree-pattern},
    location = {Hong Kong, China},
    series = {DSMM ’09}
    }

  • H. Li and S. Thompson, “Clone detection and removal for erlang/otp within a refactoring environment,” in Proceedings of the 2009 acm sigplan workshop on partial evaluation and program manipulation, New York, NY, USA, 2009, p. 169–178. doi:10.1145/1480945.1480971
    [BibTeX] [PDF]
    @inproceedings{10.1145/1480945.1480971,
    author = {Li, Huiqing and Thompson, Simon},
    title = {Clone Detection and Removal for Erlang/OTP within a Refactoring Environment},
    year = {2009},
    isbn = {9781605583273},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1480945.1480971},
    doi = {10.1145/1480945.1480971},
    booktitle = {Proceedings of the 2009 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation},
    pages = {169–178},
    numpages = {10},
    keywords = {refactoring, erlang, duplicated code, program transformation, wrangler, program analysis},
    location = {Savannah, GA, USA},
    series = {PEPM ’09}
    }

  • C. K. Roy, “Detection and analysis of near-miss software clones,” in 2009 ieee international conference on software maintenance, 2009, pp. 447-450.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5306301,
    author={C. K. {Roy}},
    booktitle={2009 IEEE International Conference on Software Maintenance},
    title={Detection and analysis of near-miss software clones},
    url = {https://qspace.library.queensu.ca/handle/1974/5104},
    year={2009},
    volume={},
    number={},
    pages={447-450},
    }

  • R. Tiarks, R. Koschke, F. S. Q. R. -. Journal, and undefined 2011, “An extended assessment of type-3 clones as detected by state-of-the-art tools,” 9th ieee international working conference on source code analysis and manipulation (scam), pp. 67-76, 2009.
    [BibTeX] [PDF]
    @article{tiarks_extended_nodate,
    title = {An extended assessment of type-3 clones as detected by state-of-the-art tools},
    url = {https://www.researchgate.net/publication/220703605_An_Assessment_of_Type-3_Clones_as_Detected_by_State-of-the-Art_Tools},
    journal = {9th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM)},
    year = {2009},
    month = {01},
    pages = {67-76},
    author = {Tiarks, R and Koschke, R and Journal, R Falke - Software Quality and undefined 2011}
    }

  • M. Wit, A. Zaidman, and A. Deursen, “Managing code clones using dynamic change tracking and resolution.” 2009, pp. 169-178. doi:10.1109/ICSM.2009.5306336
    [BibTeX] [PDF]
    @inproceedings{inproceedings,
    author = {Wit, Michiel and Zaidman, Andy and Deursen, Arie},
    year = {2009},
    url= {https://www.researchgate.net/publication/221308166_Managing_Code_Clones_Using_Dynamic_Change_Tracking_and_Resolution},
    month = {09},
    journal= {25th IEEE International Conference on Software Maintenance (ICSM)},
    pages = {169-178},
    title = {Managing Code Clones Using Dynamic Change Tracking and Resolution},
    doi = {10.1109/ICSM.2009.5306336}
    }

  • E. Juergens, F. Deissenboeck, B. Hummel, and S. Wagner, “Do code clones matter?,” in Proceedings of the 31st international conference on software engineering, USA, 2009, p. 485–495. doi:10.1109/ICSE.2009.5070547
    [BibTeX] [PDF]
    @inproceedings{10.1109/ICSE.2009.5070547,
    author = {Juergens, Elmar and Deissenboeck, Florian and Hummel, Benjamin and Wagner, Stefan},
    title = {Do Code Clones Matter?},
    year = {2009},
    isbn = {9781424434534},
    publisher = {IEEE Computer Society},
    address = {USA},
    url = {https://doi.org/10.1109/ICSE.2009.5070547},
    doi = {10.1109/ICSE.2009.5070547},
    booktitle = {Proceedings of the 31st International Conference on Software Engineering},
    pages = {485–495},
    numpages = {11},
    series = {ICSE ’09}
    }

  • S. Kawaguchi, T. Yamashina, H. Uwano, K. Fushida, Y. Kamei, M. Nagura, and H. Iida, “Shinobi: a tool for automatic code clone detection in the ide,” in 2009 16th working conference on reverse engineering, 2009, pp. 313-314.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5328752,
    author={S. {Kawaguchi} and T. {Yamashina} and H. {Uwano} and K. {Fushida} and Y. {Kamei} and M. {Nagura} and H. {Iida}},
    booktitle={2009 16th Working Conference on Reverse Engineering},
    title={SHINOBI: A Tool for Automatic Code Clone Detection in the IDE},
    year={2009},
    url ={https://ieeexplore.ieee.org/document/5328752},
    volume={},
    number={},
    pages={313-314},}

  • M. Chilowicz, E. Duris, and G. Roussel, “Syntax tree fingerprinting for source code similarity detection,” in 2009 ieee 17th international conference on program comprehension, 2009, pp. 243-247.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5090050,
    author={M. {Chilowicz} and E. {Duris} and G. {Roussel}},
    booktitle={2009 IEEE 17th International Conference on Program Comprehension},
    title={Syntax tree fingerprinting for source code similarity detection},
    year={2009},
    url ={https://ieeexplore.ieee.org/document/5090050},
    volume={},
    number={},
    pages={243-247},}

  • E. Juergens, F. Deissenboeck, and B. Hummel, “Clonedetective – a workbench for clone detection research,” in 2009 ieee 31st international conference on software engineering, 2009, pp. 603-606.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5070566,
    author={E. {Juergens} and F. {Deissenboeck} and B. {Hummel}},
    booktitle={2009 IEEE 31st International Conference on Software Engineering},
    title={CloneDetective - A workbench for clone detection research},
    year={2009},
    url={https://ieeexplore.ieee.org/document/5070566},
    volume={},
    number={},
    pages={603-606},}

  • N. H. Pham, H. A. Nguyen, T. T. Nguyen, J. M. Al-Kofahi, and T. N. Nguyen, “Complete and accurate clone detection in graph-based models,” in 2009 ieee 31st international conference on software engineering, 2009, pp. 276-286.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5070528,
    author={N. H. {Pham} and H. A. {Nguyen} and T. T. {Nguyen} and J. M. {Al-Kofahi} and T. N. {Nguyen}},
    booktitle={2009 IEEE 31st International Conference on Software Engineering},
    title={Complete and accurate clone detection in graph-based models},
    year={2009},
    url ={https://ieeexplore.ieee.org/document/5070528},
    volume={},
    number={},
    pages={276-286},}

  • N. Göde and R. Koschke, “Incremental clone detection,” in 2009 13th european conference on software maintenance and reengineering, 2009, pp. 219-228.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4812755,
    author={N. {Göde} and R. {Koschke}},
    booktitle={2009 13th European Conference on Software Maintenance and Reengineering},
    title={Incremental Clone Detection},
    year={2009},
    volume={},
    url ={https://ieeexplore.ieee.org/document/4812755},
    number={},
    pages={219-228},}

  • H. A. Basit and S. Jarzabek, “A data mining approach for detecting higher-level clones in software,” Ieee transactions on software engineering, vol. 35, iss. 4, pp. 497-514, 2009.
    [BibTeX] [PDF]
    @ARTICLE{4796208,
    author={H. A. {Basit} and S. {Jarzabek}},
    journal={IEEE Transactions on Software Engineering},
    title={A Data Mining Approach for Detecting Higher-Level Clones in Software},
    year={2009},
    url ={https://ieeexplore.ieee.org/document/4796208},
    volume={35},
    number={4},
    pages={497-514},}

  • N. Bettenburg, W. Shang, W. Ibrahim, B. Adams, Y. Zou, and A. E. Hassan, “An empirical study on inconsistent changes to code clones at release level,” in 2009 16th working conference on reverse engineering, 2009, pp. 85-94.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5328705,
    author={N. {Bettenburg} and W. {Shang} and W. {Ibrahim} and B. {Adams} and Y. {Zou} and A. E. {Hassan}},
    booktitle={2009 16th Working Conference on Reverse Engineering},
    title={An Empirical Study on Inconsistent Changes to Code Clones at Release Level},
    year={2009},
    url = {https://ieeexplore.ieee.org/document/5328705},
    volume={},
    number={},
    pages={85-94},}

  • Y. Fukushima, R. Kula, S. Kawaguchi, K. Fushida, M. Nagura, and H. Iida, “Code clone graph metrics for detecting diffused code clones,” in 2009 16th asia-pacific software engineering conference, 2009, pp. 373-380.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5358762,
    author={Y. {Fukushima} and R. {Kula} and S. {Kawaguchi} and K. {Fushida} and M. {Nagura} and H. {Iida}},
    booktitle={2009 16th Asia-Pacific Software Engineering Conference},
    title={Code Clone Graph Metrics for Detecting Diffused Code Clones},
    year={2009},
    url = {https://ieeexplore.ieee.org/document/5358762},
    volume={},
    number={},
    pages={373-380},}

  • Y. Higo, K. Sawa, and S. Kusumoto, “Problematic code clones identification using multiple detection results,” in 2009 16th asia-pacific software engineering conference, 2009, pp. 365-372.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5358749,
    author={Y. {Higo} and K. {Sawa} and S. {Kusumoto}},
    booktitle={2009 16th Asia-Pacific Software Engineering Conference},
    title={Problematic Code Clones Identification Using Multiple Detection Results},
    year={2009},
    url = {https://ieeexplore.ieee.org/document/5358749},
    volume={},
    number={},
    pages={365-372},}

  • E. Merlo and T. Lavoie, “Computing structural types of clone syntactic blocks,” in 2009 16th working conference on reverse engineering, 2009, pp. 274-278.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5328785,
    author={E. {Merlo} and T. {Lavoie}},
    booktitle={2009 16th Working Conference on Reverse Engineering},
    title={Computing Structural Types of Clone Syntactic Blocks},
    year={2009},
    url = {https://ieeexplore.ieee.org/document/5328785},
    volume={},
    number={},
    pages={274-278},}

  • N. Göde, “Evolution of type-1 clones,” in 2009 ninth ieee international working conference on source code analysis and manipulation, 2009, pp. 77-86.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5279977,
    author={N. {Göde}},
    booktitle={2009 Ninth IEEE International Working Conference on Source Code Analysis and Manipulation},
    title={Evolution of Type-1 Clones},
    year={2009},
    url = {https://ieeexplore.ieee.org/document/5279977},
    volume={},
    number={},
    pages={77-86},}

  • R. Tiarks, R. Koschke, and R. Falke, “An assessment of type-3 clones as detected by state-of-the-art tools,” in 2009 ninth ieee international working conference on source code analysis and manipulation, 2009, pp. 67-76.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5279980,
    author={R. {Tiarks} and R. {Koschke} and R. {Falke}},
    booktitle={2009 Ninth IEEE International Working Conference on Source Code Analysis and Manipulation},
    title={An Assessment of Type-3 Clones as Detected by State-of-the-Art Tools},
    year={2009},
    url = {https://ieeexplore.ieee.org/document/5279980},
    volume={},
    number={},
    pages={67-76},}

  • C. K. Roy and J. R. Cordy, “A mutation/injection-based automatic framework for evaluating code clone detection tools,” in 2009 international conference on software testing, verification, and validation workshops, 2009, pp. 157-166.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4976382,
    author={C. K. {Roy} and J. R. {Cordy}},
    booktitle={2009 International Conference on Software Testing, Verification, and Validation Workshops},
    title={A Mutation/Injection-Based Automatic Framework for Evaluating Code Clone Detection Tools},
    year={2009},
    url = {https://ieeexplore.ieee.org/document/4976382},
    volume={},
    number={},
    pages={157-166},}

  • W. S. Evans, C. W. Fraser, and F. Ma, “Clone detection via structural abstraction,” Software quality journal, vol. 17, iss. 4, p. 309–330, 2009. doi:10.1007/s11219-009-9074-y
    [BibTeX] [PDF]
    @article{10.1007/s11219-009-9074-y,
    author = {Evans, William S. and Fraser, Christopher W. and Ma, Fei},
    title = {Clone Detection via Structural Abstraction},
    year = {2009},
    issue_date = {December 2009},
    publisher = {Kluwer Academic Publishers},
    address = {USA},
    volume = {17},
    number = {4},
    issn = {0963-9314},
    url = {https://doi.org/10.1007/s11219-009-9074-y},
    doi = {10.1007/s11219-009-9074-y},
    journal = {Software Quality Journal},
    month = dec,
    pages = {309–330},
    numpages = {22},
    keywords = {Refactoring, Clone detection, Procedural abstraction}
    }

2008

  • C. K. Roy and J. R. Cordy, “Scenario-based comparison of clone detection techniques,” in 2008 16th ieee international conference on program comprehension, 2008, pp. 153-162.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4556127,
    author={C. K. {Roy} and J. R. {Cordy}},
    booktitle={2008 16th IEEE International Conference on Program Comprehension},
    title={Scenario-Based Comparison of Clone Detection Techniques},
    url = {https://ieeexplore.ieee.org/abstract/document/4556127/},
    year={2008},
    volume={},
    number={},
    pages={153-162},
    }

  • C. K. Roy and J. R. Cordy, “Nicad: accurate detection of near-miss intentional clones using flexible pretty-printing and code normalization,” in 2008 16th ieee international conference on program comprehension, 2008, pp. 172-181.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4556129,
    author={C. K. {Roy} and J. R. {Cordy}},
    booktitle={2008 16th IEEE International Conference on Program Comprehension},
    url = {https://ieeexplore.ieee.org/abstract/document/4556129/},
    title={NICAD: Accurate Detection of Near-Miss Intentional Clones Using Flexible Pretty-Printing and Code Normalization},
    year={2008},
    volume={},
    number={},
    pages={172-181},
    }

  • C. K. Roy and J. R. Cordy, “An empirical study of function clones in open source software,” in 2008 15th working conference on reverse engineering, 2008, pp. 81-90.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4656397,
    author={C. K. {Roy} and J. R. {Cordy}},
    booktitle={2008 15th Working Conference on Reverse Engineering},
    title={An Empirical Study of Function Clones in Open Source Software},
    year={2008},
    url = {https://ieeexplore.ieee.org/abstract/document/4656397/},
    volume={},
    number={},
    pages={81-90},
    }

  • M. Gabel, L. Jiang, and Z. Su, “Scalable detection of semantic clones,” in 2008 acm/ieee 30th international conference on software engineering, 2008, pp. 321-330.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4814143,
    author={M. {Gabel} and L. {Jiang} and Z. {Su}},
    booktitle={2008 ACM/IEEE 30th International Conference on Software Engineering},
    title={Scalable detection of semantic clones},
    year={2008},
    volume={},
    number={},
    url = {https://ieeexplore.ieee.org/document/4814143},
    pages={321-330},}

  • R. Falke, P. Frenzel, and R. Koschke, “Empirical evaluation of clone detection using syntax suffix trees,” Empirical softw. engg., vol. 13, iss. 6, p. 601–643, 2008. doi:10.1007/s10664-008-9073-9
    [BibTeX] [PDF]
    @article{10.1007/s10664-008-9073-9,
    author = {Falke, Raimar and Frenzel, Pierre and Koschke, Rainer},
    title = {Empirical Evaluation of Clone Detection Using Syntax Suffix Trees},
    year = {2008},
    issue_date = {December 2008},
    publisher = {Kluwer Academic Publishers},
    address = {USA},
    volume = {13},
    number = {6},
    issn = {1382-3256},
    url = {https://doi.org/10.1007/s10664-008-9073-9},
    doi = {10.1007/s10664-008-9073-9},
    journal = {Empirical Softw. Engg.},
    month = dec,
    pages = {601–643},
    numpages = {43},
    keywords = {Duplication, Software evolution, Redundancy, Program analysis, Software maintenance, Software clone detection}
    }

  • J. Guo and Y. Zou, “Detecting clones in business applications,” in 2008 15th working conference on reverse engineering, 2008, pp. 91-100.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4656398,
    author={J. {Guo} and Y. {Zou}},
    booktitle={2008 15th Working Conference on Reverse Engineering},
    title={Detecting Clones in Business Applications},
    year={2008},
    url ={https://ieeexplore.ieee.org/document/4656398},
    volume={},
    number={},
    pages={91-100},}

  • F. Deissenboeck, B. Hummel, E. Jürgens, B. Schätz, S. Wagner, J. Girard, and S. Teuchert, “Clone detection in automotive model-based development,” in 2008 acm/ieee 30th international conference on software engineering, 2008, pp. 603-612.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4814172,
    author={F. {Deissenboeck} and B. {Hummel} and E. {Jürgens} and B. {Schätz} and S. {Wagner} and J. {Girard} and S. {Teuchert}},
    booktitle={2008 ACM/IEEE 30th International Conference on Software Engineering},
    title={Clone detection in automotive model-based development},
    year={2008},
    url = {https://ieeexplore.ieee.org/abstract/document/4814172},
    volume={},
    number={},
    pages={603-612},}

  • A. Lozano and M. Wermelinger, “Assessing the effect of clones on changeability,” in 2008 ieee international conference on software maintenance, 2008, pp. 227-236.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4658071,
    author={A. {Lozano} and M. {Wermelinger}},
    booktitle={2008 IEEE International Conference on Software Maintenance},
    title={Assessing the effect of clones on changeability},
    year={2008},
    url ={https://ieeexplore.ieee.org/document/4658071},
    volume={},
    number={},
    pages={227-236},}

  • Yali Zhang, H. A. Basit, S. Jarzabek, D. Anh, and M. Low, “Query-based filtering and graphical view generation for clone analysis,” in 2008 ieee international conference on software maintenance, 2008, pp. 376-385.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4658086,
    author={ {Yali Zhang} and H. A. {Basit} and S. {Jarzabek} and D. {Anh} and M. {Low}},
    booktitle={2008 IEEE International Conference on Software Maintenance},
    title={Query-based filtering and graphical view generation for clone analysis},
    year={2008},
    url = {https://ieeexplore.ieee.org/document/4658086},
    volume={},
    number={},
    pages={376-385},}

  • J. Krinke, “Is cloned code more stable than non-cloned code?,” in 2008 eighth ieee international working conference on source code analysis and manipulation, 2008, pp. 57-66.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4637539,
    author={J. {Krinke}},
    booktitle={2008 Eighth IEEE International Working Conference on Source Code Analysis and Manipulation},
    title={Is Cloned Code More Stable than Non-cloned Code?},
    year={2008},
    url = {https://ieeexplore.ieee.org/document/4637539},
    volume={},
    number={},
    pages={57-66},}

  • T. Ishio, H. Date, T. Miyake, and K. Inoue, “Mining coding patterns to detect crosscutting concerns in java programs,” in 2008 15th working conference on reverse engineering, 2008, pp. 123-132.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4656401,
    author={T. {Ishio} and H. {Date} and T. {Miyake} and K. {Inoue}},
    booktitle={2008 15th Working Conference on Reverse Engineering},
    title={Mining Coding Patterns to Detect Crosscutting Concerns in Java Programs},
    year={2008},
    url = {https://ieeexplore.ieee.org/document/4656401},
    volume={},
    number={},
    pages={123-132},}

  • J. Zhang, J. Gray, Y. Lin, and R. Tairas, “Aspect mining from a modeling perspective,” International journal of computer applications in technology (ijcat), vol. 31, pp. 74-82, 2008. doi:10.1504/IJCAT.2008.017720
    [BibTeX] [PDF]
    @article{article,
    author = {Zhang, Jing and Gray, Jeff and Lin, Yuehua and Tairas, Robert},
    year = {2008},
    month = {03},
    pages = {74-82},
    title = {Aspect Mining from a Modeling Perspective},
    volume = {31},
    url = {https://www.researchgate.net/publication/220171370_Aspect_Mining_from_a_Modeling_Perspective},
    journal = {International Journal of Computer Applications in Technology (IJCAT)},
    doi = {10.1504/IJCAT.2008.017720}
    }

2007

  • S. R. Jadhav and S. B. Wakurdekar, “Techniques and Algorithm for Clone Detection and Analysis,” International journal of innovative research in computer and communication engineering, vol. 3297, 2007. doi:10.15680/IJIRCCE.2016
    [BibTeX] [Abstract] [PDF]

    For maintenance and development of source code, several studies have exposed that the replica of a code or code clone in software technology are possibly risky. While this is the severe difficulty within software industry, throughout refactoring, there is small bit support for removing software clones. A large demanding difficulty is association and inclusion the replicated code, particularly later starting introduction to the software clone they are going through the numerous alterations in them. This paper presents a novel algorithm in which a couple of clone is mechanically reviewed and exclusive of altering the agenda performance that clone couple is re-factored securely. The differentiations shown in the clones are studied in this approach and those are securely parameterized with no incidence of any side cause. Novel of the gain of this approach is that the insignificant computational expenditure. Lastly, a large-scale experiential study has been carried out on above a million clone couples noticed and this discovery is completed by four dissimilar clone detection tools. This has been conducted in nine open source projects for supplying how re-factorability is exaggerated by dissimilar clone assets and tool arrangement selections.

    @article{jadhav_techniques_2007,
    title = {Techniques and {Algorithm} for {Clone} {Detection} and {Analysis}},
    volume = {3297},
    issn = {2320-9798},
    url = {https://bvucoepune.edu.in/wp-content/uploads/2018/BVUCOEP-DATA/Research_Publications/2015_16/58.pdf},
    doi = {10.15680/IJIRCCE.2016},
    abstract = {For maintenance and development of source code, several studies have exposed that the replica of a code or code clone in software technology are possibly risky. While this is the severe difficulty within software industry, throughout refactoring, there is small bit support for removing software clones. A large demanding difficulty is association and inclusion the replicated code, particularly later starting introduction to the software clone they are going through the numerous alterations in them. This paper presents a novel algorithm in which a couple of clone is mechanically reviewed and exclusive of altering the agenda performance that clone couple is re-factored securely. The differentiations shown in the clones are studied in this approach and those are securely parameterized with no incidence of any side cause. Novel of the gain of this approach is that the insignificant computational expenditure. Lastly, a large-scale experiential study has been carried out on above a million clone couples noticed and this discovery is completed by four dissimilar clone detection tools. This has been conducted in nine open source projects for supplying how re-factorability is exaggerated by dissimilar clone assets and tool arrangement selections.},
    journal = {International Journal of Innovative Research in Computer and Communication Engineering},
    author = {Jadhav, Snehal R and Wakurdekar, Sachin B},
    year = {2007},
    keywords = {Clone refactoring, Empirical study, Code duplication, Re-factorability assessment, Software clone management}
    }

  • L. Jiang, Z. Su, and E. Chiu, “Context-based detection of clone-related bugs,” in Proceedings of the the 6th joint meeting of the european software engineering conference and the acm sigsoft symposium on the foundations of software engineering, New York, NY, USA, 2007, p. 55–64. doi:10.1145/1287624.1287634
    [BibTeX] [PDF]
    @inproceedings{10.1145/1287624.1287634,
    author = {Jiang, Lingxiao and Su, Zhendong and Chiu, Edwin},
    title = {Context-Based Detection of Clone-Related Bugs},
    year = {2007},
    isbn = {9781595938114},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1287624.1287634},
    doi = {10.1145/1287624.1287634},
    booktitle = {Proceedings of the the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering},
    pages = {55–64},
    numpages = {10},
    keywords = {context-based bug detection, inconsistencies, code clone detection, code clone-related bugs},
    location = {Dubrovnik, Croatia},
    series = {ESEC-FSE ’07}
    }

  • S. Livieri, Y. Higo, M. Matsushita, and K. Inoue, “Analysis of the linux kernel evolution using code clone coverage,” in Proceedings of the fourth international workshop on mining software repositories, USA, 2007, p. 22. doi:10.1109/MSR.2007.1
    [BibTeX] [PDF]
    @inproceedings{10.1109/MSR.2007.1,
    author = {Livieri, Simone and Higo, Yoshiki and Matsushita, Makoto and Inoue, Katsuro},
    title = {Analysis of the Linux Kernel Evolution Using Code Clone Coverage},
    year = {2007},
    isbn = {076952950X},
    publisher = {IEEE Computer Society},
    address = {USA},
    url = {https://doi.org/10.1109/MSR.2007.1},
    doi = {10.1109/MSR.2007.1},
    booktitle = {Proceedings of the Fourth International Workshop on Mining Software Repositories},
    pages = {22},
    numpages = {4},
    series = {MSR ’07}
    }

  • L. Jiang, G. Misherghi, Z. Su, and S. Glondu, “Deckard: scalable and accurate tree-based detection of code clones,” in 29th international conference on software engineering (icse’07), 2007, pp. 96-105.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4222572,
    author={L. {Jiang} and G. {Misherghi} and Z. {Su} and S. {Glondu}},
    booktitle={29th International Conference on Software Engineering (ICSE'07)},
    title={DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones},
    year={2007},
    url ={https://ieeexplore.ieee.org/document/4222572},
    volume={},
    number={},
    pages={96-105},}

  • Y. Ma and D. Woo, “Applying a code clone detection method to domain analysis of device drivers,” in 14th asia-pacific software engineering conference (apsec’07), 2007, pp. 254-261.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4425862,
    author={Y. {Ma} and D. {Woo}},
    booktitle={14th Asia-Pacific Software Engineering Conference (APSEC'07)},
    title={Applying a Code Clone Detection Method to Domain Analysis of Device Drivers},
    year={2007},
    url = {https://ieeexplore.ieee.org/document/4425862},
    volume={},
    number={},
    pages={254-261},}

  • T. Bakota, R. Ferenc, and T. Gyimothy, “Clone smells in software evolution,” in 2007 ieee international conference on software maintenance, 2007, pp. 24-33.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4362615,
    author={T. {Bakota} and R. {Ferenc} and T. {Gyimothy}},
    booktitle={2007 IEEE International Conference on Software Maintenance},
    title={Clone Smells in Software Evolution},
    year={2007},
    url = {https://ieeexplore.ieee.org/document/4362615},
    volume={},
    number={},
    pages={24-33},}

  • J. Krinke, “A study of consistent and inconsistent changes to code clones,” in 14th working conference on reverse engineering (wcre 2007), 2007, pp. 170-178.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4400163,
    author={J. {Krinke}},
    booktitle={14th Working Conference on Reverse Engineering (WCRE 2007)},
    title={A Study of Consistent and Inconsistent Changes to Code Clones},
    year={2007},
    url = {https://ieeexplore.ieee.org/document/4400163},
    volume={},
    number={},
    pages={170-178},}

  • B. S. Baker, “Finding clones with dup: analysis of an experiment,” Ieee transactions on software engineering, vol. 33, iss. 9, pp. 608-621, 2007.
    [BibTeX] [PDF]
    @ARTICLE{4288194,
    author={B. S. {Baker}},
    journal={IEEE Transactions on Software Engineering},
    title={Finding Clones with Dup: Analysis of an Experiment},
    year={2007},
    url = {https://ieeexplore.ieee.org/document/4288194},
    volume={33},
    number={9},
    pages={608-621},}

  • E. Adar and M. Kim, “Softguess: visualization and exploration of code clones in context,” in 29th international conference on software engineering (icse’07), 2007, pp. 762-766.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4222642,
    author={E. {Adar} and M. {Kim}},
    booktitle={29th International Conference on Software Engineering (ICSE'07)},
    title={SoftGUESS: Visualization and Exploration of Code Clones in Context},
    year={2007},
    url = {https://ieeexplore.ieee.org/document/4222642},
    volume={},
    number={},
    pages={762-766},}

  • A. Lozano, M. Wermelinger, and B. Nuseibeh, “Evaluating the harmfulness of cloning: a change based experiment,” in Fourth international workshop on mining software repositories (msr’07:icse workshops 2007), 2007, pp. 18-18.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4228655,
    author={A. {Lozano} and M. {Wermelinger} and B. {Nuseibeh}},
    booktitle={Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007)},
    title={Evaluating the Harmfulness of Cloning: A Change Based Experiment},
    year={2007},
    url = {https://ieeexplore.ieee.org/document/4228655},
    volume={},
    number={},
    pages={18-18},}

  • L. Aversano, L. Cerulo, and M. Di Penta, “How clones are maintained: an empirical study,” in 11th european conference on software maintenance and reengineering (csmr’07), 2007, pp. 81-90.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4145027,
    author={L. {Aversano} and L. {Cerulo} and M. {Di Penta}},
    booktitle={11th European Conference on Software Maintenance and Reengineering (CSMR'07)},
    title={How Clones are Maintained: An Empirical Study},
    year={2007},
    url = {https://ieeexplore.ieee.org/document/4145027},
    volume={},
    number={},
    pages={81-90},}

  • Y. Higo, Y. Ueda, S. Kusumoto, and K. Inoue, “Simultaneous modification support based on code clone analysis,” in 14th asia-pacific software engineering conference (apsec’07), 2007, pp. 262-269.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4425863,
    author={Y. {Higo} and Y. {Ueda} and S. {Kusumoto} and K. {Inoue}},
    booktitle={14th Asia-Pacific Software Engineering Conference (APSEC'07)},
    title={Simultaneous Modification Support based on Code Clone Analysis},
    year={2007},
    url = {https://ieeexplore.ieee.org/document/4425863},
    volume={},
    number={},
    pages={262-269},}

  • C. K. Roy, M. Gias Uddin, B. Roy, and T. R. Dean, “Evaluating aspect mining techniques: a case study,” in 15th ieee international conference on program comprehension (icpc ’07), 2007, pp. 167-176.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4268251,
    author={C. K. {Roy} and M. {Gias Uddin} and B. {Roy} and T. R. {Dean}},
    booktitle={15th IEEE International Conference on Program Comprehension (ICPC '07)},
    title={Evaluating Aspect Mining Techniques: A Case Study},
    year={2007},
    url = {https://ieeexplore.ieee.org/document/4268251},
    volume={},
    number={},
    pages={167-176},}

  • H. A. Basit and S. Jarzabek, “Efficient token based clone detection with flexible tokenization,” in Proceedings of the the 6th joint meeting of the european software engineering conference and the acm sigsoft symposium on the foundations of software engineering, New York, NY, USA, 2007, p. 513–516. doi:10.1145/1287624.1287698
    [BibTeX] [PDF]
    @inproceedings{10.1145/1287624.1287698,
    author = {Basit, Hamid Abdul and Jarzabek, Stan},
    title = {Efficient Token Based Clone Detection with Flexible Tokenization},
    year = {2007},
    isbn = {9781595938114},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1287624.1287698},
    doi = {10.1145/1287624.1287698},
    booktitle = {Proceedings of the the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering},
    pages = {513–516},
    numpages = {4},
    keywords = {token-based clone detection, clone detection, reverse engineering, software maintenance},
    location = {Dubrovnik, Croatia},
    series = {ESEC-FSE ’07}
    }

  • E. Duala-Ekoko and M. P. Robillard, “Tracking code clones in evolving software,” in Proceedings of the 29th international conference on software engineering, USA, 2007, p. 158–167. doi:10.1109/ICSE.2007.90
    [BibTeX] [PDF]
    @inproceedings{10.1109/ICSE.2007.90,
    author = {Duala-Ekoko, Ekwa and Robillard, Martin P.},
    title = {Tracking Code Clones in Evolving Software},
    year = {2007},
    isbn = {0769528287},
    publisher = {IEEE Computer Society},
    address = {USA},
    url = {https://doi.org/10.1109/ICSE.2007.90},
    doi = {10.1109/ICSE.2007.90},
    booktitle = {Proceedings of the 29th International Conference on Software Engineering},
    pages = {158–167},
    numpages = {10},
    series = {ICSE ’07}
    }

  • M. Harman, “Search based software engineering for program comprehension,” in 15th ieee international conference on program comprehension (icpc ’07), 2007, pp. 3-13.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4268235,
    author={M. {Harman}},
    booktitle={15th IEEE International Conference on Program Comprehension (ICPC '07)},
    title={Search Based Software Engineering for Program Comprehension},
    year={2007},
    url = {https://ieeexplore.ieee.org/document/4268235},
    volume={},
    number={},
    pages={3-13},}

  • P. Deshane and D. Hou, “Cren: a tool for tracking copy-and-paste code clones and renaming identifiers consistently in the ide.,” in Proceedings of the 2007 oopsla workshop on eclipse technology exchange (etx), 2007, pp. 16-20. doi:10.1145/1328279.1328283
    [BibTeX] [PDF]
    @inproceedings{inproceedings,
    author = {Deshane, Patricia and Hou, Daqing},
    year = {2007},
    month = {01},
    pages = {16-20},
    title = {CReN: a tool for tracking copy-and-paste code clones and renaming identifiers consistently in the IDE.},
    doi = {10.1145/1328279.1328283},
    url = {https://www.researchgate.net/publication/221107989_CReN_a_tool_for_tracking_copy-and-paste_code_clones_and_renaming_identifiers_consistently_in_the_IDE},
    booktitle = {Proceedings of the 2007 OOPSLA workshop on Eclipse Technology eXchange (ETX)},
    }

  • Z. M. Jiang and A. E. Hassan, “A framework for studying clones in large software systems,” in Seventh ieee international working conference on source code analysis and manipulation (scam 2007), 2007, pp. 203-212.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4362914,
    author={Z. M. {Jiang} and A. E. {Hassan}},
    booktitle={Seventh IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2007)},
    title={A Framework for Studying Clones In Large Software Systems},
    year={2007},
    url = {https://ieeexplore.ieee.org/abstract/document/4362914},
    volume={},
    number={},
    pages={203-212},}

  • A. Kellens, K. Mens, and P. Tonella, “A survey of automated code-level aspect mining techniques,” in Transactions on aspect-oriented software development iv, A. Rashid and M. Aksit, Eds., Berlin, Heidelberg: Springer berlin heidelberg, 2007, pp. 143-162. doi:10.1007/978-3-540-77042-8_6
    [BibTeX] [Abstract] [PDF]

    This paper offers a first, in-breadth survey and comparison of current aspect mining tools and techniques. It focuses mainly on automated techniques that mine a program’s static or dynamic structure for candidate aspects. We present an initial comparative framework for distinguishing aspect mining techniques, and assess known techniques against this framework. The results of this assessment may serve as a roadmap to potential users of aspect mining techniques, to help them in selecting an appropriate technique. It also helps aspect mining researchers to identify remaining open research questions, possible avenues for future research, and interesting combinations of existing techniques.

    @Inbook{Kellens2007,
    author="Kellens, Andy
    and Mens, Kim
    and Tonella, Paolo",
    editor="Rashid, Awais
    and Aksit, Mehmet",
    title="A Survey of Automated Code-Level Aspect Mining Techniques",
    booktitle="Transactions on Aspect-Oriented Software Development IV",
    year="2007",
    publisher="Springer Berlin Heidelberg",
    address="Berlin, Heidelberg",
    pages="143-162",
    abstract="This paper offers a first, in-breadth survey and comparison of current aspect mining tools and techniques. It focuses mainly on automated techniques that mine a program's static or dynamic structure for candidate aspects. We present an initial comparative framework for distinguishing aspect mining techniques, and assess known techniques against this framework. The results of this assessment may serve as a roadmap to potential users of aspect mining techniques, to help them in selecting an appropriate technique. It also helps aspect mining researchers to identify remaining open research questions, possible avenues for future research, and interesting combinations of existing techniques.",
    isbn="978-3-540-77042-8",
    doi="10.1007/978-3-540-77042-8_6",
    url="https://doi.org/10.1007/978-3-540-77042-8_6"
    }

  • S. Livieri, Y. Higo, M. Matushita, and K. Inoue, “Very-large scale code clone analysis and visualization of open source programs using distributed ccfinder: d-ccfinder,” in Proceedings of the 29th international conference on software engineering, USA, 2007, p. 106–115. doi:10.1109/ICSE.2007.97
    [BibTeX] [PDF]
    @inproceedings{10.1109/ICSE.2007.97,
    author = {Livieri, Simone and Higo, Yoshiki and Matushita, Makoto and Inoue, Katsuro},
    title = {Very-Large Scale Code Clone Analysis and Visualization of Open Source Programs Using Distributed CCFinder: D-CCFinder},
    year = {2007},
    isbn = {0769528287},
    publisher = {IEEE Computer Society},
    address = {USA},
    url = {https://doi.org/10.1109/ICSE.2007.97},
    doi = {10.1109/ICSE.2007.97},
    booktitle = {Proceedings of the 29th International Conference on Software Engineering},
    pages = {106–115},
    numpages = {10},
    series = {ICSE ’07}
    }

  • G. S. Manku, A. Jain, and A. Das Sarma, “Detecting near-duplicates for web crawling,” in Proceedings of the 16th international conference on world wide web, New York, NY, USA, 2007, p. 141–150. doi:10.1145/1242572.1242592
    [BibTeX] [PDF]
    @inproceedings{10.1145/1242572.1242592,
    author = {Manku, Gurmeet Singh and Jain, Arvind and Das Sarma, Anish},
    title = {Detecting Near-Duplicates for Web Crawling},
    year = {2007},
    isbn = {9781595936547},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1242572.1242592},
    doi = {10.1145/1242572.1242592},
    booktitle = {Proceedings of the 16th International Conference on World Wide Web},
    pages = {141–150},
    numpages = {10},
    keywords = {hamming distance, web document, sketch, search, web crawl, similarity, near-duplicate, fingerprint},
    location = {Banff, Alberta, Canada},
    series = {WWW ’07}
    }

  • S. M. Nasehi, G. R. Sotudeh, and M. Gomrokchi, “Source code enhancement using reduction of duplicated code,” in Proceedings of the 25th conference on iasted international multi-conference: software engineering, USA, 2007, p. 192–197.
    [BibTeX] [PDF]
    @inproceedings{10.5555/1332044.1332075,
    author = {Nasehi, Seyyed Mehdi and Sotudeh, Gholam Reza and Gomrokchi, Maziar},
    title = {Source Code Enhancement Using Reduction of Duplicated Code},
    year = {2007},
    publisher = {ACTA Press},
    address = {USA},
    booktitle = {Proceedings of the 25th Conference on IASTED International Multi-Conference: Software Engineering},
    pages = {192–197},
    url = {https://dl.acm.org/doi/10.5555/1332044.1332075},
    numpages = {6},
    keywords = {duplicated code, code smell detection, refactoring},
    location = {Innsbruck, Austria},
    series = {SE’07}
    }

  • D. Poshyvanyk and A. Marcus, “Combining formal concept analysis with information retrieval for concept location in source code,” in Proceedings of the 15th ieee international conference on program comprehension, USA, 2007, p. 37–48. doi:10.1109/ICPC.2007.13
    [BibTeX] [PDF]
    @inproceedings{10.1109/ICPC.2007.13,
    author = {Poshyvanyk, Denys and Marcus, Andrian},
    title = {Combining Formal Concept Analysis with Information Retrieval for Concept Location in Source Code},
    year = {2007},
    isbn = {0769528600},
    publisher = {IEEE Computer Society},
    address = {USA},
    url = {https://doi.org/10.1109/ICPC.2007.13},
    doi = {10.1109/ICPC.2007.13},
    booktitle = {Proceedings of the 15th IEEE International Conference on Program Comprehension},
    pages = {37–48},
    numpages = {12},
    series = {ICPC ’07}
    }

  • D. C. Rajapakse and S. Jarzabek, “Using server pages to unify clones in web applications: a trade-off analysis,” in 29th international conference on software engineering (icse’07), 2007, pp. 116-126.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4222574,
    author={D. C. {Rajapakse} and S. {Jarzabek}},
    booktitle={29th International Conference on Software Engineering (ICSE'07)},
    title={Using Server Pages to Unify Clones in Web Applications: A Trade-Off Analysis},
    year={2007},
    url ={https://ieeexplore.ieee.org/document/4222574},
    volume={},
    number={},
    pages={116-126},}

2006

  • H. Liu, Z. Ma, L. Zhang, and W. Shao, “Detecting duplications in sequence diagrams based on suffix trees,” in 2006 13th asia pacific software engineering conference (apsec’06), 2006, pp. 269-276.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4137427,
    author={H. {Liu} and Z. {Ma} and L. {Zhang} and W. {Shao}},
    booktitle={2006 13th Asia Pacific Software Engineering Conference (APSEC'06)},
    title={Detecting Duplications in Sequence Diagrams Based on Suffix Trees},
    year={2006},
    url ={https://ieeexplore.ieee.org/document/4137427},
    volume={},
    number={},
    pages={269-276},}

  • C. Kapser and M. W. Godfrey, “”cloning considered harmful” considered harmful,” in 2006 13th working conference on reverse engineering, 2006, pp. 19-28.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4023973,
    author={C. {Kapser} and M. W. {Godfrey}},
    booktitle={2006 13th Working Conference on Reverse Engineering},
    title={"Cloning Considered Harmful" Considered Harmful},
    year={2006},
    url = {https://ieeexplore.ieee.org/document/4023973},
    volume={},
    number={},
    pages={19-28},}

  • M. Balint, R. Marinescu, and T. Girba, “How developers copy,” in 14th ieee international conference on program comprehension (icpc’06), 2006, pp. 56-68.
    [BibTeX] [PDF]
    @INPROCEEDINGS{1631105,
    author={M. {Balint} and R. {Marinescu} and T. {Girba}},
    booktitle={14th IEEE International Conference on Program Comprehension (ICPC'06)},
    title={How Developers Copy},
    url = {https://ieeexplore.ieee.org/document/1631105},
    year={2006},
    volume={},
    number={},
    pages={56-68},}

  • C. Kapser and M. W. Godfrey, “”cloning considered harmful” considered harmful,” in 2006 13th working conference on reverse engineering, 2006, pp. 19-28.
    [BibTeX] [PDF]
    @INPROCEEDINGS{4023973,
    author={C. {Kapser} and M. W. {Godfrey}},
    booktitle={2006 13th Working Conference on Reverse Engineering},
    title={"Cloning Considered Harmful" Considered Harmful},
    year={2006},
    url = {https://ieeexplore.ieee.org/document/4023973}
    volume={},
    number={},
    pages={19-28},}

  • E. Adar, “Guess: a language and interface for graph exploration,” in Proceedings of the sigchi conference on human factors in computing systems, New York, NY, USA, 2006, p. 791–800. doi:10.1145/1124772.1124889
    [BibTeX] [PDF]
    @inproceedings{10.1145/1124772.1124889,
    author = {Adar, Eytan},
    title = {GUESS: A Language and Interface for Graph Exploration},
    year = {2006},
    isbn = {1595933727},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1124772.1124889},
    doi = {10.1145/1124772.1124889},
    booktitle = {Proceedings of the SIGCHI Conference on Human Factors in Computing Systems},
    pages = {791–800},
    numpages = {10},
    keywords = {graph layout, graph visualization, domain-specific embedded language},
    location = {Montr\'{e}al, Qu\'{e}bec, Canada},
    series = {CHI ’06}
    }

  • S. Bouktif, G. Antoniol, E. Merlo, and M. Neteler, “A novel approach to optimize clone refactoring activity,” in Proceedings of the 8th annual conference on genetic and evolutionary computation, New York, NY, USA, 2006, p. 1885–1892. doi:10.1145/1143997.1144312
    [BibTeX] [PDF]
    @inproceedings{10.1145/1143997.1144312,
    author = {Bouktif, Salah and Antoniol, Giuliano and Merlo, Ettore and Neteler, Markus},
    title = {A Novel Approach to Optimize Clone Refactoring Activity},
    year = {2006},
    isbn = {1595931864},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1143997.1144312},
    doi = {10.1145/1143997.1144312},
    booktitle = {Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation},
    pages = {1885–1892},
    numpages = {8},
    keywords = {genetic algorithms, multi-objective optimization, effort prediction, evolution modeling, software quality improvement, refactoring effort},
    location = {Seattle, Washington, USA},
    series = {GECCO ’06}
    }

  • G. A. Di Lucca, D. Distante, and M. L. Bernardi, “Recovering conceptual models from web applications,” in Proceedings of the 24th annual acm international conference on design of communication, New York, NY, USA, 2006, p. 113–120. doi:10.1145/1166324.1166351
    [BibTeX] [PDF]
    @inproceedings{10.1145/1166324.1166351,
    author = {Di Lucca, Giuseppe Antonio and Distante, Damiano and Bernardi, Mario Luca},
    title = {Recovering Conceptual Models from Web Applications},
    year = {2006},
    isbn = {1595935231},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1166324.1166351},
    doi = {10.1145/1166324.1166351},
    booktitle = {Proceedings of the 24th Annual ACM International Conference on Design of Communication},
    pages = {113–120},
    numpages = {8},
    keywords = {conceptual modeling, documentation, UWA, reverse engineering, Web applications, design},
    location = {Myrtle Beach, SC, USA},
    series = {SIGDOC ’06}
    }

  • S. Ducasse, O. Nierstrasz, and M. Rieger, “On the effectiveness of clone detection by string matching,” Journal of software maintenance and evolution: research and practice, vol. 18, iss. 1, pp. 37-58, 2006. doi:10.1002/smr.317
    [BibTeX] [Abstract] [PDF]

    Abstract Although duplicated code is known to pose severe problems for software maintenance, it is difficult to identify in large systems. Many different techniques have been developed to detect software clones, some of which are very sophisticated, but are also expensive to implement and adapt. Lightweight techniques based on simple string matching are easy to implement, but how effective are they? We present a simple string-based approach which we have successfully applied to a number of different languages such COBOL, JAVA, C++, PASCAL, PYTHON, SMALLTALK, C and PDP-11 ASSEMBLER. In each case the maximum time to adapt the approach to a new language was less than 45 minutes. In this paper we investigate a number of simple variants of string-based clone detection that normalize differences due to common editing operations, and assess the quality of clone detection for very different case studies. Our results confirm that this inexpensive clone detection technique generally achieves high recall and acceptable precision. Over-zealous normalization of the code before comparison, however, can result in an unacceptable numbers of false positives. Copyright © 2005 John Wiley & Sons, Ltd.

    @article{doi:10.1002/smr.317,
    author = {Ducasse, Stéphane and Nierstrasz, Oscar and Rieger, Matthias},
    title = {On the effectiveness of clone detection by string matching},
    journal = {Journal of Software Maintenance and Evolution: Research and Practice},
    volume = {18},
    number = {1},
    pages = {37-58},
    keywords = {software maintenance, duplicated code, string matching, clone detection},
    doi = {10.1002/smr.317},
    url = {https://onlinelibrary.wiley.com/doi/abs/10.1002/smr.317},
    eprint = {https://onlinelibrary.wiley.com/doi/pdf/10.1002/smr.317},
    abstract = {Abstract Although duplicated code is known to pose severe problems for software maintenance, it is difficult to identify in large systems. Many different techniques have been developed to detect software clones, some of which are very sophisticated, but are also expensive to implement and adapt. Lightweight techniques based on simple string matching are easy to implement, but how effective are they? We present a simple string-based approach which we have successfully applied to a number of different languages such COBOL, JAVA, C++, PASCAL, PYTHON, SMALLTALK, C and PDP-11 ASSEMBLER. In each case the maximum time to adapt the approach to a new language was less than 45 minutes. In this paper we investigate a number of simple variants of string-based clone detection that normalize differences due to common editing operations, and assess the quality of clone detection for very different case studies. Our results confirm that this inexpensive clone detection technique generally achieves high recall and acceptable precision. Over-zealous normalization of the code before comparison, however, can result in an unacceptable numbers of false positives. Copyright © 2005 John Wiley \& Sons, Ltd.},
    year = {2006}
    }

  • S. Giesecke, “Generic modelling of code clones,” Roceedings of duplication, re-dundancy, and similarity in software, 2006.
    [BibTeX] [PDF]
    @article{article,
    author = {Giesecke, Simon},
    year = {2006},
    month = {01},
    url = {https://www.researchgate.net/publication/30815550_Generic_modelling_of_code_clones},
    pages = {},
    journal = {roceedings of Duplication, Re-dundancy, and Similarity in Software},
    title = {Generic modelling of code clones}
    }

  • R. Geiger, B. Fluri, H. C. Gall, and M. Pinzger, “Relation of code clones and change couplings,” in Fundamental approaches to software engineering, Berlin, Heidelberg, 2006, pp. 411-425.
    [BibTeX] [Abstract] [PDF]

    Code clones have long been recognized as bad smells in software systems and are considered to cause maintenance problems during evolution. It is broadly assumed that the more clones two files share, the more often they have to be changed together. This relation between clones and change couplings has been postulated but neither demonstrated nor quantified yet. However, given such a relation it would simplify the identification of restructuring candidates and reduce change couplings. In this paper, we examine this relation and discuss if a correlation between code clones and change couplings can be verified. For that, we propose a framework to examine code clones and relate them to change couplings taken from release history analysis. We validated our framework with the open source project Mozilla and the results of the validation show that although the relation is statistically unverifiable it derives a reasonable amount of cases where the relation exists. Therefore, to discover clone candidates for restructuring we additionally propose a set of metrics and a visualization technique. This allows one to spot where a correlation between cloning and change coupling exists and, as a result, which files should be restructured to ease further evolution.

    @InProceedings{10.1007/11693017_31,
    author="Geiger, Reto
    and Fluri, Beat
    and Gall, Harald C.
    and Pinzger, Martin",
    editor="Baresi, Luciano
    and Heckel, Reiko",
    title="Relation of Code Clones and Change Couplings",
    booktitle="Fundamental Approaches to Software Engineering",
    year="2006",
    publisher="Springer Berlin Heidelberg",
    address="Berlin, Heidelberg",
    pages="411-425",
    url = {https://link.springer.com/chapter/10.1007/11693017_31},
    abstract="Code clones have long been recognized as bad smells in software systems and are considered to cause maintenance problems during evolution. It is broadly assumed that the more clones two files share, the more often they have to be changed together. This relation between clones and change couplings has been postulated but neither demonstrated nor quantified yet. However, given such a relation it would simplify the identification of restructuring candidates and reduce change couplings. In this paper, we examine this relation and discuss if a correlation between code clones and change couplings can be verified. For that, we propose a framework to examine code clones and relate them to change couplings taken from release history analysis. We validated our framework with the open source project Mozilla and the results of the validation show that although the relation is statistically unverifiable it derives a reasonable amount of cases where the relation exists. Therefore, to discover clone candidates for restructuring we additionally propose a set of metrics and a visualization technique. This allows one to spot where a correlation between cloning and change coupling exists and, as a result, which files should be restructured to ease further evolution.",
    isbn="978-3-540-33094-3"
    }

  • S. Jarzabek and S. Li, “Unifying clones with a generative programming technique: a case study: practice articles,” J. softw. maint. evol., vol. 18, iss. 4, p. 267–292, 2006.
    [BibTeX] [PDF]
    @article{10.5555/1148461.1148463,
    author = {Jarzabek, Stan and Li, Shubiao},
    title = {Unifying Clones with a Generative Programming Technique: A Case Study: Practice Articles},
    year = {2006},
    issue_date = {July 2006},
    publisher = {John Wiley & Sons, Inc.},
    address = {USA},
    volume = {18},
    url = {https://dl.acm.org/doi/10.5555/1148461.1148463},
    number = {4},
    issn = {1532-060X},
    journal = {J. Softw. Maint. Evol.},
    month = jul,
    pages = {267–292},
    numpages = {26},
    keywords = {object-oriented methods, reusability, generative programming, maintainability, class libraries}
    }

  • Z. M. Jiang, A. E. Hassan, and R. C. Holt, “Visualizing clone cohesion and coupling,” in Proceedings of the xiii asia pacific software engineering conference, USA, 2006, p. 467–476. doi:10.1109/APSEC.2006.63
    [BibTeX] [PDF]
    @inproceedings{10.1109/APSEC.2006.63,
    author = {Jiang, Zhen Ming and Hassan, Ahmed E. and Holt, Richard C.},
    title = {Visualizing Clone Cohesion and Coupling},
    year = {2006},
    isbn = {0769526853},
    publisher = {IEEE Computer Society},
    address = {USA},
    url = {https://doi.org/10.1109/APSEC.2006.63},
    doi = {10.1109/APSEC.2006.63},
    booktitle = {Proceedings of the XIII Asia Pacific Software Engineering Conference},
    pages = {467–476},
    numpages = {10},
    series = {APSEC ’06}
    }

  • C. J. Kapser and M. W. Godfrey, “Supporting the analysis of clones in software systems: research articles,” J. softw. maint. evol., vol. 18, iss. 2, p. 61–82, 2006.
    [BibTeX] [PDF]
    @article{10.5555/1133105.1133106,
    author = {Kapser, Cory J. and Godfrey, Michael W.},
    title = {Supporting the Analysis of Clones in Software Systems: Research Articles},
    year = {2006},
    issue_date = {March 2006},
    publisher = {John Wiley & Sons, Inc.},
    address = {USA},
    volume = {18},
    number = {2},
    url = {https://dl.acm.org/doi/10.5555/1133105.1133106},
    issn = {1532-060X},
    journal = {J. Softw. Maint. Evol.},
    month = mar,
    pages = {61–82},
    numpages = {22},
    keywords = {visualization, code clone, duplication, clone detection, software architecture, maintenance}
    }

  • M. Kim and D. Notkin, “Program element matching for multi-version program analyses,” in Proceedings of the 2006 international workshop on mining software repositories, New York, NY, USA, 2006, p. 58–64. doi:10.1145/1137983.1137999
    [BibTeX] [PDF]
    @inproceedings{10.1145/1137983.1137999,
    author = {Kim, Miryung and Notkin, David},
    title = {Program Element Matching for Multi-Version Program Analyses},
    year = {2006},
    isbn = {1595933972},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1137983.1137999},
    doi = {10.1145/1137983.1137999},
    booktitle = {Proceedings of the 2006 International Workshop on Mining Software Repositories},
    pages = {58–64},
    numpages = {7},
    keywords = {matching, software evolution, multi-version analysis},
    location = {Shanghai, China},
    series = {MSR ’06}
    }

  • C. Liu, F. Chen, J. Han, and P. Yu, “Gplag: detection of software plagiarism by program dependence graph analysis.” 2006, pp. 872-881. doi:10.1145/1150402.1150522
    [BibTeX] [PDF]
    @inproceedings{inproceedings,
    author = {Liu, Chao and Chen, Fen and Han, Jiawei and Yu, Philip},
    year = {2006},
    month = {01},
    pages = {872-881},
    title = {GPLAG: Detection of software plagiarism by program dependence graph analysis},
    volume = {2006},
    url = {https://www.researchgate.net/publication/221653862_GPLAG_Detection_of_software_plagiarism_by_program_dependence_graph_analysis},
    journal = {Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
    doi = {10.1145/1150402.1150522}
    }

  • Z. Mann, “Three public enemies: cut, copy, and paste,” Computer, vol. 39, iss. 07, pp. 31-35, 2006. doi:10.1109/MC.2006.246
    [BibTeX] [PDF]
    @ARTICLE {,
    author = {Z. Mann},
    journal = {Computer},
    title = {Three Public Enemies: Cut, Copy, and Paste},
    year = {2006},
    volume = {39},
    url = {https://www.computer.org/csdl/magazine/co/2006/07/r7031/13rRUxly90I},
    number = {07},
    issn = {1558-0814},
    pages = {31-35},
    keywords = {software development},
    doi = {10.1109/MC.2006.246},
    publisher = {IEEE Computer Society},
    address = {Los Alamitos, CA, USA},
    month = {jul}
    }

  • E. Merlo, “Detection of plagiarism in university projects using metrics-based spectral similarity,” Proceedings of dagstuhl seminar 06301: duplication, redun- dancy, and similarity in software, 2006.
    [BibTeX] [PDF]
    @article{article,
    author = {Merlo, Ettore},
    year = {2006},
    month = {01},
    pages = {},
    url = {https://www.researchgate.net/publication/30815580_Detection_of_Plagiarism_in_University_Projects_Using_Metrics-based_Spectral_Similarity},
    journal = {Proceedings of Dagstuhl Seminar 06301: Duplication, Redun- dancy, and Similarity in Software},
    title = {Detection of Plagiarism in University Projects Using Metrics-based Spectral Similarity}
    }

  • A. Raza, G. Vogel, and E. Plödereder, “Bauhaus – a tool suite for program analysis and reverse engineering,” in Reliable software technologies – ada-europe 2006, Berlin, Heidelberg, 2006, pp. 71-82.
    [BibTeX] [Abstract] [PDF]

    The maintenance and evolution of critical software with high requirements for reliability is an extremely demanding, time consuming and expensive task. Errors introduced by ad-hoc changes might have disastrous effects on the system and must be prevented under all circumstances, which requires the understanding of the details of source code and system design. This paper describes Bauhaus, a comprehensive tool suite that supports program understanding and reverse engineering on all layers of abstraction, from source code to architecture.

    @InProceedings{10.1007/11767077_6,
    author="Raza, Aoun
    and Vogel, Gunther
    and Pl{\"o}dereder, Erhard",
    editor="Pinho, Lu{\'i}s Miguel
    and Gonz{\'a}lez Harbour, Michael",
    title="Bauhaus -- A Tool Suite for Program Analysis and Reverse Engineering",
    booktitle="Reliable Software Technologies -- Ada-Europe 2006",
    year="2006",
    publisher="Springer Berlin Heidelberg",
    address="Berlin, Heidelberg",
    pages="71-82",
    url = {https://link.springer.com/chapter/10.1007/11767077_6},
    abstract="The maintenance and evolution of critical software with high requirements for reliability is an extremely demanding, time consuming and expensive task. Errors introduced by ad-hoc changes might have disastrous effects on the system and must be prevented under all circumstances, which requires the understanding of the details of source code and system design. This paper describes Bauhaus, a comprehensive tool suite that supports program understanding and reverse engineering on all layers of abstraction, from source code to architecture.",
    isbn="978-3-540-34664-7"
    }

  • T. Sager, A. Bernstein, M. Pinzger, and C. Kiefer, “Detecting similar java classes using tree algorithms,” in Proceedings of the 2006 international workshop on mining software repositories, New York, NY, USA, 2006, p. 65–71. doi:10.1145/1137983.1138000
    [BibTeX] [PDF]
    @inproceedings{10.1145/1137983.1138000,
    author = {Sager, Tobias and Bernstein, Abraham and Pinzger, Martin and Kiefer, Christoph},
    title = {Detecting Similar Java Classes Using Tree Algorithms},
    year = {2006},
    isbn = {1595933972},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1137983.1138000},
    doi = {10.1145/1137983.1138000},
    booktitle = {Proceedings of the 2006 International Workshop on Mining Software Repositories},
    pages = {65–71},
    numpages = {7},
    keywords = {software evolution, change analysis, tree similarity measures, software repositories},
    location = {Shanghai, China},
    series = {MSR ’06}
    }

  • R. Tairas and J. Gray, “Phoenix-based clone detection using suffix trees,” in Proceedings of the 44th annual southeast regional conference, New York, NY, USA, 2006, p. 679–684. doi:10.1145/1185448.1185597
    [BibTeX] [PDF]
    @inproceedings{10.1145/1185448.1185597,
    author = {Tairas, Robert and Gray, Jeff},
    title = {Phoenix-Based Clone Detection Using Suffix Trees},
    year = {2006},
    isbn = {1595933158},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1185448.1185597},
    doi = {10.1145/1185448.1185597},
    booktitle = {Proceedings of the 44th Annual Southeast Regional Conference},
    pages = {679–684},
    numpages = {6},
    keywords = {software analysis, code clones, clone detection, suffix trees},
    location = {Melbourne, Florida},
    series = {ACM-SE 44}
    }

2005

  • H. A. Basit and S. Jarzabek, “Detecting higher-level similarity patterns in programs,” in Proceedings of the 10th european software engineering conference held jointly with 13th acm sigsoft international symposium on foundations of software engineering, New York, NY, USA, 2005, p. 156–165. doi:10.1145/1081706.1081733
    [BibTeX] [PDF]
    @inproceedings{10.1145/1081706.1081733,
    author = {Basit, Hamid Abdul and Jarzabek, Stan},
    title = {Detecting Higher-Level Similarity Patterns in Programs},
    year = {2005},
    isbn = {1595930140},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1081706.1081733},
    doi = {10.1145/1081706.1081733},
    booktitle = {Proceedings of the 10th European Software Engineering Conference Held Jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering},
    pages = {156–165},
    numpages = {10},
    keywords = {software clones, similarity patterns, clone detection},
    location = {Lisbon, Portugal},
    series = {ESEC/FSE-13}
    }

  • M. Kim, V. Sazawal, D. Notkin, and G. Murphy, “An empirical study of code clone genealogies,” in Proceedings of the 10th european software engineering conference held jointly with 13th acm sigsoft international symposium on foundations of software engineering, New York, NY, USA, 2005, p. 187–196. doi:10.1145/1081706.1081737
    [BibTeX] [PDF]
    @inproceedings{10.1145/1081706.1081737,
    author = {Kim, Miryung and Sazawal, Vibha and Notkin, David and Murphy, Gail},
    title = {An Empirical Study of Code Clone Genealogies},
    year = {2005},
    isbn = {1595930140},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1081706.1081737},
    doi = {10.1145/1081706.1081737},
    booktitle = {Proceedings of the 10th European Software Engineering Conference Held Jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering},
    pages = {187–196},
    numpages = {10},
    keywords = {software maintenance, empirical study, software evolution, code clone, refactoring},
    location = {Lisbon, Portugal},
    series = {ESEC/FSE-13}
    }

  • R. Al-Ekram, C. Kapser, R. Holt, and M. Godfrey, “Cloning by accident: an empirical study of source code cloning across software systems,” in 2005 international symposium on empirical software engineering, 2005., 2005, p. 10.
    [BibTeX] [PDF]
    @INPROCEEDINGS{1541846,
    author={R. {Al-Ekram} and C. {Kapser} and R. {Holt} and M. {Godfrey}},
    booktitle={2005 International Symposium on Empirical Software Engineering, 2005.},
    title={Cloning by accident: an empirical study of source code cloning across software systems},
    year={2005},
    url = {https://ieeexplore.ieee.org/document/1541846},
    volume={},
    number={},
    pages={10},}

  • H. A. Basit, D. C. Rajapakse, and S. Jarzabek, “Beyond templates: a study of clones in the stl and some general implications,” in Proceedings. 27th international conference on software engineering, 2005. icse 2005., 2005, pp. 451-459.
    [BibTeX] [PDF]
    @INPROCEEDINGS{1553588,
    author={H. A. {Basit} and D. C. {Rajapakse} and S. {Jarzabek}},
    booktitle={Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005.},
    title={Beyond templates: a study of clones in the STL and some general implications},
    year={2005},
    url = {https://ieeexplore.ieee.org/document/1553588},
    volume={},
    number={},
    pages={451-459},}

  • M. Bruntink, A. van Deursen, R. van Engelen, and T. Tourwe, “On the use of clone detection for identifying crosscutting concern code,” Ieee transactions on software engineering, vol. 31, iss. 10, pp. 804-818, 2005.
    [BibTeX] [PDF]
    @ARTICLE{1542064,
    author={M. {Bruntink} and A. {van Deursen} and R. {van Engelen} and T. {Tourwe}},
    journal={IEEE Transactions on Software Engineering},
    title={On the use of clone detection for identifying crosscutting concern code},
    year={2005},
    url = {https://ieeexplore.ieee.org/document/1542064},
    volume={31},
    number={10},
    pages={804-818},}

  • H. Basit, D. Rajapakse, and S. Jarzabek, “An empirical study on limits of clone unification using generics..” 2005, pp. 109-114.
    [BibTeX] [PDF]
    @inproceedings{inproceedings,
    author = {Basit, Hamid and Rajapakse, Damith and Jarzabek, Stan},
    year = {2005},
    url = {https://www.researchgate.net/publication/221391160_An_Empirical_Study_on_Limits_of_Clone_Unification_Using_Generics},
    month = {01},
    journal = {Proceedings of the 17th International Conference on Software Engineering and Knowledge Engineering (SEKE)},
    pages = {109-114},
    title = {An Empirical Study on Limits of Clone Unification Using Generics.}
    }

  • D. C. Rajapakse and S. Jarzabek, “An investigation of cloning in web applications,” in Proceedings of the 5th international conference on web engineering, Berlin, Heidelberg, 2005, p. 252–262. doi:10.1007/11531371_35
    [BibTeX] [PDF]
    @inproceedings{10.1007/11531371_35,
    author = {Rajapakse, Damith C. and Jarzabek, Stan},
    title = {An Investigation of Cloning in Web Applications},
    year = {2005},
    isbn = {3540279962},
    publisher = {Springer-Verlag},
    address = {Berlin, Heidelberg},
    url = {https://doi.org/10.1007/11531371_35},
    doi = {10.1007/11531371_35},
    booktitle = {Proceedings of the 5th International Conference on Web Engineering},
    pages = {252–262},
    numpages = {11},
    location = {Sydney, Australia},
    series = {ICWE’05}
    }

  • A. De Lucia, R. Francese, G. Scanniello, and G. Tortora, “Understanding cloned patterns in web applications,” in 13th international workshop on program comprehension (iwpc’05), 2005, pp. 333-336.
    [BibTeX] [PDF]
    @INPROCEEDINGS{1421049,
    author={A. {De Lucia} and R. {Francese} and G. {Scanniello} and G. {Tortora}},
    booktitle={13th International Workshop on Program Comprehension (IWPC'05)},
    title={Understanding cloned patterns in Web applications},
    year={2005},
    url = {https://ieeexplore.ieee.org/document/1421049},
    volume={},
    number={},
    pages={333-336},}

  • M. W. Godfrey and L. Zou, “Using origin analysis to detect merging and splitting of source code entities,” Ieee transactions on software engineering, vol. 31, iss. 2, pp. 166-181, 2005.
    [BibTeX] [PDF]
    @ARTICLE{1401931,
    author={M. W. {Godfrey} and L. {Zou}},
    journal={IEEE Transactions on Software Engineering},
    title={Using origin analysis to detect merging and splitting of source code entities},
    year={2005},
    url = {https://ieeexplore.ieee.org/document/1401931},
    volume={31},
    number={2},
    pages={166-181},}

  • C. Kapser and M. W. Godfrey, “Improved tool support for the investigation of duplication in software,” in Proceedings of the 21st ieee international conference on software maintenance, USA, 2005, p. 305–314. doi:10.1109/ICSM.2005.52
    [BibTeX] [PDF]
    @inproceedings{10.1109/ICSM.2005.52,
    author = {Kapser, Cory and Godfrey, Michael W.},
    title = {Improved Tool Support for the Investigation of Duplication in Software},
    year = {2005},
    isbn = {0769523684},
    publisher = {IEEE Computer Society},
    address = {USA},
    url = {https://doi.org/10.1109/ICSM.2005.52},
    doi = {10.1109/ICSM.2005.52},
    booktitle = {Proceedings of the 21st IEEE International Conference on Software Maintenance},
    pages = {305–314},
    numpages = {10},
    series = {ICSM ’05}
    }

  • M. Kim and D. Notkin, “Using a clone genealogy extractor for understanding and supporting evolution of code clones,” in Proceedings of the 2005 international workshop on mining software repositories, New York, NY, USA, 2005, p. 1–5. doi:10.1145/1083142.1083146
    [BibTeX] [PDF]
    @inproceedings{10.1145/1083142.1083146,
    author = {Kim, Miryung and Notkin, David},
    title = {Using a Clone Genealogy Extractor for Understanding and Supporting Evolution of Code Clones},
    year = {2005},
    isbn = {1595931236},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1083142.1083146},
    doi = {10.1145/1083142.1083146},
    booktitle = {Proceedings of the 2005 International Workshop on Mining Software Repositories},
    pages = {1–5},
    numpages = {5},
    location = {St. Louis, Missouri},
    series = {MSR ’05}
    }

  • S. Lee and I. Jeong, “Sdd: high performance code clone detection system for large scale source code,” in Proceedings of the object oriented programming sys- tems languages and applications companion to the 20th annual acm sigplan conference on object-oriented programming, systems, languages, and applications (oopsla), 2005, pp. 140-141.
    [BibTeX] [PDF]
    @inproceedings{Lee2005SDDHP,
    title={SDD: high performance code clone detection system for large scale source code},
    author={S. Lee and Iryoung Jeong},
    url = {https://www.semanticscholar.org/paper/SDD%3A-high-performance-code-clone-detection-system-Lee-Jeong/d1828876ce5cf3360228b91da5a19c4c76ceb56d},
    booktitle={Proceedings of the Object Oriented Programming Sys- tems Languages and Applications Companion to the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications (OOPSLA)},
    year={2005},
    pages ={140-141},
    }

  • Z. Li and Y. Zhou, “Pr-miner: automatically extracting implicit programming rules and detecting violations in large software code,” in Proceedings of the 10th european software engineering conference held jointly with 13th acm sigsoft international symposium on foundations of software engineering, New York, NY, USA, 2005, p. 306–315. doi:10.1145/1081706.1081755
    [BibTeX] [PDF]
    @inproceedings{10.1145/1081706.1081755,
    author = {Li, Zhenmin and Zhou, Yuanyuan},
    title = {PR-Miner: Automatically Extracting Implicit Programming Rules and Detecting Violations in Large Software Code},
    year = {2005},
    isbn = {1595930140},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1081706.1081755},
    doi = {10.1145/1081706.1081755},
    booktitle = {Proceedings of the 10th European Software Engineering Conference Held Jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering},
    pages = {306–315},
    numpages = {10},
    keywords = {automated violation detection, automated specification generation, pattern recognition, data mining for software engineering, static analysis, programming rules},
    location = {Lisbon, Portugal},
    series = {ESEC/FSE-13}
    }

  • A. Sutton, H. Kagdi, J. I. Maletic, and G. L. Volkert, “Hybridizing evolutionary algorithms and clustering algorithms to find source-code clones,” in Proceedings of the 7th annual conference on genetic and evolutionary computation, New York, NY, USA, 2005, p. 1079–1080. doi:10.1145/1068009.1068191
    [BibTeX] [PDF]
    @inproceedings{10.1145/1068009.1068191,
    author = {Sutton, Andrew and Kagdi, Huzefa and Maletic, Jonathan I. and Volkert, L. Gwenn},
    title = {Hybridizing Evolutionary Algorithms and Clustering Algorithms to Find Source-Code Clones},
    year = {2005},
    isbn = {1595930108},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/1068009.1068191},
    doi = {10.1145/1068009.1068191},
    booktitle = {Proceedings of the 7th Annual Conference on Genetic and Evolutionary Computation},
    pages = {1079–1080},
    numpages = {2},
    keywords = {clone detection, evolutionary algorithms, software engineering},
    location = {Washington DC, USA},
    series = {GECCO ’05}
    }

  • R. Wettel and R. Marinescu, “Archeology of code duplication: recovering duplication chains from small duplication fragments,” in 7th international symposium on symbolic and numeric algorithms for scientific computing (synasc’05), 2005, p. 8.
    [BibTeX] [PDF]
    @INPROCEEDINGS{1595830,
    author={R. {Wettel} and R. {Marinescu}},
    booktitle={7th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC'05)},
    title={Archeology of code duplication: recovering duplication chains from small duplication fragments},
    year={2005},
    url = {https://ieeexplore.ieee.org/document/1595830},
    volume={},
    number={},
    pages={8},}

  • N. Yoshida, Y. Higo, T. Kamiya, S. Kusumoto, and K. Inoue, “On refactoring support based on code clone dependency relation,” in 11th ieee international software metrics symposium (metrics), 2005, pp. 10-16.
    [BibTeX] [PDF]
    @INPROCEEDINGS{1509294,
    author={N. {Yoshida} and Y. {Higo} and T. {Kamiya} and S. {Kusumoto} and K. {Inoue}},
    booktitle={11th IEEE International Software Metrics Symposium (METRICS)},
    title={On refactoring support based on code clone dependency relation},
    year={2005},
    url = {https://ieeexplore.ieee.org/document/1509294}
    volume={},
    number={},
    pages={10-16},}

2004

  • J. R. Cordy, T. R. Dean, and N. Synytskyy, “Practical language-independent detection of near-miss clones,” in Proceedings of the 2004 conference of the centre for advanced studies on collaborative research, 2004, p. 1–12.
    [BibTeX]
    @inproceedings{10.5555/1034914.1034915,
    author = {Cordy, James R. and Dean, Thomas R. and Synytskyy, Nikita},
    title = {Practical Language-Independent Detection of near-Miss Clones},
    year = {2004},
    publisher = {IBM Press},
    booktitle = {Proceedings of the 2004 Conference of the Centre for Advanced Studies on Collaborative Research},
    pages = {1–12},
    numpages = {12},
    location = {Markham, Ontario, Canada},
    series = {CASCON ’04}
    }

  • E. Merlo, G. Antoniol, M. Di Penta, and V. F. Rollo, “Linear complexity object-oriented similarity for clone detection and software evolution analyses,” in 20th ieee international conference on software maintenance, 2004. proceedings., 2004, pp. 412-416.
    [BibTeX] [PDF]
    @INPROCEEDINGS{1357826,
    author={E. {Merlo} and G. {Antoniol} and M. {Di Penta} and V. F. {Rollo}},
    booktitle={20th IEEE International Conference on Software Maintenance, 2004. Proceedings.},
    title={Linear complexity object-oriented similarity for clone detection and software evolution analyses},
    year={2004},
    url ={https://ieeexplore.ieee.org/document/1357826},
    volume={},
    number={},
    pages={412-416},}

  • M. Rieger, S. Ducasse, and M. Lanza, “Insights into system-wide code duplication,” in 11th working conference on reverse engineering, 2004, pp. 100-109.
    [BibTeX] [PDF]
    @INPROCEEDINGS{1374310,
    author={M. {Rieger} and S. {Ducasse} and M. {Lanza}},
    booktitle={11th Working Conference on Reverse Engineering},
    title={Insights into system-wide code duplication},
    year={2004},
    url = {https://ieeexplore.ieee.org/document/1374310},
    volume={},
    number={},
    pages={100-109},}

  • C. Kapser and M. W. Godfrey, “Aiding comprehension of cloning through categorization,” in Proceedings. 7th international workshop on principles of software evolution, 2004., 2004, pp. 85-94.
    [BibTeX] [PDF]
    @INPROCEEDINGS{1334772,
    author={C. {Kapser} and M. W. {Godfrey}},
    booktitle={Proceedings. 7th International Workshop on Principles of Software Evolution, 2004.},
    title={Aiding comprehension of cloning through categorization},
    year={2004},
    url = {https://ieeexplore.ieee.org/document/1334772},
    volume={},
    number={},
    pages={85-94},}

  • Miryung Kim, L. Bergman, T. Lau, and D. Notkin, “An ethnographic study of copy and paste programming practices in oopl,” in Proceedings. 2004 international symposium on empirical software engineering, 2004. isese ’04., 2004, pp. 83-92.
    [BibTeX] [PDF]
    @INPROCEEDINGS{1334896,
    author={ {Miryung Kim} and L. {Bergman} and T. {Lau} and D. {Notkin}},
    booktitle={Proceedings. 2004 International Symposium on Empirical Software Engineering, 2004. ISESE '04.},
    title={An ethnographic study of copy and paste programming practices in OOPL},
    year={2004},
    url = {https://ieeexplore.ieee.org/document/1334896},
    volume={},
    number={},
    pages={83-92},}

  • I. Baxter, C. Pidgeon, and M. Mehlich, “Dms®: program transformations for practical scalable software evolution.” 2004, pp. 625-634. doi:10.1109/ICSE.2004.1317484
    [BibTeX] [PDF]
    @inproceedings{inproceedings,
    author = {Baxter, Ira and Pidgeon, Christopher and Mehlich, Michael},
    year = {2004},
    month = {01},
    pages = {625-634},
    title = {DMS®: Program Transformations for Practical Scalable Software Evolution},
    volume = {26},
    url = {https://www.researchgate.net/publication/221553743_DMSR_Program_Transformations_for_Practical_Scalable_Software_Evolution},
    journal = {Proceedings of International Conference on Software Engineering},
    doi = {10.1109/ICSE.2004.1317484}
    }

  • B. Belkhouche, A. Nix, and J. Hassell, “Plagiarism detection in software designs,” in Proceedings of the 42nd annual southeast regional conference, New York, NY, USA, 2004, p. 207–211. doi:10.1145/986537.986585
    [BibTeX] [PDF]
    @inproceedings{10.1145/986537.986585,
    author = {Belkhouche, B. and Nix, Anastasia and Hassell, Johnette},
    title = {Plagiarism Detection in Software Designs},
    year = {2004},
    isbn = {1581138709},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/986537.986585},
    doi = {10.1145/986537.986585},
    booktitle = {Proceedings of the 42nd Annual Southeast Regional Conference},
    pages = {207–211},
    numpages = {5},
    location = {Huntsville, Alabama},
    series = {ACM-SE 42}
    }

  • F. Calefato, F. Lanubile, and T. Mallardo, “Function clone detection in web applications: a semiautomated approach,” J. web eng., vol. 3, iss. 1, p. 3–21, 2004.
    [BibTeX] [PDF]
    @article{10.5555/2011138.2011140,
    author = {Calefato, Fabio and Lanubile, Filippo and Mallardo, Teresa},
    title = {Function Clone Detection in Web Applications: A Semiautomated Approach},
    year = {2004},
    issue_date = {May 2004},
    publisher = {Rinton Press, Incorporated},
    address = {Paramus, NJ},
    volume = {3},
    url = {https://dl.acm.org/doi/10.5555/2011138.2011140},
    number = {1},
    issn = {1540-9589},
    journal = {J. Web Eng.},
    month = may,
    pages = {3–21},
    numpages = {19},
    keywords = {function clones, clone detection, refactoring, web applications, code duplication}
    }

  • M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni, “Locality-sensitive hashing scheme based on p-stable distributions,” in Proceedings of the twentieth annual symposium on computational geometry, New York, NY, USA, 2004, p. 253–262. doi:10.1145/997817.997857
    [BibTeX] [PDF]
    @inproceedings{10.1145/997817.997857,
    author = {Datar, Mayur and Immorlica, Nicole and Indyk, Piotr and Mirrokni, Vahab S.},
    title = {Locality-Sensitive Hashing Scheme Based on p-Stable Distributions},
    year = {2004},
    isbn = {1581138857},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/997817.997857},
    doi = {10.1145/997817.997857},
    booktitle = {Proceedings of the Twentieth Annual Symposium on Computational Geometry},
    pages = {253–262},
    numpages = {10},
    keywords = {approximate nearest neighbor, p-stable distributions, sublinear algorithm, locally sensitive hashing},
    location = {Brooklyn, New York, USA},
    series = {SCG ’04}
    }

  • “Reengineering web applications based on cloned pattern analysis,” in Proceedings of the 12th ieee international workshop on program comprehension, USA, 2004, p. 132.
    [BibTeX] [PDF]
    @inproceedings{10.5555/998682.1006829,
    title = {Reengineering Web Applications Based on Cloned Pattern Analysis},
    year = {2004},
    isbn = {0769521495},
    url = {https://dl.acm.org/doi/10.5555/998682.1006829},
    publisher = {IEEE Computer Society},
    address = {USA},
    booktitle = {Proceedings of the 12th IEEE International Workshop on Program Comprehension},
    pages = {132},
    numpages = {1},
    series = {IWPC ’04}
    }

  • G. Di Lucca, A. Fasolino, P. Tramontana, and U. Carlini, “Identifying reusable components in web applications.,” in Roceedings of the iasted international con- ference on software engineering, 2004, pp. 526-531.
    [BibTeX] [PDF]
    @inproceedings{inproceedings,
    author = {Di Lucca, Giuseppe and Fasolino, Anna and Tramontana, Porfirio and Carlini, Ugo},
    year = {2004},
    month = {01},
    pages = {526-531},
    booktitle = {roceedings of the IASTED International Con- ference on Software Engineering},
    url={https://www.researchgate.net/publication/220901303_Identifying_reusable_components_in_web_applications},
    title = {Identifying reusable components in web applications.}
    }

  • Y. Higo, T. Kamiya, S. Kusumoto, and K. Inoue, “Aries: refactoring support environment based on code clone analysis.,” in Proceedings of the iasted conference on software engineering and applications, 2004, pp. 222-229.
    [BibTeX] [PDF]
    @inproceedings{inproceedings,
    author = {Higo, Yoshiki and Kamiya, Toshihiro and Kusumoto, Shinji and Inoue, Katsuro},
    year = {2004},
    month = {01},
    pages = {222-229},
    title = {ARIES: Refactoring support environment based on code clone analysis.},
    url = {https://www.researchgate.net/publication/220845849_ARIES_Refactoring_support_environment_based_on_code_clone_analysis},
    booktitle = {Proceedings of the IASTED Conference on Software Engineering and Applications},
    }

  • J. Krinke and S. Breu, “Control-flow-graph-based aspect mining,” in Proceedings of the 1st workshop on aspect reverse engineering (ware), 2004.
    [BibTeX] [PDF]
    @inproceedings{Krinke2004ControlFlowGraphBasedAM,
    title={Control-Flow-Graph-Based Aspect Mining},
    author={Jens Krinke and Silvia Breu},
    url = {https://www.semanticscholar.org/paper/Control-Flow-Graph-Based-Aspect-Mining-Krinke-Breu/62f141c2965e626776411b3dd5ed573860798810},
    booktitle = {Proceedings of the 1st Workshop on Aspect Reverse Engineering (WARE)},
    numpages = {5},
    year={2004}
    }

  • T. Lancaster and F. Culwin, “A comparison of source code plagiarism detection engines,” Computer science education, vol. 14, pp. 101-112, 2004. doi:10.1080/08993400412331363843
    [BibTeX] [PDF]
    @article{article,
    author = {Lancaster, Thomas and Culwin, Fintan},
    year = {2004},
    month = {06},
    pages = {101-112},
    title = {A Comparison of Source Code Plagiarism Detection Engines},
    volume = {14},
    url = {https://www.researchgate.net/publication/234126993_A_Comparison_of_Source_Code_Plagiarism_Detection_Engines},
    journal = {Computer Science Education},
    doi = {10.1080/08993400412331363843}
    }

  • G. Mishne and M. de Rijke, “Source code retrieval using conceptual similarity,” in Coupling approaches, coupling media and coupling languages for information retrieval, Paris, FRA, 2004, p. 539–554.
    [BibTeX] [PDF]
    @inproceedings{10.5555/2816272.2816322,
    author = {Mishne, Gilad and de Rijke, Maarten},
    title = {Source Code Retrieval Using Conceptual Similarity},
    year = {2004},
    isbn = {905450096},
    publisher = {LE CENTRE DE HAUTES ETUDES INTERNATIONALES D’INFORMATIQUE DOCUMENTAIRE},
    address = {Paris, FRA},
    booktitle = {Coupling Approaches, Coupling Media and Coupling Languages for Information Retrieval},
    pages = {539–554},
    url = {https://dl.acm.org/doi/10.5555/2816272.2816322},
    numpages = {16},
    location = {Vaucluse, France},
    series = {RIAO ’04}
    }

  • M. Toomim, A. Begel, and S. L. Graham, “Managing duplicated code with linked editing,” in Proceedings of the 2004 ieee symposium on visual languages – human centric computing, USA, 2004, p. 173–180. doi:10.1109/VLHCC.2004.35
    [BibTeX] [PDF]
    @inproceedings{10.1109/VLHCC.2004.35,
    author = {Toomim, Michael and Begel, Andrew and Graham, Susan L.},
    title = {Managing Duplicated Code with Linked Editing},
    year = {2004},
    isbn = {0780386965},
    publisher = {IEEE Computer Society},
    address = {USA},
    url = {https://doi.org/10.1109/VLHCC.2004.35},
    doi = {10.1109/VLHCC.2004.35},
    booktitle = {Proceedings of the 2004 IEEE Symposium on Visual Languages - Human Centric Computing},
    pages = {173–180},
    numpages = {8},
    series = {VLHCC ’04}
    }

  • V. Wahler, D. Seipel, J. W. v. Gudenberg, and G. Fischer, “Clone detection in source code by frequent itemset techniques,” in Proceedings of the source code analysis and manipulation, fourth ieee international workshop, USA, 2004, p. 128–135. doi:10.1109/SCAM.2004.5
    [BibTeX] [PDF]
    @inproceedings{10.1109/SCAM.2004.5,
    author = {Wahler, Vera and Seipel, Dietmar and Gudenberg, Jurgen Wolff v. and Fischer, Gregor},
    title = {Clone Detection in Source Code by Frequent Itemset Techniques},
    year = {2004},
    isbn = {0769521444},
    publisher = {IEEE Computer Society},
    address = {USA},
    url = {https://doi.org/10.1109/SCAM.2004.5},
    doi = {10.1109/SCAM.2004.5},
    booktitle = {Proceedings of the Source Code Analysis and Manipulation, Fourth IEEE International Workshop},
    pages = {128–135},
    numpages = {8},
    series = {SCAM ’04}
    }

  • A. Walenstein, A. Lakhotia, and R. Koschke, “The second international workshop on detection of software clones: workshop report,” Sigsoft softw. eng. notes, vol. 29, iss. 2, p. 1–5, 2004. doi:10.1145/979743.979752
    [BibTeX] [PDF]
    @article{10.1145/979743.979752,
    author = {Walenstein, Andrew and Lakhotia, Arun and Koschke, Rainer},
    title = {The Second International Workshop on Detection of Software Clones: Workshop Report},
    year = {2004},
    issue_date = {March 2004},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    volume = {29},
    number = {2},
    issn = {0163-5948},
    url = {https://doi.org/10.1145/979743.979752},
    doi = {10.1145/979743.979752},
    journal = {SIGSOFT Softw. Eng. Notes},
    month = mar,
    pages = {1–5},
    numpages = {5}
    }

2003

  • A. Walenstein, N. Jyoti, J. Li, Y. Yang, and A. Lakhotia, “Problems creating task-relevant clone detection reference data,” in Proceedings of the 10th working conference on reverse engineering, USA, 2003, p. 285.
    [BibTeX] [PDF]
    @inproceedings{10.5555/950792.951349,
    author = {Walenstein, Andrew and Jyoti, Nitin and Li, Junwei and Yang, Yun and Lakhotia, Arun},
    title = {Problems Creating Task-Relevant Clone Detection Reference Data},
    year = {2003},
    isbn = {0769520278},
    publisher = {IEEE Computer Society},
    address = {USA},
    url = {https://dl.acm.org/doi/10.5555/950792.951349},
    booktitle = {Proceedings of the 10th Working Conference on Reverse Engineering},
    pages = {285},
    numpages = {1},
    series = {WCRE ’03}
    }

  • F. Van Rysselberghe and S. Demeyer, “Reconstruction of successful software evolution using clone detection,” in Sixth international workshop on principles of software evolution, 2003. proceedings., 2003, pp. 126-130.
    [BibTeX] [PDF]
    @INPROCEEDINGS{1231219,
    author={F. {Van Rysselberghe} and S. {Demeyer}},
    booktitle={Sixth International Workshop on Principles of Software Evolution, 2003. Proceedings.},
    title={Reconstruction of successful software evolution using clone detection},
    year={2003},
    url = {https://ieeexplore.ieee.org/document/1231219},
    volume={},
    number={},
    pages={126-130},}

  • W. Chen, B. Li, and R. Gupta, “Code compaction of matching single-entry multiple-exit regions,” in Proceedings of the 10th international conference on static analysis, Berlin, Heidelberg, 2003, p. 401–417.
    [BibTeX] [PDF]
    @inproceedings{10.5555/1760267.1760299,
    author = {Chen, Wen-Ke and Li, Bengu and Gupta, Rajiv},
    title = {Code Compaction of Matching Single-Entry Multiple-Exit Regions},
    year = {2003},
    isbn = {3540403256},
    url = {https://dl.acm.org/doi/10.5555/1760267.1760299},
    publisher = {Springer-Verlag},
    address = {Berlin, Heidelberg},
    booktitle = {Proceedings of the 10th International Conference on Static Analysis},
    pages = {401–417},
    numpages = {17},
    keywords = {code compaction, single-entry-multiple-exit regions, control flow signature, predicated execution},
    location = {San Diego, CA, USA},
    series = {SAS’03}
    }

  • B. De Sutter, S. Bruno, B. Bus, and K. De Bosschere, “Sifting out the mud: low level c++ code reuse,” , 2003.
    [BibTeX] [PDF]
    @article{article,
    author = {De Sutter, Bjorn and Bruno, Sutter and Bus, Bruno and De Bosschere, Koen},
    year = {2003},
    booktitle = {Proceedings of the 17th ACM SIGPLAN Conference on Object- oriented Programming, Systems, Languages, and Applications (OOPSLA)},
    url = {https://www.researchgate.net/publication/2897841_Sifting_out_the_Mud_Low_Level_C_Code_Reuse},
    month = {11},
    pages = {275-291},
    pages = {},
    title = {Sifting out the Mud: Low Level C++ Code Reuse}
    }

  • K. Gallagher and L. Layman, “Are decomposition slices clones?,” in Proceedings of the 11th ieee international workshop on program comprehension, USA, 2003, p. 251.
    [BibTeX] [PDF]
    @inproceedings{10.5555/851042.857059,
    author = {Gallagher, Keith and Layman, Lucas},
    title = {Are Decomposition Slices Clones?},
    year = {2003},
    isbn = {0769518834},
    url = {https://dl.acm.org/doi/10.5555/851042.857059},
    publisher = {IEEE Computer Society},
    address = {USA},
    booktitle = {Proceedings of the 11th IEEE International Workshop on Program Comprehension},
    pages = {251},
    numpages = {1},
    keywords = {Clone Detection, Decomposition Slicing, Software Maintenance, Program Slicing, Software Comprehension},
    series = {IWPC ’03}
    }

  • P. Grubb and A. A. Takang, Software maintenance, 2nd ed., , 2003. doi:10.1142/5318
    [BibTeX] [PDF]
    @book{doi:10.1142/5318,
    author = {Grubb, Penny and Takang, Armstrong A},
    title = {Software Maintenance},
    journal = {World Scientificg},
    year = {2003},
    doi = {10.1142/5318},
    address = {},
    edition = {2nd},
    url = {https://www.worldscientific.com/doi/abs/10.1142/5318},
    eprint = {https://www.worldscientific.com/doi/pdf/10.1142/5318}
    }

  • C. Kapser and M. Godfrey, “A taxonomy of clones in source code: the re-engineers most wanted list,” Proceedings of the 2nd international workshop on detection of software clones (iwdsc), 2003.
    [BibTeX] [PDF]
    @article{article,
    author = {Kapser, Cory and Godfrey, Michael},
    year = {2003},
    month = {01},
    pages = {},
    url = {https://www.researchgate.net/publication/244275094_A_Taxonomy_of_Clones_in_Source_Code_The_Re-Engineers_Most_Wanted_List},
    journal = {Proceedings of the 2nd International Workshop on Detection of Software Clones (IWDSC)},
    title = {A Taxonomy of Clones in Source Code: The Re-Engineers Most Wanted List}
    }

  • R. Komondoor and S. Horwitz, “Effective, automatic procedure extraction,” in 11th ieee international workshop on program comprehension, 2003., 2003, pp. 33-42.
    [BibTeX] [PDF]
    @INPROCEEDINGS{1199187,
    author={R. {Komondoor} and S. {Horwitz}},
    booktitle={11th IEEE International Workshop on Program Comprehension, 2003.},
    title={Effective, automatic procedure extraction},
    year={2003},
    url = {https://ieeexplore.ieee.org/document/1199187},
    volume={},
    number={},
    pages={33-42},}

  • A. Lakhotia, J. Li, A. Walenstein, and Y. Yang, “Towards a clone detection benchmark suite and results archive,” in Proceedings of the 11th ieee international workshop on program comprehension, USA, 2003, p. 285.
    [BibTeX] [PDF]
    @inproceedings{10.5555/851042.857044,
    author = {Lakhotia, Arun and Li, Junwei and Walenstein, Andrew and Yang, Yun},
    title = {Towards a Clone Detection Benchmark Suite and Results Archive},
    year = {2003},
    isbn = {0769518834},
    publisher = {IEEE Computer Society},
    address = {USA},
    booktitle = {Proceedings of the 11th IEEE International Workshop on Program Comprehension},
    pages = {285},
    numpages = {1},
    url ={https://dl.acm.org/doi/10.5555/851042.857044},
    series = {IWPC ’03}
    }

  • F. Lanubile and T. Mallardo, “Finding function clones in web applications,” in Proceedings of the seventh european conference on software maintenance and reengineering, USA, 2003, p. 379.
    [BibTeX] [PDF]
    @inproceedings{10.5555/872754.873583,
    author = {Lanubile, Filippo and Mallardo, Teresa},
    title = {Finding Function Clones in Web Applications},
    year = {2003},
    isbn = {0769519024},
    publisher = {IEEE Computer Society},
    address = {USA},
    url = {https://dl.acm.org/doi/10.5555/872754.873583},
    booktitle = {Proceedings of the Seventh European Conference on Software Maintenance and Reengineering},
    pages = {379},
    numpages = {1},
    series = {CSMR ’03}
    }

  • M. Lanza and S. Ducasse, “Polymetric views-a lightweight visual approach to reverse engineering,” Ieee trans. softw. eng., vol. 29, iss. 9, p. 782–795, 2003. doi:10.1109/TSE.2003.1232284
    [BibTeX] [PDF]
    @article{10.1109/TSE.2003.1232284,
    author = {Lanza, Michele and Ducasse, St\'{e}phane},
    title = {Polymetric Views-A Lightweight Visual Approach to Reverse Engineering},
    year = {2003},
    issue_date = {September 2003},
    publisher = {IEEE Press},
    volume = {29},
    number = {9},
    issn = {0098-5589},
    url = {https://doi.org/10.1109/TSE.2003.1232284},
    doi = {10.1109/TSE.2003.1232284},
    journal = {IEEE Trans. Softw. Eng.},
    month = sep,
    pages = {782–795},
    numpages = {14},
    keywords = {object-oriented programming, software visualization, software metrics., Reverse engineering}
    }

  • A. Leit?, “Detection of redundant code using r2d2,” in 2013 ieee 13th international working conference on source code analysis and manipulation (scam), Los Alamitos, CA, USA, 2003, p. 183. doi:10.1109/SCAM.2003.1238044
    [BibTeX] [PDF]
    @INPROCEEDINGS {,
    author = {A. Leit?},
    booktitle = {2013 IEEE 13th International Working Conference on Source Code Analysis and Manipulation (SCAM)},
    title = {Detection of Redundant Code using R2D2},
    year = {2003},
    volume = {},
    issn = {},
    pages = {183},
    keywords = {null},
    doi = {10.1109/SCAM.2003.1238044},
    url = {https://doi.ieeecomputersociety.org/10.1109/SCAM.2003.1238044},
    publisher = {IEEE Computer Society},
    address = {Los Alamitos, CA, USA},
    month = {sep}
    }

  • L. Prechelt and G. Malpohl, “Finding plagiarisms among a set of programs with jplag,” Journal of universal computer science, vol. 8, 2003.
    [BibTeX] [PDF]
    @article{article,
    author = {Prechelt, Lutz and Malpohl, Guido},
    year = {2003},
    month = {03},
    pages = {},
    title = {Finding Plagiarisms among a Set of Programs with JPlag},
    volume = {8},
    url = {https://www.researchgate.net/publication/2832828_Finding_Plagiarisms_among_a_Set_of_Programs_with_JPlag},
    journal = {Journal of Universal Computer Science}
    }

  • F. Ricca and P. Tonella, “Using clustering to support the migration from static to dynamic web pages,” in 11th ieee international workshop on program comprehension, 2003., 2003, pp. 207-216.
    [BibTeX] [PDF]
    @INPROCEEDINGS{1199204,
    author={F. {Ricca} and P. {Tonella}},
    booktitle={11th IEEE International Workshop on Program Comprehension, 2003.},
    title={Using clustering to support the migration from static to dynamic web pages},
    year={2003},
    url = {https://ieeexplore.ieee.org/document/1199204},
    volume={},
    number={},
    pages={207-216},}

  • S. Schleimer, D. S. Wilkerson, and A. Aiken, “Winnowing: local algorithms for document fingerprinting,” in Proceedings of the 2003 acm sigmod international conference on management of data, New York, NY, USA, 2003, p. 76–85. doi:10.1145/872757.872770
    [BibTeX] [PDF]
    @inproceedings{10.1145/872757.872770,
    author = {Schleimer, Saul and Wilkerson, Daniel S. and Aiken, Alex},
    title = {Winnowing: Local Algorithms for Document Fingerprinting},
    year = {2003},
    isbn = {158113634X},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/872757.872770},
    doi = {10.1145/872757.872770},
    booktitle = {Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data},
    pages = {76–85},
    numpages = {10},
    location = {San Diego, California},
    series = {SIGMOD ’03}
    }

  • N. Synytskyy, J. R. Cordy, and T. Dean, “Resolution of static clones in dynamic web pages,” in Fifth ieee international workshop on web site evolution, 2003. theme: architecture. proceedings., 2003, pp. 49-56.
    [BibTeX] [PDF]
    @INPROCEEDINGS{1234008,
    author={N. {Synytskyy} and J. R. {Cordy} and T. {Dean}},
    booktitle={Fifth IEEE International Workshop on Web Site Evolution, 2003. Theme: Architecture. Proceedings.},
    title={Resolution of static clones in dynamic Web pages},
    year={2003},
    url ={https://ieeexplore.ieee.org/document/1234008},
    volume={},
    number={},
    pages={49-56},}

  • A. Walenstein and A. Lakhotia, “Clone detector evaluation can be improved: ideas from information retrieval,” Proceedings of the 2nd international workshop on detection of software clones (iwdsc), 2003.
    [BibTeX] [PDF]
    @article{article,
    author = {Walenstein, Andrew and Lakhotia, Arun},
    year = {2003},
    month = {01},
    pages = {},
    journal = {Proceedings of the 2nd International Workshop on Detection of Software Clones (IWDSC)},
    url= {https://www.researchgate.net/publication/229018949_Clone_detector_evaluation_can_be_improved_Ideas_from_information_retrieval},
    title = {Clone detector evaluation can be improved: Ideas from information retrieval}
    }

  • L. Zou and M. W. Godfrey, “Detecting merging and splitting using origin analysis,” in Proceedings of the 10th working conference on reverse engineering, USA, 2003, p. 146.
    [BibTeX] [PDF]
    @inproceedings{10.5555/950792.951375,
    author = {Zou, Lijie and Godfrey, Michael W.},
    title = {Detecting Merging and Splitting Using Origin Analysis},
    year = {2003},
    isbn = {0769520278},
    publisher = {IEEE Computer Society},
    address = {USA},
    url = {https://dl.acm.org/doi/10.5555/950792.951375},
    booktitle = {Proceedings of the 10th Working Conference on Reverse Engineering},
    pages = {146},
    numpages = {1},
    series = {WCRE ’03}
    }

2002

  • Y. Ueda, T. Kamiya, S. Kusumoto, and K. Inoue, “On detection of gapped code clones using gap locations,” in Ninth asia-pacific software engineering conference, 2002., 2002, pp. 327-336.
    [BibTeX] [PDF]
    @INPROCEEDINGS{1183002,
    author={Y. {Ueda} and T. {Kamiya} and S. {Kusumoto} and K. {Inoue}},
    booktitle={Ninth Asia-Pacific Software Engineering Conference, 2002.},
    title={On detection of gapped code clones using gap locations},
    year={2002},
    url ={https://ieeexplore.ieee.org/document/1183002},
    volume={},
    number={},
    pages={327-336},}

  • E. Burd and J. Bailey, “Evaluating clone detection tools for use during preventative maintenance,” in Proceedings. second ieee international workshop on source code analysis and manipulation, 2002, pp. 36-43.
    [BibTeX] [PDF]
    @INPROCEEDINGS{1134103,
    author={E. {Burd} and J. {Bailey}},
    booktitle={Proceedings. Second IEEE International Workshop on Source Code Analysis and Manipulation},
    title={Evaluating clone detection tools for use during preventative maintenance},
    year={2002},
    url = {https://ieeexplore.ieee.org/document/1134103},
    volume={},
    number={},
    pages={36-43},}

  • T. Kamiya, S. Kusumoto, and K. Inoue, “Ccfinder: a multilinguistic token-based code clone detection system for large scale source code,” Ieee transactions on software engineering, vol. 28, iss. 7, pp. 654-670, 2002.
    [BibTeX] [PDF]
    @ARTICLE{1019480,
    author={T. {Kamiya} and S. {Kusumoto} and K. {Inoue}},
    journal={IEEE Transactions on Software Engineering},
    title={CCFinder: a multilinguistic token-based code clone detection system for large scale source code},
    year={2002},
    url = {https://ieeexplore.ieee.org/document/1019480},
    volume={28},
    number={7},
    pages={654-670},}

  • G. Antoniol, V. Umberto, E. Merlo, and M. Di Penta, “Analyzing cloning evolution in the linux kernel,” Information & software technology, vol. 44, pp. 755-765, 2002. doi:10.1016/S0950-5849(02)00123-4
    [BibTeX] [PDF]
    @article{article,
    author = {Antoniol, Giuliano and Umberto, Villano and Merlo, Ettore and Di Penta, Massimiliano},
    year = {2002},
    month = {10},
    url = {https://www.researchgate.net/publication/220610329_Analyzing_cloning_evolution_in_the_Linux_kernel},
    pages = {755-765},
    title = {Analyzing cloning evolution in the Linux kernel},
    volume = {44},
    journal = {Information & Software Technology},
    doi = {10.1016/S0950-5849(02)00123-4}
    }

  • G. A. Di Lucca, M. Di Penta, and A. R. Fasolino, “An approach to identify duplicated web pages,” in Proceedings 26th annual international computer software and applications, 2002.
    [BibTeX] [PDF]
    @INPROCEEDINGS{1045051,
    author={G. A. {Di Lucca} and M. {Di Penta} and A. R. {Fasolino}},
    booktitle={Proceedings 26th Annual International Computer Software and Applications},
    title={An approach to identify duplicated web pages},
    year={2002},
    url = {https://ieeexplore.ieee.org/document/1045051},
    volume={},
    number={},
    }

  • Y. Higo, Y. Ueda, T. Kamiya, S. Kusumoto, and K. Inoue, “On software maintenance process improvement based on code clone analysis,” in Proceedings of the 4th international conference on product focused software process improvement, Berlin, Heidelberg, 2002, p. 185–197.
    [BibTeX] [PDF]
    @inproceedings{10.5555/646972.713674,
    author = {Higo, Yoshiki and Ueda, Yasushi and Kamiya, Toshihro and Kusumoto, Shinji and Inoue, Katsuro},
    title = {On Software Maintenance Process Improvement Based on Code Clone Analysis},
    year = {2002},
    isbn = {3540002340},
    url = {https://dl.acm.org/doi/10.5555/646972.713674},
    publisher = {Springer-Verlag},
    address = {Berlin, Heidelberg},
    booktitle = {Proceedings of the 4th International Conference on Product Focused Software Process Improvement},
    pages = {185–197},
    numpages = {13},
    series = {PROFES ’02}
    }

  • “Extensible language-aware merging,” in Proceedings of the international conference on software maintenance (icsm’02), USA, 2002, p. 511.
    [BibTeX] [PDF]
    @inproceedings{10.5555/876882.879732,
    title = {Extensible Language-Aware Merging},
    year = {2002},
    isbn = {0769518192},
    publisher = {IEEE Computer Society},
    address = {USA},
    url = {https://dl.acm.org/doi/10.5555/876882.879732},
    booktitle = {Proceedings of the International Conference on Software Maintenance (ICSM’02)},
    pages = {511},
    numpages = {1},
    series = {ICSM ’02}
    }

  • E. Merlo, M. Dagenais, P. Bachand, J. S. Sormani, S. Gradara, and G. Antoniol, “Investigating large software system evolution: the linux kernel,” in Proceedings of the 26th international computer software and applications conference on prolonging software life: development and redevelopment, USA, 2002, p. 421–426.
    [BibTeX] [PDF]
    @inproceedings{10.5555/645984.675865,
    author = {Merlo, Ettore and Dagenais, Michel and Bachand, P. and Sormani, J. S. and Gradara, S. and Antoniol, Giuliano},
    title = {Investigating Large Software System Evolution: The Linux Kernel},
    year = {2002},
    isbn = {0769517277},
    publisher = {IEEE Computer Society},
    address = {USA},
    booktitle = {Proceedings of the 26th International Computer Software and Applications Conference on Prolonging Software Life: Development and Redevelopment},
    pages = {421–426},
    url = {https://dl.acm.org/doi/10.5555/645984.675865},
    numpages = {6},
    keywords = {project management, clone analysis, software metrics, software evolution},
    series = {COMPSAC ’02}
    }

  • Qiang Tu and M. W. Godfrey, “An integrated approach for studying architectural evolution,” in Proceedings 10th international workshop on program comprehension, 2002, pp. 127-136.
    [BibTeX] [PDF]
    @INPROCEEDINGS{1021334,
    author={ {Qiang Tu} and M. W. {Godfrey}},
    booktitle={Proceedings 10th International Workshop on Program Comprehension},
    title={An integrated approach for studying architectural evolution},
    year={2002},
    url = {https://ieeexplore.ieee.org/document/1021334},
    volume={},
    number={},
    pages={127-136},}

  • Y. Ueda, T. Kamiya, S. Kusumoto, and K. Inoue, “Gemini: maintenance support environment based on code clone analysis,” in Proceedings eighth ieee symposium on software metrics, 2002, pp. 67-76.
    [BibTeX] [PDF]
    @INPROCEEDINGS{1011326,
    author={Y. {Ueda} and T. {Kamiya} and S. {Kusumoto} and K. {Inoue}},
    booktitle={Proceedings Eighth IEEE Symposium on Software Metrics},
    title={Gemini: maintenance support environment based on code clone analysis},
    year={2002},
    url = {https://ieeexplore.ieee.org/document/1011326},
    volume={},
    number={},
    pages={67-76},}

2001

  • A. Marcus and J. I. Maletic, “Identification of high-level concept clones in source code,” in Proceedings 16th annual international conference on automated software engineering (ase 2001), 2001, pp. 107-114.
    [BibTeX] [PDF]
    @INPROCEEDINGS{989796,
    author={A. {Marcus} and J. I. {Maletic}},
    booktitle={Proceedings 16th Annual International Conference on Automated Software Engineering (ASE 2001)},
    title={Identification of high-level concept clones in source code},
    year={2001},
    url = {https://ieeexplore.ieee.org/document/989796},
    volume={},
    number={},
    pages={107-114},}

  • G. Antoniol, G. Casazza, M. Di Penta, and E. Merlo, “Modeling clones evolution through time series,” in Proceedings ieee international conference on software maintenance. icsm 2001, 2001, pp. 273-280.
    [BibTeX] [PDF]
    @INPROCEEDINGS{972740,
    author={G. {Antoniol} and G. {Casazza} and M. {Di Penta} and E. {Merlo}},
    booktitle={Proceedings IEEE International Conference on Software Maintenance. ICSM 2001},
    title={Modeling clones evolution through time series},
    year={2001},
    url = {https://ieeexplore.ieee.org/document/972740},
    volume={},
    number={},
    pages={273-280},}

  • C. Boldyreff and R. Kewish, “Reverse engineering to achieve maintainable www sites,” in Proceedings of the eighth working conference on reverse engineering (wcre’01), USA, 2001, p. 249.
    [BibTeX] [PDF]
    @inproceedings{10.5555/832308.837121,
    author = {Boldyreff, Cornelia and Kewish, Richard},
    title = {Reverse Engineering to Achieve Maintainable WWW Sites},
    year = {2001},
    url ={https://dl.acm.org/doi/10.5555/832308.837121},
    isbn = {0769513034},
    publisher = {IEEE Computer Society},
    address = {USA},
    booktitle = {Proceedings of the Eighth Working Conference on Reverse Engineering (WCRE’01)},
    pages = {249},
    numpages = {1},
    keywords = {web analysis, re-structuring, detection of duplicated web content, data abstraction, Web site maintenance},
    series = {WCRE ’01}
    }

  • G. Casazza, G. Antoniol, V. Umberto, E. Merlo, and M. Di Penta, “Identifying clones in the linux kernel,” in Proceedings. first ieee international workshop on source code analysis and manipulation, 2001, pp. 90-97. doi:10.1109/SCAM.2001.972670
    [BibTeX] [PDF]
    @inproceedings{inproceedings,
    author = {Casazza, G. and Antoniol, Giuliano and Umberto, Villano and Merlo, E. and Di Penta, Massimiliano},
    year = {2001},
    month = {02},
    pages = {90-97},
    title = {Identifying clones in the Linux Kernel},
    isbn = {0-7695-1387-5},
    booktitle = {Proceedings. First IEEE International Workshop on Source Code Analysis and Manipulation},
    url = {https://www.researchgate.net/publication/3929803_Identifying_clones_in_the_Linux_Kernel},
    doi = {10.1109/SCAM.2001.972670}
    }

  • A. Chou, J. Yang, B. Chelf, S. Hallem, and D. Engler, “An empirical study of operating systems errors,” in Proceedings of the eighteenth acm symposium on operating systems principles, New York, NY, USA, 2001, p. 73–88. doi:10.1145/502034.502042
    [BibTeX] [PDF]
    @inproceedings{10.1145/502034.502042,
    author = {Chou, Andy and Yang, Junfeng and Chelf, Benjamin and Hallem, Seth and Engler, Dawson},
    title = {An Empirical Study of Operating Systems Errors},
    year = {2001},
    isbn = {1581133898},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/502034.502042},
    doi = {10.1145/502034.502042},
    booktitle = {Proceedings of the Eighteenth ACM Symposium on Operating Systems Principles},
    pages = {73–88},
    numpages = {16},
    location = {Banff, Alberta, Canada},
    series = {SOSP ’01}
    }

  • F. D. Caprio, G. Casazza, M. D. Penta, and U. Villano, “Clone analysis in the web era: an approach to identify cloned web pages,” in Proceedings of the 7th ieee workshop on empirical studies of software maintenance (wess), 2001, pp. 107-113.
    [BibTeX] [PDF]
    @inproceedings{Caprio2001CloneAI,
    title={Clone Analysis in the Web Era: an Approach to Identify Cloned Web Pages},
    url = {https://www.semanticscholar.org/paper/Clone-Analysis-in-the-Web-Era%3A-an-Approach-to-Web-Caprio-Casazza/ee6e8f646ed0313f5a9ab3529ee2fa32e9a07019},
    author={Francesco Di Caprio and Gerardo Casazza and Massimiliano Di Penta and Umberto Villano},
    year={2001},
    booktitle = {Proceedings of the 7th IEEE Workshop on Empirical Studies of Software Maintenance (WESS)},
    pages = {107-113}
    }

  • F. Fioravanti, G. Migliarese, and P. Nesi, “Reengineering analysis of object-oriented systems via duplication analysis,” in Proceedings of the 23rd international conference on software engineering. icse 2001, 2001, pp. 577-586.
    [BibTeX] [PDF]
    @INPROCEEDINGS{919132,
    author={F. {Fioravanti} and G. {Migliarese} and P. {Nesi}},
    booktitle={Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001},
    title={Reengineering analysis of object-oriented systems via duplication analysis},
    year={2001},
    url = {https://ieeexplore.ieee.org/document/919132},
    volume={},
    number={},
    pages={577-586},}

  • M. Godfrey and Q. Tu, “Growth, evolution, and structural change in open source software,” in Proceedings of the 4th international workshop on principles of software evolution, New York, NY, USA, 2001, p. 103–106. doi:10.1145/602461.602482
    [BibTeX] [PDF]
    @inproceedings{10.1145/602461.602482,
    author = {Godfrey, Michael and Tu, Qiang},
    title = {Growth, Evolution, and Structural Change in Open Source Software},
    year = {2001},
    isbn = {1581135084},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/602461.602482},
    doi = {10.1145/602461.602482},
    booktitle = {Proceedings of the 4th International Workshop on Principles of Software Evolution},
    pages = {103–106},
    numpages = {4},
    keywords = {software evolution, supporting environments, open source software, software architecture, GCC, structural change, Linux},
    location = {Vienna, Austria},
    series = {IWPSE ’01}
    }

  • R. Komondoor and S. Horwitz, “Tool demonstration: finding duplicated code using program dependences,” in Programming languages and systems, Berlin, Heidelberg, 2001, pp. 383-386.
    [BibTeX] [Abstract] [PDF]

    The results of several studies [1,7,8] indicate that 7-23{\%} of the source code for large programs is duplicated code. Duplication makes programs harder to maintain because when enhancements or bug fixes are made in one instance of the duplicated code, it is necessary to search for the other instances in order to perform the corresponding modification.

    @InProceedings{10.1007/3-540-45309-1_25,
    author="Komondoor, Raghavan
    and Horwitz, Susan",
    editor="Sands, David",
    title="Tool Demonstration: Finding Duplicated Code Using Program Dependences",
    booktitle="Programming Languages and Systems",
    year="2001",
    publisher="Springer Berlin Heidelberg",
    address="Berlin, Heidelberg",
    pages="383-386",
    url = {https://link.springer.com/chapter/10.1007/3-540-45309-1_25},
    abstract="The results of several studies [1,7,8] indicate that 7-23{\%} of the source code for large programs is duplicated code. Duplication makes programs harder to maintain because when enhancements or bug fixes are made in one instance of the duplicated code, it is necessary to search for the other instances in order to perform the corresponding modification.",
    isbn="978-3-540-45309-3"
    }

  • R. Komondoor and S. Horwitz, “Using slicing to identify duplication in source code,” in Proceedings of the 8th international symposium on static analysis, Berlin, Heidelberg, 2001, p. 40–56.
    [BibTeX] [PDF]
    @inproceedings{10.5555/647170.718283,
    author = {Komondoor, Raghavan and Horwitz, Susan},
    title = {Using Slicing to Identify Duplication in Source Code},
    year = {2001},
    isbn = {3540423141},
    publisher = {Springer-Verlag},
    address = {Berlin, Heidelberg},
    booktitle = {Proceedings of the 8th International Symposium on Static Analysis},
    pages = {40–56},
    url ={https://dl.acm.org/doi/10.5555/647170.718283},
    numpages = {17},
    series = {SAS ’01}
    }

  • J. Krinke, “Identifying similar code with program dependence graphs,” in Proceedings of the eighth working conference on reverse engineering (wcre’01), USA, 2001, p. 301.
    [BibTeX] [PDF]
    @inproceedings{10.5555/832308.837142,
    author = {Krinke, Jens},
    title = {Identifying Similar Code with Program Dependence Graphs},
    year = {2001},
    isbn = {0769513034},
    publisher = {IEEE Computer Society},
    address = {USA},
    url = {https://dl.acm.org/doi/10.5555/832308.837142},
    booktitle = {Proceedings of the Eighth Working Conference on Reverse Engineering (WCRE’01)},
    pages = {301},
    numpages = {1},
    series = {WCRE ’01}
    }

  • J. I. Maletic and A. Marcus, “Supporting program comprehension using semantic and structural information,” in Proceedings of the 23rd international conference on software engineering. icse 2001, 2001, pp. 103-112.
    [BibTeX] [PDF]
    @INPROCEEDINGS{919085,
    author={J. I. {Maletic} and A. {Marcus}},
    booktitle={Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001},
    title={Supporting program comprehension using semantic and structural information},
    year={2001},
    url = {https://ieeexplore.ieee.org/document/919085},
    volume={},
    number={},
    pages={103-112},}

  • R. C. Miller and B. A. Myers, “Interactive simultaneous editing of multiple text regions,” in Proceedings of the general track: 2001 usenix annual technical conference, USA, 2001, p. 161–174.
    [BibTeX] [PDF]
    @inproceedings{10.5555/647055.715910,
    author = {Miller, Robert C. and Myers, Brad A.},
    title = {Interactive Simultaneous Editing of Multiple Text Regions},
    year = {2001},
    isbn = {188044609X},
    publisher = {USENIX Association},
    address = {USA},
    url = {https://dl.acm.org/doi/10.5555/647055.715910},
    booktitle = {Proceedings of the General Track: 2001 USENIX Annual Technical Conference},
    pages = {161–174},
    numpages = {14}
    }

  • B. S. Mitchell and S. Mancoridis, “Craft: a framework for evaluating software clustering results in the absence of benchmark decompositions,” in Proceedings of the eighth working conference on reverse engineering (wcre’01), USA, 2001, p. 93.
    [BibTeX] [PDF]
    @inproceedings{10.5555/832308.837151,
    author = {Mitchell, Brian S. and Mancoridis, Spiros},
    title = {CRAFT: A Framework for Evaluating Software Clustering Results in the Absence of Benchmark Decompositions},
    year = {2001},
    isbn = {0769513034},
    publisher = {IEEE Computer Society},
    address = {USA},
    url = {https://dl.acm.org/doi/10.5555/832308.837151},
    booktitle = {Proceedings of the Eighth Working Conference on Reverse Engineering (WCRE’01)},
    pages = {93},
    numpages = {1},
    keywords = {Software Maintenance, Software Clustering, Evaluation},
    series = {WCRE ’01}
    }

  • L. Moonen, “Generating robust parsers using island grammars,” in Proceedings of the eighth working conference on reverse engineering (wcre’01), USA, 2001, p. 13.
    [BibTeX] [PDF]
    @inproceedings{10.5555/832308.837160,
    author = {Moonen, Leon},
    title = {Generating Robust Parsers Using Island Grammars},
    year = {2001},
    isbn = {0769513034},
    publisher = {IEEE Computer Society},
    address = {USA},
    booktitle = {Proceedings of the Eighth Working Conference on Reverse Engineering (WCRE’01)},
    pages = {13},
    numpages = {1},
    url ={https://dl.acm.org/doi/10.5555/832308.837160},
    keywords = {parser generation, reverse engineering, fuzzy parsing, program analysis., source model extraction, Island grammars, partial parsing},
    series = {WCRE ’01}
    }

2000

  • M. Balazinska, E. Merlo, M. Dagenais, B. Lague, and K. Kontogiannis, “Advanced clone-analysis to support object-oriented system refactoring,” in Proceedings seventh working conference on reverse engineering, 2000, pp. 98-107.
    [BibTeX] [PDF]
    @INPROCEEDINGS{891457,
    author={M. {Balazinska} and E. {Merlo} and M. {Dagenais} and B. {Lague} and K. {Kontogiannis}},
    booktitle={Proceedings Seventh Working Conference on Reverse Engineering},
    title={Advanced clone-analysis to support object-oriented system refactoring},
    year={2000},
    url = {https://ieeexplore.ieee.org/document/891457},
    volume={},
    number={},
    pages={98-107},}

  • S. K. Debray, W. Evans, R. Muth, and B. De Sutter, “Compiler techniques for code compaction,” Acm trans. program. lang. syst., vol. 22, iss. 2, p. 378–415, 2000. doi:10.1145/349214.349233
    [BibTeX] [PDF]
    @article{10.1145/349214.349233,
    author = {Debray, Saumya K. and Evans, William and Muth, Robert and De Sutter, Bjorn},
    title = {Compiler Techniques for Code Compaction},
    year = {2000},
    issue_date = {March 2000},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    volume = {22},
    number = {2},
    issn = {0164-0925},
    url = {https://doi.org/10.1145/349214.349233},
    doi = {10.1145/349214.349233},
    journal = {ACM Trans. Program. Lang. Syst.},
    month = mar,
    pages = {378–415},
    numpages = {38},
    keywords = {code compaction, code size reduction, code compression}
    }

  • S. Ducasse, M. Rieger, and S. Demeyer, “A language independent approach for detecting duplicated code,” Conference on software maintenance, 2000.
    [BibTeX] [PDF]
    @article{article,
    author = {Ducasse, Stéphane and Rieger, Matthias and Demeyer, Serge},
    year = {2000},
    month = {12},
    pages = {},
    url = {https://www.researchgate.net/publication/2430208_A_Language_Independent_Approach_for_Detecting_Duplicated_Code},
    title = {A Language Independent Approach for Detecting Duplicated Code},
    journal = {Conference on Software Maintenance}
    }

  • K. Faxen, “The costs and benefits of cloning in a lazy functional language.” 2000, pp. 1-12.
    [BibTeX] [PDF]
    @inproceedings{inproceedings,
    author = {Faxen, Karl-Filip},
    year = {2000},
    month = {01},
    pages = {1-12},
    journal = {2nd Scottish Functional Programming Workshop (SFP)},
    title = {The costs and benefits of cloning in a lazy functional language},
    url = {https://www.researchgate.net/publication/221335508_The_costs_and_benefits_of_cloning_in_a_lazy_functional_language},
    }

  • Godfrey and Qiang Tu, “Evolution in open source software: a case study,” in Proceedings 2000 international conference on software maintenance, 2000, pp. 131-142.
    [BibTeX] [PDF]
    @INPROCEEDINGS{883030,
    author={ {Godfrey} and {Qiang Tu}},
    booktitle={Proceedings 2000 International Conference on Software Maintenance},
    title={Evolution in open source software: a case study},
    year={2000},
    url = {https://ieeexplore.ieee.org/document/883030},
    volume={},
    number={},
    pages={131-142},}

  • T. L. Graves, A. F. Karr, J. S. Marron, and H. Siy, “Predicting fault incidence using software change history,” Ieee trans. softw. eng., vol. 26, iss. 7, p. 653–661, 2000. doi:10.1109/32.859533
    [BibTeX] [PDF]
    @article{10.1109/32.859533,
    author = {Graves, Todd L. and Karr, Alan F. and Marron, J. S. and Siy, Harvey},
    title = {Predicting Fault Incidence Using Software Change History},
    year = {2000},
    issue_date = {July 2000},
    publisher = {IEEE Press},
    volume = {26},
    number = {7},
    issn = {0098-5589},
    url = {https://doi.org/10.1109/32.859533},
    doi = {10.1109/32.859533},
    journal = {IEEE Trans. Softw. Eng.},
    month = jul,
    pages = {653–661},
    numpages = {9},
    keywords = {generalized linear models., code decay, change management data, Fault potential, statistical analysis, metrics}
    }

  • R. Komondoor and S. Horwitz, “Semantics-preserving procedure extraction,” in Proceedings of the 27th acm sigplan-sigact symposium on principles of programming languages, New York, NY, USA, 2000, p. 155–169. doi:10.1145/325694.325713
    [BibTeX] [PDF]
    @inproceedings{10.1145/325694.325713,
    author = {Komondoor, Raghavan and Horwitz, Susan},
    title = {Semantics-Preserving Procedure Extraction},
    year = {2000},
    isbn = {1581131259},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/325694.325713},
    doi = {10.1145/325694.325713},
    booktitle = {Proceedings of the 27th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages},
    pages = {155–169},
    numpages = {15},
    location = {Boston, MA, USA},
    series = {POPL ’00}
    }

1999

  • M. Balazinska, E. Merlo, M. Dagenais, B. Lague, and K. Kontogiannis, “Measuring clone based reengineering opportunities,” in Proceedings sixth international software metrics symposium (cat. no.pr00403), 1999, pp. 292-303.
    [BibTeX] [PDF]
    @INPROCEEDINGS{809750,
    author={M. {Balazinska} and E. {Merlo} and M. {Dagenais} and B. {Lague} and K. {Kontogiannis}},
    booktitle={Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403)},
    title={Measuring clone based reengineering opportunities},
    year={1999},
    url = {https://ieeexplore.ieee.org/document/809750},
    volume={},
    number={},
    pages={292-303},}

  • M. Balazinska, E. Merlo, M. Dagenais, B. Lague, and K. Kontogiannis, “Partial redesign of java software systems based on clone analysis,” in Sixth working conference on reverse engineering (cat. no.pr00303), 1999, pp. 326-336.
    [BibTeX] [PDF]
    @INPROCEEDINGS{806971,
    author={M. {Balazinska} and E. {Merlo} and M. {Dagenais} and B. {Lague} and K. {Kontogiannis}},
    booktitle={Sixth Working Conference on Reverse Engineering (Cat. No.PR00303)},
    title={Partial redesign of Java software systems based on clone analysis},
    year={1999},
    url = {https://ieeexplore.ieee.org/abstract/document/806971},
    volume={},
    number={},
    pages={326-336},}

  • B. S. Baker, “Parameterized diff,” in Proceedings of the tenth annual acm-siam symposium on discrete algorithms, USA, 1999, p. 854–855.
    [BibTeX] [PDF]
    @inproceedings{10.5555/314500.314968,
    author = {Baker, Brenda S.},
    title = {Parameterized Diff},
    year = {1999},
    isbn = {0898714346},
    url = {https://dl.acm.org/doi/10.5555/314500.314968},
    publisher = {Society for Industrial and Applied Mathematics},
    address = {USA},
    booktitle = {Proceedings of the Tenth Annual ACM-SIAM Symposium on Discrete Algorithms},
    pages = {854–855},
    numpages = {2},
    location = {Baltimore, Maryland, USA},
    series = {SODA ’99}
    }

  • K. D. Cooper and N. McIntosh, “Enhanced code compression for embedded risc processors,” in Proceedings of the acm sigplan 1999 conference on programming language design and implementation, New York, NY, USA, 1999, p. 139–149. doi:10.1145/301618.301655
    [BibTeX] [PDF]
    @inproceedings{10.1145/301618.301655,
    author = {Cooper, Keith D. and McIntosh, Nathaniel},
    title = {Enhanced Code Compression for Embedded RISC Processors},
    year = {1999},
    isbn = {1581130945},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/301618.301655},
    doi = {10.1145/301618.301655},
    booktitle = {Proceedings of the ACM SIGPLAN 1999 Conference on Programming Language Design and Implementation},
    pages = {139–149},
    numpages = {11},
    location = {Atlanta, Georgia, USA},
    series = {PLDI ’99}
    }

  • A. Van Deursen and T. Kuipers, “Building documentation generators,” in Proceedings ieee international conference on software maintenance – 1999 (icsm’99). ‘software maintenance for business change’ (cat. no.99cb36360), 1999, pp. 40-49.
    [BibTeX] [PDF]
    @INPROCEEDINGS{792497,
    author={A. {Van Deursen} and T. {Kuipers}},
    booktitle={Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360)},
    title={Building documentation generators},
    year={1999},
    url = {https://ieeexplore.ieee.org/document/792497},
    volume={},
    number={},
    pages={40-49},}

  • V. R. Richard Fanta, “Removing clones from the code,” Journal of software maintenance, pp. 223-243, 1999.
    [BibTeX] [PDF]
    @article{article,
    author = {Richard Fanta, Václav Rajlich},
    journal = {Journal of Software Maintenance},
    year = {1999},
    pages = {223-243},
    url = {https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.65.6604&rep=rep1&type=pdf},
    title = {Removing Clones from the Code},
    }

  • D. Gitchell and N. Tran, “Sim: a utility for detecting similarity in computer programs,” in The proceedings of the thirtieth sigcse technical symposium on computer science education, New York, NY, USA, 1999, p. 266–270. doi:10.1145/299649.299783
    [BibTeX] [PDF]
    @inproceedings{10.1145/299649.299783,
    author = {Gitchell, David and Tran, Nicholas},
    title = {Sim: A Utility for Detecting Similarity in Computer Programs},
    year = {1999},
    isbn = {1581130856},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/299649.299783},
    doi = {10.1145/299649.299783},
    booktitle = {The Proceedings of the Thirtieth SIGCSE Technical Symposium on Computer Science Education},
    pages = {266–270},
    numpages = {5},
    location = {New Orleans, Louisiana, USA},
    series = {SIGCSE ’99}
    }

  • J. Patenaude, E. Merlo, M. Dagenais, and B. Laguë, “Extending software quality assessment techniques to java systems,” in Proceedings of the 7th international workshop on program comprehension, USA, 1999, p. 49.
    [BibTeX] [PDF]
    @inproceedings{10.5555/520033.858251,
    author = {Patenaude, Jean-Francois and Merlo, Ettore and Dagenais, Michel and Lagu\"{e}, Bruno},
    title = {Extending Software Quality Assessment Techniques to Java Systems},
    year = {1999},
    isbn = {0769501796},
    publisher = {IEEE Computer Society},
    address = {USA},
    booktitle = {Proceedings of the 7th International Workshop on Program Comprehension},
    pages = {49},
    numpages = {1},
    url ={https://dl.acm.org/doi/10.5555/520033.858251},
    keywords = {Java systems, Clone detection, Software evaluation, Quality assessment},
    series = {IWPC ’99}
    }

1998

  • B. S. Baker and U. Manber, “Deducing similarities in java sources from bytecodes,” in Proceedings of the annual conference on usenix annual technical conference, USA, 1998, p. 15.
    [BibTeX] [PDF]
    @inproceedings{10.5555/1268256.1268271,
    author = {Baker, Brenda S. and Manber, Udi},
    title = {Deducing Similarities in Java Sources from Bytecodes},
    year = {1998},
    url = {https://dl.acm.org/doi/10.5555/1268256.1268271},
    publisher = {USENIX Association},
    address = {USA},
    booktitle = {Proceedings of the Annual Conference on USENIX Annual Technical Conference},
    pages = {15},
    numpages = {1},
    location = {New Orleans, Louisiana},
    series = {ATEC ’98}
    }

  • M. Dagenais, E. Merlo, B. Laguë, and D. Proulx, “Clones occurence in large object oriented software packages,” in Proceedings of the 1998 conference of the centre for advanced studies on collaborative research, 1998, p. 10.
    [BibTeX] [PDF]
    @inproceedings{10.5555/783160.783170,
    author = {Dagenais, Michel and Merlo, Ettore and Lagu\"{e}, Bruno and Proulx, Daniel},
    title = {Clones Occurence in Large Object Oriented Software Packages},
    year = {1998},
    url = {https://dl.acm.org/doi/10.5555/783160.783170},
    publisher = {IBM Press},
    booktitle = {Proceedings of the 1998 Conference of the Centre for Advanced Studies on Collaborative Research},
    pages = {10},
    numpages = {9},
    location = {Toronto, Ontario, Canada},
    series = {CASCON ’98}
    }

  • R. Koschke, J. -F. Girard, and M. Würthner, “An intermediate representation for reverse engineering analyses,” in Proceedings of the working conference on reverse engineering (wcre’98), USA, 1998, p. 241.
    [BibTeX] [PDF]
    @inproceedings{10.5555/832305.837023,
    author = {Koschke, R. and Girard, J.-F. and W\"{u}rthner, M.},
    title = {An Intermediate Representation for Reverse Engineering Analyses},
    year = {1998},
    isbn = {0818689676},
    publisher = {IEEE Computer Society},
    address = {USA},
    url = {https://dl.acm.org/doi/10.5555/832305.837023},
    booktitle = {Proceedings of the Working Conference on Reverse Engineering (WCRE’98)},
    pages = {241},
    numpages = {1},
    keywords = {reverse engineering, program representation, Views},
    series = {WCRE ’98}
    }

1997

  • K. Kontogiannis, “Evaluation experiments on the detection of programming patterns using software metrics,” in Proceedings of the fourth working conference on reverse engineering, 1997, pp. 44-54.
    [BibTeX] [PDF]
    @INPROCEEDINGS{624575,
    author={K. {Kontogiannis}},
    booktitle={Proceedings of the Fourth Working Conference on Reverse Engineering},
    title={Evaluation experiments on the detection of programming patterns using software metrics},
    year={1997},
    url ={https://ieeexplore.ieee.org/document/624575},
    volume={},
    number={},
    pages={44-54},}

  • B. S. Baker, “Parameterized duplication in strings: algorithms and an application to software maintenance,” Siam j. comput., vol. 26, iss. 5, p. 1343–1362, 1997. doi:10.1137/S0097539793246707
    [BibTeX] [PDF]
    @article{10.1137/S0097539793246707,
    author = {Baker, Brenda S.},
    title = {Parameterized Duplication in Strings: Algorithms and an Application to Software Maintenance},
    year = {1997},
    issue_date = {Oct. 1997},
    publisher = {Society for Industrial and Applied Mathematics},
    address = {USA},
    volume = {26},
    number = {5},
    issn = {0097-5397},
    url = {https://doi.org/10.1137/S0097539793246707},
    doi = {10.1137/S0097539793246707},
    journal = {SIAM J. Comput.},
    month = oct,
    pages = {1343–1362},
    numpages = {20},
    keywords = {pattern matching, duplication, string matching}
    }

  • E. Burd and M. Munro, “Investigating the maintenance implications of the replication of code,” in 1997 proceedings international conference on software maintenance, 1997, pp. 322-329.
    [BibTeX] [PDF]
    @INPROCEEDINGS{5726969,
    author={E. {Burd} and M. {Munro}},
    booktitle={1997 Proceedings International Conference on Software Maintenance},
    title={Investigating the maintenance implications of the replication of code},
    year={1997},
    url ={https://ieeexplore.ieee.org/document/5726969},
    volume={},
    number={},
    pages={322-329},}

  • B. Lague, D. Proulx, J. Mayrand, E. M. Merlo, and J. Hudepohl, “Assessing the benefits of incorporating function clone detection in a development process,” in Proceedings of the international conference on software maintenance, USA, 1997, p. 314.
    [BibTeX] [PDF]
    @inproceedings{10.5555/645545.853273,
    author = {Lague, Bruno and Proulx, Daniel and Mayrand, Jean and Merlo, Ettore M. and Hudepohl, John},
    title = {Assessing the Benefits of Incorporating Function Clone Detection in a Development Process},
    year = {1997},
    isbn = {081868013X},
    url = {https://dl.acm.org/doi/10.5555/645545.853273},
    publisher = {IEEE Computer Society},
    address = {USA},
    booktitle = {Proceedings of the International Conference on Software Maintenance},
    pages = {314},
    numpages = {1},
    keywords = {source code metrics, product assessment, software evolution, Software clones, software maintenance},
    series = {ICSM ’97}
    }

1996

  • Mayrand, Leblanc, and Merlo, “Experiment on the automatic detection of function clones in a software system using metrics,” in 1996 proceedings of international conference on software maintenance, 1996, pp. 244-253.
    [BibTeX] [PDF]
    @INPROCEEDINGS{565012,
    author={ {Mayrand} and {Leblanc} and {Merlo}},
    booktitle={1996 Proceedings of International Conference on Software Maintenance},
    title={Experiment on the automatic detection of function clones in a software system using metrics},
    year={1996},
    url ={https://ieeexplore.ieee.org/document/565012},
    volume={},
    number={},
    pages={244-253},}

  • B. S. Baker, “Parameterized pattern matching: algorithms and applications,” Journal of computer and system sciences, vol. 52, iss. 1, pp. 28-42, 1996. doi:https://doi.org/10.1006/jcss.1996.0003
    [BibTeX] [Abstract] [PDF]

    The problem of finding sections of code that either are identical or are related by the systematic renaming of variables or constants can be modeled in terms ofparameterized strings(p-strings) andparameterized matches(p-matches). P-strings are strings over two alphabets, one of which represents parameters. Two p-strings are aparameterized match(p-match) if one p-string is obtained by renaming the parameters of the other by a one-to-one function. In this paper, we investigate parameterized pattern matching via parameterized suffix trees (p-suffix trees). We give two algorithms for constructing p-suffix trees: one (eager) that runs in linear time for fixed alphabets, and another that uses auxiliary data structures and runs inO(nlog(n)) time for variable alphabets, wherenis input length. We show that using a p-suffix tree for a pattern p-stringP, it is possible to search for all p-matches ofPwithin a text p-stringTin space linear in |P| and time linear in |T| for fixed alphabets, orO(|T|log(min(|P|, σ)) time andO(|P|) space for variable alphabets, whereσis the sum of the alphabet sizes. The simpler p-suffix tree construction algorithmeagerhas been implemented, and experiments show it to be practical. Since it runs faster than predicted by the above worst-case bound, we reanalyze the algorithm and show thateagerruns in timeO(min(t|S|+m(t,S)∣t>0)logσ)), where for an input p-stringS, m(t, S) is the number of maximal p-matches of length at leasttthat occur withinS, andσis the sum of the alphabet sizes. Experiments with the author’s programdup(B. Baker,in“Comput. Sci. Statist.,” Vol. 24, 1992) for finding all maximal p-matches within a p-string have foundm(t, S) to be less than |S| in practice unlesstis small.

    @article{BAKER199628,
    title = "Parameterized Pattern Matching: Algorithms and Applications",
    journal = "Journal of Computer and System Sciences",
    volume = "52",
    number = "1",
    pages = "28-42",
    year = "1996",
    issn = "0022-0000",
    doi = "https://doi.org/10.1006/jcss.1996.0003",
    url = "http://www.sciencedirect.com/science/article/pii/S0022000096900033",
    author = "Brenda S. Baker",
    abstract = "The problem of finding sections of code that either are identical or are related by the systematic renaming of variables or constants can be modeled in terms ofparameterized strings(p-strings) andparameterized matches(p-matches). P-strings are strings over two alphabets, one of which represents parameters. Two p-strings are aparameterized match(p-match) if one p-string is obtained by renaming the parameters of the other by a one-to-one function. In this paper, we investigate parameterized pattern matching via parameterized suffix trees (p-suffix trees). We give two algorithms for constructing p-suffix trees: one (eager) that runs in linear time for fixed alphabets, and another that uses auxiliary data structures and runs inO(nlog(n)) time for variable alphabets, wherenis input length. We show that using a p-suffix tree for a pattern p-stringP, it is possible to search for all p-matches ofPwithin a text p-stringTin space linear in |P| and time linear in |T| for fixed alphabets, orO(|T|log(min(|P|, σ)) time andO(|P|) space for variable alphabets, whereσis the sum of the alphabet sizes. The simpler p-suffix tree construction algorithmeagerhas been implemented, and experiments show it to be practical. Since it runs faster than predicted by the above worst-case bound, we reanalyze the algorithm and show thateagerruns in timeO(min(t|S|+m(t,S)∣t>0)logσ)), where for an input p-stringS, m(t, S) is the number of maximal p-matches of length at leasttthat occur withinS, andσis the sum of the alphabet sizes. Experiments with the author's programdup(B. Baker,in“Comput. Sci. Statist.,” Vol. 24, 1992) for finding all maximal p-matches within a p-string have foundm(t, S) to be less than |S| in practice unlesstis small."
    }

  • G. Flammia, “On the internet, software should be milked, not brewed,” Ieee expert: intelligent systems and their applications, vol. 11, iss. 6, p. 87–88, 1996. doi:10.1109/64.546588
    [BibTeX] [PDF]
    @article{10.1109/64.546588,
    author = {Flammia, Giovanni},
    title = {On the Internet, Software Should Be Milked, Not Brewed},
    year = {1996},
    issue_date = {December 1996},
    publisher = {IEEE Educational Activities Department},
    address = {USA},
    url = {https://dl.acm.org/doi/abs/10.1109/64.546588},
    volume = {11},
    number = {6},
    issn = {0885-9000},
    url = {https://doi.org/10.1109/64.546588},
    doi = {10.1109/64.546588},
    journal = {IEEE Expert: Intelligent Systems and Their Applications},
    month = dec,
    pages = {87–88},
    numpages = {2}
    }

  • J. Helfman, “Dotplot patterns: a literal look at pattern languages,” Theor. pract. object syst., vol. 2, iss. 1, p. 31–41, 1996.
    [BibTeX] [PDF]
    @article{10.5555/246277.246285,
    author = {Helfman, Jonathan},
    title = {Dotplot Patterns: A Literal Look at Pattern Languages},
    year = {1996},
    issue_date = {1996},
    publisher = {John Wiley & Sons, Inc.},
    address = {USA},
    volume = {2},
    url = {https://dl.acm.org/doi/10.5555/246277.246285},
    number = {1},
    issn = {1074-3227},
    journal = {Theor. Pract. Object Syst.},
    month = nov,
    pages = {31–41},
    numpages = {11}
    }

  • H. J. Johnson, “Navigating the textual redundancy web in legacy source,” in Proceedings of the 1996 conference of the centre for advanced studies on collaborative research, 1996, p. 16.
    [BibTeX] [PDF]
    @inproceedings{10.5555/782052.782068,
    author = {Johnson, J. Howard},
    title = {Navigating the Textual Redundancy Web in Legacy Source},
    year = {1996},
    publisher = {IBM Press},
    booktitle = {Proceedings of the 1996 Conference of the Centre for Advanced Studies on Collaborative Research},
    pages = {16},
    url = {https://dl.acm.org/doi/10.5555/782052.782068},
    numpages = {10},
    location = {Toronto, Ontario, Canada},
    series = {CASCON ’96}
    }

  • K. A. Kontogiannis, R. Demori, E. Merlo, M. Galler, and M. Bernstein, “Pattern matching for clone and concept detection,” in Reverse engineering, USA: Kluwer academic publishers, 1996, p. 77–108.
    [BibTeX] [PDF]
    @inbook{10.5555/265619.265626,
    author = {Kontogiannis, K. A. and Demori, R. and Merlo, E. and Galler, M. and Bernstein, M.},
    title = {Pattern Matching for Clone and Concept Detection},
    year = {1996},
    isbn = {0792397568},
    publisher = {Kluwer Academic Publishers},
    address = {USA},
    url = {https://dl.acm.org/doi/10.5555/265619.265626},
    booktitle = {Reverse Engineering},
    pages = {77–108},
    numpages = {32}
    }

1995

  • B. S. Baker, “On finding duplication and near-duplication in large software systems,” in Proceedings of 2nd working conference on reverse engineering, 1995, pp. 86-95.
    [BibTeX] [PDF]
    @INPROCEEDINGS{514697,
    author={B. S. {Baker}},
    booktitle={Proceedings of 2nd Working Conference on Reverse Engineering},
    title={On finding duplication and near-duplication in large software systems},
    year={1995},
    url ={https://ieeexplore.ieee.org/document/514697},
    volume={},
    number={},
    pages={86-95},}

  • N. Davey, P. Barson, S. D. H. Field, R. Frank, and S. Tansley, “The development of a software clone detector,” International journal of applied software technology, vol. 1, 1995.
    [BibTeX] [PDF]
    @article{article,
    author = {Davey, Neil and Barson, Paul and Field, S.D.H. and Frank, Ray and Tansley, Stewart},
    year = {1995},
    month = {01},
    url = {https://www.researchgate.net/publication/30383319_The_Development_of_a_Software_Clone_Detector},
    pages = {},
    title = {The Development of a Software Clone Detector},
    volume = {1},
    journal = {International Journal of Applied Software Technology}
    }

  • S. Rao Kosaraju, “Faster algorithms for the construction of parameterized suffix trees,” in Proceedings of the 36th annual symposium on foundations of computer science, USA, 1995, p. 631.
    [BibTeX] [PDF]
    @inproceedings{10.5555/795662.796302,
    author = {Rao Kosaraju, S.},
    title = {Faster Algorithms for the Construction of Parameterized Suffix Trees},
    year = {1995},
    isbn = {0818671831},
    publisher = {IEEE Computer Society},
    address = {USA},
    booktitle = {Proceedings of the 36th Annual Symposium on Foundations of Computer Science},
    pages = {631},
    url = {https://dl.acm.org/doi/10.5555/795662.796302},
    numpages = {1},
    keywords = {suffix tree algorithm, code duplication problem, pattern matching, suffix tree, algorithm theory, parameterized suffix trees, computational complexity, trees (mathematics), string matching},
    series = {FOCS ’95}
    }

  • P. Devanbu, “On “a framework for source code search using program patterns”,” Ieee trans. softw. eng., vol. 21, iss. 12, p. 1009–1010, 1995. doi:10.1109/32.489076
    [BibTeX] [PDF]
    @article{10.1109/32.489076,
    author = {Devanbu, Prem},
    title = {On “A Framework for Source Code Search Using Program Patterns”},
    year = {1995},
    issue_date = {December 1995},
    publisher = {IEEE Press},
    volume = {21},
    number = {12},
    issn = {0098-5589},
    url = {https://doi.org/10.1109/32.489076},
    doi = {10.1109/32.489076},
    journal = {IEEE Trans. Softw. Eng.},
    month = dec,
    pages = {1009–1010},
    numpages = {2}
    }

1994

  • Johnson, “Substring matching for clone detection and change tracking,” in Proceedings 1994 international conference on software maintenance, 1994, pp. 120-126.
    [BibTeX] [PDF]
    @INPROCEEDINGS{336783,
    author={ {Johnson}},
    booktitle={Proceedings 1994 International Conference on Software Maintenance},
    title={Substring matching for clone detection and change tracking},
    year={1994},
    url ={https://ieeexplore.ieee.org/document/336783},
    volume={},
    number={},
    pages={120-126},}

  • E. Buss, R. De Mori, W. M. Gentleman, J. Henshaw, H. Johnson, K. Kontogiannis, E. Merlo, H. A. Muller, J. Mylopoulos, S. Paul, A. Prakash, M. Stanley, S. R. Tilley, J. Troster, and K. Wong, “Investigating reverse engineering technologies for the cas program understanding project,” Ibm systems journal, vol. 33, iss. 3, pp. 477-500, 1994.
    [BibTeX] [PDF]
    @ARTICLE{5387326,
    author={E. {Buss} and R. {De Mori} and W. M. {Gentleman} and J. {Henshaw} and H. {Johnson} and K. {Kontogiannis} and E. {Merlo} and H. A. {Muller} and J. {Mylopoulos} and S. {Paul} and A. {Prakash} and M. {Stanley} and S. R. {Tilley} and J. {Troster} and K. {Wong}},
    journal={IBM Systems Journal},
    title={Investigating reverse engineering technologies for the CAS program understanding project},
    year={1994},
    url = {https://ieeexplore.ieee.org/document/5387326},
    volume={33},
    number={3},
    pages={477-500},}

  • H. J. Johnson, “Visualizing textual redundancy in legacy source,” in Proceedings of the 1994 conference of the centre for advanced studies on collaborative research, 1994, p. 32.
    [BibTeX] [PDF]
    @inproceedings{10.5555/782185.782217,
    author = {Johnson, J. Howard},
    title = {Visualizing Textual Redundancy in Legacy Source},
    year = {1994},
    publisher = {IBM Press},
    booktitle = {Proceedings of the 1994 Conference of the Centre for Advanced Studies on Collaborative Research},
    pages = {32},
    numpages = {10},
    url = {https://dl.acm.org/doi/10.5555/782185.782217},
    location = {Toronto, Ontario, Canada},
    series = {CASCON ’94}
    }

  • U. Manber, “Finding similar files in a large file system,” in Proceedings of the usenix winter 1994 technical conference on usenix winter 1994 technical conference, USA, 1994, p. 2.
    [BibTeX] [PDF]
    @inproceedings{10.5555/1267074.1267076,
    author = {Manber, Udi},
    title = {Finding Similar Files in a Large File System},
    year = {1994},
    publisher = {USENIX Association},
    address = {USA},
    url = {https://dl.acm.org/doi/10.5555/1267074.1267076},
    booktitle = {Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference},
    pages = {2},
    numpages = {1},
    location = {San Francisco, California},
    series = {WTEC’94}
    }

1993

  • K. W. Church and J. I. Helfman, “Dotplot: a program for exploring self-similarity in millions of lines of text and code,” Journal of computational and graphical statistics, vol. 2, iss. 2, pp. 153-174, 1993. doi:10.1080/10618600.1993.10474605
    [BibTeX] [PDF]
    @article{doi:10.1080/10618600.1993.10474605,
    author = { Kenneth Ward Church and Jonathan Isaac Helfman },
    title = {Dotplot: A Program for Exploring Self-Similarity in Millions of Lines of Text and Code},
    journal = {Journal of Computational and Graphical Statistics},
    volume = {2},
    number = {2},
    pages = {153-174},
    year = {1993},
    publisher = {Taylor & Francis},
    doi = {10.1080/10618600.1993.10474605},
    url = {https://amstat.tandfonline.com/doi/abs/10.1080/10618600.1993.10474605},
    }

  • H. J. Johnson, “Identifying redundancy in source code using fingerprints,” in Proceedings of the 1993 conference of the centre for advanced studies on collaborative research: software engineering – volume 1, 1993, p. 171–183.
    [BibTeX] [PDF]
    @inproceedings{10.5555/962289.962305,
    author = {Johnson, J. Howard},
    title = {Identifying Redundancy in Source Code Using Fingerprints},
    year = {1993},
    publisher = {IBM Press},
    booktitle = {Proceedings of the 1993 Conference of the Centre for Advanced Studies on Collaborative Research: Software Engineering - Volume 1},
    pages = {171–183},
    url = {https://dl.acm.org/doi/10.5555/962289.962305},
    numpages = {13},
    location = {Toronto, Ontario, Canada},
    series = {CASCON ’93}
    }

1992

  • B. S. Baker, “A program for identifying duplicated code,” Computing science and statistics, 1992.
    [BibTeX] [PDF]
    @ARTICLE{Baker92aprogram,
    author = {Brenda S. Baker},
    url ={http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.550.4540},
    title = {A program for identifying duplicated code},
    journal = {Computing Science and Statistics},
    year = {1992}
    }

1991

  • W. Yang, “Identifying syntactic differences between two programs,” S