Fault localization from bug reports

Once a bug in software is reported, developers have to determine which source files are related to the bug. This process is referred to as bug localization, and an automatic way of bug localization is important to improve developers’ productivity. This paper proposes an approach called DrewBL to efficiently localize faulty files for a given bug report using a natural language processing tool, word2vec. In DrewBL, we first build a vector space model named semantic-VSM which represents a distributed representation of words in the bug report and source code files and next compute the relevance between them by feeding the constructed model to word2vec. We also present an approach called CombBL to further improve the accuracy of bug localization which employs not only the proposed DrewBL but also existing bug localization techniques, such as BugLocator based on textual similarity and Bugspots based on bug-fixing history, in a combinatorial manner. This study gives our early experimental results to show the effectiveness and efficiency of the proposed approaches using two open source projects.


  • Y. Uneno, O. Mizuno, and E. Choi, "Using a Distributed Representation of Words in Localizing Relevant Files for Bug Reports," In Proc. of the 2016 IEEE International Conference on Software Quality, Reliability & Security (QRS2016), pp. 183-190, August 2016.
  • Y. Uneno, O. Mizuno, E. Choi, "Using Word2vec in Localizing Relevant Files for Bug Reports," 電子情報通信学会技術研究報告, 115(SS2015-85), pp. 55-60, March 2016.
  • Y. Uneno, "単語分散表現を用いたバグ報告からの不具合ファイル特定," Master thesis, 京都工芸繊維大学大学院工芸科学研究科, 2016.
  • Y. Uneno, O. Mizuno, "Word2Vecを利用したバグ報告に対する修正対象推薦手法," ソフトウェア信頼性研究会FORCE2015予稿集, November 2015.
  • Y. Uneno and O. Mizuno, "Identifying Bug Injected Files from Bug Description Using Word2vec," In Poster presentation of 6th International Workshop on Empirical Software Engineering in Practice (IWESEP2014), November 2014.