The fault-prone module detection in source code is of importance for assurance of software quality. Most of previous fault-prone detection approaches are based on software metrics. Such approaches, however, have difficulties in collecting the metrics and constructing mathematical models based on the metrics.
In order to mitigate such difficulties, we propose a novel approach for detecting fault-prone modules using a spam filtering technique, named Fault-Prone Filtering. Because of the increase of needs for spam e-mail detection, the spam filtering technique has been progressed as a convenient and effective technique for text mining. In our approach, fault-prone modules are detected in a way that the source code modules are considered as text files and are applied to the spam filter directly.
In order to show the usefulness of our approach, we conducted an experiment using a large source code repository of Java based open source project. The result of experiment shows that our approach can classify about 85% of software modules correctly. The result also indicates that fault-prone modules can be detected relatively low cost at an early stage.
Related paper:
- 西浦, 門田, "Fault-proneモジュール予測における第三者データに基づいた外れ値除去," コンピュータソフトウェア, 40(4), pp. 22-28, 2023年.
- 瀬戸, 西浦, 門田, "相関ルールとランダムフォレストを組合わせたfault-proneモジュール予測の追実験," 情報処理学会論文誌, 63(8), pp. 1352-1360, 2022年8月.
- A. Yamada and O. Mizuno, "Classification of Bug Injected and Fixed Changes Using a Text Discriminator," ACIS International Journal of Software Innovation, 3(1), pp. 50-62, January 2015.
- O. Mizuno, N. Kawashima, and K. Kawamoto, "Fault-Prone Module Prediction Approaches Using Identifiers in Source Code," ACIS International Journal of Software Innovation, 3(1), pp. 36-49, January 2015.
- O. Mizuno and H. Hata, "A Metric to Detect Fault-Prone Software Modules Using Text Classifier," International Journal of Reliability and Safety, 7(1), pp. 17-31, February 2013.
- 畑, 水野, 菊野, "開発履歴メトリクスを用いた細粒度な Fault-prone モジュール予測," 情報処理学会論文誌, 53(6), pp. 1635-1643, 2012年6月.
- O. Mizuno and M. Nakai, "Can Faulty Modules Be Predicted by Warning Messages of Static Code Analyzer?," Advances in Software Engineering, 2012(924923), 8 pages, May 2012.
- 水野, 畑, "スパムフィルタを用いたFault-proneモジュール検出法の予測精度に関する従来法との比較評価," 電子情報通信学会論文誌D, J94-D(1), pp. 409-412, 2011年1月.
- O. Mizuno and H. Hata, "A Hybrid Fault-Proneness Detection Approach Using Text Filtering and Static Code Analysis," International Journal of Advancements in Computing Technology, 2(5), pp. 1-12, December 2010.
- H. Hata, O. Mizuno, and T. Kikuno, "Fault-Prone Module Detection Using Large-Scale Text Features Based on Spam Filtering," Empirical Software Engineering, 15(2), pp. 147-165, April 2010. (JCR: 1.612 (2009))
- O. Mizuno and H. Hata, "Prediction of Fault-Prone Modules Using a Text Filtering Based Metric," International Journal of Software Engineering and Its Application, 4(1), pp. 43-52, January 2010.
- O. Mizuno and T. Kikuno, "Prediction of Fault-Prone Software Modules Using a Generic Text Discriminator," IEICE Trans. on Information and Systems, E91-D(4), pp. 888-896, April 2008. (JCR: 0.369 (2008))
- 水野, 菊野, "Fault-Prone フィルタリング: 不具合を含むモジュールのスパムフィルタを利用した予測手法," SEC journal, 4(1), pp. 6-15, 2008年2月.
- T. Fujiwara, O. Mizuno, and P. Leelaprute, "Fault-Prone Byte-Code Detection Using Text Classifier," In Proc. of 16th International Conference on Product-Focused Software Process Improvement (PROFES2015), 1st International Workshop on Processes, Methods, and Tools for Engineering Embedded Systems, LNCS(9459), pp. 415-430, December 2015. (Bozen-Bolzano, Italy)
- K. Mori and O. Mizuno, "An Implementation of Just-In-Time Fault-Prone Prediction Technique Using Text Classifier," In Proc. of the 39th IEEE Computers, Software, and Applications Conference (COMPSAC 2015), pp. 609-612, July 2015. (Taichung, Taiwan)
- O. Mizuno and Y. Hirata, "A Cross-Project Evaluation of Text-Based Fault-Prone Module Prediction," In Proc. of 6th International Workshop on Empirical Software Engineering in Practice (IWESEP2014), pp. 43-48, November 2014. (Osaka, Japan) (Acceptance rate: 56%, 10/18)
- A. Yamada and O. Mizuno, "A Text Filtering Based Approach to Classify Bug Injected and Fixed Changes," In Proc. of 12th International Conference on Software Engineering Research, Management and Applications (SERA2014), pp. 680-686, August 2014. (Kitakyushu, Japan) (Acceptance rate: 59%, 19/32)
- N. Kawashima and O. Mizuno, "Predicting Fault-Prone Modules by Word Occurrence in Identifiers," In Proc. of 12th International Conference on Software Engineering Research, Management and Applications (SERA2014), Studies in Computational Intelligence , 578, pp. 87-98, August 2014. (Kitakyushu, Japan) (Acceptance rate: 59%, 19/32)
- O. Mizuno, "On Effects of Tokens in Source Code to Accuracy of Fault-Prone Module Prediction," In Proc. of the 17th International Computer Science and Engineering Conference (ICSEC2013), 103 - 108, September 2013. (Bangkok, Thailand) (Acceptance rate: 57%, 73/128)
- K. Kawamoto and O. Mizuno, "Predicting Fault-Prone Modules Using the Length of Identifiers," In Proc. of 4th International Workshop on Empirical Software Engineering in Practice (IWESEP 2012), pp. 30-34, October 2012. (Osaka, Japan) (Acceptance rate: 8/14, 57%)
- Y. Hirata and O. Mizuno, "Investigating Effects of Tokens on Detecting Fault-Prone Modules by Text Filtering," In Proc. of 22nd International Symposium on Software Reliability Engineering (ISSRE2011), Supplemental proceedings, 3-2, November 2011. (Hiroshima, Japan)
- M. Nakai and O. Mizuno, "Fault-Prone Module Prediction by Filtering Warning Messages of Static Code Analyzer," In Proc. of the Joint Conference of the 21th International Workshop on Software Measurement and the 6th International Conference on Software Process and Product Measurement (IWSM/MENSURA2011), Fast abstracts, pp. 18-21, November 2011. (Nara, Japan)
- Y. Hirata and O. Mizuno, "Do Comments Explain Codes Adequately? -- Investigation by Text Filtering --," In Proc. of 8th Working Conference on Mining Software Repositories (MSR2011), pp. 242-245, May 2011. (Honolulu, HI, USA)
- O. Mizuno and Y. Hirata, "Fault-Prone Module Prediction Using Contents of Comment Lines," In Proc. of International Workshop on Empirical Software Engineering in Practice 2010 (IWESEP2010), pp. 39-44, December 2010. (NAIST, Nara, Japan) (Acceptance rate: 66%)
- O. Mizuno and H. Hata, "An Empirical Comparison of Fault-Prone Module Detection Approaches: Complexity Metrics and Text Feature Metrics," In Proc. of 34th Annual IEEE Computer Software and Applications Conference (COMPSAC2010), pp. 248-249, July 2010. (Seoul, Korea)
- O. Mizuno and H. Hata, "An Integrated Approach to Detect Fault-Prone Modules Using Complexity and Text Feature Metrics," In Proc. of 2010 International Conference on Advanced Science and Technology (AST2010), LNCS 6059, pp. 457-468, June 2010. (Miyazaki, Japan)
- O. Mizuno and H. Hata, "Yet Another Metric for Predicting Fault-Prone Modules," In Proc. of 2009 International Conference on Advanced Software Engineering & Its Applications (ASEA2009), CCIS 59, pp. 296-304, December 2009. (Cheju, Korea)
- H. Hata, O. Mizuno, and T. Kikuno, "Comparative Study of Fault-Proneness Filtering with PMD," In Proc. of 19th International Symposium on Software Reliability Engineering (ISSRE2008), pp. 317-318, November 2008. (Seattle/Redmond, WA, USA) (Acceptance rate: 37%, 11/30)
- H. Hata, O. Mizuno, and T. Kikuno, "An Extension of Fault-Prone Filtering Using Precise Training and a Dynamic Threshold," In Proc. of 5th Working Conference on Mining Software Repositories (MSR2008), pp. 89-97, May 2008. (Leipzig, Germany) (Acceptance rate: 19%)
- T. Kondou, O. Mizuno, and T. Kikuno, "Investigating Factors Affecting Accuracy of Fault-Prone Filtering," In Proc. of 18th International Symposium on Software Reliability Engineering (ISSRE2007), Supplemental proceedings, CD-ROM, November 2007. (Trollhattan, Sweden)
- T. Yagi, O. Mizuno, and T. Kikuno, "Analysing Effect of Pre-Training in Fault-Prone Prediction Using Spam Filter," In Proc. of 18th International Symposium on Software Reliability Engineering (ISSRE2007), Supplemental proceedings, CD-ROM, November 2007. (Trollhattan, Sweden)
- O. Mizuno, S. Ikami, S. Nakaichi, and T. Kikuno, "Fault-Prone Filtering: Detection of Fault-Prone Modules Using Spam Filtering Technique," In Proc. 1st International Symposium on Empirical Software Engineering and Measurement (ESEM2007), pp. 374-383, September 2007. (Madrid, Spain) (Acceptance rate: 41%, 44/107)
- O. Mizuno and T. Kikuno, "Training on Errors Experiment to Detect Fault-Prone Software Modules by Spam Filter," In The 6th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE2007), pp. 405-414, September 2007. (Dubrovnik, Croatia) (Acceptance rate: 17%, 43/251)
- M. Kimoto, O. Mizuno, and T. Kikuno, "Extraction of Fault-Prone Modules Based on Fault Tracking Data from Open Source Software Repository," In 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN2007), Supplemental Proceedings, pp. 366-367, June 2007. (Edinburgh, UK)
- O. Mizuno, S. Ikami, S. Nakaichi, and T. Kikuno, "Spam Filter Based Approach for Finding Fault-Prone Software Modules," In Proc. of Fourth International Workshop on Mining Software Repositories (MSR07), p. 4, May 2007. (Minneapolis, MN, USA) (Acceptance rate: 51%)
- 西浦, 門田, "Fault-proneモジュール予測における第三者データに基づいた外れ値除去," 第29回ソフトウェア工学の基礎ワークショップ (FOSE2022)論文集, pp. 153-158, 2022年11月.
- 森, 水野, "スパムフィルタに基づく即時バグ予測ツールの試作," ソフトウェア・シンポジウム2015, pp. 37-46, 2015年6月.
- 藤原, 水野, "バイトコードを用いたテキスト分類による不具合予測," ソフトウェア・シンポジウム2015, pp. 80-88, 2015年6月.
- 川島, 水野, "識別子中の単語情報を用いた Fault-prone モジュール予測," ソフトウェアシンポジウム2014論文集, pp. 72-80, 2014年6月. (秋田市)
- 畑, 水野, 菊野, "開発履歴メトリクスに基づくfault-proneモジュール予測の細粒度モジュールへの適用," ソフトウェアエンジニアリングシンポジウム2011(SES2011), 4, 2011年9月.
- 中井, 水野, "ソースコード静的解析結果を利用した不具合混入モジュールの予測手法の提案," ソフトウェア・シンポジウム2011, 09_研究論文 (Online only), 2011年6月. (長崎市)
- 畑, 水野, 菊野, "負例を用いない機械学習によるfault-proneモジュール検出," ソフトウェアエンジニアリングシンポジウム2009 (SES2009), pp. 133-138, 2009年9月. (東京)
- 水野, "Fault-proneness Filtering: スパムフィルタに基づく不具合混入ソフトウェアモジュールの予測手法," 生産と技術, 61(1), pp. 38-43, 2009年1月.
- 水野, 黒田, 石原, 山下, "テキスト分類による不具合予測システムの実装と企業環境での評価," 電子情報通信学会技術報告, SS2020-17, 電子情報通信学会, 2021年1月.
- 平田, 水野, "テキスト分類に基づくFault-proneモジュール検出法におけるコメント行の影響の分析," 情報処理学会研究報告 ソフトウェア工学(SE), 2010-SE-170(10), pp. 1-8, 2010年11月. (大阪大学)
- 劉, 水野, 菊野, "フォールトプローンモジュール検出手法間の精度比較 〜Fault-proneness filteringとロジスティック回帰〜," 電子情報通信学会技術報告, 108(384, KBSE2008-47), pp. 61-66, 2009年1月. (東京)
- 森井, 水野, 菊野, "ソースコード中に含まれる不具合トークンをテキスト分類に基づいて推定するツールの試作と評価," 電子情報通信学会技術研究報告, 108(64, SS2008-4), pp. 19-24, 2008年5月. (宮崎市)
- 八木, 水野, 菊野, "SPAMフィルタを用いたFault-Proneモジュールの予測 -- 異なるプロジェクトの学習結果を利用した精度評価," ソフトウェア信頼性研究会第4回ワークショップ論文集, pp. 35-43, 2007年6月. (松山市)
- 井神, 中市, 水野, 菊野, "汎用テキストフィルタを利用した不具合を含むソースコードの予測," 電子情報通信学会技術研究報告, 106(522, SS2006-75), pp. 25-30, 2007年2月. (愛知県立大学, 名古屋市)
- 川島, "識別子中の単語情報を用いたFault-proneモジュール予測手法の提案," 卒業研究報告書, 京都工芸繊維大学, 2014年2月.