Fault-proneness Filtering (2006~)

The fault-prone module detection in source code is of importance for assurance of software quality. Most of previous fault-prone detection approaches are based on software metrics. Such approaches, however, have difficulties in collecting the metrics and constructing mathematical models based on the metrics.

In order to mitigate such difficulties, we propose a novel approach for detecting fault-prone modules using a spam filtering technique, named Fault-Prone Filtering. Because of the increase of needs for spam e-mail detection, the spam filtering technique has been progressed as a convenient and effective technique for text mining. In our approach, fault-prone modules are detected in a way that the source code modules are considered as text files and are applied to the spam filter directly.

In order to show the usefulness of our approach, we conducted an experiment using a large source code repository of Java based open source project. The result of experiment shows that our approach can classify about 85% of software modules correctly. The result also indicates that fault-prone modules can be detected relatively low cost at an early stage.

Data Mining from Software Project Data in Industries (1998~)

We have applied various data mining techniques to data
collected from industries in Japan. The aim of this study is
to investigate significant factors to either of the quality
of software product, cost for the development, and duration
of the development.

So far, we used association rules mining, Bayesian belief
network, decision trees, Bayesian classifier, factor
analysis, and so on.