Semi-supervised Bug Triage
This page is for semi-supervised bug triage (ssl-triage). In this page, I list the codes in Java and the configuration. These codes are only a prototype, not an application. Some details of implementation have been omitted in our codes. You can add them, if you like. The details can be found in our paper on ssl bug triage.
To cite this project, you can cite the following paper in SEKE 2010. See details in my publication list .
Jifeng Xuan, He Jiang, Zhilei Ren, Jun Yan, Zhongxuan Luo. Automatic Bug Triage using Semi-Supervised Text Classification. Proceedings of 22nd International Conference on Software Engineering and Knowledge Engineering (SEKE 2010), Redwood City, California, USA. July 1-3, 2010, pp. 209-214.
To run the codes, Java Runtime Environment (JRE) is must (JRE 1.6 is recommended). The input files are the XML form of bug reports in Bugzilla. To locate the input files, the path should be point out in the codes. In cn.edu.dlut.oscar.ssltriage.data.ConstPath.java,
  • PATH_XML_DIR = the path of input xml
  • PATH_XML_TEMP = the temp file path, need not to modify
  • PATH_CORRECT = the map for correcting the xml file, which contains noise
  • PATH_STOP = the set in stoplist
Four approaches can be used, such as naive Bayes (NB), NB + EM, NB + EM + weighted recommendation list, and NB + EM + multiple mixture components as follows.
  • cn.edu.dlut.oscar.ssltriage.algo.naivebayes.NaiveBayes.java
  • cn.edu.dlut.oscar.ssltriage.algo.em.EM.java
  • cn.edu.dlut.oscar.ssltriage.algo.em.EM_weightedRecommendList.java
  • cn.edu.dlut.oscar.ssltriage.algo.em.MixEM.java
When you meet some questions on our codes or our paper, please contact xuan (at) mail (dot) dlut (dot) edu (dot) cn