Reading Group
This reading group is holden by OSCAR Team. We maintain the list of papers in this page. Each participants of this reading group MUST read the paper of the coming week before the face-to-face phase of the reading group. The papers to be read will include, but are not limited to, the following: software engineering (e.g., empirical software engineering, software maintainence, and software testing) and artificial intelligence (e.g., machine learning and natural language processing).
Andrew Meneely and Laurie Williams. Socio-Technical Developer Networks: Should We Trust Our Measurements?. Proceedings of 33rd International Conference on Software Engineering (ICSE'11) pp. 281-290. [ACM]
Zuoning Yin, Ding Yuan, Yuanyuan Zhou, Shankar Pasupathy and Lakshmi N. Bairavasundaram. How Do Fixes Become Bugs? - A Comprehensive Characteristic Study on Incorrect Fixes in Commercial and Open Source Operating Systems. Proceedings of 19th ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE'11) to appear. [PDF]
Abstract. Bug fixing accounts for a large amount of the software maintenance resources. Generally, bugs are reported, fixed, verified and closed. However, in some cases bugs have to be re-opened. Re-opened bugs increase maintenance costs, degrade the overall user-perceived quality of the software and lead to unnecessary rework by busy practitioners. In this paper, we study and predict re-opened bugs through a case study on the Eclipse project. We structure our study along 4 dimensions: 1) the work habits dimension (e.g., the weekday on which the bug was initially closed on), 2) the bug report dimension (e.g., the component in which the bug was found) 3) the bug fix dimension (e.g., the amount of time it took to perform the initial fix) and 4) the team dimension (e.g., the experience of the bug fixer). Our case study on the Eclipse Platform 3.0 project shows that the comment and description text, the time it took to fix the bug, and the component the bug was found in are the most important factors in determining whether a bug will be re-opened. Based on these dimensions we create decision trees that predict whether a bug will be re-opened after its closure. Using a combination of our dimensions, we can build explainable prediction models that can achieve 62.9% precision and 84.5% recall when predicting whether a bug will be re-opened.
Emad Shihab, Akinori Ihara, Yasutaka Kamei, Walid M. Ibrahim, Masao Ohira, Bram Adams, Ahmed E. Hassan, Ken-ichi Matsumoto. Predicting Re-opened Bugs: A Case Study on the Eclipse Project. Proceedings of 17th Working Conference on Reverse Engineering (WCRE '10) pp. 249-258. [IEEE]
Abstract. Bug fixing accounts for a large amount of the software maintenance resources. Generally, bugs are reported, fixed, verified and closed. However, in some cases bugs have to be re-opened. Re-opened bugs increase maintenance costs, degrade the overall user-perceived quality of the software and lead to unnecessary rework by busy practitioners. In this paper, we study and predict re-opened bugs through a case study on the Eclipse project. We structure our study along 4 dimensions: 1) the work habits dimension (e.g., the weekday on which the bug was initially closed on), 2) the bug report dimension (e.g., the component in which the bug was found) 3) the bug fix dimension (e.g., the amount of time it took to perform the initial fix) and 4) the team dimension (e.g., the experience of the bug fixer). Our case study on the Eclipse Platform 3.0 project shows that the comment and description text, the time it took to fix the bug, and the component the bug was found in are the most important factors in determining whether a bug will be re-opened. Based on these dimensions we create decision trees that predict whether a bug will be re-opened after its closure. Using a combination of our dimensions, we can build explainable prediction models that can achieve 62.9% precision and 84.5% recall when predicting whether a bug will be re-opened.
Jin-woo Park, Mu-woong Lee, Jinhan Kim, Seung-won Hwang, Sunghun Kim. CosTriage: A Cost-Aware Triage Algorithm for Bug Reporting Systems. Proceedings of 25th Conference on Artificial Intelligence (AAAI '11) to appear. [PDF]
Abstract. 'Who can fix this bug?' is an important question in bug triage to ¡°accurately¡± assign developers to bug reports. To address this question, recent research treats it as a optimizing recommendation accuracy problem and proposes a solution that is essentially an instance of content-based recommendation (CBR). However, CBR is well-known to cause over-specialization, recommending only the types of bugs that each developer has solved before. This problem is critical in practice, as some experienced developers could be overloaded, and this would slow the bug fixing process. In this paper, we take two directions to address this problem: First, we reformulate the problem as an optimization problem of both accuracy and cost. Second, we adopt a content-boosted collaborative filtering (CBCF), combining an existing CBR with a collaborative filtering recommender (CF), which enhances the recommendation quality of either approach alone. However, unlike general recommendation scenarios, bug fix history is extremely sparse. Due to the nature of bug fixes, one bug is fixed by only one developer, which makes it challenging to pursue the above two directions. To address this challenge, we develop a topic-model to reduce the sparseness and enhance the quality of CBCF. Our experimental evaluation shows that our solution reduces the cost efficiently by 30% without seriously compromising accuracy.
Tip. This paper is very good. I suggest that it should be read repeatedly.
Raymond P.L. Buse and Westley Weimer. Automatically Documenting Program Changes. Proceedings of 25th IEEE/ACM International Conference on Automated Software Engineering (ASE '10) pp. 33-42. [ACM]
Abstract. Source code modifications are often documented with log messages. Such messages are a key component of software maintenance: they can help developers validate changes, locate and triage defects, and understand modifications. However, this documentation can be burdensome to create and can be incomplete or inaccurate.
We present an automatic technique for synthesizing succinct human-readable documentation for arbitrary program differences. Our algorithm is based on a combination of symbolic execution and a novel approach to code summarization. The documentation it produces describes the effect of a change on the runtime behavior of a program, including the conditions under which program behavior changes and what the new behavior is.
We compare our documentation to 250 human-written log messages from 5 popular open source projects. Employing a human study, we find that our generated documentation is suitable for supplementing or replacing 89% of existing log messages that directly describe a code change.
Gerardo Canfora, Luigi Cerulo, Massimiliano Di Penta. Tracking Your Changes: A Language-Independent Approach. IEEE Software , vol. 26, no. 1, pp. 50-57. [IEEE]
Abstract. Null.
Appendix. a) Unix program Diff. http://en.wikipedia.org/wiki/Diff . Appendix. b) CVS. http://ximbiot.com/
|