Extracting Structural Information from Bug Reports


In software engineering experiments, the description of bug reports is typically treated as natural language text, although it often contains stack traces, source code, and patches. Neglecting such structural elements is a loss of valuable information; structure usually leads to a better performance of machine learning approaches. In this paper, we present a tool called infoZilla that detects structural elements from bug reports with near perfect accuracy and allows us to extract them. We anticipate that infoZilla can be used to leverage data from bug reports at a different granularity level that can facilitate interesting research in the future.

Download the Full Paper

The full paper is available for download, if you want to learn more about detecting and extracting bug report contents.


If you would like to cite the research in your own work, please use the following citation:

 author = "Nicolas Bettenburg and Rahul Premraj and Thomas Zimmermann and Sunghun Kim",
 title = "Extracting structural information from bug reports",
 booktitle = "MSR '08: Proceedings of the 2008 International Working Conference on Mining Software Repositories",
 year = "2008",
 pages = "27--30",
 location = "Leipzig, Germany",
 publisher = "ACM"

Legal Disclaimer

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.