An Empirical Study on the Risk of Using Off-The-Shelf Techniques for Processing Mailing List Data


Mailing list repositories contain valuable information about the history of a project. Research is starting to mine this information to support developers and maintainers of long-lived software projects. However, such information exists as unstructured data that needs special processing before it can be studied. In this paper, we identify several challenges that arise when using off-the-shelf techniques for processing mailing list data. Our study highlights the importance of proper processing of mailing list data to ensure accurate research results.

Download the Full Paper

The full paper is available for download, if you want to learn more about bug reports.


If you would like to cite the research in your own work, please use the following citation:

  author = "Nicolas Bettenburg and Emad Shihab and Ahmed E. Hassan",
  title = "An empirical study on the risks of using off-the-shelf techniques for processing mailing list data",
  booktitle = "ICSM'09: Proceedings of the 25th IEEE International Conference on Software Maintenance",
  year = "2009",
  pages = "539--542",
  publisher = "IEEE Computer Society",
  location = "Edmonton, Alberta"

Legal Disclaimer

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.