Where should I comment my code? A dataset and model for predicting locations which need comments

Earl T. Barr
Michael Ernst
Santanu Dash
International Conference on Software Engineering (ICSE) (2020)
Google Scholar

Abstract

It is important to write code comments. Programmers should not
comment every line of code: doing so would clutter the code, and
programmers do not have time to do so in any event. Programmers
must judiciously decide where to write code comments.
We have created a machine learning model that suggests locations where a programmer should write a code comment. We
trained it on existing high quality commented code to learn locations chosen which are chosen by developers. Once trained, the
model can predict locations on new code. We find that our models
can achieve good accuracy on this task but there is a lot of scope
for future improvements.