Faculty Scholarship 2019

A neural model for generating natural language summaries of program subroutines

Alexander Leclair, University of Notre Dame
Siyuan Jiang, Eastern Michigan University
Collin McMillan, University of Notre Dame

Document Type

Conference Proceeding

Publication Date

2019

Department/School

Computer Science

Publication Title

Proceedings - International Conference on Software Engineering

Abstract

Source code summarization - creating natural language descriptions of source code behavior - is a rapidly-growing research topic with applications to automatic documentation generation, program comprehension, and software maintenance. Traditional techniques relied on heuristics and templates built manually by human experts. Recently, data-driven approaches based on neural machine translation have largely overtaken template-based systems. But nearly all of these techniques rely almost entirely on programs having good internal documentation; without clear identifier names, the models fail to create good summaries. In this paper, we present a neural model that combines words from code with code structure from an AST. Unlike previous approaches, our model processes each data source as a separate input, which allows the model to learn code structure independent of the text in code. This process helps our approach provide coherent summaries in many cases even when zero internal documentation is provided. We evaluate our technique with a dataset we created from 2.1m Java methods. We find improvement over two baseline techniques from SE literature and one from NLP literature.

Link to Published Version

10.1109/ICSE.2019.00087

Recommended Citation

LeClair, A., Jiang, S., & McMillan, C. (2019). A neural model for generating natural language summaries of program subroutines. 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE) , 795–806. https://doi.org/10.1109/ICSE.2019.00087

Link to Full Text

COinS

Faculty Scholarship 2019

A neural model for generating natural language summaries of program subroutines

Document Type

Publication Date

Department/School

Publication Title

Abstract

Link to Published Version

Recommended Citation

Search

Links

Browse

Author Corner

Faculty Scholarship 2019

A neural model for generating natural language summaries of program subroutines

Authors

Document Type

Publication Date

Department/School

Publication Title

Abstract

Link to Published Version

Recommended Citation

Share

Search

Links

Browse

Author Corner