Concepedia

Publication | Closed Access

A neural framework for retrieval and summarization of source code

100

Citations

28

References

2018

Year

Abstract

Code retrieval and summarization are two tasks often employed by software developers to reuse code that spreads over online repositories. In this paper, we present a neural framework that allows bidirectional mapping between source code and natural language to improve these two tasks. Our framework, BVAE, is designed to have two Variational AutoEncoders (VAEs) to model bimodal data: C-VAE for source code and L-VAE for natural language. Both VAEs are trained jointly to reconstruct their input as much as possible with regularization that captures the closeness between the latent variables of code and description. BVAE could learn semantic vector representations for both code and description and generate completely new descriptions for arbitrary code snippets. We design two instance models of BVAE for retrieval and summarization tasks respectively and evaluate their performance on a benchmark which involves two programming languages: C# and SQL. Experiments demonstrate BVAE’s potential on the two tasks.

References

YearCitations

Page 1