Seq2seq Natural Language Generation


I implemented a seq2seq Natural Language Generator using the fastai library and the 2017 e2e NLG challenge dataset.

The code can be found here and a detailed article on medium here.

To summarize, on the plus side, I was able to generate reasonably sounding short texts that verbalize most of the input using a sequential model without any of the usual NLG paraphernalia (ordering, sentence planning, lexicalization, linguistic realization, etc).

On the downside, the resulting texts were linguistically not very varied, which is a known limitation of these types of models.

Also, I did not find that attention improved the results compared with vanilla seq2seq. And, whilst data augmentation improved the test set results on bleu metric, neural sampling based beam search with reranking did not.