NLG Design Patterns: #1 Conditional Enumerations

Ludan Stoecklé
3 min readSep 10, 2020

This is the first article of a series on NLG (Natural Language Generation) design patterns. These articles describe common situations encountered when implementing NLG systems, and how you should deal with it.

A sample implementation is given using open-source RosaeNLG, but the same approach applies with any template based NLG system, like CoreNLG or Yseop.

Photo by Glenn Carstens-Peters on Unsplash

“Conditional Enumerations” Intent

You want to generate a dynamic sentence that enumerates properties or characteristics, like in this sentence:

This car is ideal for a family, is practical for long travels and consumes less than 5 l per 100 km.

or in:

This car is is ideal for a young couple and is practical in the city.

You could also have multiple sentences:

This car is ideal for a family. Moreover, it is practical for long travels. At last, it consumes less than 5 l per 100 km.

Use the “Conditional Enumerations” pattern when:

  • you wish to generate a sentence, or a paragraph, listing facts
  • each fact is dynamically triggered by a business condition (or, more generally, is a dynamic text)
  • you don’t know in advance how many properties will be triggered — it could 2, 3, or more, or 1, or just 0

This pattern is very common, though it may require some effort to identify it in real life texts.

Problem

The issue is to manage the separators — often “,” and “and” — putting them at the right place. When you have only one element, you have no separator; when you have 2 elements, you will just use “and” between both; when you have 3 or more, you will use “,” between each element and “and” just before the last one.

A custom implementation would be to push each textual argument in a dedicated list, and then, depending on the size of the list, to trigger different textual templates. This is tedious and becomes a nightmare when the texts in elements of the list are themselves dynamically generated using NLG templates, or are lists themselves.

The Solution

Use a dedicated NLG enumeration structure and indicate:

  • how to separate elements: “,” as the general separator, often “and” or “or” as the last separator
  • the textual content of each element to enumerate: it can anything — static text, static text with a condition, or another enumeration
  • also indicate what should happen when 0 element trigger, how to end the sentence, etc.

Sample implementation using RosaeNLG, with the itemz/item structure:

Implementation using RosaeNLG, generating 1 sentence

This will generate:

This car is ideal for a family, is practical for long travels and consumes less than 5 l per 100 km.

This car is ideal for a young couple and is practical in the city.

Generating separate sentences:

Implementation using RosaeNLG, generating multiple sentences

This will generate:

First, this car is ideal for a family. As well, it is practical for long travels. At last, it consumes less than 5 l per 100 km.

First, this car is ideal for a young couple. At last, it is practical in the city.

Advantages:

  • you only express business conditions once, close from the text to generate
  • the “surface” parameters (“,”, “and” etc.) are grouped and well separated from business conditions
  • code is concise, readable, and easy to maintain

Behind the Scenes

The NLG engine will work in 2 steps:

  • first, it will just go through each item, evaluating the condition (which can be imbricated), trying to generate the text, just to evaluate which elements are empty or not (and at the end also knowing the size of the list)
  • second, knowing which elements are not empty and the total size, it will be able to generate the target text with the proper separators at the proper place

About the Author

Ludan Stoecklé cumulates 13 years of experience in design and implementation of NLG systems, first with NLG vendor Yseop, then with consulting firm Addventa and for BNP Paribas CIB bank. He is also the author of open-source system RosaeNLG.

--

--