What Happens When You Run SRL Experiments Over a BERT Based Model?

10 min readMay 20, 2021

Transformers have made more progress in the past few years than NLP in the past generation. Standard NLU approaches first learn syntactical and lexical features to explain the structure of a sentence. The former NLP models would be trained to understand the basic syntax of language before running Semantic Role Labeling (SRL). with BERT-based models.

In this article, we will use a pretrained BERT-based model provided by the Allen Institute for AI based on the Shi and Lin (2019) paper. Shi and Lin took SRL to the next level by dropping syntactic and lexical training. We will see how this was achieved.

This article is an excerpt from the book, Transformers for Natural Language Processing by Denis Rothman — a comprehensive guide that helps NLP practitioners become an AI language understanding expert by mastering the quantum leap of Transformer neural network models.

Setting up the BERT SRL environment

We will be using a Google Colab notebook, the AllenNLP visual text representations of SRL available at https://demo.allennlp.org/reading-comprehension.

We will apply the following method:

1. Open SRL.ipynb, install AllenNLP, and run each sample.
2. We will display the raw output of the SRL run.
3. We will visualize the output using AllenNLP’s online visualization tools
4. We will display the output using AllenNLP’s online text visualization tools.

Note: The SRL model output may differ when AllenNLP changes the transformer model used. AllenNLP models and transformers, in general, are continuously trained and updated. Also, the datasets using for training might change. Finally, these are not rule-based algorithms that produce the same result each time. The outputs might change from one run to another, as described and shown in the screenshots.

Let’s now run some SRL experiments.

SRL experiments with the BERT-based model

We will run our SRL experiments using the method described in the Setting up the BERT SRL environment section of this chapter. We will begin with basic samples with various sentence structures. We will then challenge the BERT-based model with some more difficult samples to explore the system’s capacity and limits.

Open SRL.ipynb and run the installation cell:

!pip install allennlp==1.0.0 allennlp-models==1.0.0

We are now ready to warm up with some basic samples.

Basic samples

Basic samples seem intuitively simple but can be tricky to analyze. Compound sentences, adjectives, adverbs, and modals are not easy to identify, even for non-expert humans.

Let’s begin with an easy sample for the transformer.

Sample 1

The first sample is long but relatively easy for the transformer:

“Did Bob really think he could prepare a meal for 50 people in only a few hours?”

Run Sample 1 cell in SRL.ipynb:

!echo ‘{“sentence”: “Did Bob really think he could prepare a meal for 50 people in only a few hours?”}’ | \

Allennlp predict https://storage.googleapis.com/allennlp-public-models/bert-base-srl-2020.03.24.tar.gz –

The transformer identified the verb “think,” for example, as we can see in the following excerpt of the raw output of the cell:

prediction: {“verbs”: [{“verb”: “think”, “description”: “Did [ARG0: Bob] [ARGM-ADV: really] [V: think] [ARG1: he could prepare a meal for 50 people in only a few hours] ?”,

If we run the sample in the AllenNLP online interface, we obtain a visual representation of the SRL task. The first verb identified is “think”:

Figure 1: Identifying the verb “think”

If we take a close look at this representation, we can detect some interesting properties of the simple BERT-based transformer, which:

– Detected the verb “think”
– Avoided the “prepare” trap that could have been interpreted as the main verb. Instead, “prepare” remained part of the argument of “think”
– Detected an adverb and labeled it

The transformer then moved to the verb “prepare,” labeled it, and analyzed its context:

Figure 2: Identifying the verb “prepare”, the arguments, and the modifiers

Again, the simple BERT-based transformer model detected a lot of information on the grammatical structure of the sentence and found:

– The verb “prepare” and isolated it
– The noun “he” and labeled it as an argument and did the same for “a meal for 50 people.” Both arguments are correctly related to the verb “prepare”
– That “in only a few hours” is a temporal modifier of “prepare”
– That “could” was a modal modifier that indicates the modality of a verb, such as the likelihood of an event

The text output of AllenNLP sums the analysis up:

think: Did [ARG0: Bob] [ARGM-ADV: really] [V: think] [ARG1: he could prepare a meal for 50 people in only a few hours] ?

could: Did Bob really think he [V: could] prepare a meal for 50 people in only a few hours ?

prepare: Did Bob really think [ARG0: he] [ARGM-MOD: could] [V: prepare] [ARG1: a meal for 50 people] [ARGM-TMP: in only a few hours] ?

We will now analyze another relatively long sentence.

Sample 2

The following sentence seems easy but contains several verbs:

“Mrs. and Mr. Tomaso went to Europe for vacation and visited Paris and first went to visit the Eiffel Tower.”

Will this confusing sentence make the transformer hesitate? Let’s see by running the Sample 2 cell of the SRL.ipynb notebook:

!echo ‘{“sentence”: “Mrs. And Mr. Tomaso went to Europe for vacation and visited Paris and first went to visit the Eiffel Tower.”}’ | \

allennlp predict https://storage.googleapis.com/allennlp-public-models/bert-base-srl-2020.03.24.tar.gz –

The excerpt of the output proves that the transformer correctly identified the verbs in the sentence:

prediction: {“verbs”: [{“verb”: “went”, “description”: “[ARG0: Mrs. and Mr. Tomaso] [V: went] [ARG4: to Europe] [ARGM-PRP: for vacation]

Running the sample on AllenNLP online shows that an argument was identified as the purpose of the trip:

Figure 3: Identifying the verb “went,” the arguments, and the modifier

We can interpret the arguments of the verb “went.” However, the transformer found that the modifier of the verb was the purpose of the trip. The result would not be surprising if we did not know that Shi and Lin (2019) had only built a simple BERT model to obtain this high-quality grammatical analysis.

We can also notice that “went” was correctly associated with Europe. The transformer correctly identified the verb “visit” as being related to Paris:

Figure 4: Identifying the verb “visited” and the arguments

The transformer could have associated the verb “visited” directly with the Eiffel Tower. But it didn’t. It stood its ground and made the right decision.

The final task we asked the transformer to do was to identify the context of the second use of the verb “went”. Again, it did not fall into the trap of merging all of the arguments related to the verb “went”, used twice in the sentence. Again, it correctly split the sequence and produced an excellent result:

Figure 5: Identifying the verb “went,” the argument, and the modifiers

The verb “went” was used twice, but the transformer did not fall into the trap. It even found that “first” was a temporal modifier of the verb “went.”

The formatted text output of the AllenNLP online interface sums the excellent result obtained for this sample:

went: [ARG0: Mrs. and Mr. Tomaso] [V: went] [ARG4: to Europe] [ARGM-PRP: for vacation] and visited Paris and first went to visit the Eiffel Tower .

visited: [ARG0: Mrs. and Mr. Tomaso] went to Europe for vacation and [V: visited] [ARG1: Paris] and first went to visit the Eiffel Tower .

went: [ARG0: Mrs. and Mr. Tomaso] went to Europe for vacation and visited Paris and [ARGM-TMP: first] [V: went] [ARGM-PRP: to visit the Eiffel Tower] .

visit: [ARG0: Mrs. and Mr. Tomaso] went to Europe for vacation and visited Paris and first went to [V: visit] [ARG1: the Eiffel Tower] .

Let’s run a sentence that is a bit more confusing.

Sample 3

Sample 3 will make things more difficult for our transformer model. The following sample contains the verb “drink” four times:

“John wanted to drink tea, Mary likes to drink coffee but Karim drank some cool water and Faiza would like to drink tomato juice.”

Let’s run Sample 3 in the SRL.ipynb notebook:

!echo ‘{“sentence”: “John wanted to drink tea, Mary likes to drink coffee but Karim drank some cool water and Faiza would like to drink tomato juice.”}’ | \

allennlp predict https://storage.googleapis.com/allennlp-public-models/bert-base-srl-2020.03.24.tar.gz –

The transformer found its way around, as shown in the following excerpts of the raw output that contain the verbs:

prediction: {“verbs”: [{“verb”: “wanted,” “description”: “[ARG0: John] [V: wanted] [ARG1: to drink tea] , Mary likes to drink coffee but Karim drank some cool water and Faiza would like to drink tomato juice.”

{“verb”: “likes,” “description”: “John wanted to drink tea, [ARG0: Mary] [V: likes] [ARG1: to drink coffee] but Karim drank some cool water and Faiza would like to drink tomato juice.”

{“verb”: “drank,” “description”: “John wanted to drink tea, Mary likes to drink coffee but [ARG0: Karim] [V: drank] [ARG1: some cool water and Faiza] would like to drink tomato juice.”

{“verb”: “would,” “description”: “John wanted to drink tea, Mary likes to drink coffee but Karim drank some cool water and Faiza [V: would] [ARGM-DIS: like] to drink tomato juice.”

When we run the sentence on the AllenNLP online interface, we obtain several visual representations. We will examine two of them.

The first one is perfect, it identifies the verb “wanted” and makes the right associations:

Figure 6: Identifying the verb “wanted” and the arguments

However, when it identified the verb “drank,” it slipped “and Faiza” as an argument:

Figure 7: Identifying the verb “drank and the arguments

The sentence meant that “Karim drank some cool water.” The presence of “and Faiza” as an argument of “drank” is debatable.

The problem has an impact on “Faiza would like to drink tomato juice”:

Figure 8: Identifying the verb “like,” the arguments, and the modifier

The presence of “some cool water and” is not an argument of like. Only “Faiza” is an argument of “like.”

The text output obtained with AllenNLP confirms the problem:

wanted: [ARG0: John] [V: wanted] [ARG1: to drink tea], Mary likes to drink coffee, but Karim drank some cool water and Faiza would like to drink tomato juice.

drink: [ARG0: John] wanted to [V: drink] [ARG1: tea], Mary likes to drink coffee, but Karim drank some cool water and Faiza would like to drink tomato juice.

likes: John wanted to drink tea, [ARG0: Mary] [V: likes] [ARG1: to drink coffee] but Karim drank some cool water and Faiza would like to drink tomato juice.

drink: John wanted to drink tea, [ARG0: Mary] likes to [V: drink] [ARG1: coffee] but Karim drank some cool water and Faiza would like to drink tomato juice.

drank: John wanted to drink tea, Mary likes to drink coffee but [ARG0: Karim] [V: drank] [ARG1: some cool water and Faiza] would like to drink tomato juice.

would: John wanted to drink tea, Mary likes to drink coffee, but Karim drank some cool water and Faiza [V: would] [ARGM-DIS: like] to drink tomato juice.

like: John wanted to drink tea, Mary likes to drink coffee, but Karim drank [ARG0: some cool water and Faiza] [ARGM-MOD: would] [V: like] [ARG1: to drink tomato juice] .

drink: John wanted to drink tea, Mary likes to drink coffee, but Karim drank [ARG0: some cool water and Faiza] would like to [V: drink] [ARG1: tomato juice].

The output is a bit fuzzy. For example, we can see that one of the arguments of the verb “like” is that Karim drank some cool water and Faiza, which is confusing:

like: John wanted to drink tea, Mary likes to drink coffee, but Karim drank [ARG0: some cool water and Faiza] [ARGM-MOD: would] [V: like] [ARG1: to drink tomato juice].

Summary

In this article, we explored SRL. SRL tasks are difficult for both humans and machines. Transformer models have shown that human baselines can be reached for many NLP topics to a certain extent. We found that a simple BERT-based transformer can perform predicate sense disambiguation. We ran a simple transformer that could identify the meaning of a verb (predicate) without lexical or syntactic labeling. Shi and Lin (2019) used a standard “sentence + verb” input format to train their BERT-based transformer.

About the author

Denis Rothman graduated from Sorbonne University and Paris-Diderot University, patenting one of the very first word2matrix embedding solutions. Denis Rothman is the author of three cutting-edge AI solutions: one of the first AI cognitive chatbots more than 30 years ago; a profit-orientated AI resource optimizing system; and an AI APS (Advanced Planning and Scheduling) solution based on cognitive patterns used worldwide in aerospace, rail, energy, apparel, and many other fields. Designed initially as a cognitive AI bot for IBM, it then went on to become a robust APS solution used to this day.

Original post here.

Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday. You can also get data science training on-demand wherever you are with our Ai+ Training platform.

What Happens When You Run SRL Experiments Over a BERT Based Model?

Written by ODSC - Open Data Science