How can you continue to achieve amazing results with limited time and resources?

Writing high quality content that is enlightening and persuasive is still a surefire way to achieve your traffic and conversion goals.

However, the process is a tedious, manual task that is not scalable.

Fortunately, the latest advances in understanding and generating natural languages ​​offer promising and exciting results.

In his SEJ eSummit session, Hamlet Batista discussed practical examples (and codes) that technical SEO experts can follow and adapt to their business, which is currently possible.

Here is a summary of his presentation.

Automated, high quality content creation

Auto-completion suggestions

How many times have you come across it?

Am I the only one who is sometimes afraid of how specific and relevant Google Doc and Gmail suggestions are?

You write a text and (this whole part can be suggested).

I mean it's awesome. But it is scary. 🤪😱

– Kristina Azarenko @ (@azarchick) May 11, 2020


Read below

You start typing in Gmail and Google automatically completes the whole part and it's very accurate.

You know, it's really fascinating, but at the same time, it can be really scary.

You may already be using AI technology in your work without even realizing it.

Automatic completion of Gmail

If you use the Google Docs, Gmail or even Microsoft Word and Outlook smart compose functionality, you are already using this technology.

This is part of your day as a marketer when communicating with customers.

The great thing is that this technology is not only accessible to Google.

Visit the Write With Transformer website, start typing, and press Tab to get complete sentence ideas.

Batista demonstrated how the machine can start generating lines after inserting the title and a sentence from a recent SEJ article – all you have to do is press the auto-complete command.


Read below

Write with Transformer

All of the text highlighted above was generated entirely by a computer.

The cool thing is that the technology that enables it is freely available and accessible to anyone who wants to use it.

Intent-based search

One of the changes we are currently seeing in search engine optimization is the transition to intention-based search queries.

As Mindy Weinstein puts it in her article in the Search Engine Journal: How to go deeper with keyword research:

"We are at a time when intent-based search is more important to us than pure volume."

"You should take the extra step to learn what questions customers ask and how they describe their problems."

"Go from keywords to questions"

This change gives us an opportunity when we write content.

The opportunity

Search engines respond today.

An effective way to write original and popular content is to answer the key questions of your target audience.

Take a look at this example for the "Python for SEO" query.

The first result shows that we can use content that answers questions, in this case using the FAQ scheme.

FAQ search clippings claim more real estate in the SERPs.

Python for SEO

However, doing this manually for any content you want to create can be expensive and time consuming.

But what if we can automate it using AI and existing content resources?

Use existing knowledge

Most established companies already have valuable, proprietary knowledge databases that they have only developed through normal interactions with customers over time.

These are often not yet publicly available (support emails, chats, internal wikis).

Open source AI + proprietary knowledge

Using a technique called transfer learning, we can create original, high quality content by combining proprietary knowledge bases and public deep learning models and records.


Read below

Transfer learning

There are differences between traditional machine learning (ML) and deep learning.

In traditional ML, you primarily carry out classifications and use existing knowledge to make predictions.

With Deep Learning, you can now draw on the knowledge of common sense that has been built up over time by large companies such as Google, Facebook, Microsoft and others.

During the session, Batista showed how this can be done.

How to automate content creation

Below are the steps that need to be followed when reviewing automated approaches to generating questions and answers.

  • Ask popular questions using online tools.
  • Answer them with two NLG approaches:
    • A span search approach.
    • A "closed book" approach.
  • Add a FAQ scheme and validate it with the SDTT.


Read below

Procurement of popular questions

Finding popular questions based on your keywords is not a big challenge as you can use free tools to do so.

Answer the public

Just enter a keyword and you will get a lot of questions that users ask.

Answer the public

Question Analyzer from BuzzSumo

They collect information from forums and other places. You can also find more long tail questions.


Read below

Question Analyzer from BuzzSumo

This tool scratches the people who also ask Google questions.

Question & answer system

The algorithm

Papers With Codes is a great resource for cutting edge research on answering questions.

This gives you free access to the latest research results that are published.

Scientists and researchers publish their research results so they can get feedback from their colleagues.

They always challenge each other to develop a better system.


Read below

What's more interesting is that even people like us can access the code we need to answer the questions.

For this task we use T5 or Text-to-Text Transfer Transformer.

The record

We also need the training data that the system will use to learn how to answer questions.

The Stanford Question Answering Dataset 2.0 (SQuAD 2.0) is the most popular data set for reading comprehension.

SQuAD 2.0

Now that we have both the data set and the code, let's talk about the two approaches we can use.

  • Answering open questions: You know where the answer is.
  • Answering questions with a closed book: You don't know where the answer is.


Read below

Approach No. 1: A Span Search Approach (Open Book)

With three simple lines of code, we can get the system to answer our questions.

You can do this in Google Colab.

Create a Colab notebook and enter the following:

! Install pip transformers

import from transformers Pipeline # Assign a pipeline to answer questions
nlp = pipeline (& # 39; answering questions & # 39;) nlp ({
& # 39; Question & # 39 ;: & # 39; What is the name of the repository? & # 39 ;,
& # 39; context & # 39 ;: & # 39; Pipeline has been added to the Huggingface / Transformers repository & # 39;

When you enter the command and specify a question and the context that you think contains the answer to the question, the system will basically look for the string that contains the answer.

{& # 39; Answer & # 39 ;: & # 39; Huggingface / Transformers & # 39 ;,
& # 39; end & # 39;: 59,
"Score": 0.5135626548884602,
& # 39; start & # 39 ;: 35}

The steps are simple:

How do you get the context?

With a few lines of code.

! pip install request-html

Import HTMLSession from request_html
session = HTMLSession ()

url = ""

selector = "# post-328471> div: nth child (2)> div> div> div.sej-article-content.gototop-pos"

with session.get (url) as r:

post = r.html.find (selector, first = True)

text = post.text

Using the request HTML library, you can get the URL that corresponds to navigating the browser to the URL and provide a selector (this is the path of the element of the text block on the page.)


Read below

I should just make a call to get the content and add it to the text – and that will be my context.

In this case, we ask a question that is contained in a SEJ article.

That means we know where the answer is. We provide the article that contains the answer.

But what if we don't know which article contains the answer, then we try to ask?

Approach 2: Exploring the Limits of NLG with T5 & Turing-NLG (Closed Book)

Google's T5 (11 billion parameter model) and Microsoft's TuringNG (17 billion parameter model) can answer questions without specifying a context.

They are so massive that they can remember many things during training.

The Google T5 team dealt with and lost the 11 billion parameter model in a pub trivia challenge.

Let's see how easy it is to train T5 to answer our own arbitrary questions.


Read below

In this example, one of the questions Batista asked was: "Who is the best search engine optimization company in the world?"

T5 answering questionsT5 answer any questions.

The best search engine optimization in the world is based on a model trained by Google SEOmoz.

SEOmoz - best SEO after T5

Train, fine tune and use T5

Training T5

We'll train the 3 billion parameter model with a free Google Colab TPU.

Here is the technical plan for using T5:


Read below

Copy the Colab notebook to your Google Drive

  • Change the runtime environment in Cloud TPU.

Change the runtime environment to cloud TPUChange the runtime environment to cloud TPUCreate a Google Cloud Storage Bucket

  • Enter the bucket path to the notebook.

Enter the bucket path to the notebook

  • Choose the 3 billion parameter model.

Choose the 3 billion parameter model

  • Run the remaining cells until the prediction step.

Run the remaining cells until the prediction step

And now you have a model that can actually answer questions.

But how do we add your proprietary knowledge so that it can answer questions in your domain or industry from your website?

Add new proprietary training records

Here we go into the fine-tuning step.

Simply click on the Fine-tune option in the model.

fine tune

The code contains some examples of how new functions are created and new functions are added to the model.


Read below


  • Process your proprietary knowledge base in a format that works with T5.
  • Adjust the existing code for this purpose (Natural Questions, TriviaQA).

Read the article in the Batista search engine journal, A hands-on introduction to machine learning for SEO professionals to learn the extraction, transformation, and loading process for machine learning.

Add FAQ scheme

This step is straightforward.

The FAQ can be found in the Google documentation: Mark your FAQs with structured data.

Google Developers FAQ markup

To do this, add the JSON-LD structure.


Do you want to do it automatically?


Read below

Batista also wrote an article about it: A practical introduction to modern JavaScript for SEOs.

With JavaScript you should be able to generate this JSON-LD.

Resources to learn more:

Watch this presentation

You can now see Batista's full presentation of SEJ eSummit on June 2nd.

Image credits

Selected picture: Paulo Bobita
All screenshots from the author, July 2020


Please enter your comment!
Please enter your name here