The document discusses using automated text summarization techniques to generate quality content at scale from user-generated content like online product reviews. It proposes a technical plan to download Amazon reviews, remove duplicate sentences using neural semantic textual similarity, and then generate frequently asked questions and corresponding FAQ schema by feeding the review text into a neural question generation model. The goal is to leverage user content and machine learning to automatically create helpful content for websites.
Turn Digital Reputation Threats into Offense Tactics - Daniel Lemin
Quality Content at Scale Through Automated Text Summarization of UGC
1. Quality Content at Scale
Through Automated Text
Summarization of UGC
Hamlet Batista
2. You can't connect the dots looking
forward; you can only connect
them looking backwards. So you
have to trust that the dots will
somehow connect in your future.
― Steve Jobs
@hamletbatista
3. The SEO Impact of
Meta Descriptions
1. Meta descriptions don’t
impact rankings and are the
last element on a SEO
implementation list
2. They are often ignored by
Google (over 60% of the
time)
3. Yet, this case study showed
they can make a
measurable difference
MoreBeer Case Study, July 28, 2017
https://bit.ly/3lyqOpb
@hamletbatista
4. Learnings from
Paid Search
1. Scientific SEO A/B tests
proved the impact of meta
descriptions
2. Bootstrapping the
experiments with paid
search data proved very
successful
3. We have over 90% success
rate with this technique
Martech Advisor Article July 28, 2017
https://bit.ly/2SL7DfI
@hamletbatista
5. The Two Salesmen
1. Contrasting paid ads vs
organic snippets
2. Paid ads focus on
compelling benefits while
organic snippets focus on
copying text from the page
3. This implies search visitors
stop and read the search
results before clicking
PEC Article, January 4, 2018
https://bit.ly/3jOGRii
@hamletbatista
6. The Two Salesmen
1. Contrasting paid ads vs
organic snippets
2. Paid ads focus on
compelling benefits while
organic snippets focus on
copying text from the page
3. This implies search visitors
stop and read the search
results before clicking
PEC Article, January 4, 2018
https://bit.ly/3jOGRii
@hamletbatista
7. What if the Ads
Copy is Boring?
1. ColeParmer didn’thave
compelling paid ads to pull
ideas from
2. Our experiments with other
clients taught us we
needed Cialdini’s pricinples
in the copy
3. ColeParmer had plenty of
product reviews (Social
Proof)
ColeParmer Case Study, February 19, 2019
https://bit.ly/3dgCMAP
@hamletbatista
9. 1. Reciprocity
2. Commitment and
consistency
3. Social proof
4. Authority
5. Liking
6. Scarcity
― Robert Cialdini’s Influence Principles
https://amzn.to/31az2ML
@hamletbatista
10. Let’s Generate
FAQs from Reviews
Copy
Here is our Technical Plan:
1. Download Amazon reviews
2. Remove duplicate
sentences
3. Generate FAQs (including
FAQPage schema)
@hamletbatista
11. Downloading
Amazon Reviews
Amazon provides 130M+
customer reviews for research
purposes.
1. The files are in TSV format
2. We need to skip error rows
(incorrect tab delimiter)
3. Then, filter the reviews we
want to use (for example
purchase-verified, 5-star
reviews with a minimum
number of votes)
@hamletbatista
14. Removing Duplicate
Sentences
Many reviews have repetitive
information. For example: “the
product is great/awesome/etc.”
These don’t provide any
new/useful insights.
We can use Neural Semantic
Textual Similarity to identify and
remove duplicates, even when
the text is not syntactically the
same.
@hamletbatista
15. Generating FAQs
(Including Schema)
Once we have useful review text,
we can use it to feed a Neural
Question Generation model and
produce FAQs
We can also generate the
corresponding schema.
@hamletbatista
16. Generating FAQs
(Including Schema)
Once we have useful review text,
we can use it to feed a Neural
Question Generation model and
produce FAQs.
We can also generate the
corresponding schema.
@hamletbatista
17. Resources to Learn More
1. How to Optimize Your Content for Search Questions Using Deep Learning https://blogs.bing.com/webmaster/july-2020/How-to-Optimize-
Your-Content-for-Search-Questions-using-Deep-Learning
2. How to Generate Quality FAQs and FAQPage Schemas Automatically with Python https://www.searchenginejournal.com/generate-quality-
faqs-faqpage-schemas-with-python/380004/
3. Example Content Brief, MarketMuse First Draft, Final Draft on the topic of Order Management [pdf] https://blog.marketmuse.com/wp-
content/uploads/2020/10/Example-Content-Brief-First-Draft-Final-Draft-Order-Management.pdf For early access 👉 https://bit.ly/2GgCD4j
4. OpenAI’s gigantic GPT-3 hints at the limits of language models for AI https://www.zdnet.com/article/openais-gigantic-gpt-3-hints-at-the-
limits-of-language-models-for-ai/
5. An Introduction to Python for SEO Pros Using Spreadsheets https://www.searchenginejournal.com/introduction-to-python-seo-
spreadsheets/342779/