SlideShare uma empresa Scribd logo
1 de 45
Advanced SEO
COMPETITIVE INTELLIGENCE,
MODERN WEB SCRAPING, & MORE
Advanced SEO
Competitive intelligence, modern web scraping, & more
Hello!
I am Melissa Sciorra
Sr Manager, SEO @ SmarterTravel (a TripAdvisor Company)
3
@mel_arroics
#cmc2019
FamilyVacationCritic.com | SmarterTravel.com | Jetsetter.com | WhatToPack.com | AirfareWatchdog.com | Oyster.com
RAISE OF HANDS
How many people in this session work on Search Engine Optimization?
How many people in this session have used Screaming Frog?
How many people in this session have used XPath?
4
Agenda
How SEOs and Content Teams can work
together, SMARTER
Web Scraping technology & intro to XPath
Elements of webpages that can be
EXTRACTED
REAL LIFE USE CASES so you can go into
work next week and
5
impress your boss
EVOLVE@mel_arroics #cmc2019
Content Strategy
Workflow
Ideation Stage = the time to brainstorm topics.
Brainstorming topics can consist of:
- aha moments
- discovering topics through reading
- watching tv
-something you care about
idea
Writing &
Optimization
publishing
EVOLVE@mel_arroics #cmc2019
“Web Scraping is a way of automating the
process of gathering information from
different websites on the Internet.”
EVOLVE@mel_arroics #cmc2019
XPath
A query language that describes a way to find and
process items in XML (and HTML) documents
(Short For XML Path Language)
It’s supported by modern web browsers
In plain ENGLISH:
You can select any element, attribute, table,
content of an element, or meta object in a webpage.
Let’s See an Example
“I want to find all <h3> tags in my blog post.”
SCREAMING FROG can extract 2 <h1> and 2 <h2>, but
extracting <h3> doesn’t come out of the box, and it doesn’t
crawl more than 2 Header Tag types.
EVOLVE@mel_arroics #cmc2019
Screaming Frog
Custom Extraction
//h1
//h2
//h3
extract all <h1>
extract all <h2>
extract all <h3>
EVOLVE@mel_arroics #cmc2019@mel_arroics #cmc2019
EVOLVE@mel_arroics #cmc2019
PSA: The internetis a collection of pages. LOTS of pages.
Every website is built differently from the next.
Its all HTLM, CSS, JavaScript, etc.
Some are built well. Some are not.
Inconsistency in coding can make data collection hard.
…XPath can help!
EVOLVE@mel_arroics #cmc2019
Xpath: Location Paths
Xpath expressions can begin with the root node (the element) with /
/ selects the entire document
/html/head selects contents of the head element only
/html/head/title selects contents of a title element
Node-by-Node is important to understand for XPath, but not necessary to use
//title selects title element no matter where it is
EVOLVE@mel_arroics #cmc2019
Your XPath Syntax should be //a/@href
This is because //@href would give you ALL link attributes, from any line of code,
including references to JS, CSS and so on.
What if you want to extract all of the links on a page?
A link is defined by <a @href=“www.website.com/example”</a>
EVOLVE@mel_arroics #cmc2019
The Tools You
Need
Screaming Frog
https://bit.ly/29AEs8Q
Google Chrome
http://bit.ly/2CqZqp7
Scraper for Chrome
http://bit.ly/2W6dbAT
XPath Helper
https://bit.ly/2n8gtTC
Make sure developer
tools is enabled.
EVOLVE@mel_arroics #cmc2019
Screaming Frog
Google Chrome
http://bit.ly/2CqZqp7
EVOLVE@mel_arroics #cmc2019
SCREAMINGFROG XPATH EXTRACTION
• 10fields allow youto insert Xpath, CSSPath, or RegEx to searchand extract custom elements
• IncludesSyntax Validator
ExtractHTML Element
The selected element andall ofits
innerHTML content.
ExtractInner HTML
The innerHTML contentofthe
selected element; if theselected
element containsotherHTML
elements, they’llbeincluded.
ExtractText
The textcontentofa selected
element andthe textcontentof
anysub elements.
Tip: You choose what
you want to extract
EVOLVE@mel_arroics #cmc2019
Google Chrome
Developer Tools
EVOLVE@mel_arroics #cmc2019
Scraper For Chrome
EVOLVE@mel_arroics #cmc2019
XPath Helper
For
Chrome
How To Use XPath
In Your Day-to-Day
EVOLVE@mel_arroics #cmc2019
1.
ExtractExternal
Lists
“Airlines + Luggage Policy” =
I Need To Find All The Airlines To Create A Keyword
Tree To Provide My Content Team.
opportunity
I Find A Ranker.Com Site That Lists Out All
Airlines, But If I + Paste Into Excel It Would Be
Messy. Copy.
Right-click On An Airline Header > Scrape Similar
1
2
3
EVOLVE@mel_arroics #cmc2019
EVOLVE@mel_arroics #cmc2019
1.
ExtractExternal
Lists
Shorten to
//h2/div/a
to collect ALL
airline <h2>
EVOLVE@mel_arroics #cmc2019
2.
ExtractArticle
Publish
Date
Updated Date Schema =
I Need To Provide My Content Team With High-
performing URLs That Need To Be Reviewed And Updated.
HIGHER CTR%
EVOLVE@mel_arroics #cmc2019
2.
ExtractArticle
Publish
Date
1. Identify Top Pages In Google Search Console, Export,
And Open Up A Page Into Your Browser.
2. Find The Date Of Your Article And Right-click > Inspect.
3. Right Click On The Highlighted Entry > Copy > Copy
Xpath:
https://www.jetsetter.com/magazine/cool-things-to-do-in-denver/
//*[@id="container-scroll"]/div/div[2]/div[2]/div[1]/div/span[2]/time
4. Close Source Code And Open XPath Helper. Paste Your
Copied XPath Into “Query” And Make Sure It Returns The
Date Result.
5. Open Screaming Frog > Configuration > Custom >
Extraction.
EVOLVE@mel_arroics #cmc2019
2.
ExtractArticle
Publish
Date
5. Open Screaming Frog > Configuration > Custom >
Extraction.
6. Paste Your XPath Function And Name It. Extract
Inner HTML. Check For Checkmark Validation.
7. Paste Your Top URLS Into Screaming Frog And
Crawl.
Find Your Extractions Under Custom
Tab > Extraction Filter
EVOLVE@mel_arroics #cmc2019
EVOLVE@mel_arroics #cmc2019
EVOLVE@mel_arroics #cmc2019
3.
Analyze
Competitor’sArticle
Titles
Competitive Analysis =
I Want To See The Main Themes Of What They
Are Writing About To Begin My Competitive
Analysis.
1. Run A Crawl Of Competitor’s Website, Or
Extract Highest Performing URLs From
SEMRush And Crawl.
2. Download <H1> Or Title Tags.
3. Paste Into A Text Analyzer, Like
online-utility.org
Find content gaps
EVOLVE@mel_arroics #cmc2019
3.
Analyze
Competitor’sArticle
Titles
EVOLVE@mel_arroics #cmc2019
4.
ExtractYouTube
Video Titles And
Tags
New Video Strategy=
I Need To See Where To Start With My Video SEO
Strategy.
1. Visit The YouTube Channel And Load Up Videos
Until You Can’t Load Anymore Under Channel
Videos
2. Right-click On A Video Title And Select Scrape
Similar
3. Export to Google Docs
More visibility
EVOLVE@mel_arroics #cmc2019
4.
ExtractYouTube
Video Titles And
Tags
4. Add YouTube.com Through A Concatenate Formula
Onto All URLs:
5. Paste Full URL Into Screaming Frog.
6. Export Crawl Into Excel To Analyze Title, Meta
Description, And Meta Keywords.
=concatenate(”https://www.youtube.com",B2)
EVOLVE@mel_arroics #cmc2019
5.
Find Pages With
Specific Anchor
Text
Extract Certain On-Page Links=
I Want To See If Any Of My On-page Link Anchor
Text Contains “Amazon”.
1. Open Screaming Frog.
2. Enter The Below Formula Into Configuration >
Custom > Extraction:
//a[contains(translate(.,'ABCDEFGHIJKLMNOPQRSTUVWXYZ',
'abcdefghijklmnopqrstuvwxyz'),'amazon')]/@href
More opportunity
3. Replace ‘Amazon’ With Other Anchor Text You Want
To Search. Extract Inner HTML.
EVOLVE@mel_arroics #cmc2019
5.
Find Pages with
Specific Anchor
Text
EVOLVE@mel_arroics #cmc2019
6.
FindPages That
ContainExternal
LinksFromSpecific
Sites
Optimize Profitable Pages=
I Want To Extract A List Of All My Affiliate URLs
(fave.co)
1. Open Screaming Frog.
2. Enter The Below Formula Into Configuration >
Custom > Extraction:
//A[contains(@href,'Fave.Co')]/@Href
3. Extract Inner HTML And Crawl Your Website To
Find Your URLs That Contain Fave.co.
More Money
EVOLVE@mel_arroics #cmc2019
EVOLVE@mel_arroics #cmc2019
7.
Find Your
Content Fans For
Outreach
Your Fans =
I Want To Reach Out To People Who Left Comments
On My Site And Let Them Know About A New Piece Of
Content.
Most Users Who Comment On WordPress Blogs Enter
Their Name And Website.
interested IN YOU
EVOLVE@mel_arroics #cmc2019
7.
Find Your
Content Fans For
Outreach
Your Fans =
If This Is Something You Or Your Competitor Has
Enabled, Scrape The Names And Websites Of The
Commenters To Reach Out And Tell Them About Your
Content.
interested IN YOU
EVOLVE@mel_arroics #cmc2019
8. Analyze Which
Of Your Content
PerformsBest
Finding Valuable Category Types=
I Want To Find Which Type Of Content Gets The
Most Organic Clicks.
1. Pull Top 100 URL From Google Search Console
And Paste Into Screaming Frog.
2. Open A Sample UTL And Find The Location Of
Your Primary Tag.
3. Copy XPath (Right-click, Inspect, Copy XPath)
4. Paste Formula Into Screaming Frog Custom
Extraction.
//*[@id="container-scroll"]/div/div[2]/div[1]/div[1]
Content opp’y
EVOLVE@mel_arroics #cmc2019
8. Analyze Which
Of Your Content
PerformsBest
5. Combine Tag Data With Google Search Console
Data Via VLookup And Create A Pivot Table. Create A
Bar Chart.
Clicks by Tag
EVOLVE@mel_arroics #cmc2019
XPATH OUTPUT
//h1 Extract all H1tags
//h3[1] Extract the firstH3tag
//h3[2] Extract the secondH3tag
//div/p Extract any<p> containedwithina <div>
//div[@class='author'] Extract any<div> with class“author”
//p[@class='bio'] Extract any<p> with class“bio”
//*[@class='bio'] Extract anyelementwith class“bio”
//ul/li[last()] Extract the last<li>ina <ul>
//ol[@class='cat']/li[1] Extract the first<li> in a <ol> with class“cat”
count(//h2) Countthe numberof H2’s(setextractionfilter to “FunctionValue”)
//a[contains(.,'clickhere')] Extract anylinkwith anchortext containing“click here”
//a[starts-with(@title,'Writtenby')] Extract anylinkwith a titlestartingwith “Writtenby”
//@href Extract all links
//a[starts-with(@href,'mailto')]/@href Extract linkthat startswith “mailto” (emailaddress)
//meta[@property='article:published_time']/@content Extract the articlepublishdate(commonly-foundmetatag onWP)
Keep Learning!
• https://www.linkedin.com/pulse/secret-increasing-organic-ctr-
2019-updating-your-article-sciorra/
• https://builtvisible.com/seo-guide-to-xpath/
• https://www.screamingfrog.co.uk/web-scraping/
• https://www.w3schools.com/xml/xpath_intro.asp
• https://www.pmg.com/blog/how-to-use-xpath-in-screaming-frog/
• https://uproer.com/articles/screaming-frog-custom-extraction-
xpath-regex/
• https://ahrefs.com/blog/web-scraping-for-marketers/
43
44
Thanks!
Any questions?
@mel_arroics
#cmc2019
Advanced SEO
COMPETITIVE INTELLIGENCE,
MODERN WEB SCRAPING, & MORE

Mais conteúdo relacionado

Mais procurados

SearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your LogsSearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your LogsDistilled
 
Solving Complex JavaScript Issues and Leveraging Semantic HTML5
Solving Complex JavaScript Issues and Leveraging Semantic HTML5Solving Complex JavaScript Issues and Leveraging Semantic HTML5
Solving Complex JavaScript Issues and Leveraging Semantic HTML5Hamlet Batista
 
Hey Googlebot, did you cache that ?
Hey Googlebot, did you cache that ?Hey Googlebot, did you cache that ?
Hey Googlebot, did you cache that ?Petra Kis-Herczegh
 
Mauro Cattaneo - Why hreflang is crucial to international SEO success - Brigh...
Mauro Cattaneo - Why hreflang is crucial to international SEO success - Brigh...Mauro Cattaneo - Why hreflang is crucial to international SEO success - Brigh...
Mauro Cattaneo - Why hreflang is crucial to international SEO success - Brigh...Mauro Cattaneo
 
The Ultimate Guide to Scrapebox - The Only Scrapebox Tutorial You Need
The Ultimate Guide to Scrapebox - The Only Scrapebox Tutorial You NeedThe Ultimate Guide to Scrapebox - The Only Scrapebox Tutorial You Need
The Ultimate Guide to Scrapebox - The Only Scrapebox Tutorial You Needfrankmo920
 
Headless SEO: Optimising Next Gen Sites | brightonSEO 2021
Headless SEO: Optimising Next Gen Sites | brightonSEO 2021Headless SEO: Optimising Next Gen Sites | brightonSEO 2021
Headless SEO: Optimising Next Gen Sites | brightonSEO 2021Alex Wright
 
How to scale SEO work NOBODY wants to do (including your competitors) to rapi...
How to scale SEO work NOBODY wants to do (including your competitors) to rapi...How to scale SEO work NOBODY wants to do (including your competitors) to rapi...
How to scale SEO work NOBODY wants to do (including your competitors) to rapi...Hamlet Batista
 
Split Testing for SEO - 9 Months of Learning
Split Testing for SEO - 9 Months of LearningSplit Testing for SEO - 9 Months of Learning
Split Testing for SEO - 9 Months of LearningDominic Woodman
 
So you think you know canonical tags - Sean Butcher Brighton SEO presentation
So you think you know canonical tags -  Sean Butcher Brighton SEO presentationSo you think you know canonical tags -  Sean Butcher Brighton SEO presentation
So you think you know canonical tags - Sean Butcher Brighton SEO presentationSean Butcher
 
TechSEO Boost 2021 - Rendering Strategies: Measuring the Devil’s Details in C...
TechSEO Boost 2021 - Rendering Strategies: Measuring the Devil’s Details in C...TechSEO Boost 2021 - Rendering Strategies: Measuring the Devil’s Details in C...
TechSEO Boost 2021 - Rendering Strategies: Measuring the Devil’s Details in C...Catalyst
 
Technical SEO for international markets - Leonie Mann - Brighton SEO 2021
Technical SEO for international markets- Leonie Mann - Brighton SEO 2021Technical SEO for international markets- Leonie Mann - Brighton SEO 2021
Technical SEO for international markets - Leonie Mann - Brighton SEO 2021Leonie Mann
 
Combatting Crawl Bloat & Pruning Your Content Effectively
Combatting Crawl Bloat & Pruning Your Content EffectivelyCombatting Crawl Bloat & Pruning Your Content Effectively
Combatting Crawl Bloat & Pruning Your Content EffectivelyCharlie Whitworth
 
Deep crawl the chaotic landscape of JavaScript
Deep crawl the chaotic landscape of JavaScript Deep crawl the chaotic landscape of JavaScript
Deep crawl the chaotic landscape of JavaScript Onely
 
Schema.org and the changing world of Rich Results - SEOEdinburgh Meetup
Schema.org and the changing world of Rich Results - SEOEdinburgh MeetupSchema.org and the changing world of Rich Results - SEOEdinburgh Meetup
Schema.org and the changing world of Rich Results - SEOEdinburgh MeetupGeoff Kennedy
 
SearchLove Boston 2018 - Bartosz Goralewicz - JavaScript: Looking Past the ...
SearchLove Boston 2018 -  Bartosz Goralewicz -  JavaScript: Looking Past the ...SearchLove Boston 2018 -  Bartosz Goralewicz -  JavaScript: Looking Past the ...
SearchLove Boston 2018 - Bartosz Goralewicz - JavaScript: Looking Past the ...Distilled
 
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...Distilled
 
How to build simple web apps to automate your SEO tasks - BrightonSEO Spring ...
How to build simple web apps to automate your SEO tasks - BrightonSEO Spring ...How to build simple web apps to automate your SEO tasks - BrightonSEO Spring ...
How to build simple web apps to automate your SEO tasks - BrightonSEO Spring ...Charly Wargnier
 
On-Page SEO EXTREME - SEOZone Istanbul 2013
On-Page SEO EXTREME - SEOZone Istanbul 2013On-Page SEO EXTREME - SEOZone Istanbul 2013
On-Page SEO EXTREME - SEOZone Istanbul 2013Bastian Grimm
 
Single Page Apps - Gerry White @ BrightonSEO
Single Page Apps - Gerry White @ BrightonSEOSingle Page Apps - Gerry White @ BrightonSEO
Single Page Apps - Gerry White @ BrightonSEOGerry White
 
Browser Changes That Will Impact SEO From 2019-2020
Browser Changes That Will Impact SEO From 2019-2020Browser Changes That Will Impact SEO From 2019-2020
Browser Changes That Will Impact SEO From 2019-2020Tom Anthony
 

Mais procurados (20)

SearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your LogsSearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
 
Solving Complex JavaScript Issues and Leveraging Semantic HTML5
Solving Complex JavaScript Issues and Leveraging Semantic HTML5Solving Complex JavaScript Issues and Leveraging Semantic HTML5
Solving Complex JavaScript Issues and Leveraging Semantic HTML5
 
Hey Googlebot, did you cache that ?
Hey Googlebot, did you cache that ?Hey Googlebot, did you cache that ?
Hey Googlebot, did you cache that ?
 
Mauro Cattaneo - Why hreflang is crucial to international SEO success - Brigh...
Mauro Cattaneo - Why hreflang is crucial to international SEO success - Brigh...Mauro Cattaneo - Why hreflang is crucial to international SEO success - Brigh...
Mauro Cattaneo - Why hreflang is crucial to international SEO success - Brigh...
 
The Ultimate Guide to Scrapebox - The Only Scrapebox Tutorial You Need
The Ultimate Guide to Scrapebox - The Only Scrapebox Tutorial You NeedThe Ultimate Guide to Scrapebox - The Only Scrapebox Tutorial You Need
The Ultimate Guide to Scrapebox - The Only Scrapebox Tutorial You Need
 
Headless SEO: Optimising Next Gen Sites | brightonSEO 2021
Headless SEO: Optimising Next Gen Sites | brightonSEO 2021Headless SEO: Optimising Next Gen Sites | brightonSEO 2021
Headless SEO: Optimising Next Gen Sites | brightonSEO 2021
 
How to scale SEO work NOBODY wants to do (including your competitors) to rapi...
How to scale SEO work NOBODY wants to do (including your competitors) to rapi...How to scale SEO work NOBODY wants to do (including your competitors) to rapi...
How to scale SEO work NOBODY wants to do (including your competitors) to rapi...
 
Split Testing for SEO - 9 Months of Learning
Split Testing for SEO - 9 Months of LearningSplit Testing for SEO - 9 Months of Learning
Split Testing for SEO - 9 Months of Learning
 
So you think you know canonical tags - Sean Butcher Brighton SEO presentation
So you think you know canonical tags -  Sean Butcher Brighton SEO presentationSo you think you know canonical tags -  Sean Butcher Brighton SEO presentation
So you think you know canonical tags - Sean Butcher Brighton SEO presentation
 
TechSEO Boost 2021 - Rendering Strategies: Measuring the Devil’s Details in C...
TechSEO Boost 2021 - Rendering Strategies: Measuring the Devil’s Details in C...TechSEO Boost 2021 - Rendering Strategies: Measuring the Devil’s Details in C...
TechSEO Boost 2021 - Rendering Strategies: Measuring the Devil’s Details in C...
 
Technical SEO for international markets - Leonie Mann - Brighton SEO 2021
Technical SEO for international markets- Leonie Mann - Brighton SEO 2021Technical SEO for international markets- Leonie Mann - Brighton SEO 2021
Technical SEO for international markets - Leonie Mann - Brighton SEO 2021
 
Combatting Crawl Bloat & Pruning Your Content Effectively
Combatting Crawl Bloat & Pruning Your Content EffectivelyCombatting Crawl Bloat & Pruning Your Content Effectively
Combatting Crawl Bloat & Pruning Your Content Effectively
 
Deep crawl the chaotic landscape of JavaScript
Deep crawl the chaotic landscape of JavaScript Deep crawl the chaotic landscape of JavaScript
Deep crawl the chaotic landscape of JavaScript
 
Schema.org and the changing world of Rich Results - SEOEdinburgh Meetup
Schema.org and the changing world of Rich Results - SEOEdinburgh MeetupSchema.org and the changing world of Rich Results - SEOEdinburgh Meetup
Schema.org and the changing world of Rich Results - SEOEdinburgh Meetup
 
SearchLove Boston 2018 - Bartosz Goralewicz - JavaScript: Looking Past the ...
SearchLove Boston 2018 -  Bartosz Goralewicz -  JavaScript: Looking Past the ...SearchLove Boston 2018 -  Bartosz Goralewicz -  JavaScript: Looking Past the ...
SearchLove Boston 2018 - Bartosz Goralewicz - JavaScript: Looking Past the ...
 
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
 
How to build simple web apps to automate your SEO tasks - BrightonSEO Spring ...
How to build simple web apps to automate your SEO tasks - BrightonSEO Spring ...How to build simple web apps to automate your SEO tasks - BrightonSEO Spring ...
How to build simple web apps to automate your SEO tasks - BrightonSEO Spring ...
 
On-Page SEO EXTREME - SEOZone Istanbul 2013
On-Page SEO EXTREME - SEOZone Istanbul 2013On-Page SEO EXTREME - SEOZone Istanbul 2013
On-Page SEO EXTREME - SEOZone Istanbul 2013
 
Single Page Apps - Gerry White @ BrightonSEO
Single Page Apps - Gerry White @ BrightonSEOSingle Page Apps - Gerry White @ BrightonSEO
Single Page Apps - Gerry White @ BrightonSEO
 
Browser Changes That Will Impact SEO From 2019-2020
Browser Changes That Will Impact SEO From 2019-2020Browser Changes That Will Impact SEO From 2019-2020
Browser Changes That Will Impact SEO From 2019-2020
 

Semelhante a #CMC2019: Advanced SEO: Competitive intelligence, Web Scraping, and More.

Link Building at Scale With a Tiny Team - Sam Oh
Link Building at Scale With a Tiny Team - Sam OhLink Building at Scale With a Tiny Team - Sam Oh
Link Building at Scale With a Tiny Team - Sam OhSam Oh
 
The ultimate seo_checklist
The ultimate seo_checklistThe ultimate seo_checklist
The ultimate seo_checklistKenny Mark
 
Link Building at Scale: Big Links with a Tiny Team
Link Building at Scale: Big Links with a Tiny TeamLink Building at Scale: Big Links with a Tiny Team
Link Building at Scale: Big Links with a Tiny Team97th Floor
 
SourceCon Lab- Bookmarklets by Glenn Gutmacher Oct 2014
SourceCon Lab- Bookmarklets by Glenn Gutmacher Oct 2014SourceCon Lab- Bookmarklets by Glenn Gutmacher Oct 2014
SourceCon Lab- Bookmarklets by Glenn Gutmacher Oct 2014Glenn Gutmacher
 
Plug and Play Tools for the Recruiting Empiricist
Plug and Play Tools for the Recruiting EmpiricistPlug and Play Tools for the Recruiting Empiricist
Plug and Play Tools for the Recruiting EmpiricistJung Kim
 
Advanced Web Scraping or How To Make Internet Your Database #seoplus2018
Advanced Web Scraping or How To Make Internet Your Database #seoplus2018Advanced Web Scraping or How To Make Internet Your Database #seoplus2018
Advanced Web Scraping or How To Make Internet Your Database #seoplus2018Esteve Castells
 
SEO Presentation
SEO PresentationSEO Presentation
SEO Presentationganeh17
 
WordPress Development Confoo 2010
WordPress Development Confoo 2010WordPress Development Confoo 2010
WordPress Development Confoo 2010Brendan Sera-Shriar
 
Cut The Crap: Running Content Audits With Crawlers - Sam Marsden, Technical S...
Cut The Crap: Running Content Audits With Crawlers - Sam Marsden, Technical S...Cut The Crap: Running Content Audits With Crawlers - Sam Marsden, Technical S...
Cut The Crap: Running Content Audits With Crawlers - Sam Marsden, Technical S...DeepCrawl
 
Scrape box presentation
Scrape box presentationScrape box presentation
Scrape box presentationElephate1
 
Ultimate Guide to White Hat SEO using Scrapebox
Ultimate Guide to White Hat SEO using ScrapeboxUltimate Guide to White Hat SEO using Scrapebox
Ultimate Guide to White Hat SEO using ScrapeboxŁukasz Rogala
 
How to disrupt established markets with SEO in 2015 - LOGIN 2015
How to disrupt established markets with SEO in 2015 - LOGIN 2015How to disrupt established markets with SEO in 2015 - LOGIN 2015
How to disrupt established markets with SEO in 2015 - LOGIN 2015Yannis Karagiannidis
 
Wordpress SEO
Wordpress SEOWordpress SEO
Wordpress SEOBeFound
 
Week 12 - Search Engine Optimization
Week 12 -  Search Engine OptimizationWeek 12 -  Search Engine Optimization
Week 12 - Search Engine Optimizationhenri_makembe
 
Future of Search Engine Factors, AMP, On-Page Key to Success
Future of Search Engine Factors, AMP, On-Page Key to SuccessFuture of Search Engine Factors, AMP, On-Page Key to Success
Future of Search Engine Factors, AMP, On-Page Key to SuccessAnetwork
 
Demand quest seo training session 2 5.2018
Demand quest seo training session 2 5.2018Demand quest seo training session 2 5.2018
Demand quest seo training session 2 5.2018Nate Plaunt
 
Redefining Technical SEO, #MozCon 2019 by Paul Shapiro
Redefining Technical SEO, #MozCon 2019 by Paul ShapiroRedefining Technical SEO, #MozCon 2019 by Paul Shapiro
Redefining Technical SEO, #MozCon 2019 by Paul ShapiroPaul Shapiro
 
Redefining Technical SEO - Paul Shapiro at MozCon 2019
Redefining Technical SEO - Paul Shapiro at MozCon 2019Redefining Technical SEO - Paul Shapiro at MozCon 2019
Redefining Technical SEO - Paul Shapiro at MozCon 2019Catalyst
 

Semelhante a #CMC2019: Advanced SEO: Competitive intelligence, Web Scraping, and More. (20)

Link Building at Scale With a Tiny Team - Sam Oh
Link Building at Scale With a Tiny Team - Sam OhLink Building at Scale With a Tiny Team - Sam Oh
Link Building at Scale With a Tiny Team - Sam Oh
 
The ultimate seo_checklist
The ultimate seo_checklistThe ultimate seo_checklist
The ultimate seo_checklist
 
Link Building at Scale: Big Links with a Tiny Team
Link Building at Scale: Big Links with a Tiny TeamLink Building at Scale: Big Links with a Tiny Team
Link Building at Scale: Big Links with a Tiny Team
 
SourceCon Lab- Bookmarklets by Glenn Gutmacher Oct 2014
SourceCon Lab- Bookmarklets by Glenn Gutmacher Oct 2014SourceCon Lab- Bookmarklets by Glenn Gutmacher Oct 2014
SourceCon Lab- Bookmarklets by Glenn Gutmacher Oct 2014
 
Plug and Play Tools for the Recruiting Empiricist
Plug and Play Tools for the Recruiting EmpiricistPlug and Play Tools for the Recruiting Empiricist
Plug and Play Tools for the Recruiting Empiricist
 
Advanced Web Scraping or How To Make Internet Your Database #seoplus2018
Advanced Web Scraping or How To Make Internet Your Database #seoplus2018Advanced Web Scraping or How To Make Internet Your Database #seoplus2018
Advanced Web Scraping or How To Make Internet Your Database #seoplus2018
 
SEO Presentation
SEO PresentationSEO Presentation
SEO Presentation
 
WordPress Development Confoo 2010
WordPress Development Confoo 2010WordPress Development Confoo 2010
WordPress Development Confoo 2010
 
Cut The Crap: Running Content Audits With Crawlers - Sam Marsden, Technical S...
Cut The Crap: Running Content Audits With Crawlers - Sam Marsden, Technical S...Cut The Crap: Running Content Audits With Crawlers - Sam Marsden, Technical S...
Cut The Crap: Running Content Audits With Crawlers - Sam Marsden, Technical S...
 
Meta tag creation
Meta tag creationMeta tag creation
Meta tag creation
 
Scrape box presentation
Scrape box presentationScrape box presentation
Scrape box presentation
 
Ultimate Guide to White Hat SEO using Scrapebox
Ultimate Guide to White Hat SEO using ScrapeboxUltimate Guide to White Hat SEO using Scrapebox
Ultimate Guide to White Hat SEO using Scrapebox
 
How to disrupt established markets with SEO in 2015 - LOGIN 2015
How to disrupt established markets with SEO in 2015 - LOGIN 2015How to disrupt established markets with SEO in 2015 - LOGIN 2015
How to disrupt established markets with SEO in 2015 - LOGIN 2015
 
Wordpress SEO
Wordpress SEOWordpress SEO
Wordpress SEO
 
Week 12 - Search Engine Optimization
Week 12 -  Search Engine OptimizationWeek 12 -  Search Engine Optimization
Week 12 - Search Engine Optimization
 
Future of Search Engine Factors, AMP, On-Page Key to Success
Future of Search Engine Factors, AMP, On-Page Key to SuccessFuture of Search Engine Factors, AMP, On-Page Key to Success
Future of Search Engine Factors, AMP, On-Page Key to Success
 
Demand quest seo training session 2 5.2018
Demand quest seo training session 2 5.2018Demand quest seo training session 2 5.2018
Demand quest seo training session 2 5.2018
 
Redefining Technical SEO, #MozCon 2019 by Paul Shapiro
Redefining Technical SEO, #MozCon 2019 by Paul ShapiroRedefining Technical SEO, #MozCon 2019 by Paul Shapiro
Redefining Technical SEO, #MozCon 2019 by Paul Shapiro
 
Flavours of SEO
Flavours of SEOFlavours of SEO
Flavours of SEO
 
Redefining Technical SEO - Paul Shapiro at MozCon 2019
Redefining Technical SEO - Paul Shapiro at MozCon 2019Redefining Technical SEO - Paul Shapiro at MozCon 2019
Redefining Technical SEO - Paul Shapiro at MozCon 2019
 

Último

TAM Sports IPL 17 Advertising Report- M01 - M23
TAM Sports IPL 17 Advertising Report- M01 - M23TAM Sports IPL 17 Advertising Report- M01 - M23
TAM Sports IPL 17 Advertising Report- M01 - M23Social Samosa
 
Digital Marketing in 5G Era - Digital Transformation in 5G Age
Digital Marketing in 5G Era - Digital Transformation in 5G AgeDigital Marketing in 5G Era - Digital Transformation in 5G Age
Digital Marketing in 5G Era - Digital Transformation in 5G AgeDigiKarishma
 
Professional Sales Representative by Sahil Srivastava.pptx
Professional Sales Representative by Sahil Srivastava.pptxProfessional Sales Representative by Sahil Srivastava.pptx
Professional Sales Representative by Sahil Srivastava.pptxSahil Srivastava
 
Gen Z and Millennial Debit Card Use Survey.pdf
Gen Z and Millennial Debit Card Use Survey.pdfGen Z and Millennial Debit Card Use Survey.pdf
Gen Z and Millennial Debit Card Use Survey.pdfMedia Logic
 
Creating a Successful Digital Marketing Campaign.pdf
Creating a Successful Digital Marketing Campaign.pdfCreating a Successful Digital Marketing Campaign.pdf
Creating a Successful Digital Marketing Campaign.pdfgopzzzin
 
Miss Immigrant USA Activity Pageant Program.pdf
Miss Immigrant USA Activity Pageant Program.pdfMiss Immigrant USA Activity Pageant Program.pdf
Miss Immigrant USA Activity Pageant Program.pdfMagdalena Kulisz
 
Content Marketing: How To Find The True Value Of Your Marketing Funnel
Content Marketing: How To Find The True Value Of Your Marketing FunnelContent Marketing: How To Find The True Value Of Your Marketing Funnel
Content Marketing: How To Find The True Value Of Your Marketing FunnelSearch Engine Journal
 
5 Digital Marketing Tips | Devherds Software Solutions
5 Digital Marketing Tips | Devherds Software Solutions5 Digital Marketing Tips | Devherds Software Solutions
5 Digital Marketing Tips | Devherds Software SolutionsDevherds Software Solutions
 
Bamboo Charcoal Toothpaste By Phyto Atomy For More Details Message On WhatsA...
Bamboo Charcoal Toothpaste By Phyto Atomy  For More Details Message On WhatsA...Bamboo Charcoal Toothpaste By Phyto Atomy  For More Details Message On WhatsA...
Bamboo Charcoal Toothpaste By Phyto Atomy For More Details Message On WhatsA...shrutimishraqt
 
The Evolution of Internet : How consumers use technology and its impact on th...
The Evolution of Internet : How consumers use technology and its impact on th...The Evolution of Internet : How consumers use technology and its impact on th...
The Evolution of Internet : How consumers use technology and its impact on th...sowmyrao14
 
Agencia Marketing Branding Examen Fundamentals Digital Marketing Google Abril...
Agencia Marketing Branding Examen Fundamentals Digital Marketing Google Abril...Agencia Marketing Branding Examen Fundamentals Digital Marketing Google Abril...
Agencia Marketing Branding Examen Fundamentals Digital Marketing Google Abril...Marketing BRANDING
 
A Comprehensive Guide to Technical SEO | Banyanbrain
A Comprehensive Guide to Technical SEO | BanyanbrainA Comprehensive Guide to Technical SEO | Banyanbrain
A Comprehensive Guide to Technical SEO | BanyanbrainBanyanbrain
 
Agencia Marketing Branding Measurement Certification Google Ads Abril 2024
Agencia Marketing Branding Measurement Certification Google Ads Abril 2024Agencia Marketing Branding Measurement Certification Google Ads Abril 2024
Agencia Marketing Branding Measurement Certification Google Ads Abril 2024Marketing BRANDING
 
social media optimization complete indroduction
social media optimization complete indroductionsocial media optimization complete indroduction
social media optimization complete indroductioninfoshraddha747
 
Dave Cousin TW-BERT Good for Users, Good for SEOsBrighton SEO Deck
Dave Cousin TW-BERT Good for Users, Good for SEOsBrighton SEO DeckDave Cousin TW-BERT Good for Users, Good for SEOsBrighton SEO Deck
Dave Cousin TW-BERT Good for Users, Good for SEOsBrighton SEO DeckOban International
 
20 Top Social Media Tips for Peer Specialists
20 Top Social Media Tips for Peer Specialists20 Top Social Media Tips for Peer Specialists
20 Top Social Media Tips for Peer Specialistsmlicam615
 
15 Tactics to Scale Your Trade Show Marketing Strategy
15 Tactics to Scale Your Trade Show Marketing Strategy15 Tactics to Scale Your Trade Show Marketing Strategy
15 Tactics to Scale Your Trade Show Marketing StrategyBlue Atlas Marketing
 
Richard van der Velde, Technical Support Lead for Cookiebot @CMP – “Artificia...
Richard van der Velde, Technical Support Lead for Cookiebot @CMP – “Artificia...Richard van der Velde, Technical Support Lead for Cookiebot @CMP – “Artificia...
Richard van der Velde, Technical Support Lead for Cookiebot @CMP – “Artificia...Associazione Digital Days
 
Paul Russell Confidential Resume for Fahlo.pdf
Paul Russell Confidential Resume for Fahlo.pdfPaul Russell Confidential Resume for Fahlo.pdf
Paul Russell Confidential Resume for Fahlo.pdfpaul8402
 
Navigating Global Markets and Strategies for Success
Navigating Global Markets and Strategies for SuccessNavigating Global Markets and Strategies for Success
Navigating Global Markets and Strategies for SuccessElizabeth Moore
 

Último (20)

TAM Sports IPL 17 Advertising Report- M01 - M23
TAM Sports IPL 17 Advertising Report- M01 - M23TAM Sports IPL 17 Advertising Report- M01 - M23
TAM Sports IPL 17 Advertising Report- M01 - M23
 
Digital Marketing in 5G Era - Digital Transformation in 5G Age
Digital Marketing in 5G Era - Digital Transformation in 5G AgeDigital Marketing in 5G Era - Digital Transformation in 5G Age
Digital Marketing in 5G Era - Digital Transformation in 5G Age
 
Professional Sales Representative by Sahil Srivastava.pptx
Professional Sales Representative by Sahil Srivastava.pptxProfessional Sales Representative by Sahil Srivastava.pptx
Professional Sales Representative by Sahil Srivastava.pptx
 
Gen Z and Millennial Debit Card Use Survey.pdf
Gen Z and Millennial Debit Card Use Survey.pdfGen Z and Millennial Debit Card Use Survey.pdf
Gen Z and Millennial Debit Card Use Survey.pdf
 
Creating a Successful Digital Marketing Campaign.pdf
Creating a Successful Digital Marketing Campaign.pdfCreating a Successful Digital Marketing Campaign.pdf
Creating a Successful Digital Marketing Campaign.pdf
 
Miss Immigrant USA Activity Pageant Program.pdf
Miss Immigrant USA Activity Pageant Program.pdfMiss Immigrant USA Activity Pageant Program.pdf
Miss Immigrant USA Activity Pageant Program.pdf
 
Content Marketing: How To Find The True Value Of Your Marketing Funnel
Content Marketing: How To Find The True Value Of Your Marketing FunnelContent Marketing: How To Find The True Value Of Your Marketing Funnel
Content Marketing: How To Find The True Value Of Your Marketing Funnel
 
5 Digital Marketing Tips | Devherds Software Solutions
5 Digital Marketing Tips | Devherds Software Solutions5 Digital Marketing Tips | Devherds Software Solutions
5 Digital Marketing Tips | Devherds Software Solutions
 
Bamboo Charcoal Toothpaste By Phyto Atomy For More Details Message On WhatsA...
Bamboo Charcoal Toothpaste By Phyto Atomy  For More Details Message On WhatsA...Bamboo Charcoal Toothpaste By Phyto Atomy  For More Details Message On WhatsA...
Bamboo Charcoal Toothpaste By Phyto Atomy For More Details Message On WhatsA...
 
The Evolution of Internet : How consumers use technology and its impact on th...
The Evolution of Internet : How consumers use technology and its impact on th...The Evolution of Internet : How consumers use technology and its impact on th...
The Evolution of Internet : How consumers use technology and its impact on th...
 
Agencia Marketing Branding Examen Fundamentals Digital Marketing Google Abril...
Agencia Marketing Branding Examen Fundamentals Digital Marketing Google Abril...Agencia Marketing Branding Examen Fundamentals Digital Marketing Google Abril...
Agencia Marketing Branding Examen Fundamentals Digital Marketing Google Abril...
 
A Comprehensive Guide to Technical SEO | Banyanbrain
A Comprehensive Guide to Technical SEO | BanyanbrainA Comprehensive Guide to Technical SEO | Banyanbrain
A Comprehensive Guide to Technical SEO | Banyanbrain
 
Agencia Marketing Branding Measurement Certification Google Ads Abril 2024
Agencia Marketing Branding Measurement Certification Google Ads Abril 2024Agencia Marketing Branding Measurement Certification Google Ads Abril 2024
Agencia Marketing Branding Measurement Certification Google Ads Abril 2024
 
social media optimization complete indroduction
social media optimization complete indroductionsocial media optimization complete indroduction
social media optimization complete indroduction
 
Dave Cousin TW-BERT Good for Users, Good for SEOsBrighton SEO Deck
Dave Cousin TW-BERT Good for Users, Good for SEOsBrighton SEO DeckDave Cousin TW-BERT Good for Users, Good for SEOsBrighton SEO Deck
Dave Cousin TW-BERT Good for Users, Good for SEOsBrighton SEO Deck
 
20 Top Social Media Tips for Peer Specialists
20 Top Social Media Tips for Peer Specialists20 Top Social Media Tips for Peer Specialists
20 Top Social Media Tips for Peer Specialists
 
15 Tactics to Scale Your Trade Show Marketing Strategy
15 Tactics to Scale Your Trade Show Marketing Strategy15 Tactics to Scale Your Trade Show Marketing Strategy
15 Tactics to Scale Your Trade Show Marketing Strategy
 
Richard van der Velde, Technical Support Lead for Cookiebot @CMP – “Artificia...
Richard van der Velde, Technical Support Lead for Cookiebot @CMP – “Artificia...Richard van der Velde, Technical Support Lead for Cookiebot @CMP – “Artificia...
Richard van der Velde, Technical Support Lead for Cookiebot @CMP – “Artificia...
 
Paul Russell Confidential Resume for Fahlo.pdf
Paul Russell Confidential Resume for Fahlo.pdfPaul Russell Confidential Resume for Fahlo.pdf
Paul Russell Confidential Resume for Fahlo.pdf
 
Navigating Global Markets and Strategies for Success
Navigating Global Markets and Strategies for SuccessNavigating Global Markets and Strategies for Success
Navigating Global Markets and Strategies for Success
 

#CMC2019: Advanced SEO: Competitive intelligence, Web Scraping, and More.

Notas do Editor

  1. Welcome to Advanced SEO, competitive intelligence, modern web scraping, and more.
  2. My name is Melissa Sciorra, and I’m currently the senior manager of SEO at SmarterTravel, a TripAdvisor company. We own and operate travel websites that reach nearly 200 million unique visitors each month. You may have heard of some of my sites, including Jetsetter.com, Airfarewatchdog.com, Oyster.com, and our newest site, whattopack.com. Feel free to tweet at me using my handle, @mel_arroics, and use the hashtag CMC2019. I want to preface this talk by first including a disclaimer; I’ve been in SEO for almost 9 years, and I’m by no means a developer who is proficient in python. We all know that in SEO, sometimes things can get a little repetitive, and I’ve discovered ways of fueling my research that can help save time and automate processes, and provide competitive insights. this TOPIC gets very technical very quickly, so I’m going to try to break it down to a level where anyone can understand and use these functions to make custom extraction easy
  3. Quick poll: How many people in this session work in SEO full time? How many people in this session work in SEO part time? How many people have used screaming frog How many people have never heard of screaming frog? How many people have used xpath? How many people have never heard of xpath?
  4. Today, we are going to learn how SEO’s can automate research processes to help fuel their own competitive research, and to help provide insights to content teams. We’re going to dive into webscraping technology in todays age, and what xpath is. We’ll go over elements of webpages that can be extracted using real life examples, and by the end of this session, you’ll have takeaways that you can start using at work to imress your boss, your colleagues, your friends, and maybe even your mothers.
  5. Let’s dive in. We know that SEO in 2019 is still about creating really awesome content for our users. This means you and your team must must continuously come up with great ideas, or find great ideas from existing posts, search query reports, or competitive analysis and content gaps. Content strategy begins with the ideation stage, and brainstorming topics can consist of aha moments, watching tv, things you are passionate about, and more.
  6. You can also come up with ideas through web scraping. That is, scraping what your competitors are doing, and this starts at the type of content they are writing about. What is Web Scraping? A way of automating the process of gathering information from different sites on the internet. The trick with web scraping is that you have to have a basic understanding of how a web page’s markup is laid out. This, plus an understanding of Xpath, helps you extract data quickly and easily.
  7. So what is Xpath and how can it make my life easier? Xpath is a query language for selecting pieces of information in an XML document. It allows you to extract elements, attributes and objects from the HTML in a webpage. Its supported by most web browsers This means that any website, your own website and your competitors websites, can be scraped for information that you want based on cammands you write in Xpath.
  8. Lets see an example. For those of you who have used screaming frog before, we know that the H1 and H2 tags can be pulled automatically with every site crawl, but lets say we want to also identify and analyze H3 tags.
  9. I’d open the custom extraction field in screaming frog and enter the syntax for H3. The two slashes mean search the entire XML document and looks for any element containing <h3>.
  10. When I enter the syntax, I can find the extraction within the custom field in Screaming frog.
  11. But there is more too it than just copying and pasting expressions. If only it were that simple…. The internet is full of tons of webpages that are built differently from the next. The only similarity is that XML documents contain HTML, CSS, and JS. Xpath can help automate the process of data collection, saving you time at your keyboard to work on more strategic goals.
  12. Node by node begins at the root node, a slash. Two slashes searches the whole document.
  13. Use XPath to extract any HTML element of a webpage. If you want to scrape information contained in a div, span, p, heading tag or really any other HTML element
  14. The Screaming Frog SEO Spider is a website crawler, that allows you to crawl websites’ URLs and fetch key elements to analyse and audit technical and onsite SEO
  15. Inspect and live-edit the HTML and CSS of a page using the Chrome DevTools Elements panel. Google Chrome has a feature that makes writing XPath easier. Using the Inspect tool, you can right-click on any element and copy the XPath syntax. It’ll often be the case that you’ll need to modify what Chrome gives you before pasting the XPath into Screaming Frog, but it at least gets you started.
  16. Scraper for Chrome is a simple and fast tool that allows you to identify and refine xpath expressions.
  17. QA your xpath queries
  18. Lets start off with an easy example. Our content team came up with the idea to create a large piece of content that explained luggage policy by airline after doing a few searches on Google and using SEMrush. As an SEO, I have to provide the content team with the highest volume search terms so they can narrow down their list. I google “list of American airlines” and find a ranker.com website that lists all the airlines in America. I could copy and paste this list in excel, but I would be left with a really messy spreadsheet that would take time to clean up. Instead, I right click on an airline header, and use my tool “Scrape similar “
  19. Right click > Scrape similar
  20. From here, the Xpath reference is /html/body/article/h2/div/a, but I remove my root node info and include two slashes next to my h2 to find all H2s in the XML document. I can then export these into excel, put together a concatenate formula based off of popular luggage policy terms, and upload them into google adwords to find average monthly search volume.
  21. Lets see another example. We know that having updated content not only makes Google happy, but it also makes users happy. For example, I search for best shows on Netflix and am presented with position 1 and position 2 SERPs. One shows me its been updated in April and the other has been updated in march – which one do you think I’m going to click into?
  22. You should make this a normal deliverable to provide to your client or content team. Heres how you do it. First, identify your top pages in Google Search Console and export. Open up one of those pages into your browser and find the date on page. Right click and inspect element, which brings up the code in devbrowser. Rigt click on the highlighted entry within the code, and copy xpath. For example, on my jetsetter.com URL for cool things to do in Denver, my xpath looks like this. To QA, I’m going to open my Xpath helper and paste the xpath into it.
  23. Analyze competitor’s recent posts titles. Plug into a text analysis tool to let us see what posts are about
  24. We advise being very careful with this strategy. Remember, these people may have left a comment, but they didn’t opt into your email list. That could have been for a number of reasons, but chances are they were only really interested in this post. We, therefore, recommend using this strategy only to tell commenters about the updates to the post and/or other new posts that are similar. In other words, don’t email people about stuff they’re unlikely to care about! ..Use hunger.io add-on in Google Sheets for to find Emails
  25. CHEAT SHEET
  26. Resources
  27. Questions