SlideShare a Scribd company logo
1 of 81
Hamlet Batista | @hamletbatista | #TechSEOBoost
Python for SEO
–
Programming As a Superpower
Hamlet Batista | @hamletbatista | #TechSEOBoost
AGENDA
–
Practical SEO applications
of Python => 3.6 for:
Data extraction
–
Preparation
–
Analysis & Visualization
–
Machine learning
–
Deep learning
Hamlet Batista | @hamletbatista | #TechSEOBoost
INTRO
–
Why program when you can hire
a programmer to do the work for you?
Hamlet Batista | @hamletbatista | #TechSEOBoost
But before…
Hamlet Batista | @hamletbatista | #TechSEOBoost
Hamlet Batista | @hamletbatista | #TechSEOBoost
Hamlet Batista | @hamletbatista | #TechSEOBoost
Hamlet Batista | @hamletbatista | #TechSEOBoost
Hamlet Batista | @hamletbatista | #TechSEOBoost
Hamlet Batista | @hamletbatista | #TechSEOBoost
Hamlet Batista | @hamletbatista | #TechSEOBoost
Hamlet Batista | @hamletbatista | #TechSEOBoost
CHALLENGING SEO PROBLEMS
–
THAT NEED PROGRAMMING WORK
Hamlet Batista | @hamletbatista | #TechSEOBoost
IBM WebSphere => SAP Hybris
Hamlet Batista | @hamletbatista | #TechSEOBoost
IBM WebSphere Site
Category Page
(Links to one or more
Product Listing
Pages)
Product Listing Page
(Links to one or more
Product Pages)
Product Page
(Single SKU)
Hamlet Batista | @hamletbatista | #TechSEOBoost
SAP Hybris Site
Category Page
(Links to one or more
Product Pages)
Product Page
(Single SKU)
Hamlet Batista | @hamletbatista | #TechSEOBoost
Old Site
Product Pages
(717)
New Site
Product Pages
(442)
Product
Mapping
(3431)
Hamlet Batista | @hamletbatista | #TechSEOBoost
Old Site
Category
Pages
(371)
New Site
Category
Pages
(147)
Category
Mapping
(712)
Hamlet Batista | @hamletbatista | #TechSEOBoost
Hamlet Batista | @hamletbatista | #TechSEOBoost
Category
Home
Product
Content
Videos
Other
NewUsersRevenuePageCount
Hamlet Batista | @hamletbatista | #TechSEOBoost
Hamlet Batista | @hamletbatista | #TechSEOBoost
Hamlet Batista | @hamletbatista | #TechSEOBoost
Category
Home
Product
Content
Videos
Other
NewUsers
Hamlet Batista | @hamletbatista | #TechSEOBoost
Winners vs Losers
Hamlet Batista | @hamletbatista | #TechSEOBoost
Launch Jupyter Notebook in Google Colaboratory
https://colab.research.google.com/github/ranksense/open-
source/blob/master/Presentations/TechSEOBoost/2018/PythonforSEOTechSEOBoost2018_
Hamlet_Batista.ipynb
Hamlet Batista | @hamletbatista | #TechSEOBoost
Hamlet Batista | @hamletbatista | #TechSEOBoost
Ecommerce V3 => Shopify
Hamlet Batista | @hamletbatista | #TechSEOBoost
https://github.com/plotly/plotly.py
Hamlet Batista | @hamletbatista | #TechSEOBoost
Solution Part 1 – Steps
Step 1:
Pull Google Analytics Data
–
Step 2:
Store Data in Pandas DataFrame
–
Step 3:
Perform Data Preparation and
Perform Basic Set Operations
CHALLENGE: Find Which Pages Lost
SEO Traffic
Hamlet Batista | @hamletbatista | #TechSEOBoost
Python – Basics
https://pandas.pydata.org/
Python for Data Science Cheat Sheet
https://s3.amazonaws.com/assets.datacamp.com/blog_assets/PythonF
orDataScience.pdf
Hamlet Batista | @hamletbatista | #TechSEOBoost
Python – Jupyter
Google Colaboratory
https://colab.research.google.com/notebooks/
welcome.ipynb
Hamlet Batista | @hamletbatista | #TechSEOBoost
Python – Pandas
https://pandas.pydata.org/
Cheat Sheet
https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf
10 Minutes to pandas
https://pandas.pydata.org/pandas-docs/stable/10min.html
Intro to Pandas for Excel Super Users
https://towardsdatascience.com/intro-to-pandas-for-excel-
super-users-dac1b38f12b0
Hamlet Batista | @hamletbatista | #TechSEOBoost
Python – Requests
WEB SCRAPING REFERENCE:
A Simple Cheat Sheet for Web Scraping with
Python
https://blog.hartleybrody.com/web-scraping-cheat-sheet/
http://docs.python-requests.org/en/master/
Hamlet Batista | @hamletbatista | #TechSEOBoost
https://ga-dev-tools.appspot.com/query-explorer/
Hamlet Batista | @hamletbatista | #TechSEOBoost
Pulling Google Analytics Data
Hamlet Batista | @hamletbatista | #TechSEOBoost
Storing Data in a DataFrame
Hamlet Batista | @hamletbatista | #TechSEOBoost
Transforming Data for Analysis
https://www.shanelynn.ie/merge-join-dataframes-python-pandas-index-1/
Left Join Full Outer Join Left Join (if NULL)
Inner Join Right Join Right Join (if NULL)
Hamlet Batista | @hamletbatista | #TechSEOBoost
Transforming Data for Analysis
Hamlet Batista | @hamletbatista | #TechSEOBoost
Pages That Lost SEO Traffic
Hamlet Batista | @hamletbatista | #TechSEOBoost
Solution Part 2 – Steps
Step 1:
We will crawl old pages to follow
redirects
–
Step 2:
We will group pages using regular
expressions
–
Step 3:
Repeat the previous analysis
CHALLENGE: Find Which Page Groups Lost
SEO Traffic (Manually)
Hamlet Batista | @hamletbatista | #TechSEOBoost
Regular Expressions for
SEOs and Digital
Marketers (with Use
Cases)
https://netpeaksoftware.com/blog/
regular-expressions-for-seos-
and-digital-marketers-with-use-
cases
Regex101.com
Hamlet Batista | @hamletbatista | #TechSEOBoost
Crawling Old Pages
Hamlet Batista | @hamletbatista | #TechSEOBoost
Grouping with Regexes
Lookahead and Lookbehind Zero-Length Assertions
https://www.regular-expressions.info/lookaround.html
Hamlet Batista | @hamletbatista | #TechSEOBoost
https://github.com/plotly/plotly.py
Hamlet Batista | @hamletbatista | #TechSEOBoost
Page Groups That Lost SEO Traffic
Hamlet Batista | @hamletbatista | #TechSEOBoost
Reverse Engineer Success Too
Hamlet Batista | @hamletbatista | #TechSEOBoost
How Do We Generalize This?
Hamlet Batista | @hamletbatista | #TechSEOBoost
Using Machine Learning!
Hamlet Batista | @hamletbatista | #TechSEOBoost
But before…
Hamlet Batista | @hamletbatista | #TechSEOBoost
Credit: Matt West
Why Are Dominicans So Good
at Baseball?
Hamlet Batista | @hamletbatista | #TechSEOBoost
Hit a Vitilla? Hit Anything
https://www.youtube.com/watch?v=k8Aw2cBer84
Hamlet Batista | @hamletbatista | #TechSEOBoost
Vitilla
https://en.wikipedia.org/wiki/Vitilla
Hamlet Batista | @hamletbatista | #TechSEOBoost
Learn Machine Learning and Solve Any SEO Problem
Hamlet Batista | @hamletbatista | #TechSEOBoost
Hamlet Batista | @hamletbatista | #TechSEOBoost
Regex-> URL Matching
XPath-> Content
Matching
Hamlet Batista | @hamletbatista | #TechSEOBoost
Solution Part 3 – Steps
Step 1:
Collect training data
–
Step 2:
Prepare and split training data into
training, and testing
–
Step 3:
Find best model
CHALLENGE: Find Which Page Groups Lost
SEO Traffic (Automatically)
Hamlet Batista | @hamletbatista | #TechSEOBoost
Python – BeautifulSoup
BeautifulSoup 4 Cheatsheet
http://akul.me/blog/2016/beautifulsoup-cheatsheet/
https://www.crummy.com/software/BeautifulSoup/bs4/download/
An SEO’s guide to XPath
https://builtvisible.com/seo-guide-to-xpath/
Hamlet Batista | @hamletbatista | #TechSEOBoost
Python – Scikit-learn
https://scikit-learn.org/
Cheat Sheet
https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Scikit_Lear
heat_Sheet_Python.pdf
Hands-On Introduction To Scikit-learn (sklearn)
https://towardsdatascience.com/hands-on-introduction-to-scikit-learn-
sklearn-f3df652ff8f2
Efficiently Searching Optimal Tuning
Parameters
https://www.ritchieng.com/machine-learning-efficiently-
search-tuning-param/
Hamlet Batista | @hamletbatista | #TechSEOBoost
Data Scientist Bottom Up Solution
Inside the BloomReach Algorithm - Using
Machine Learning to Understand Page
Templates
https://www.bloomreach.com/en/blog/2018/07/using-machine-
learning-to-learn-page-templates.html
Hamlet Batista | @hamletbatista | #TechSEOBoost
For most Ecommerce sites, the dimensions
and quantity of images and input form elements
change by page template.
Let’s use that as the features vector.
Hamlet’s Observation
and Simpler Solution
Hamlet Batista | @hamletbatista | #TechSEOBoost
Hamlet’s Observation and Simpler Solution
Hamlet Batista | @hamletbatista | #TechSEOBoost
Hamlet’s Observation and Simpler Solution
Hamlet Batista | @hamletbatista | #TechSEOBoost
Collecting Training Data
Hamlet Batista | @hamletbatista | #TechSEOBoost
What is One Hot Encoding?
Why and when do you have to
use it?
https://hackernoon.com/what-is-one-
hot-encoding-why-and-when-do-you-
have-to-use-it-e3c6186d008f
Prepare and Split Data
Hamlet Batista | @hamletbatista | #TechSEOBoost
Cross Validation and Grid Search
For Model Selection in Python
https://stackabuse.com/cross-validation-
and-grid-search-for-model-selection-in-
python/
Find Best Model
Hamlet Batista | @hamletbatista | #TechSEOBoost
https://github.com/plotly/plotly.py
Hamlet Batista | @hamletbatista | #TechSEOBoost
Simple guide to confusion matrix terminology
https://www.dataschool.io/simple-guide-to-confusion-matrix-terminology/
Confusion Matrix
Hamlet Batista | @hamletbatista | #TechSEOBoost
But wait… We can do Better
Hamlet Batista | @hamletbatista | #TechSEOBoost
Using Deep Learning!
Hamlet Batista | @hamletbatista | #TechSEOBoost
Solution Part 4 – Steps
Step 1:
Label a few thousand web page
screenshots with the visual features
you care about
–
Step 2:
Train a computer vision model to
predict more granular page groups
–
Step 3: Find best model
CHALLENGE: Learn More Granular Page
Groups that Lost SEO Traffic (Automatically)
Hamlet Batista | @hamletbatista | #TechSEOBoost
https://www.tensorflow.org/
Keras Cheat Sheet
https://s3.amazonaws.com/assets.dataca
mp.com/blog_assets/Keras_Cheat_Sheet
_Python.pdf
TensorFlow Tutorial For
Beginners
https://www.datacamp.com/community/tut
orials/tensorflow-tutorial
Python – Tensorflow
& Keras
Hamlet Batista | @hamletbatista | #TechSEOBoost
Bottleneck
The “Information
Bottleneck” Theory
https://www.quantamagazine.org/ne
w-theory-cracks-open-the-black-
box-of-deep-learning-20170921/
Hamlet Batista | @hamletbatista | #TechSEOBoost
Encoder Bottleneck Decoder
Input Image Reconstructed Image
Latent Space
Representation
AUTOENCODER
Hamlet Batista | @hamletbatista | #TechSEOBoost
14 x 14 Feature Map
1. Input Image 2. Convolutional
Feature Extraction
3. RNN with attention
over the image
4. Word by word
generation
LSTM
Encoder Bottleneck Decoder
Latent Space
Representation
Caption Generator
Hamlet Batista | @hamletbatista | #TechSEOBoost
Python – Tensorflow Object Detection API
https://github.com/tensorflow/models/tree/master/research/object_detection
Hamlet Batista | @hamletbatista | #TechSEOBoost
AutoML Vision API Tutorial
https://cloud.google.com/vision/automl/docs/tutorial
Google AutoML
Hamlet Batista | @hamletbatista | #TechSEOBoost
Visually Labeling Screenshots
Hamlet Batista | @hamletbatista | #TechSEOBoost
Don't Take Security
Advice from SEO Experts
or Psychics
https://www.troyhunt.com/dont-
take-security-advice-from-seo-
experts-or-psychics-neil-patel/
Hamlet Batista | @hamletbatista | #TechSEOBoost
Launch Jupyter Notebook in Google
Colaboratory
https://colab.research.google.com/github/ranksense/open-
source/blob/master/Presentations/TechSEOBoost/2018/Pyt
honforSEOTechSEOBoost2018_Hamlet_Batista.ipynb
Hamlet Batista | @hamletbatista | #TechSEOBoost
SUMMARY
–
Hamlet Batista | @hamletbatista | #TechSEOBoost
Summary
Practical applications
of Python => 3.6
for:
Data extraction
–
Preparation
–
Analysis
–
Machine learning
–
Deep learning
Hamlet Batista | @hamletbatista | #TechSEOBoost
Free Realtime SEO Monitor
–
Ongoing monitoring with no active crawls
–
Receive alerts about critical SEO issues
–
Apply quick, temporary fixes in Cloudflare
–
Create developer tickets for permanent solutions
ABOUT RANKSENSE
– Apply for Beta Access
www.ranksense.com

More Related Content

What's hot

Going Solo - The Survival Guide for Freelance SEOs (Present & Future) | brigh...
Going Solo - The Survival Guide for Freelance SEOs (Present & Future) | brigh...Going Solo - The Survival Guide for Freelance SEOs (Present & Future) | brigh...
Going Solo - The Survival Guide for Freelance SEOs (Present & Future) | brigh...
Steve Morgan
 

What's hot (20)

Content Design & its Role in SEO and Accessibility [BrightonSEO Spring 2023]
Content Design & its Role in SEO and Accessibility [BrightonSEO Spring 2023]Content Design & its Role in SEO and Accessibility [BrightonSEO Spring 2023]
Content Design & its Role in SEO and Accessibility [BrightonSEO Spring 2023]
 
Machine Learning use cases for Technical SEO Automation Brighton SEO Patrick ...
Machine Learning use cases for Technical SEO Automation Brighton SEO Patrick ...Machine Learning use cases for Technical SEO Automation Brighton SEO Patrick ...
Machine Learning use cases for Technical SEO Automation Brighton SEO Patrick ...
 
[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing Pages
[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing Pages[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing Pages
[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing Pages
 
How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...
How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...
How to Incorporate ML in your SERP Analysis, Lazarina Stoy -BrightonSEO Oct, ...
 
How to Use Search Intent to Dominate Google Discover
How to Use Search Intent to Dominate Google DiscoverHow to Use Search Intent to Dominate Google Discover
How to Use Search Intent to Dominate Google Discover
 
Probabilistic Thinking in SEO - BrightonSEO October 2022
Probabilistic Thinking in SEO - BrightonSEO October 2022Probabilistic Thinking in SEO - BrightonSEO October 2022
Probabilistic Thinking in SEO - BrightonSEO October 2022
 
Going Solo - The Survival Guide for Freelance SEOs (Present & Future) | brigh...
Going Solo - The Survival Guide for Freelance SEOs (Present & Future) | brigh...Going Solo - The Survival Guide for Freelance SEOs (Present & Future) | brigh...
Going Solo - The Survival Guide for Freelance SEOs (Present & Future) | brigh...
 
How to come up with content ideas without relying on search volume.pptx
How to come up with content ideas without relying on search volume.pptxHow to come up with content ideas without relying on search volume.pptx
How to come up with content ideas without relying on search volume.pptx
 
A Simple method to Create Content using NLP
A Simple method to Create Content using NLP A Simple method to Create Content using NLP
A Simple method to Create Content using NLP
 
Holistic Search - Developing An Organic First Strategy
Holistic Search - Developing An Organic First StrategyHolistic Search - Developing An Organic First Strategy
Holistic Search - Developing An Organic First Strategy
 
brightonSEO - Stress Is Contagious Don't Catch It From Your Clients
brightonSEO - Stress Is Contagious Don't Catch It From Your ClientsbrightonSEO - Stress Is Contagious Don't Catch It From Your Clients
brightonSEO - Stress Is Contagious Don't Catch It From Your Clients
 
Core Web Vitals Audit - Sophie Gibson - PDF - BrightonSEO.pdf
Core Web Vitals Audit - Sophie Gibson - PDF - BrightonSEO.pdfCore Web Vitals Audit - Sophie Gibson - PDF - BrightonSEO.pdf
Core Web Vitals Audit - Sophie Gibson - PDF - BrightonSEO.pdf
 
How to Implement Machine Learning in Your Internal Linking Audit - Lazarina S...
How to Implement Machine Learning in Your Internal Linking Audit - Lazarina S...How to Implement Machine Learning in Your Internal Linking Audit - Lazarina S...
How to Implement Machine Learning in Your Internal Linking Audit - Lazarina S...
 
Jasmine Granton - Brighton SEO 2022.pptx (1).pdf
Jasmine Granton - Brighton SEO 2022.pptx (1).pdfJasmine Granton - Brighton SEO 2022.pptx (1).pdf
Jasmine Granton - Brighton SEO 2022.pptx (1).pdf
 
BrightonSEO October 2022 - Martijn Scheybeler - SEO Testing: Find Out What Wo...
BrightonSEO October 2022 - Martijn Scheybeler - SEO Testing: Find Out What Wo...BrightonSEO October 2022 - Martijn Scheybeler - SEO Testing: Find Out What Wo...
BrightonSEO October 2022 - Martijn Scheybeler - SEO Testing: Find Out What Wo...
 
I Am A Donut - How To Avoid International SEO Mistakes
I Am A Donut - How To Avoid International SEO MistakesI Am A Donut - How To Avoid International SEO Mistakes
I Am A Donut - How To Avoid International SEO Mistakes
 
Google Sheets For SEO - Tom Pool - London SEO Meetup XL
Google Sheets For SEO - Tom Pool - London SEO Meetup XLGoogle Sheets For SEO - Tom Pool - London SEO Meetup XL
Google Sheets For SEO - Tom Pool - London SEO Meetup XL
 
Networking for SEOs (and why it matters)
Networking for SEOs (and why it matters)Networking for SEOs (and why it matters)
Networking for SEOs (and why it matters)
 
Brighton SEO April 2022 - Automate the technical SEO stuff
Brighton SEO April 2022 - Automate the technical SEO stuffBrighton SEO April 2022 - Automate the technical SEO stuff
Brighton SEO April 2022 - Automate the technical SEO stuff
 
SEO Tool Overload😱... Google Data Studio to the rescue
SEO Tool Overload😱... Google Data Studio to the rescueSEO Tool Overload😱... Google Data Studio to the rescue
SEO Tool Overload😱... Google Data Studio to the rescue
 

Similar to Python for SEO

Delivering client sites - KC2015
Delivering client sites - KC2015Delivering client sites - KC2015
Delivering client sites - KC2015
Ilesh Mistry
 
Getting collections online
Getting collections onlineGetting collections online
Getting collections online
Mike Ellis
 

Similar to Python for SEO (20)

SEO Meets Automation
SEO Meets AutomationSEO Meets Automation
SEO Meets Automation
 
TechSEO Boost - Apps script for SEOs
TechSEO Boost - Apps script for SEOsTechSEO Boost - Apps script for SEOs
TechSEO Boost - Apps script for SEOs
 
Automate, Create Tools, & Test Ideas Quickly with Google Apps Script
Automate, Create Tools, & Test Ideas Quickly with Google Apps ScriptAutomate, Create Tools, & Test Ideas Quickly with Google Apps Script
Automate, Create Tools, & Test Ideas Quickly with Google Apps Script
 
Everything That Can Go Wrong Will Go Wrong - Tech SEO Boost 2017 - Patrick Stox
Everything That Can Go Wrong Will Go Wrong - Tech SEO Boost 2017 - Patrick StoxEverything That Can Go Wrong Will Go Wrong - Tech SEO Boost 2017 - Patrick Stox
Everything That Can Go Wrong Will Go Wrong - Tech SEO Boost 2017 - Patrick Stox
 
Scaling Keyword Research to Find Content Gaps
Scaling Keyword Research to Find Content GapsScaling Keyword Research to Find Content Gaps
Scaling Keyword Research to Find Content Gaps
 
Getting Started with Python and Machine Learning for SEO | BrightonSEO Octobe...
Getting Started with Python and Machine Learning for SEO | BrightonSEO Octobe...Getting Started with Python and Machine Learning for SEO | BrightonSEO Octobe...
Getting Started with Python and Machine Learning for SEO | BrightonSEO Octobe...
 
Machine Learning For SEOs - TechSEOBoost 2018
Machine Learning For SEOs - TechSEOBoost 2018Machine Learning For SEOs - TechSEOBoost 2018
Machine Learning For SEOs - TechSEOBoost 2018
 
TechSEO Boost 2017: Making the Web Fast
TechSEO Boost 2017: Making the Web FastTechSEO Boost 2017: Making the Web Fast
TechSEO Boost 2017: Making the Web Fast
 
Delivering client sites - KC2015
Delivering client sites - KC2015Delivering client sites - KC2015
Delivering client sites - KC2015
 
Reverse Engineering Twitter Hashtag Algorithm
Reverse Engineering Twitter Hashtag AlgorithmReverse Engineering Twitter Hashtag Algorithm
Reverse Engineering Twitter Hashtag Algorithm
 
Python For Technical SEO | Women In Tech SEO Festival March 2020 | Ruth Everett
Python For Technical SEO | Women In Tech SEO Festival March 2020 | Ruth Everett Python For Technical SEO | Women In Tech SEO Festival March 2020 | Ruth Everett
Python For Technical SEO | Women In Tech SEO Festival March 2020 | Ruth Everett
 
Doing More with Less: Automated, High-Quality Content Generation
Doing More with Less: Automated, High-Quality Content GenerationDoing More with Less: Automated, High-Quality Content Generation
Doing More with Less: Automated, High-Quality Content Generation
 
Max Prin - MnSearch Summit 2018 - SEO for the Current Mobile Landscape
Max Prin - MnSearch Summit 2018 - SEO for the Current Mobile LandscapeMax Prin - MnSearch Summit 2018 - SEO for the Current Mobile Landscape
Max Prin - MnSearch Summit 2018 - SEO for the Current Mobile Landscape
 
MnSearch Summit 2018 - Max Prin – Technical SEO Tactics for the Current Mobil...
MnSearch Summit 2018 - Max Prin – Technical SEO Tactics for the Current Mobil...MnSearch Summit 2018 - Max Prin – Technical SEO Tactics for the Current Mobil...
MnSearch Summit 2018 - Max Prin – Technical SEO Tactics for the Current Mobil...
 
Getting collections online
Getting collections onlineGetting collections online
Getting collections online
 
TechSEO Boost: Machine Learning for SEOs
TechSEO Boost: Machine Learning for SEOsTechSEO Boost: Machine Learning for SEOs
TechSEO Boost: Machine Learning for SEOs
 
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
 
Tripletail
TripletailTripletail
Tripletail
 
GraphQL Without a Database | Frontend Developer Love
GraphQL Without a Database | Frontend Developer LoveGraphQL Without a Database | Frontend Developer Love
GraphQL Without a Database | Frontend Developer Love
 
GatsbyJS Recipes - Mmt tech meetup - August 2020
GatsbyJS Recipes - Mmt tech meetup - August 2020GatsbyJS Recipes - Mmt tech meetup - August 2020
GatsbyJS Recipes - Mmt tech meetup - August 2020
 

More from Hamlet Batista

More from Hamlet Batista (19)

A Deep Dive Into SEO Tactics For Modern Javascript Frameworks
A Deep Dive Into SEO Tactics For Modern Javascript FrameworksA Deep Dive Into SEO Tactics For Modern Javascript Frameworks
A Deep Dive Into SEO Tactics For Modern Javascript Frameworks
 
Automated Duplicate Content Consolidation with Google Cloud Functions
Automated Duplicate Content Consolidation with Google Cloud FunctionsAutomated Duplicate Content Consolidation with Google Cloud Functions
Automated Duplicate Content Consolidation with Google Cloud Functions
 
Quality Content at Scale Through Automated Text Summarization of UGC
Quality Content at Scale Through Automated Text Summarization of UGCQuality Content at Scale Through Automated Text Summarization of UGC
Quality Content at Scale Through Automated Text Summarization of UGC
 
Creando una Sección de FAQS y su Marcado de Datos Estructurados en 30 Minutos
Creando una Sección de FAQS y su Marcado de Datos Estructurados en 30 MinutosCreando una Sección de FAQS y su Marcado de Datos Estructurados en 30 Minutos
Creando una Sección de FAQS y su Marcado de Datos Estructurados en 30 Minutos
 
The Python Cheat Sheet for the Busy Marketer
The Python Cheat Sheet for the Busy MarketerThe Python Cheat Sheet for the Busy Marketer
The Python Cheat Sheet for the Busy Marketer
 
How to scale SEO work NOBODY wants to do (including your competitors) to rapi...
How to scale SEO work NOBODY wants to do (including your competitors) to rapi...How to scale SEO work NOBODY wants to do (including your competitors) to rapi...
How to scale SEO work NOBODY wants to do (including your competitors) to rapi...
 
Agile SEO: Faster SEO Results
Agile SEO: Faster SEO ResultsAgile SEO: Faster SEO Results
Agile SEO: Faster SEO Results
 
Query Classification on Steroids with BERT
Query Classification on Steroids with BERTQuery Classification on Steroids with BERT
Query Classification on Steroids with BERT
 
Solving Complex JavaScript Issues and Leveraging Semantic HTML5
Solving Complex JavaScript Issues and Leveraging Semantic HTML5Solving Complex JavaScript Issues and Leveraging Semantic HTML5
Solving Complex JavaScript Issues and Leveraging Semantic HTML5
 
Python for Data-driven Storytelling
Python for Data-driven StorytellingPython for Data-driven Storytelling
Python for Data-driven Storytelling
 
Scaling automated quality text generation for enterprise sites
Scaling automated quality text generation for enterprise sitesScaling automated quality text generation for enterprise sites
Scaling automated quality text generation for enterprise sites
 
The New Renaissance of JavaScript
The New Renaissance of JavaScriptThe New Renaissance of JavaScript
The New Renaissance of JavaScript
 
Data and Evidence-driven SEO
Data and Evidence-driven SEOData and Evidence-driven SEO
Data and Evidence-driven SEO
 
Advanced Data-Driven SEO
Advanced Data-Driven SEOAdvanced Data-Driven SEO
Advanced Data-Driven SEO
 
Technical SEO "Overoptimization"
Technical SEO "Overoptimization"Technical SEO "Overoptimization"
Technical SEO "Overoptimization"
 
Why Pay for Performance When You Can Lead the World To Your Door for Free?
Why Pay for Performance When You Can Lead the World To Your Door for Free?Why Pay for Performance When You Can Lead the World To Your Door for Free?
Why Pay for Performance When You Can Lead the World To Your Door for Free?
 
Gettin' It Up And Keepin' It Up in Google
Gettin' It Up And Keepin' It Up in GoogleGettin' It Up And Keepin' It Up in Google
Gettin' It Up And Keepin' It Up in Google
 
Batista, Hamlet, Beyond The Usual Link Building
Batista, Hamlet, Beyond The Usual Link BuildingBatista, Hamlet, Beyond The Usual Link Building
Batista, Hamlet, Beyond The Usual Link Building
 
White Hat Cloaking
White Hat CloakingWhite Hat Cloaking
White Hat Cloaking
 

Recently uploaded

Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
amitlee9823
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 

Recently uploaded (20)

Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 

Python for SEO

  • 1. Hamlet Batista | @hamletbatista | #TechSEOBoost Python for SEO – Programming As a Superpower
  • 2. Hamlet Batista | @hamletbatista | #TechSEOBoost AGENDA – Practical SEO applications of Python => 3.6 for: Data extraction – Preparation – Analysis & Visualization – Machine learning – Deep learning
  • 3. Hamlet Batista | @hamletbatista | #TechSEOBoost INTRO – Why program when you can hire a programmer to do the work for you?
  • 4. Hamlet Batista | @hamletbatista | #TechSEOBoost But before…
  • 5. Hamlet Batista | @hamletbatista | #TechSEOBoost
  • 6. Hamlet Batista | @hamletbatista | #TechSEOBoost
  • 7. Hamlet Batista | @hamletbatista | #TechSEOBoost
  • 8. Hamlet Batista | @hamletbatista | #TechSEOBoost
  • 9. Hamlet Batista | @hamletbatista | #TechSEOBoost
  • 10. Hamlet Batista | @hamletbatista | #TechSEOBoost
  • 11. Hamlet Batista | @hamletbatista | #TechSEOBoost
  • 12. Hamlet Batista | @hamletbatista | #TechSEOBoost CHALLENGING SEO PROBLEMS – THAT NEED PROGRAMMING WORK
  • 13. Hamlet Batista | @hamletbatista | #TechSEOBoost IBM WebSphere => SAP Hybris
  • 14. Hamlet Batista | @hamletbatista | #TechSEOBoost IBM WebSphere Site Category Page (Links to one or more Product Listing Pages) Product Listing Page (Links to one or more Product Pages) Product Page (Single SKU)
  • 15. Hamlet Batista | @hamletbatista | #TechSEOBoost SAP Hybris Site Category Page (Links to one or more Product Pages) Product Page (Single SKU)
  • 16. Hamlet Batista | @hamletbatista | #TechSEOBoost Old Site Product Pages (717) New Site Product Pages (442) Product Mapping (3431)
  • 17. Hamlet Batista | @hamletbatista | #TechSEOBoost Old Site Category Pages (371) New Site Category Pages (147) Category Mapping (712)
  • 18. Hamlet Batista | @hamletbatista | #TechSEOBoost
  • 19. Hamlet Batista | @hamletbatista | #TechSEOBoost Category Home Product Content Videos Other NewUsersRevenuePageCount
  • 20. Hamlet Batista | @hamletbatista | #TechSEOBoost
  • 21. Hamlet Batista | @hamletbatista | #TechSEOBoost
  • 22. Hamlet Batista | @hamletbatista | #TechSEOBoost Category Home Product Content Videos Other NewUsers
  • 23. Hamlet Batista | @hamletbatista | #TechSEOBoost Winners vs Losers
  • 24. Hamlet Batista | @hamletbatista | #TechSEOBoost Launch Jupyter Notebook in Google Colaboratory https://colab.research.google.com/github/ranksense/open- source/blob/master/Presentations/TechSEOBoost/2018/PythonforSEOTechSEOBoost2018_ Hamlet_Batista.ipynb
  • 25. Hamlet Batista | @hamletbatista | #TechSEOBoost
  • 26. Hamlet Batista | @hamletbatista | #TechSEOBoost Ecommerce V3 => Shopify
  • 27. Hamlet Batista | @hamletbatista | #TechSEOBoost https://github.com/plotly/plotly.py
  • 28. Hamlet Batista | @hamletbatista | #TechSEOBoost Solution Part 1 – Steps Step 1: Pull Google Analytics Data – Step 2: Store Data in Pandas DataFrame – Step 3: Perform Data Preparation and Perform Basic Set Operations CHALLENGE: Find Which Pages Lost SEO Traffic
  • 29. Hamlet Batista | @hamletbatista | #TechSEOBoost Python – Basics https://pandas.pydata.org/ Python for Data Science Cheat Sheet https://s3.amazonaws.com/assets.datacamp.com/blog_assets/PythonF orDataScience.pdf
  • 30. Hamlet Batista | @hamletbatista | #TechSEOBoost Python – Jupyter Google Colaboratory https://colab.research.google.com/notebooks/ welcome.ipynb
  • 31. Hamlet Batista | @hamletbatista | #TechSEOBoost Python – Pandas https://pandas.pydata.org/ Cheat Sheet https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf 10 Minutes to pandas https://pandas.pydata.org/pandas-docs/stable/10min.html Intro to Pandas for Excel Super Users https://towardsdatascience.com/intro-to-pandas-for-excel- super-users-dac1b38f12b0
  • 32. Hamlet Batista | @hamletbatista | #TechSEOBoost Python – Requests WEB SCRAPING REFERENCE: A Simple Cheat Sheet for Web Scraping with Python https://blog.hartleybrody.com/web-scraping-cheat-sheet/ http://docs.python-requests.org/en/master/
  • 33. Hamlet Batista | @hamletbatista | #TechSEOBoost https://ga-dev-tools.appspot.com/query-explorer/
  • 34. Hamlet Batista | @hamletbatista | #TechSEOBoost Pulling Google Analytics Data
  • 35. Hamlet Batista | @hamletbatista | #TechSEOBoost Storing Data in a DataFrame
  • 36. Hamlet Batista | @hamletbatista | #TechSEOBoost Transforming Data for Analysis https://www.shanelynn.ie/merge-join-dataframes-python-pandas-index-1/ Left Join Full Outer Join Left Join (if NULL) Inner Join Right Join Right Join (if NULL)
  • 37. Hamlet Batista | @hamletbatista | #TechSEOBoost Transforming Data for Analysis
  • 38. Hamlet Batista | @hamletbatista | #TechSEOBoost Pages That Lost SEO Traffic
  • 39. Hamlet Batista | @hamletbatista | #TechSEOBoost Solution Part 2 – Steps Step 1: We will crawl old pages to follow redirects – Step 2: We will group pages using regular expressions – Step 3: Repeat the previous analysis CHALLENGE: Find Which Page Groups Lost SEO Traffic (Manually)
  • 40. Hamlet Batista | @hamletbatista | #TechSEOBoost Regular Expressions for SEOs and Digital Marketers (with Use Cases) https://netpeaksoftware.com/blog/ regular-expressions-for-seos- and-digital-marketers-with-use- cases Regex101.com
  • 41. Hamlet Batista | @hamletbatista | #TechSEOBoost Crawling Old Pages
  • 42. Hamlet Batista | @hamletbatista | #TechSEOBoost Grouping with Regexes Lookahead and Lookbehind Zero-Length Assertions https://www.regular-expressions.info/lookaround.html
  • 43. Hamlet Batista | @hamletbatista | #TechSEOBoost https://github.com/plotly/plotly.py
  • 44. Hamlet Batista | @hamletbatista | #TechSEOBoost Page Groups That Lost SEO Traffic
  • 45. Hamlet Batista | @hamletbatista | #TechSEOBoost Reverse Engineer Success Too
  • 46. Hamlet Batista | @hamletbatista | #TechSEOBoost How Do We Generalize This?
  • 47. Hamlet Batista | @hamletbatista | #TechSEOBoost Using Machine Learning!
  • 48. Hamlet Batista | @hamletbatista | #TechSEOBoost But before…
  • 49. Hamlet Batista | @hamletbatista | #TechSEOBoost Credit: Matt West Why Are Dominicans So Good at Baseball?
  • 50. Hamlet Batista | @hamletbatista | #TechSEOBoost Hit a Vitilla? Hit Anything https://www.youtube.com/watch?v=k8Aw2cBer84
  • 51. Hamlet Batista | @hamletbatista | #TechSEOBoost Vitilla https://en.wikipedia.org/wiki/Vitilla
  • 52. Hamlet Batista | @hamletbatista | #TechSEOBoost Learn Machine Learning and Solve Any SEO Problem
  • 53. Hamlet Batista | @hamletbatista | #TechSEOBoost
  • 54. Hamlet Batista | @hamletbatista | #TechSEOBoost Regex-> URL Matching XPath-> Content Matching
  • 55. Hamlet Batista | @hamletbatista | #TechSEOBoost Solution Part 3 – Steps Step 1: Collect training data – Step 2: Prepare and split training data into training, and testing – Step 3: Find best model CHALLENGE: Find Which Page Groups Lost SEO Traffic (Automatically)
  • 56. Hamlet Batista | @hamletbatista | #TechSEOBoost Python – BeautifulSoup BeautifulSoup 4 Cheatsheet http://akul.me/blog/2016/beautifulsoup-cheatsheet/ https://www.crummy.com/software/BeautifulSoup/bs4/download/ An SEO’s guide to XPath https://builtvisible.com/seo-guide-to-xpath/
  • 57. Hamlet Batista | @hamletbatista | #TechSEOBoost Python – Scikit-learn https://scikit-learn.org/ Cheat Sheet https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Scikit_Lear heat_Sheet_Python.pdf Hands-On Introduction To Scikit-learn (sklearn) https://towardsdatascience.com/hands-on-introduction-to-scikit-learn- sklearn-f3df652ff8f2 Efficiently Searching Optimal Tuning Parameters https://www.ritchieng.com/machine-learning-efficiently- search-tuning-param/
  • 58. Hamlet Batista | @hamletbatista | #TechSEOBoost Data Scientist Bottom Up Solution Inside the BloomReach Algorithm - Using Machine Learning to Understand Page Templates https://www.bloomreach.com/en/blog/2018/07/using-machine- learning-to-learn-page-templates.html
  • 59. Hamlet Batista | @hamletbatista | #TechSEOBoost For most Ecommerce sites, the dimensions and quantity of images and input form elements change by page template. Let’s use that as the features vector. Hamlet’s Observation and Simpler Solution
  • 60. Hamlet Batista | @hamletbatista | #TechSEOBoost Hamlet’s Observation and Simpler Solution
  • 61. Hamlet Batista | @hamletbatista | #TechSEOBoost Hamlet’s Observation and Simpler Solution
  • 62. Hamlet Batista | @hamletbatista | #TechSEOBoost Collecting Training Data
  • 63. Hamlet Batista | @hamletbatista | #TechSEOBoost What is One Hot Encoding? Why and when do you have to use it? https://hackernoon.com/what-is-one- hot-encoding-why-and-when-do-you- have-to-use-it-e3c6186d008f Prepare and Split Data
  • 64. Hamlet Batista | @hamletbatista | #TechSEOBoost Cross Validation and Grid Search For Model Selection in Python https://stackabuse.com/cross-validation- and-grid-search-for-model-selection-in- python/ Find Best Model
  • 65. Hamlet Batista | @hamletbatista | #TechSEOBoost https://github.com/plotly/plotly.py
  • 66. Hamlet Batista | @hamletbatista | #TechSEOBoost Simple guide to confusion matrix terminology https://www.dataschool.io/simple-guide-to-confusion-matrix-terminology/ Confusion Matrix
  • 67. Hamlet Batista | @hamletbatista | #TechSEOBoost But wait… We can do Better
  • 68. Hamlet Batista | @hamletbatista | #TechSEOBoost Using Deep Learning!
  • 69. Hamlet Batista | @hamletbatista | #TechSEOBoost Solution Part 4 – Steps Step 1: Label a few thousand web page screenshots with the visual features you care about – Step 2: Train a computer vision model to predict more granular page groups – Step 3: Find best model CHALLENGE: Learn More Granular Page Groups that Lost SEO Traffic (Automatically)
  • 70. Hamlet Batista | @hamletbatista | #TechSEOBoost https://www.tensorflow.org/ Keras Cheat Sheet https://s3.amazonaws.com/assets.dataca mp.com/blog_assets/Keras_Cheat_Sheet _Python.pdf TensorFlow Tutorial For Beginners https://www.datacamp.com/community/tut orials/tensorflow-tutorial Python – Tensorflow & Keras
  • 71. Hamlet Batista | @hamletbatista | #TechSEOBoost Bottleneck The “Information Bottleneck” Theory https://www.quantamagazine.org/ne w-theory-cracks-open-the-black- box-of-deep-learning-20170921/
  • 72. Hamlet Batista | @hamletbatista | #TechSEOBoost Encoder Bottleneck Decoder Input Image Reconstructed Image Latent Space Representation AUTOENCODER
  • 73. Hamlet Batista | @hamletbatista | #TechSEOBoost 14 x 14 Feature Map 1. Input Image 2. Convolutional Feature Extraction 3. RNN with attention over the image 4. Word by word generation LSTM Encoder Bottleneck Decoder Latent Space Representation Caption Generator
  • 74. Hamlet Batista | @hamletbatista | #TechSEOBoost Python – Tensorflow Object Detection API https://github.com/tensorflow/models/tree/master/research/object_detection
  • 75. Hamlet Batista | @hamletbatista | #TechSEOBoost AutoML Vision API Tutorial https://cloud.google.com/vision/automl/docs/tutorial Google AutoML
  • 76. Hamlet Batista | @hamletbatista | #TechSEOBoost Visually Labeling Screenshots
  • 77. Hamlet Batista | @hamletbatista | #TechSEOBoost Don't Take Security Advice from SEO Experts or Psychics https://www.troyhunt.com/dont- take-security-advice-from-seo- experts-or-psychics-neil-patel/
  • 78. Hamlet Batista | @hamletbatista | #TechSEOBoost Launch Jupyter Notebook in Google Colaboratory https://colab.research.google.com/github/ranksense/open- source/blob/master/Presentations/TechSEOBoost/2018/Pyt honforSEOTechSEOBoost2018_Hamlet_Batista.ipynb
  • 79. Hamlet Batista | @hamletbatista | #TechSEOBoost SUMMARY –
  • 80. Hamlet Batista | @hamletbatista | #TechSEOBoost Summary Practical applications of Python => 3.6 for: Data extraction – Preparation – Analysis – Machine learning – Deep learning
  • 81. Hamlet Batista | @hamletbatista | #TechSEOBoost Free Realtime SEO Monitor – Ongoing monitoring with no active crawls – Receive alerts about critical SEO issues – Apply quick, temporary fixes in Cloudflare – Create developer tickets for permanent solutions ABOUT RANKSENSE – Apply for Beta Access www.ranksense.com

Editor's Notes

  1. This is what we will do to correct that: Step 1: We will crawl each page from the first set, and record the status code (and final URL of the redirects) Step 2: Repeat the analysis
  2. This is what we will do to correct that: Step 1: We will crawl each page from the first set, and record the status code (and final URL of the redirects) Step 2: Repeat the analysis
  3. This is what we will do to correct that: Step 1: We will crawl each page from the first set, and record the status code (and final URL of the redirects) Step 2: Repeat the analysis
  4. This is what we will do to correct that: Step 1: We will crawl each page from the first set, and record the status code (and final URL of the redirects) Step 2: Repeat the analysis