SlideShare uma empresa Scribd logo
1 de 3
BeautifulSoup/Selenium
Deep dive
6th May, 2020 SAKURA Internet, Inc. Research Center SR / Naoto MATSUMOTO
(C) Copyright 1996-2020 SAKURA Internet Inc
BeautifulSoup sample code
2
# apt install python3-pip
# pip3 install BeautifulSoup4
# pip3 install lxml
# vi bs4.py
#!/usr/bin/env python3
import re
import sys
import requests
from bs4 import BeautifulSoup as bs4
url = sys.argv[1]
key = sys.argv[2]
html = requests.get(url)
soup = bs4(html.content,'lxml',from_encoding='utf-8')
for script in soup(["script", "style"]): script.decompose()
text = soup.get_text()
regex = re.compile(key)
for line in text.splitlines():
if line:
match = regex.search(line)
if match:
print("%s, %s" % (url, line))
# python3 bs4.py http://www.sakura.ad.jp VPS
http://www.sakura.ad.jp, さくらのVPS
SOURCE: SAKURA Internet Research Center (2020/05)
Selenium sample code
3
# apt install python-pip curl -y
# apt install -y unzip xvfb libxi6 libgconf-2-4 -y
# apt install chromium-chromedriver -y
# pip install selenium
# pip install pyvirtualdisplay
# vi se.py
# encoding: utf-8
import sys
import time
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument('--headless')
options.add_argument('disable-infobars')
options.add_argument('--no-sandbox')
driver = webdriver.Chrome(chrome_options=options)
word = "site:" + sys.argv[1] + " " + sys.argv[2]
url= "https://www.google.com/search?q={}&safe=off".format(word)
driver.get(url)
time.sleep(1)
for i, g in enumerate(driver.find_elements_by_class_name("g")):
s = g.find_element_by_class_name("s")
x = s.find_element_by_class_name("st").text
print(x)
driver.quit()
SOURCE: SAKURA Internet Research Center (2020/05)

Mais conteúdo relacionado

Mais procurados

gunicorn introduction
gunicorn introductiongunicorn introduction
gunicorn introductionAdam Lowry
 
AWS ロボを作ろう JAWSUG Kobe
AWS ロボを作ろう JAWSUG KobeAWS ロボを作ろう JAWSUG Kobe
AWS ロボを作ろう JAWSUG Kobe崇之 清水
 
Sensu wrapper-sensu-summit
Sensu wrapper-sensu-summitSensu wrapper-sensu-summit
Sensu wrapper-sensu-summitLee Briggs
 
Cool Git Tricks (That I Learn When Things Go Badly) [1/2]
Cool Git Tricks (That I Learn When Things Go Badly) [1/2]Cool Git Tricks (That I Learn When Things Go Badly) [1/2]
Cool Git Tricks (That I Learn When Things Go Badly) [1/2]Carina C. Zona
 
Redis is the answer, what's the question - Tech Nottingham
Redis is the answer, what's the question - Tech NottinghamRedis is the answer, what's the question - Tech Nottingham
Redis is the answer, what's the question - Tech NottinghamGarry Shutler
 
IRIS-HEP Retreat: Boost-Histogram Roadmap
IRIS-HEP Retreat: Boost-Histogram RoadmapIRIS-HEP Retreat: Boost-Histogram Roadmap
IRIS-HEP Retreat: Boost-Histogram RoadmapHenry Schreiner
 
Introduction of Python & Pygame
Introduction of  Python & PygameIntroduction of  Python & Pygame
Introduction of Python & PygameTakaki Yoneyama
 
用 Bitbar Tool 寫 Script 自動擷取外幣
用 Bitbar Tool 寫 Script 自動擷取外幣用 Bitbar Tool 寫 Script 自動擷取外幣
用 Bitbar Tool 寫 Script 自動擷取外幣Win Yu
 
Python And GIS - Beyond Modelbuilder And Pythonwin
Python And GIS - Beyond Modelbuilder And PythonwinPython And GIS - Beyond Modelbuilder And Pythonwin
Python And GIS - Beyond Modelbuilder And PythonwinChad Cooper
 
Altering the real world with JavaScript - Framsia
Altering the real world with JavaScript - FramsiaAltering the real world with JavaScript - Framsia
Altering the real world with JavaScript - FramsiaJan Jongboom
 
Digital RSE: automated code quality checks - RSE group meeting
Digital RSE: automated code quality checks - RSE group meetingDigital RSE: automated code quality checks - RSE group meeting
Digital RSE: automated code quality checks - RSE group meetingHenry Schreiner
 
Golang Arg / CABA Meetup #5 - go-carbon
Golang Arg / CABA Meetup #5 - go-carbonGolang Arg / CABA Meetup #5 - go-carbon
Golang Arg / CABA Meetup #5 - go-carbonEzequiel Maraschio
 
忙しい人のためのSphinx 入門 demo
忙しい人のためのSphinx 入門 demo忙しい人のためのSphinx 入門 demo
忙しい人のためのSphinx 入門 demoFumihito Yokoyama
 

Mais procurados (20)

Evolution and AI
Evolution and AIEvolution and AI
Evolution and AI
 
gunicorn introduction
gunicorn introductiongunicorn introduction
gunicorn introduction
 
AWS ロボを作ろう JAWSUG Kobe
AWS ロボを作ろう JAWSUG KobeAWS ロボを作ろう JAWSUG Kobe
AWS ロボを作ろう JAWSUG Kobe
 
Sensu wrapper-sensu-summit
Sensu wrapper-sensu-summitSensu wrapper-sensu-summit
Sensu wrapper-sensu-summit
 
GoLang & GoatCore
GoLang & GoatCore GoLang & GoatCore
GoLang & GoatCore
 
Cool Git Tricks (That I Learn When Things Go Badly) [1/2]
Cool Git Tricks (That I Learn When Things Go Badly) [1/2]Cool Git Tricks (That I Learn When Things Go Badly) [1/2]
Cool Git Tricks (That I Learn When Things Go Badly) [1/2]
 
Redis is the answer, what's the question - Tech Nottingham
Redis is the answer, what's the question - Tech NottinghamRedis is the answer, what's the question - Tech Nottingham
Redis is the answer, what's the question - Tech Nottingham
 
Arfon-Smith-feb25
Arfon-Smith-feb25Arfon-Smith-feb25
Arfon-Smith-feb25
 
RingoJS
RingoJSRingoJS
RingoJS
 
IRIS-HEP Retreat: Boost-Histogram Roadmap
IRIS-HEP Retreat: Boost-Histogram RoadmapIRIS-HEP Retreat: Boost-Histogram Roadmap
IRIS-HEP Retreat: Boost-Histogram Roadmap
 
PyHEP 2019: Python 3.8
PyHEP 2019: Python 3.8PyHEP 2019: Python 3.8
PyHEP 2019: Python 3.8
 
C++
C++C++
C++
 
Practica54
Practica54Practica54
Practica54
 
Introduction of Python & Pygame
Introduction of  Python & PygameIntroduction of  Python & Pygame
Introduction of Python & Pygame
 
用 Bitbar Tool 寫 Script 自動擷取外幣
用 Bitbar Tool 寫 Script 自動擷取外幣用 Bitbar Tool 寫 Script 自動擷取外幣
用 Bitbar Tool 寫 Script 自動擷取外幣
 
Python And GIS - Beyond Modelbuilder And Pythonwin
Python And GIS - Beyond Modelbuilder And PythonwinPython And GIS - Beyond Modelbuilder And Pythonwin
Python And GIS - Beyond Modelbuilder And Pythonwin
 
Altering the real world with JavaScript - Framsia
Altering the real world with JavaScript - FramsiaAltering the real world with JavaScript - Framsia
Altering the real world with JavaScript - Framsia
 
Digital RSE: automated code quality checks - RSE group meeting
Digital RSE: automated code quality checks - RSE group meetingDigital RSE: automated code quality checks - RSE group meeting
Digital RSE: automated code quality checks - RSE group meeting
 
Golang Arg / CABA Meetup #5 - go-carbon
Golang Arg / CABA Meetup #5 - go-carbonGolang Arg / CABA Meetup #5 - go-carbon
Golang Arg / CABA Meetup #5 - go-carbon
 
忙しい人のためのSphinx 入門 demo
忙しい人のためのSphinx 入門 demo忙しい人のためのSphinx 入門 demo
忙しい人のためのSphinx 入門 demo
 

Semelhante a BeautifulSoup / selenium Deep dive

05 - Bypassing DEP, or why ASLR matters
05 - Bypassing DEP, or why ASLR matters05 - Bypassing DEP, or why ASLR matters
05 - Bypassing DEP, or why ASLR mattersAlexandre Moneger
 
Introduction to Assembly Language
Introduction to Assembly LanguageIntroduction to Assembly Language
Introduction to Assembly LanguageMotaz Saad
 
Will iPython replace Bash?
Will iPython replace Bash?Will iPython replace Bash?
Will iPython replace Bash?Babel
 
Will iPython replace bash?
Will iPython replace bash?Will iPython replace bash?
Will iPython replace bash?Roberto Polli
 
Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)
Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)
Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)Takayuki Shimizukawa
 
What is the best full text search engine for Python?
What is the best full text search engine for Python?What is the best full text search engine for Python?
What is the best full text search engine for Python?Andrii Soldatenko
 
Sphinx autodoc - automated api documentation - PyCon.MY 2015
Sphinx autodoc - automated api documentation - PyCon.MY 2015Sphinx autodoc - automated api documentation - PyCon.MY 2015
Sphinx autodoc - automated api documentation - PyCon.MY 2015Takayuki Shimizukawa
 
How to write rust instead of c and get away with it
How to write rust instead of c and get away with itHow to write rust instead of c and get away with it
How to write rust instead of c and get away with itFlavien Raynaud
 
Could you take a look at this Python code Towards the botto.pdf
Could you take a look at this Python code Towards the botto.pdfCould you take a look at this Python code Towards the botto.pdf
Could you take a look at this Python code Towards the botto.pdfdevangmittal4
 
2018 RubyHACK: put git to work - increase the quality of your rails project...
2018 RubyHACK:  put git to work -  increase the quality of your rails project...2018 RubyHACK:  put git to work -  increase the quality of your rails project...
2018 RubyHACK: put git to work - increase the quality of your rails project...Rodrigo Urubatan
 
Homer - Workshop at Kamailio World 2017
Homer - Workshop at Kamailio World 2017Homer - Workshop at Kamailio World 2017
Homer - Workshop at Kamailio World 2017Giacomo Vacca
 
Efficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native EnvironmentsEfficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native EnvironmentsGergely Szabó
 
Collaboration hack with slackbot - PyCon HK 2018 - 2018.11.24
Collaboration hack with slackbot - PyCon HK 2018 - 2018.11.24Collaboration hack with slackbot - PyCon HK 2018 - 2018.11.24
Collaboration hack with slackbot - PyCon HK 2018 - 2018.11.24Kei IWASAKI
 
SECCOM 2017 - Conan.io o gerente de pacote para C e C++
SECCOM 2017 - Conan.io o gerente de pacote para C e C++SECCOM 2017 - Conan.io o gerente de pacote para C e C++
SECCOM 2017 - Conan.io o gerente de pacote para C e C++Uilian Ries
 
Dev8d 2011-pipe2 py
Dev8d 2011-pipe2 pyDev8d 2011-pipe2 py
Dev8d 2011-pipe2 pyTony Hirst
 
Sphinx autodoc - automated API documentation (EuroPython 2015 in Bilbao)
Sphinx autodoc - automated API documentation (EuroPython 2015 in Bilbao)Sphinx autodoc - automated API documentation (EuroPython 2015 in Bilbao)
Sphinx autodoc - automated API documentation (EuroPython 2015 in Bilbao)Takayuki Shimizukawa
 
PVS-Studio for Linux (CoreHard presentation)
PVS-Studio for Linux (CoreHard presentation)PVS-Studio for Linux (CoreHard presentation)
PVS-Studio for Linux (CoreHard presentation)Andrey Karpov
 
10 Excellent Ways to Secure Your Spring Boot Application - Devoxx Belgium 2019
10 Excellent Ways to Secure Your Spring Boot Application - Devoxx Belgium 201910 Excellent Ways to Secure Your Spring Boot Application - Devoxx Belgium 2019
10 Excellent Ways to Secure Your Spring Boot Application - Devoxx Belgium 2019Matt Raible
 
Secure Programming Practices in C++ (NDC Oslo 2018)
Secure Programming Practices in C++ (NDC Oslo 2018)Secure Programming Practices in C++ (NDC Oslo 2018)
Secure Programming Practices in C++ (NDC Oslo 2018)Patricia Aas
 

Semelhante a BeautifulSoup / selenium Deep dive (20)

05 - Bypassing DEP, or why ASLR matters
05 - Bypassing DEP, or why ASLR matters05 - Bypassing DEP, or why ASLR matters
05 - Bypassing DEP, or why ASLR matters
 
Introduction to Assembly Language
Introduction to Assembly LanguageIntroduction to Assembly Language
Introduction to Assembly Language
 
Will iPython replace Bash?
Will iPython replace Bash?Will iPython replace Bash?
Will iPython replace Bash?
 
Will iPython replace bash?
Will iPython replace bash?Will iPython replace bash?
Will iPython replace bash?
 
Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)
Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)
Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)
 
What is the best full text search engine for Python?
What is the best full text search engine for Python?What is the best full text search engine for Python?
What is the best full text search engine for Python?
 
Sphinx autodoc - automated api documentation - PyCon.MY 2015
Sphinx autodoc - automated api documentation - PyCon.MY 2015Sphinx autodoc - automated api documentation - PyCon.MY 2015
Sphinx autodoc - automated api documentation - PyCon.MY 2015
 
How to write rust instead of c and get away with it
How to write rust instead of c and get away with itHow to write rust instead of c and get away with it
How to write rust instead of c and get away with it
 
Could you take a look at this Python code Towards the botto.pdf
Could you take a look at this Python code Towards the botto.pdfCould you take a look at this Python code Towards the botto.pdf
Could you take a look at this Python code Towards the botto.pdf
 
2018 RubyHACK: put git to work - increase the quality of your rails project...
2018 RubyHACK:  put git to work -  increase the quality of your rails project...2018 RubyHACK:  put git to work -  increase the quality of your rails project...
2018 RubyHACK: put git to work - increase the quality of your rails project...
 
Homer - Workshop at Kamailio World 2017
Homer - Workshop at Kamailio World 2017Homer - Workshop at Kamailio World 2017
Homer - Workshop at Kamailio World 2017
 
Python para equipos de ciberseguridad
Python para equipos de ciberseguridad Python para equipos de ciberseguridad
Python para equipos de ciberseguridad
 
Efficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native EnvironmentsEfficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native Environments
 
Collaboration hack with slackbot - PyCon HK 2018 - 2018.11.24
Collaboration hack with slackbot - PyCon HK 2018 - 2018.11.24Collaboration hack with slackbot - PyCon HK 2018 - 2018.11.24
Collaboration hack with slackbot - PyCon HK 2018 - 2018.11.24
 
SECCOM 2017 - Conan.io o gerente de pacote para C e C++
SECCOM 2017 - Conan.io o gerente de pacote para C e C++SECCOM 2017 - Conan.io o gerente de pacote para C e C++
SECCOM 2017 - Conan.io o gerente de pacote para C e C++
 
Dev8d 2011-pipe2 py
Dev8d 2011-pipe2 pyDev8d 2011-pipe2 py
Dev8d 2011-pipe2 py
 
Sphinx autodoc - automated API documentation (EuroPython 2015 in Bilbao)
Sphinx autodoc - automated API documentation (EuroPython 2015 in Bilbao)Sphinx autodoc - automated API documentation (EuroPython 2015 in Bilbao)
Sphinx autodoc - automated API documentation (EuroPython 2015 in Bilbao)
 
PVS-Studio for Linux (CoreHard presentation)
PVS-Studio for Linux (CoreHard presentation)PVS-Studio for Linux (CoreHard presentation)
PVS-Studio for Linux (CoreHard presentation)
 
10 Excellent Ways to Secure Your Spring Boot Application - Devoxx Belgium 2019
10 Excellent Ways to Secure Your Spring Boot Application - Devoxx Belgium 201910 Excellent Ways to Secure Your Spring Boot Application - Devoxx Belgium 2019
10 Excellent Ways to Secure Your Spring Boot Application - Devoxx Belgium 2019
 
Secure Programming Practices in C++ (NDC Oslo 2018)
Secure Programming Practices in C++ (NDC Oslo 2018)Secure Programming Practices in C++ (NDC Oslo 2018)
Secure Programming Practices in C++ (NDC Oslo 2018)
 

Mais de Naoto MATSUMOTO

Alder Lake-S CPU Temperature Monitoring
Alder Lake-S CPU Temperature MonitoringAlder Lake-S CPU Temperature Monitoring
Alder Lake-S CPU Temperature MonitoringNaoto MATSUMOTO
 
CPU製品出荷状況と消費電力の見える化
CPU製品出荷状況と消費電力の見える化CPU製品出荷状況と消費電力の見える化
CPU製品出荷状況と消費電力の見える化Naoto MATSUMOTO
 
2023年以降のサーバークラスタリング設計(メモ)
2023年以降のサーバークラスタリング設計(メモ)2023年以降のサーバークラスタリング設計(メモ)
2023年以降のサーバークラスタリング設計(メモ)Naoto MATSUMOTO
 
防災を考慮した水中調査の一考察
防災を考慮した水中調査の一考察防災を考慮した水中調査の一考察
防災を考慮した水中調査の一考察Naoto MATSUMOTO
 
旅するパケットの見える化
旅するパケットの見える化旅するパケットの見える化
旅するパケットの見える化Naoto MATSUMOTO
 
LTE-M/NB IoTを試してみる nRF9160/Thingy:91
LTE-M/NB IoTを試してみる nRF9160/Thingy:91LTE-M/NB IoTを試してみる nRF9160/Thingy:91
LTE-M/NB IoTを試してみる nRF9160/Thingy:91Naoto MATSUMOTO
 
災害時における無線モニタリングによる社会インフラの見える化
災害時における無線モニタリングによる社会インフラの見える化災害時における無線モニタリングによる社会インフラの見える化
災害時における無線モニタリングによる社会インフラの見える化Naoto MATSUMOTO
 
Network Adapter Deep dive
Network Adapter Deep diveNetwork Adapter Deep dive
Network Adapter Deep diveNaoto MATSUMOTO
 
x86_64 Hardware Deep dive
x86_64 Hardware Deep divex86_64 Hardware Deep dive
x86_64 Hardware Deep diveNaoto MATSUMOTO
 
ADS-B, AIS, APRS cheatsheet
ADS-B, AIS, APRS cheatsheetADS-B, AIS, APRS cheatsheet
ADS-B, AIS, APRS cheatsheetNaoto MATSUMOTO
 
3/4G USB modem Cheat Sheet
3/4G USB modem Cheat Sheet3/4G USB modem Cheat Sheet
3/4G USB modem Cheat SheetNaoto MATSUMOTO
 
How To Train Your ARM(SBC)
How To  Train Your ARM(SBC)How To  Train Your ARM(SBC)
How To Train Your ARM(SBC)Naoto MATSUMOTO
 
全国におけるCOVID-19対策の見える化 ~宿泊業の場合~
全国におけるCOVID-19対策の見える化 ~宿泊業の場合~全国におけるCOVID-19対策の見える化 ~宿泊業の場合~
全国におけるCOVID-19対策の見える化 ~宿泊業の場合~Naoto MATSUMOTO
 
我が国の電波の使用状況/携帯電話向け割当 (2019年3月1日現在)
我が国の電波の使用状況/携帯電話向け割当 (2019年3月1日現在)我が国の電波の使用状況/携帯電話向け割当 (2019年3月1日現在)
我が国の電波の使用状況/携帯電話向け割当 (2019年3月1日現在)Naoto MATSUMOTO
 
私たちに訪れる(かもしれない)未来と計算機によるモノコトの見える化
私たちに訪れる(かもしれない)未来と計算機によるモノコトの見える化私たちに訪れる(かもしれない)未来と計算機によるモノコトの見える化
私たちに訪れる(かもしれない)未来と計算機によるモノコトの見える化Naoto MATSUMOTO
 
仮想化環境におけるバイナリー・ポータビリティの考察 (WebAssemblyの場合)
仮想化環境におけるバイナリー・ポータビリティの考察 (WebAssemblyの場合)仮想化環境におけるバイナリー・ポータビリティの考察 (WebAssemblyの場合)
仮想化環境におけるバイナリー・ポータビリティの考察 (WebAssemblyの場合)Naoto MATSUMOTO
 

Mais de Naoto MATSUMOTO (20)

Alder Lake-S CPU Temperature Monitoring
Alder Lake-S CPU Temperature MonitoringAlder Lake-S CPU Temperature Monitoring
Alder Lake-S CPU Temperature Monitoring
 
CPU製品出荷状況と消費電力の見える化
CPU製品出荷状況と消費電力の見える化CPU製品出荷状況と消費電力の見える化
CPU製品出荷状況と消費電力の見える化
 
5Gの見える化
5Gの見える化5Gの見える化
5Gの見える化
 
2023年以降のサーバークラスタリング設計(メモ)
2023年以降のサーバークラスタリング設計(メモ)2023年以降のサーバークラスタリング設計(メモ)
2023年以降のサーバークラスタリング設計(メモ)
 
防災を考慮した水中調査の一考察
防災を考慮した水中調査の一考察防災を考慮した水中調査の一考察
防災を考慮した水中調査の一考察
 
旅するパケットの見える化
旅するパケットの見える化旅するパケットの見える化
旅するパケットの見える化
 
LTE-M/NB IoTを試してみる nRF9160/Thingy:91
LTE-M/NB IoTを試してみる nRF9160/Thingy:91LTE-M/NB IoTを試してみる nRF9160/Thingy:91
LTE-M/NB IoTを試してみる nRF9160/Thingy:91
 
災害時における無線モニタリングによる社会インフラの見える化
災害時における無線モニタリングによる社会インフラの見える化災害時における無線モニタリングによる社会インフラの見える化
災害時における無線モニタリングによる社会インフラの見える化
 
AMDGPU ROCm Deep dive
AMDGPU ROCm Deep diveAMDGPU ROCm Deep dive
AMDGPU ROCm Deep dive
 
Network Adapter Deep dive
Network Adapter Deep diveNetwork Adapter Deep dive
Network Adapter Deep dive
 
RTL2838 DVB-T Deep dive
RTL2838 DVB-T Deep diveRTL2838 DVB-T Deep dive
RTL2838 DVB-T Deep dive
 
x86_64 Hardware Deep dive
x86_64 Hardware Deep divex86_64 Hardware Deep dive
x86_64 Hardware Deep dive
 
ADS-B, AIS, APRS cheatsheet
ADS-B, AIS, APRS cheatsheetADS-B, AIS, APRS cheatsheet
ADS-B, AIS, APRS cheatsheet
 
curl --http3 cheatsheet
curl --http3 cheatsheetcurl --http3 cheatsheet
curl --http3 cheatsheet
 
3/4G USB modem Cheat Sheet
3/4G USB modem Cheat Sheet3/4G USB modem Cheat Sheet
3/4G USB modem Cheat Sheet
 
How To Train Your ARM(SBC)
How To  Train Your ARM(SBC)How To  Train Your ARM(SBC)
How To Train Your ARM(SBC)
 
全国におけるCOVID-19対策の見える化 ~宿泊業の場合~
全国におけるCOVID-19対策の見える化 ~宿泊業の場合~全国におけるCOVID-19対策の見える化 ~宿泊業の場合~
全国におけるCOVID-19対策の見える化 ~宿泊業の場合~
 
我が国の電波の使用状況/携帯電話向け割当 (2019年3月1日現在)
我が国の電波の使用状況/携帯電話向け割当 (2019年3月1日現在)我が国の電波の使用状況/携帯電話向け割当 (2019年3月1日現在)
我が国の電波の使用状況/携帯電話向け割当 (2019年3月1日現在)
 
私たちに訪れる(かもしれない)未来と計算機によるモノコトの見える化
私たちに訪れる(かもしれない)未来と計算機によるモノコトの見える化私たちに訪れる(かもしれない)未来と計算機によるモノコトの見える化
私たちに訪れる(かもしれない)未来と計算機によるモノコトの見える化
 
仮想化環境におけるバイナリー・ポータビリティの考察 (WebAssemblyの場合)
仮想化環境におけるバイナリー・ポータビリティの考察 (WebAssemblyの場合)仮想化環境におけるバイナリー・ポータビリティの考察 (WebAssemblyの場合)
仮想化環境におけるバイナリー・ポータビリティの考察 (WebAssemblyの場合)
 

Último

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 

Último (20)

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

BeautifulSoup / selenium Deep dive

  • 1. BeautifulSoup/Selenium Deep dive 6th May, 2020 SAKURA Internet, Inc. Research Center SR / Naoto MATSUMOTO (C) Copyright 1996-2020 SAKURA Internet Inc
  • 2. BeautifulSoup sample code 2 # apt install python3-pip # pip3 install BeautifulSoup4 # pip3 install lxml # vi bs4.py #!/usr/bin/env python3 import re import sys import requests from bs4 import BeautifulSoup as bs4 url = sys.argv[1] key = sys.argv[2] html = requests.get(url) soup = bs4(html.content,'lxml',from_encoding='utf-8') for script in soup(["script", "style"]): script.decompose() text = soup.get_text() regex = re.compile(key) for line in text.splitlines(): if line: match = regex.search(line) if match: print("%s, %s" % (url, line)) # python3 bs4.py http://www.sakura.ad.jp VPS http://www.sakura.ad.jp, さくらのVPS SOURCE: SAKURA Internet Research Center (2020/05)
  • 3. Selenium sample code 3 # apt install python-pip curl -y # apt install -y unzip xvfb libxi6 libgconf-2-4 -y # apt install chromium-chromedriver -y # pip install selenium # pip install pyvirtualdisplay # vi se.py # encoding: utf-8 import sys import time from selenium import webdriver from selenium.webdriver.chrome.options import Options options = Options() options.add_argument('--headless') options.add_argument('disable-infobars') options.add_argument('--no-sandbox') driver = webdriver.Chrome(chrome_options=options) word = "site:" + sys.argv[1] + " " + sys.argv[2] url= "https://www.google.com/search?q={}&safe=off".format(word) driver.get(url) time.sleep(1) for i, g in enumerate(driver.find_elements_by_class_name("g")): s = g.find_element_by_class_name("s") x = s.find_element_by_class_name("st").text print(x) driver.quit() SOURCE: SAKURA Internet Research Center (2020/05)