SlideShare uma empresa Scribd logo
1 de 71
Internetlivestats.com
Coreference resolution
Question answering (QA)
Part-of-speech (POS) tagging
Word sense disambiguation (WSD)
Paraphrase
Named entity recognition (NER)
Parsing
Summarization
Information extraction (IE)
Machine translation (MT)
Dialog
Sentiment analysis
mostly solved
making good progress
still really hard
Spam detection (Classification)
Let’s go to Agra!
Buy V1AGRA …
✓
✗
Colorless green ideas sleep furiously.
ADJ ADJ NOUN VERB ADV
Einstein met with UN officials in Princeton
PERSON ORG LOC
You’re invited to our dinner
party, Friday May 27 at 8:30
Party
May 27
add
Best roast chicken in San Francisco!
The waiter ignored us for 20 minutes.
Carter told Mubarak he shouldn’t run again.
I need new batteries for my mouse.
The 13th Shanghai International Film Festival…
第13届上海国际电影节开幕…
The Dow Jones is up
Housing prices rose
Economy is
good
Q. How effective is ibuprofen in reducing
fever in patients with acute febrile illness?
I can see Alcatraz from the window!
XYZ acquired ABC yesterday
ABC has been taken over by XYZ
Where is Citizen Kane playing in SF?
Castro Theatre at 7:30. Do
you want a ticket?
The S&P500 jumped
Source: Dan Jurafsky
non-standard English
Great job @justinbieber! Were
SOO PROUD of what youve
accomplished! U taught us 2
#neversaynever & you yourself
should never give up either♥
segmentation issues idioms
dark horse
get cold feet
lose face
throw in the towel
neologisms
unfriend
Retweet
bromance
tricky entity names
Where is A Bug’s Life playing …
Let It Be was recorded …
… a mutation on the for gene …
the New York-New Haven Railroad
the New York-New Haven Railroad
Source: Dan Jurafsky (modified)
sarcasm
A: I love Justin Bieber. Do you
like him to?
B:Yeah. Sure. I absolutely love
him.
http://www.alchemyapi.com/
https://www.congress.gov/resources/display/content/The+Federalist+Papers#TheFederalistP
apers-10
1:10pm
Non-Stop
Adair Moesteller &Wallace Fung Collins et al
Corpus
Document
Term
Source:
Chris Manning
Tokenize Clean Stem Filter
Then a hurricane came, and devastation reigned
then a hurricane came and devastation reigned
then a hurricane came and devastation reigned
then a hurricane came and devastation reigned
GitHub site
1:20pm Code Lines: 1 - 49
Code Lines: 50-79
Federalist Paper 1: Before Federalist Paper 1: After
Code Lines: 71-88
Federalist Paper 1: After
Code Lines: 89-104
Code Lines: 142-149
Code Lines: 151-1651:30pm
Code Lines: 167-171
Code Lines: 173-188
Code Lines: 189-201
Code Lines: 202-207
Uncomment (CTRL + SHIFT +C) and run lines 107-139
Code Lines: 107-139
then rerun lines
141-206
1:50pm - 2pm
BayesTheorem
these slides
Code Lines: 208-219
Update
Code Lines: 231-241
Code Lines: 242-248
Code Lines: 250-273
Code Lines: 275-290
This will take about 4 mins, depending on the computer you run it on
Code Lines: 295-308
Source: David Blei (link to article)
Code Lines: 295-308
Index.html file in the “Federalist” folder in your working directory.
Open with FireFox; it is not supported by Chrome or IE.
Code Lines: 321-349
Code Lines: 350-370
• Naïve Bayes predicts 9 of the 12 papers
as written by Madison.
• K-NN predicts only 4 of the 12 papers
as written by Madison
• Why? How stable are these results??
Code Lines: 371-373
2:30pm
Source: Richard Heimann
Source: Richard Heimann
Source: Richard Heimann
The Beige Book
GitHub
Source: Richard Heimann
https://github.com/wesslen/BeigeBookSentimentAnalysis
First six records of BB.sentiment
First six records of BB.sentiment (updated)
Raw Scored Sentiment Scaled Scored Sentiment
Stanford Deep Learning NLP class materials
https://projectmosaic.uncc.edu/events-list/
GNIP access
http://www.r-
bloggers.com/setting-up-the-twitter-r-package-for-text-analytics/
AlchemyAPI
Taste Analytics Signals
SAS Enterprise Miner
SAS Sentiment Analysis
Hamilton Soundtrack Amazon Reviews
R tm package
Python nltk package
Python gensim package
Mallet
IntroductoryText MiningClass
Coursera Natural
Language ProcessingClass
CourseraText
Mining & Analytics Course
Deep Learning for Natural Language
Processing
https://www.kaggle.com/c/word2vec-nlp-tutorial/details/part-
1-for-beginners-bag-of-words
http://www.alchemyapi.com/developers/getting-started-
guide/twitter-sentiment-analysis
https://eight2late.wordpress.com/2015/09/29/a-gentle-
introduction-to-topic-modeling-using-r/
http://www.r-bloggers.com/sentiment-analysis-on-donald-
trump-using-r-and-tableau/
Follow this link for all R “text” blogs on Rbloggers
website

Mais conteúdo relacionado

Destaque

Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)
Revolution Analytics
 

Destaque (20)

Computing Probabilities With R: mining the patterns in lottery
Computing Probabilities With R: mining the patterns in lotteryComputing Probabilities With R: mining the patterns in lottery
Computing Probabilities With R: mining the patterns in lottery
 
Text mining with R-studio
Text mining with R-studioText mining with R-studio
Text mining with R-studio
 
My Data Analysis Portfolio (Text Mining)
My Data Analysis Portfolio (Text Mining)My Data Analysis Portfolio (Text Mining)
My Data Analysis Portfolio (Text Mining)
 
Data mining with R- regression models
Data mining with R- regression modelsData mining with R- regression models
Data mining with R- regression models
 
Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng
Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng
Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng
 
Data Exploration and Visualization with R
Data Exploration and Visualization with RData Exploration and Visualization with R
Data Exploration and Visualization with R
 
Introduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in RIntroduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in R
 
hands on: Text Mining With R
hands on: Text Mining With Rhands on: Text Mining With R
hands on: Text Mining With R
 
R Reference Card for Data Mining
R Reference Card for Data MiningR Reference Card for Data Mining
R Reference Card for Data Mining
 
An Introduction to Data Mining with R
An Introduction to Data Mining with RAn Introduction to Data Mining with R
An Introduction to Data Mining with R
 
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
 
Regression and Classification with R
Regression and Classification with RRegression and Classification with R
Regression and Classification with R
 
A short tutorial on r
A short tutorial on rA short tutorial on r
A short tutorial on r
 
Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)
 
Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)
 
TextMining with R
TextMining with RTextMining with R
TextMining with R
 
Data mining tools (R , WEKA, RAPID MINER, ORANGE)
Data mining tools (R , WEKA, RAPID MINER, ORANGE)Data mining tools (R , WEKA, RAPID MINER, ORANGE)
Data mining tools (R , WEKA, RAPID MINER, ORANGE)
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
Social media analysis in R using twitter API
Social media analysis in R using twitter API Social media analysis in R using twitter API
Social media analysis in R using twitter API
 
Twitter analysis by Kaify Rais
Twitter analysis by Kaify RaisTwitter analysis by Kaify Rais
Twitter analysis by Kaify Rais
 

Último

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Último (20)

Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 

Text Mining with R for Social Science Research

Notas do Editor

  1. Three questions: Experience with R
  2. Set the bedrock for the united states government… if you want to know the original structure of the us government, this is your document.
  3. https://eight2late.wordpress.com/2015/05/27/a-gentle-introduction-to-text-mining-using-r/