Emotional behaviour seems to be ubiquitous on the web. Predictably, social media web genres such as tweets, blog posts and blog comments show high emotional involvement. What about other genres on the web? In this talk, the focus is on the search query log genre. According to recent IR research, searchers’ behaviour is not only limited to traditional informational, navigational and transactional needs. A novel hypothesis is that the seeking behaviour is driven by emotion. But can emotion be detected by analysing the queries typed by users in a search box? In this talk, I will present the results of some experiments carried out to investigate whether it is possible to identify emotion in the query log genre, and discuss how emotion could be utilized to improve the relevance of retrieved documents in searches. These experiments are part of SearchInFocus, a study centred on search.
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
How Emotional Are Users' Needs? Emotion in Query Logs
1. How Emotional are
Users’ Needs?
Exploring Emotion in Query Logs
Marina Santini
29 Jan 2013
Marina Santini - CyberEmotions2013
Warsaw University of Technology 1
29-30 Jan 2013
2. Outline
• Inspirational Triggers:
o The Big Unstructured Textual Data Issue
o Emotion in IR
o Research hypothesis
• Genre- and Emotion- Profiling of Query Logs
o Characterization of genre
o Definition of emotion
o Benefits of genre and emotion awareness in query log analysis
• Experiments
o Query Logs from GenitoriCrescono thematic blog (in iItalian)
o Query Logs from Västra Götlands Region (in Swedish)
• Conclusions
Marina Santini - CyberEmotions2013
Warsaw University of Technology 2
29-30 Jan 2013
3. Inspirational Trigger 1
BIG UNSTRUCTURED TEXTUAL DATA
Marina Santini - CyberEmotions2013
Warsaw University of Technology 3
29-30 Jan 2013
4. Big Unstructured Texutal Data
“MerrillLynch estimates that more than 85 percent of
all business information exists as unstructured data –
commonly appearing in e‐mails, memos, notes from
call centers and support operations, news, user
groups, chats, reports, letters, surveys, white
papers, marketing material, research, presentations
and web pages.” [DM Review Magazine, February
2003 Issue]
ECONOMIC LOSS!
Lots of different genres!
Marina Santini - CyberEmotions2013
Warsaw University of Technology 4
29-30 Jan 2013
5. Simple search is not
enough…
• Of course, it is possible to use simple search. But
simple search is unrewarding, because is based on
single terms.
o ”a search is made on the term felony. In a simple search, the term felony
is used, and everywhere there is a reference to felony, a hit to an
unstructured document is made. But a simple search is crude. It does not
find references to crime, arson, murder, embezzlement, vehicular
homicide, and such, even though these crimes are types of felonies” [
Source: Inmon, B. & A. Nesavich, "Unstructured Textual Data in the
Organization" from "Managing Unstructured data in the
organization", Prentice Hall 2008, pp. 1–13]
Marina Santini - CyberEmotions2013
Warsaw University of Technology 5
29-30 Jan 2013
6. Text Analytics
• A set of NLP techniques that provide some structure
to textual documents.
• Common components:
o Tokenization
o Morphological Analysis
o Syntactic Analysis
o Named Entity Recognition
o Sentiment Analysis
o Automatic Summarization
o Etc.
Marina Santini - CyberEmotions2013
Warsaw University of Technology 6
29-30 Jan 2013
7. Text Analytics Products
and Frameworks
• Commercial: Open Source:
Attensity
o
• GATE
o Clarabridge
o Temis • NLTK
Lexalytics
o
• UIMA
o Texify
o SAS • etc.
o IBM Cognos
o etc.
Business Intelligence (BI)
Customer Experience Management (CEM)
Marina Santini - CyberEmotions2013
Warsaw University of Technology 7
29-30 Jan 2013
8. Actionable Intelligence
• Business Intelligence (BI) + Customer Experience
Management (CEM) = Actionable Intelligence
• Actionable Intelligence is information that:
1. must be accurate and verifiable
2. must be timely
3. must be comprehensive
4. must be comprehensible
5. give the power to make decisions and to act
straightaway
Marina Santini - CyberEmotions2013
Warsaw University of Technology 8
29-30 Jan 2013
9. In 2003, Merryl Lynch pointed out
that it was too difficult to extract
automatically usable intelligence
from the following genres:
o e‐mails
o memos Today…
o notes from call centers and support
operations Previous genres plus
o news •Blogs
o user groups
•Tweets
o chats
o reports •FB microposts
o letters •FB comments
o surveys
•Many other social network texutal
o white papers
o marketing material, ”interactions”
o research,
o presentations
o web pages
Marina Santini - CyberEmotions2013
Warsaw University of Technology 9
29-30 Jan 2013
10. From Big Data to Query Logs
Current State of affair Viable Alternative
1. Big Unstructured Textual Data • Query Logs
2. Text Analytics (commercial • Genre- & Context
products and frameworks) aware Text Analytics
3. Structured information for BI • Actionable
Information
and CEM (BI, CEM, sentiment, e
merging topics…)
The main advantage to uses query logs (when they are Typical Use Case
available) instead of other genres consists in A company managing:
REDUCED DATA SIZE, REDUCED PRE-PROCESSING; •Website
REDUCED NOISE, REDUCED DATA CLEANING! •Blog
•eMails
Marina Santini - CyberEmotions2013 •Facebook Page
Warsaw University of Technology 10
29-30 Jan 2013
•Twitter account
11. Exploratory Query-log
Analysis Workshop
Organized by
Findwise, AB – Sweden
SearchInFocus SLTC 2012
Exploratory Study on Query Logs
and Actionable Intelligence
Query Logs provide
Actionable Intelligence for:
- search providers
- clients
- end-users
Marina Santini - CyberEmotions2013
Warsaw University of Technology 11
29-30 Jan 2013
12. Inspirational Trigger 2
EMOTION IN INFORMATION RETRIEVAL (IR)
Marina Santini - CyberEmotions2013
Warsaw University of Technology 12
29-30 Jan 2013
13. Emotion in IR Role of Emotion
in Information
o Three concepts: Retrieval
by Yashar Moshfeghi
• Emotion need PhD Thesis at University of
Glasgow, 2012
• Emotion object ” uncover social situations
• Emotion relevance where emotion is the primary
factor (i.e., source of
motivation) in an IR&S
process.” (from the abstract)
Marina Santini - CyberEmotions2013
Warsaw University of Technology 13
29-30 Jan 2013
14. Emotion Need
• The whole IR&S behaviour is driven by an emotion
need.
• An emotion need is more fundamental than an
information need in the sense that if an information
need exists it implies that there is an underlying
emotion need to satisfy this information need.
• Emotion needs, even when they do not lead to a
particular information need, can motivate
searchers to use an IR system.
Marina Santini - CyberEmotions2013
Warsaw University of Technology 14
29-30 Jan 2013
15. Research Hypothesis for the
exploration of emotion in
query logs
It is plausible that much of the IR&S behaviour is driven by an
emotion need and that users’ emotions are expressed in the
queries that are typed in search boxes and stored in query
logs.
If this is true, also emotion extraction from query logs provides
actionable intelligence, because extracted emotions can be
used to improve decision making and more grounded future
choices.
Marina Santini - CyberEmotions2013
Warsaw University of Technology 15
29-30 Jan 2013
16. Research Questions
• Is it possible to extract emotion from query logs?
• If so, is it possible to use emotion from query logs for
actionable intelligence?
Marina Santini - CyberEmotions2013
Warsaw University of Technology 16
29-30 Jan 2013
17. Genre Profiling of Query Logs
Marina Santini - CyberEmotions2013
Warsaw University of Technology 17
29-30 Jan 2013
18. What characterizes a
genre?
1. Must have a name
2. Must be recognized within a community
3. Must be produced during a task
4. Must have conventions
5. Must raise expectations
6. Can change over time. It is an cultural artifact
(culture here includes
society, media, techonology, etc.)
Marina Santini - CyberEmotions2013
Warsaw University of Technology 18
29-30 Jan 2013
19. The query log genre is…
a newly acknowledge but fully-
emerged webgenre
1. Name: in line with other digital genres (ex: web log
blog)
2. Community: internet users, IR practitioners
3. Task: to express searchers’needs in a search engine
4. Conventions: short texts written in”keywordese”
5. Expectations: to find information relevant to the
query
6. Cultural artifact: a product of sinternet-based
society OR a subproduct of search engines
Marina Santini - CyberEmotions2013
Warsaw University of Technology 19
29-30 Jan 2013
20. The query log genre:
Languistic and Textual
Conventions
• Length: short text (a query log can be seen as a
corpus of very short texts, shorter than
tweets, mobile text messages, chat logs, etc.)
• Sublanguage/Jargon: ”keywordese”
• Register: neutral
• Morphology: REDUCED
• Syntax : REDUCED (usually no subclauses, etc.)
Marina Santini - CyberEmotions2013
Warsaw University of Technology 20
29-30 Jan 2013
21. The Query Log Genre:
Benefits
• wrt discourse analysis:
o Conceptual lean and essential jargon
• reduced morphology
• reduced syntax
• short texts
• mostly nouns and verbs
Benefit1: Predictable Sublanguage
• wrt BIG UNSTRUCTURED TEXTUAL DATA
BENEFIT 2: REDUCED SIZE, REDUCED PRE-
PROCESSING; LITTLE DATA CLEANING!
Marina Santini - CyberEmotions2013
Warsaw University of Technology 21
29-30 Jan 2013
22. Emotion Profiling of Query Logs
Marina Santini - CyberEmotions2013
Warsaw University of Technology 22
29-30 Jan 2013
23. What is emotion?
BROAD DEFINITION: ANY DEGREE OF JUDGEMENTAL EVALUATION.
LIKE SENTISTRENGTH’S SCALE :
DUAL 5-POINTS SYSTEM FOR POSITIVE [1; 2; 3; 4; 5]
AND NEGATIVE [-1; -2; -3; -4; -5] EMOTIONS
Marina Santini - CyberEmotions2013
Warsaw University of Technology 23
29-30 Jan 2013
25. Thematic Blog – Italian
Logs from Google
Analytics
Marina Santini - CyberEmotions2013
Warsaw University of Technology 25
29-30 Jan 2013
26. • Parents Grow Up: Genitori Crescono
http://genitoricrescono.com/
to learn together the parent
profession
belongs to:
FattoreMammaNetwork
(gathers websites targeted
to mothers and written by
mothers)
• About:
parenthood, childcar
e, maternity, upbringin
g, behaviours during
childhood…
Marina Santini - CyberEmotions2013
Warsaw University of Technology 26
29-30 Jan 2013
27. Queries from Google Analytics
www.genitoricrescono.com - Search Overview 2009-01-01-2012-11-10
togliere il pannolino = stop wearing nappies/stop using diapers
genitori crescono = is website name
Nopron = is the name of a controvensial syrup to make children
sleep all night long
Tracy Hogg is is maternity nurse to Hollywood stars known as
'the baby whisperer' for her skill in calming unruly infants
nanna = familiar bye-byes (Brit) , beddy-byes
neonato 4 mesi = 4-months-old baby
io mi svezzo da solo: I wean by myself
nulla osta = certificate of no impediment
Marina Santini - CyberEmotions2013 terapeutico=therapeutic
aborto abortion
Warsaw University of Technology 27
29-30 Jan 2013
28. Zipf’s
distribution
“… much research has shown
that query term frequency
distributions conform to the
power law, or long tail
distribution curves. That is, a
small portion of the terms
observed in a large query log
(e.g. > 100 million queries) are
used most often, while the
remaining terms are used less
often individually."
Marina Santini - CyberEmotions2013
Warsaw University of Technology 28
29-30 Jan 2013
29. Parts of Speech
NOUNS
VERBS
ADJECTIVES AND ADVERBS
ARTICLES AND PREPOSITIONS
1.9
Marina Santini - CyberEmotions2013
Warsaw University of Technology 29
29-30 Jan 2013
30. Most Frequent Syntactic
Patterns inserimento al nido
bambini aggressivi
metodo estivill
Marina Santini - CyberEmotions2013
Warsaw University of Technology 30
29-30 Jan 2013
31. Average Lengths
“The average length of a
search query was 2.4 terms"
in a recent study in 2011 it
was found that the average
length of queries has grown
steadily over time and
average length of non-
English languages queries
had increased more than
English queries."
Marina Santini - CyberEmotions2013
Warsaw University of Technology 31
29-30 Jan 2013
32. Long query, informal
syntax
How to stop breastfeeding and make it sleep alone i am planning second pregnacy
Marina Santini - CyberEmotions2013
Warsaw University of Technology 32
29-30 Jan 2013
33. • Queries’ Emotional
Strength (i)
SentiStrength
(basic options)
Marina Santini - CyberEmotions2013
Warsaw University of Technology 33
29-30 Jan 2013
34. The power of genre and
the importance of the
communicative situation
• ”bambini aggressivi”
• Refinement of the concept presented in ”Topic-
based Sentiment Analysis in the Social Media …”
(Thelwall and Buckley, 2012): the polarity of affect
words might flip according to genre and the
communicative situation, and not only according
the topic.
Marina Santini - CyberEmotions2013
Warsaw University of Technology 34
29-30 Jan 2013
36. Emotional Strength:
Basic vs. Boosted
Marina Santini - CyberEmotions2013
Warsaw University of Technology 36
29-30 Jan 2013
37. Negation
Basic Options Boosted
bambini che non mangiano 2 -1 bambini che non mangiano 1 -1
children who do not eat
quando i bambini non dormono 2 -1 quando i bambini non dormono 1 -1
when children do not sleep
Marina Santini - CyberEmotions2013
Warsaw University of Technology 37
29-30 Jan 2013
38. Most frequent
wordTrigrams
Marina Santini - CyberEmotions2013
Warsaw University of Technology 38
29-30 Jan 2013
39. Query ”Normalization”
• Stopword removal
• Lemmatization
• And ideally synomym expansion
Marina Santini - CyberEmotions2013
Warsaw University of Technology 39
29-30 Jan 2013
40. Use emotion needs as Actionable Intelligence
Ex: for increasing
traffic to a
website
Increase emotion relevance:
• be empathetic to
searchers ’s problems by
sympathising and by
convetring the negative
words into more neutral
concepts
• Give heart and hope and
offer many solutions…
• In a few word: offer a new
communication stategy…
Marina Santini - CyberEmotions2013
Warsaw University of Technology 40
29-30 Jan 2013
41. Public organization website
Enterprise search and log
server by Findwise, AB.
Marina Santini - CyberEmotions2013
Warsaw University of Technology 41
29-30 Jan 2013
42. Within the Västra Götaland
Region website…
Marina Santini - CyberEmotions2013
Warsaw University of Technology 42
29-30 Jan 2013
43. …hittavård [find health
care center]
Regional
HealthCare
Marina Santini - CyberEmotions2013
Warsaw University of Technology 43
29-30 Jan 2013
44. VGR Corpus Description
• Corpus Time frame: 2010-2011 (2 years)
• Description: “These logs come from the search
at hittavard.vgregion.se. The biggest bulk should come
from 1177.se. The rest should be from vgregion.se. The
target audience are both VGR (Västra Götalands
Region) users/employees as well as the general
public, as it is a public site. The internal files are
searches made from within the VGR…”
• Corpus size:
o size = 3,167 KB (only queries) (BIG DATA is usually > 1TB)
o number of queries = 249,243
o number of words = 306,453
• Average query length: 1.23 words
Marina Santini - CyberEmotions2013
Warsaw University of Technology 44
29-30 Jan 2013
45. VGR Top Queries
egenremiss=self-certification
mina vårdkontakter=my healthcare contacts
webbisar=a invented word referring to newborn
babies whose pictures have been published on the web
Marina Santini - CyberEmotions2013sjukresa/or=trip to the hospital
Warsaw University of Technology 45
29-30 Jan 2013
46. Linguistic Remarks
• At the top of the frequency list: Simple nouns
•feber
o Simple nouns
•influensa
o Compounds
•klamydia
o V+N
•…
Compounds
•urinvägsinfektion
V+N •öroninflammation
•byta vårdcentral •Reseersättning
•avboka tid •…
•boka tid
•…
Marina Santini - CyberEmotions2013
Warsaw University of Technology 46
29-30 Jan 2013
47. More complex constructions
at the bottom
Marina Santini - CyberEmotions2013
Warsaw University of Technology 47
29-30 Jan 2013
49. It seems that no emotion is
conveyed by VGR users…
• Are Swedes less emotional than Italians?
• Is the ”healthcare” topic less emotional than the
”childcare” topic?
Marina Santini - CyberEmotions2013
Warsaw University of Technology 49
29-30 Jan 2013
50. It might be that…
There is a difference in users’ emotional
behaviour when specifying queries to a
web search engine OR when using a the
search engine of a specialized website.
Marina Santini - CyberEmotions2013
Warsaw University of Technology 50
29-30 Jan 2013
51. Emotion Interpretation…
is not straightforward…
• There are several factors to be accounted for:
o One important factor is the context of communication: similar words or
sentences can convey positive emotion in a query and negative emotion
in Facebook post, for example.
Marina Santini - CyberEmotions2013
Warsaw University of Technology 51
29-30 Jan 2013
52. different communicative contexts = different genres
Marina Santini - CyberEmotions2013
Warsaw University of Technology 52
29-30 Jan 2013
53. Genre Awaraness
• In practical terms, genre awareness is important in
text analytics and sentiment analysis because, all
things being equal:
o let you choose the easiest and less problematic texts to
process;
o help interpret and disambiguate the real meanings of
words and sentences according to the different
communicative context in which they appear.
Marina Santini - CyberEmotions2013
Warsaw University of Technology 53
29-30 Jan 2013
55. Is it possible to identify
and extract emotion from
query logs?
• It is possible to identify and extract emotion from web
query logs.
• It seems more difficult to extract emotion from
enterprise search engine query logs.
Marina Santini - CyberEmotions2013
Warsaw University of Technology 55
29-30 Jan 2013
56. Is it possible to use
emotion from query logs
for actionable
intelligence?
• If present, query log emotion can be used for
actionable intelligence.
Marina Santini - CyberEmotions2013
Warsaw University of Technology 56
29-30 Jan 2013
57. What do you think?
Thank you for your attention!
Marina Santini - CyberEmotions2013
Warsaw University of Technology 57
29-30 Jan 2013
59. Benefits for the Search
Provider
• Mining query logs to extract user-created knowlege, ie
queries that can be used as tags (metadata)
• Quickly create domain-specific taxonomies you can
capitalize upon, especially for new client companies
working in related fields
• Enhancements of current search products
• Inexpensive creation of annotated corpora: document
annotation through query logs is a simple technique that
in the a short time will build massive annotated corpora
to use for machine learning, which will allow more
sophisticated search refinements.
Marina Santini - CyberEmotions2013
Warsaw University of Technology 59
29-30 Jan 2013
60. Benefits for Clients & End
Users
• Somebody said: SEARCH MUST BE MIND READER!
• BUT ALSO faster, more friendly, more exhaustive and more
accurate.
• If this happens, clients will spend less for customer care. If
the end user finds what s/he needs online and
quickly, there is no need to call an helpdesk or customer
care service.
• Through the analysis of query logs, log analysts can spot
the less ”satisfied” queries (i.e. user’s needs). Companies
can use this information to plan future products or
product enhancement or marketing strategies, etc. (BI)
Marina Santini - CyberEmotions2013
Warsaw University of Technology 60
29-30 Jan 2013
Notas do Editor
In this talk, I would like to share and discuss with you the preliminaryresults of query logs’ emotionalanalysis.There is still much to be investigated in this field and to be capitalized on.HowEmotional are Users’ Needs?AbstractEmotional behaviour seems to be ubiquitous on the web. Predictably, social media web genres such as tweets, blog posts and blog comments show high emotional involvement. What about other genres on the web? In this talk, the focus is on the search query log genre. According to recent IR research, searchers’ behaviour is not only limited to traditional informational, navigational and transactional needs. A novel hypothesis is that the seeking behaviour is driven by emotion. But can emotion be detected by analysing the queries typed by users in a search box? In this talk, I will present the results of some experiments carried out to investigate whether it is possible to identify emotion in the query log genre, and discuss how emotion could be utilized to improve the relevance of retrieved documents in searches. These experiments are part of SearchInFocus, a study centred on search.
Big dataset: hadhoop, R etc.Merrill Lynch – financial management and advisorywww.ml.com/Merrill Lynch is one of the world's leading financial management and advisory companies, providing financial advice and investment banking services.e‐mails, memos, notes from call centers and support operations, news, user groups, chats, reports, letters, surveys, white papers, marketing material, research, presentations , etc are different genres, ie different types of text. For example, emails and white papers are both textual genres but they differ a lot from each other. They might deal with the same topic, but in a complete different way. So the type of information related to the same topic can vary according to genre.
Business intelligence (BI) is the ability of an organization to collect, maintain, and organize data. This produces large amounts of information that can help develop new opportunities. Identifying these opportunities, and implementing an effective strategy, can provide a competitive market advantage and long-term stability. BI technologies provide historical, current and predictive views of business operations.Customer Experience Management (CEM) is the practice of actively listening to the Voice of the Customer through a variety of listening posts, analyzing customer feedback to create a basis for acting on better business decisions and then measuring the impact of those decisions to drive even greater operational performance and customer loyalty. Through this process, a company strategically organizes itself to manage a customer's entire experience with its product, service or company. Companies invest in CEM to improve customer retention
If this companywants to analyse the interaction with clients/customers/fans/complainers to identifymaintopics, main problems, main sentiment, and decidewhichdirections are profitable for the future, I would suggest starting from query log analysis.
Findwise….
So ifwherehave genreawareness, wecandecidewhich genre is betterthananother for ourpurposes. I tried to advocate the use of query logs (whenavailable) becausetheymight be easier to mine. Can wealsofind emotions in query logs?
It has been shown in previous research that emotion plays an important role in the success of an IR&S process which has the purpose of satisfying an information need. However, these previous studies do not give a sufficiently prominent position to emotion in IR, since they limit the role of emotion to a secondary factor, by assuming that a lack of knowledge (the need for information) is the primary factor (the motivation of the search).Yashar proposes to treat emotion as the principal factor in the system of needs of a searcher, and therefore one that ought to be considered by the retrieval algorithms. He presents a view of searchers’ needs by considering not only theories from information retrieval and science, but also from psychology, philosophy, and sociology. We extensively report on the role of emotion in every aspect of human behaviour, both at an individual and social level. This serves not only to modify the current IR views of emotion, but more importantly to uncover social situations where emotion is the primary factor (i.e., source of motivation) in an IR&S process.Emotion need: An individual or group’s desire to be in a particular emotion state by means of acquiring information and/or emotion. P. 52Emotion object: emotion extracted from the content of a document that represents the emotion of the document creator, the emotion of document viewer. P. 56Emotion relevance: an IR system musth know about searchers’ emotion need as well as their information needs. P. 60
Previous research: the role of emotion in the information seeking process is to alleviate and/or diminish thenegative feelings experienced because of uncertainty, so the emotion need here is for experience positive feelings of satisfaction via obtaining information.Yasharargues that physiological needs are not directly satisfied through an information seeking process, but that they instead lead either to anemotion or information need that initiates the information seeking behaviour which goes on to satisfy these needs. For example, hunger (i.e., physiological need) can lead to either searching for close-by restaurants (i.e., information need) or negative emotion states (e.g., frustration) needing to be resolved by watching funny clips (i.e., emotion need). Due to this delegation of physiological need to information or emotion need, we do not further investigate physiological need. Therefore, all we need is to investigate the relationship between information need and emotion need.Yashar argues that an emotion need is more fundamental than an information need in the sense that if an information need exists it implies that there is an underlying emotion need to satisfy this information need. The whole IR&S behaviour is thus driven by an emotion need. However, the converse may not necessary be true, e.g., a user could want to be happy/sad/angry but without having a well-defined IN. Thus, whenever information need is discussed, an emotion need is preexistent. In the case when the emotion need of the searcher is to diminish the negative feelings associated with a lack of knowledge (i.e., an IN), the emotion need would be satisfied if the IN associated with it is resolved. For example, if a searcher’s IN is to know about topic x, the searcher must believe4 that information about x has been acquired, in order for their emotion need to be satisfied. Thus, the emotion need will not be resolved unless the underlying information need is resolved, since in this context, the information need is the dominant one.There are in fact emotion needs that do not imply an information need in the way we have defined information above. An example of such needs are the scenarios explained in Section 3.5.4, i.e., users who are stressed and look at some clips that they know will help to relieve their stress. Of course, one way of remembering these clips is by employing the associative nature of the relationship between emotion and memory (see Section 3.4.4). Other ways include looking at the popular (most viewed/highly recommended) objects. In all these scenarios there is no particular information need to be resolved, but only an emotion need, e.g., when searchers are seeking for funny clips in YouTube. In these scenarios, it is argued that the emotion aspect of information objects is more important than their information aspect, and we label them as extreme emotion need scenarios. Thus, one can present emotion need as a continuous spectrum ranging from informational needs to extreme emotionalneeds.It has been shown that information need motivates searchers to engage with an IR system. An emotion need can be a motive for searchers to use an IR system when it manifests itself as information need. It is our belief that emotion needs, even when they do not lead to a particular information need, can motivate searchers to use an IR system.
Yasharsays that emotion is the driving force of Information Retrieval and Seeking. If this is true, it is plausible that wecanfind emotion in the queries that the userwrites in a search box, and consequently in query logs.
“keywordese”, i.e. the kind of sublanguage/jargonweuse to communicate with searchengines (that is, a languagewithoutarticle, without prepositions, and other stop words, withoutmuch syntax or hedges, etc.), query logs are skimmed texts that require no cleaning from redundancies or rhetorical ornaments, and reducedpre-processing.
One of the manyblogsfocussing on maternity and child-relatedissuel. There is a widenetwork of similarblogs and websites in ItalycalledFattoreMammaNetwork: http://fattoremamma.com/network/
RepetionsUse of functionwords (stopwords)ProperNamesOneword & multi-wordsBooktitles: Iomisvezzo da solo scritto da Lucio PiermarinipediatricianWean = Accustom an infant to food other than its mother's milk.Nullaosta = certificate of no impedimentAbortoterapeutico= therapeuticabortion = misscarriageDr. Estivill, a pediatrician and neurophysiologist, is the director of the Sleep Clinic at the InstitutDexeus in Barcelona, Spain. He is also the coordinator of the Sleep Unit at Catalonia General Hospital and Incosol Clinic in Marbella. Dr. Estivill has written many popular books on sleep and other habits, including 5 Days to a Perfect Night’s Sleep for your Child. Togliere il pannolino = stop wearingnappies/stopusingdiapers = article ”i” masculin plural has beenusedNopron: syrup = psicofarmaco =drug used in treatment of mental conditions Fare la nanna = nanna fam bye-byes (Brit) , beddy-byesNeonato 4 mesiNulla osta Iomisvezzo da soloTracy Hogg(notprovided): About a year ago, Google decided that all users logged into Google — such as Google+, Gmail — would be redirected from http://google.com to https://google.com, thereby encrypting data. Google claims this move was done to protect users’ privacy. While the jury is still out on whether or not this move protects anyone’s privacy, one thing is for certain: web-based businesses and SEO experts have to jump through some hoops in order to get around the dreaded "(not provided)" keyword.
High frequency of nouns and verbs indicates density of information (Biber, 1988: 105). Adjectives elaborate nominal informationA density of contentwords,
Inserimento al nido = ”settling-in phase" (period in which children are gradually introduced to the nursery)
Whenweknow the genre of texts, wecansurmize the purpose of the text producer.Whenwehave a high frequency of a string such as ”bambiniaggressivi” in a specializedblog for parents, it probablymeans that usershave a problem with aggressive children and theywant to find suggestions on how to solve the problem and alsofindsomeempathy (they are not the onlyoneshaving this problem). So the negative adjectiveaggressividoes not convey a negative feeling but a positive energy to solve a problem. So this queryconveys a posive emotion and the hope that something negative (an aggressive behaviour) can be changed or solved. This types of adjectives that are negative at face values, shouldbecome positive becausethey express an emotion need that is a positive attitude. hat I am proposinghere is a refinement of the conceptpresented in ”Topic-based Sentiment Analysis in the Social Media …” (Thelwall and Buckley, 2012): the polarity of affectwordsmightflipaccording to genre (that gives us a hint of the purpose of why a text has beenproduced) and the communicative situation., and not only the topic. Ifsomebodywrites ”bambiniaggressivi” in a FB problably the purpose is different, like witnessingchildren who behaveaggressively.
Tweeking is not enough.
Not sure how to interpret this
Instead of applyinggenreralaffectvocubulary, one easy way is to identify the coreaffectvocabulary is to make a ngramfrequency list. In this example you cansee the mostfrequenttrigrams from genitoricrescono. You canseemany repetitions because in thesequeriesthere ar manyfunctionwords (articles, prepositions, etc). In order to get a cleaner list.
Nationa HEALTH CARE /Nhc
Not big data
Self-certificationMymedical center contatctMammographiGerman measlesChange healthcarewardWebbisar = regulation on how to publishpictures of new born babys on the web
I think that the VGR website genre is less emotional… There is possibly a difference in users’ emotionalbehaviourwhenspecifyingqueries to a websearchengine and whenusing a specializedwebsite. In the first case, userscommunicatetheir emotions moreextensively,In the second case, they just specify the singleword that best representtheir information need.
The expectation from a public service website is to be informative. VGR queries are specificed by users who are in the website and use the internalsearchengine. Thereforethey go straitght to the point by specifying a singlewordwithoutanydescriptive part of speech.While on the web, users must describe in a moredetailedwaywhatthey are looking for. It wouldhavebeeninteresting to comparehow the same topics expressed in the VGR querieswherespecified in a forum or a medicalblog.…