Aspect Level Sentiment Analysis for Arabic Language
1. Cairo University
Institute of Statistical Studies and Research
Department of Computer and Information Science
[Aspect Level Sentiment Analysis for Arabic Language]
BY:
Mahmoud Mohamed Hassan Mahmoud El Razzaz
A proposal for a thesis to be submitted for the fulfillment of
M.SC. Degree in computer science.
Supervised by
Prof. Dr. Hesham Hefny
Dr. Mohamed Farouk
Cairo, Egypt
October 2013
1
2. 1. INTRODUCTION
Sentiment analysis, also called opinion mining, is the field of study that analyzes
people’s opinions, sentiments, evaluations, appraisals, attitudes, and emotions towards
entities such as products, services, organizations, individuals, issues, events, topics, and
their attributes. It represents a large problem space. There are also many names and
slightly different tasks, e.g., sentiment analysis, opinion mining, opinion extraction,
sentiment mining, subjectivity analysis, affect analysis, emotion analysis, review mining,
etc. However, they are now all under the umbrella of sentiment analysis or opinion
mining. While in industry, the term sentiment analysis is more commonly used, but in
academia both sentiment analysis and opinion mining are frequently employed. They
basically represent the same field of study.
Although linguistics and natural language processing (NLP) have a long history, little
research had been done about people’s opinions and sentiments before the year 2000.
Since then, the field has become a very active research area. There are several reasons
for this. First, it has a wide arrange of applications, almost in every domain. The industry
surrounding sentiment analysis has also flourished due to the proliferation of
commercial applications. This provides a strong motivation for research. Second, it
offers many challenging research problems, which had never been studied before.
Third, for the first time in human history, we now have a huge volume of opinionated
data in the social media on the Web. Without this data, a lot of research would not have
been possible. Not surprisingly, the inception and the rapid growth of sentiment analysis
coincide with those of the social media. In fact, sentiment analysis is now right at the
center of the social media research. Hence, research in sentiment analysis not only has
an important impact on NLP, but may also have a profound impact on management
sciences, political science, economics, and social sciences as they are all affected by
people’s opinions.
2. The research problem (problem definition)
With the explosive growth of social media (e.g., reviews, forum discussions, blogs,
micro-blogs, Twitter, comments, and postings in social network sites) on the Web,
individuals and organizations are increasingly using the content in these media for
2
3. decision making. Nowadays, if one wants to buy a consumer product, one is no longer
limited to asking one’s friends and family for opinions because there are many user
reviews and discussions in public forums on the Web about the product. For an
organization, it may no longer be necessary to conduct surveys, opinion polls, and focus
groups in order to gather public opinions because there is an abundance of such
information publicly available. However, finding and monitoring opinion sites on the
Web and distilling the information contained in them remains a formidable task because
of the proliferation of diverse sites. Each site typically contains a huge volume of opinion
text that is not always easily deciphered in long blogs and forum postings. The average
human reader will have difficulty identifying relevant sites and extracting and
summarizing the opinions in them.
3. The objectives of the study
I want to study the feasibility of constructing an automated sentiment classification
system, find the best accuracy can be obtained from such systems if it is feasible, the
affect of the domain of the data on the accuracy of the classification.
4. Related work / Literature review
Various machine learning and non-machine learning techniques have been used for
classifying Sentiment texts in English Language. Many of these techniques are discussed
in Bing Liu. “Sentiment Analysis and Opinion Mining”.
In Arabic Language many researcher started to apply sentiment classification on Arabic
Language in the past few years such as:
Document Level Sentiment Classification for Arabic Language:
Mohamed El Arnaoty et al., who provided “a machine learning approach for opinion
holder extraction in Arabic language” 2012[1], Mohamed Aly et al., who provided “A
Large Scale Arabic Book reviews Data Set” 2013.[2]
3
4. Sentence level Sentiment Classification for Arabic Language:
N. Farraet al., in Sentence-Level and Document-Level Sentiment mining for Arabic Texts.
In proceedings of International Conference on data mining workshops. Pages 11141119. IEEE, 2010 [3]
Aspect Level Sentiment Classification for Arabic Language:
Some researcher conducted an Aspect level sentiment classifier for English Language as
in Tun Thura Thetet al ., in “Aspect-based sentiment analysis of movie reviews on
discussion boards” Journal of Information Science 2010 [4].
But for our best knowledge an aspect level sentiment classification have not been
examined yet for Arabic Language.
Finally some researchers surveyed the work done so far in the research of SSA of
Arabic and its key issues:
a Survey on Sentiment And Subjectivity Analysis of Arabic were introduced by Mohamed
Korayem et al., in “Subjectivity and Sentiment Analysis of Arabic: A Survey” 2012 [5].
Furthermore the difficulties of applying sentiment classification in Arabic Language were
disused by Soha Ahmed et al., in “Key Issues in Conducting Sentiment Analysis on Arabic
Social Media Text” 2012 [6].
Also the SSA for Arabic Language have been applied in the domain of social media by
Muhammad Abdul-Mageedet al., in “SAMAR: Subjectivity and sentiment analysis 1 for
Arabic social media”[7].
4
5. Work plan
1. Overview of Data collection
2. Overview of data preprocessing (entity extraction, entity categorization, feature
selection, and feature extraction)
3. Overview of the Sentiment Analysis levels and techniques
4. The proposed approach for Sentiment Analysis: Aspect Level Sentiment classification.
5. Testing the proposal approach and comparing the results with related work.
6. Conclusion and future work.
References:
[1] Mohamed El Elarnaoty, Samir AbdelRahman, and Aly Fahmy: “a machine learning
approach for opinion holder extraction in Arabic language” 2012.
[2] Mohamed Aly and Amir Atiya: “A Large Scale Arabic Book reviews Data Set” 2013.
[3] N. Farra, E. Challita, R. Assi, and H. Hajj. Sentence-Level and Document-Level
Sentiment mining for Arabic Texts. In proceedings of International Conference on data
mining workshops. Pages 1114-1119. IEEE, 2010
[4] Tun Thura Thet, Jin-Cheon Na and Christopher S.G. Khoo: “Aspect-based sentiment
analysis of movie reviews on discussion boards” Journal of Information Science 2010.
[5] Mohamed Korayem et al., in “Subjectivity and Sentiment Analysis of Arabic: A
Survey” 2012
[6] Soha Ahmed, Michel Pasquier, and Ghassan Qadah: “Key Issues in Conducting
Sentiment Analysis on Arabic Social Media Text” 2012.
[7] Muhammad Abdul-Mageed, Mona Diab and Sandra Kübler: “SAMAR: Subjectivity and
sentiment analysis for Arabic social media” .
5