SENTIMENT ANALYSIS IN E-COMMERCE USING
DEEP LEARNING
Aytaj Abdullayeva
Baku Higher Oil School
Baku, Azerbaijan
aytaj.abdulayeva.std@bhos.edu.az
Supervisor: Ph.D Associate Professor Kamala Pashayeva
Keywords:
Sentiment analysis, Deep Learning in e-commerce, review analysis, e-
commerce product reviews.
Myriad number of people share their views online for the products that
they bought or the restaurants they visited on the review websites or social
media. All this huge data stays on the Internet, and it becomes very
challenging to pass all from the grid to find out what is being said about the
organization or the product. The amount of data available is increasing
gradually because of the arrival of big data era. The time required by the
retailers and customers to read huge data of reviews and classify them is
high and not efficient.
THE 3
rd
INTERNATIONAL SCIENTIFIC CONFERENCES OF STUDENTS AND YOUNG RESEARCHERS
dedicated to the 99
th
anniversary of the National Leader of Azerbaijan Heydar Aliyev
119
E-commerce industry should pay more attention to the sentiment
analysis which has a very tremendous effect on understanding customer
needs. The process of sentiment analysis contains defining the emotions
from a series of words, or the opinions and the tone from the text. Most
important one is to understand whether the feedback is positive or negative.
An integral part of the analysis is not only to understand the content or the
emotion of the feedback, but also to understand which aspect they are talking
about. Nowadays, the number of e-commerce platforms using sentiment
analysis is increasing rapidly due to the need to understand the consumers.
Furthermore, there are numerous benefits of using sentiment analysis
techniques, such as processing exceedingly large amount of data within a
short time, finding out weak points of the products (The size mismatch for the
fashion items can be an example.) [1]
Traditional customer surveys always exist in the market, but the
improvement of e-commerce has brought the sentiment analysis concept to
a more innovative phase using NLP (Natural Language Processing). The first
step is to start data cleaning such as removing missing value, stop words,
digits or the unnecessary symbols, these steps are followed by converting
the whole text to lowercase.
There are various methods used for NLP, however, after testing them
(Bags of Words, BERT, Hugging Face libraries) XLNet has been selected as
the best model. XLNet uses the advantages of auto-regressive methods to
train the data which does not rely on the data corruption so that it outperforms
BERT. The main advantage of AR models is to be good at NLP tasks. The
dataset that has been used to train the model contains review title, reviews,
the binary value of recommended or not and the star rating. Model is trained
using XLNet with 4 epochs which indicate the total number of stages that the
total training data passed through. Throughout the process, the decrease in
the loss (prediction error) has been observed which results in providing better
performance for the model.
As our dataset is unbalanced, precision, recall and F1-score (harmonic
mean of the previous two) are much more appropriate to apply for evaluating
the model performance rather than accuracy, as it is distributed mostly by
the True Negatives/Positives (correct prediction for negative and positive
class) which is not the focus point. The integral part here is to highlight the
model behavior on the False Negatives/Positives (wrong prediction for
negative and positive class) so that the decrease of the cost can be viable.[2]
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑇𝑟𝑢𝑒𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑇𝑟𝑢𝑒𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝐹𝑎𝑙𝑠𝑒𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
,
𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑇𝑟𝑢𝑒𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑇𝑟𝑢𝑒𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝐹𝑎𝑙𝑠𝑒𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠
,
|