In the original languageTranslation into English

BIG DATA EMOTION CLASSIFICATION

Annotation

Affecting our mind in terms of decision-making, influencing our moods and behaviors, emotion makes up a major part of our daily lives. People's physical and mental health and work status can be adversely affected by persistent negative emotions, but positive emotions can enhance subjective well-being and promote physical and mental health. I then formulated a question that can positive/negative emotion be automatically classified using a model, i.e, does the sentence contains enough information for the computer to make a sentiment judgment. With a massive amount of text data, I here build up an automatic emotion classification model that could read and distinguish sentences with negative emotions from sentences with positive emotions. Specifically, I studied the penalized logistic regression model with Stanford movie review data as the input. The AUC metric is used for model evaluation and outputted a promising out of sample score of 0.96.

Keywords

Logistic regression
big data
sentiment classification

References:

  1. 2004, All of Statistics: A Concise Course in Statistical Inference, Larry A. Wasserman
  2. 2016, Linear Regression Using R: An Introduction to Data Modeling, David J. Lilja
  3. Sentiment Analysis. https://ai.stanford.edu/~amaas/data/sentiment/
  4. Learning Word Vectors for Sentiment Analysis, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011
  5. Scikit-Learn: Machine Learning in Python — Scikit-Learn 1.0 Documentation. https://scikit-learn.org/stable
  6. Classification: ROC Curve and AUC | Machine Learning Crash Course. Google Developers, https://developers.google.com/machine-learning/crash-course/classification/roc-and-auc.

Other articles of the issue

cc-license