GoEmotions is a corpus of 58k carefully curated comments extracted from Reddit, with human annotations using 27 emotion categories or Neutral.

  • Number of examples: 58,009
  • Number of labels: 27 + Neutral
  • Maximum sequence length in training and evaluation datasets: 30

On top of the raw data, we include a version filtered based on reter-agreement, which contains a train/test/validation split:

  • Size of training dataset: 43,410
  • Size of test dataset: 5,427
  • Size of validation dataset: 5,426

The emotion categories are: admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise.