SardiStance

stance detection in Italian tweets sardistance@evalita 2020

SardiStance is the first shared task regarding Stance Detection in Italian tweets. We have collected tweets written in Italian about the Sardines movement, and we invite at automatically detecting their stance.

The task will be organized within EVALITA 2020, the 7th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian, which will be held **ONLINE on December 16th and 17th, 2020** in Bologna, Italy, co-located with CLiC-it 2020, from November 30th, 2020 until December 3rd, 2020.

Organizers

Mirko Lai1

Alessandra T. Cignarella1,2

Cristina Bosco1

Viviana Patti1

Paolo Rosso2

1. Dipartimento di Informatica, Università degli Studi di Torino, Italy

2. PRHLT Research Center, Universitat Politècnica de València, Spain

Credits

Sergio Rabellino

ICT Staff, Dipartimento di Informatica, Università degli Studi di Torino, Italy


Introduction and Motivation

Recently, a special interest in the task of monitoring people's stance towards particular targets has grown; thus leading to the creation of a novel area of investigation named Stance Detection (SD). Research on this topic has an impact on different aspects such as public administration, policy-making, marketing strategies and security. In fact, through the constant monitoring of people's opinion, desires, complaints and beliefs on political agenda or public services, administrators could better meet population's needs. For example, a practical application of SD could improve the automatic identification of people's extremist tendencies or understand and prevent citizens' dissatisfaction and frustration.

Stance Detection (target-specific stance classification) is the task of automatically determining whether the author of a text is in favor, against, whether neither inference is likely or neutral/none towards a given target. The first shared task on SD was indeed held for English at SemEval in 2016, i.e. Task 6 "Detecting Stance in Tweets" (Mohammad et al., 2016). It consisted in detecting the orientation in favor or against six different targets of interest: "Hillary Clinton", "Feminist Movement", "Legalization of Abortion", "Atheism", "Donald Trump", and "Climate Change is a Real Concern". A more recent evaluation for SD systems was proposed at IberEval 2017 for both Catalan and Spanish (Taulé et al., 2017) where the target was only one, i.e. "Independence of Catalonia". The same organizers, proposed a similar shared task the following year at the evaluation campaign IberEval 2018 regarding the target "Catalan first of October Referendum" (Taulé et al., 2018) and encouraging Stance Detection with multimodal approaches.

Another type of stance classification, more general-purpose, is the open stance classification task, usually indicated with the acronym SDQC, by referring to the four categories exploited for indicating the attitude of a message with respect to a rumour: Support (S), Deny (D), Query (Q) and Comment (C) (Aker et al., 2017). Two shared tasks of this type were organized in the last years at SemEval in the 2017 and in the 2019 editions, considering tweets along with Reddit posts in English (Derczynski et al., 2017, Gorrell et al., 2019).


Target Audience

The task is open to everyone from industry and academia. We in particular encourage the participations of researchers, industrial teams, and students.


Task Description

With this task proposal we would like to invite participants to explore features based on the textual content of the tweet, such as structural, stylistic, and affective features, but also features based on contextual information that documents not emerge directly from the text, such as for instance knowledge about the domain of the political debate or information about the user's community.
Overall, we propose two different subtasks:

Task A - Textual Stance Detection:

The first task is a three-class classification task where the system has to predict whether a tweet is in favour, against or neutral/none towards the given target (following the guidelines below), exploiting only textual information, i.e. the text of the tweet.

Task B - Contextual Stance Detection:

The second task is the same as the first one: a three-class classification task where the system has to predict whether a tweet is in favour, against or neutral/none towards the given target. Here participants will have access to a wider range of contextual information based on the post such as: the number of retweets, the number of favours, the type of posting source (e.g. iOS or Android), and date of posting.

Furthermore we will share (and encourage its exploitation) contextual information related to the user, such as: number of tweets ever posted, user's bio (only emojis), user's number of followers, user's number of friends. Additionally we will share users' contextual information about their social network, such as: friends, replies, retweets, and quotes' relations.
The personal ids of the users will be anonymized but their network structures will be maintained intact.

 


Data

The dataset will include short documents taken from Twitter. From May 29th it will be possible to download it from here: data repository. The files are encrypted. Please register to our Google Group to obtain the passwords.

The test set will be available in the same repository from September 18th, 2020. 


Evaluation

Each participating team will initially have access to the training data only. Later, the unlabelled test data will be released (see the timeframe below). After the assessment, the labels for the test data will be released as well.

The evaluation will be performed according to the standard metrics known in literature (accuracy, precision, recall and F1-score). We will provide official ranking for Task A and for Task B, and two separateranking for constrained and unconstrained runs. Systems will be evaluated using F1-score computed over the two main classes (favour and against). The submissions will be ranked by the averaged F1-score above the two classes.

Baselines

For both sub-tasks, we will compute a first baseline using a simple machine learning model based on SVM combined with uni-gram feature. And as second baseline we will provide one measure computed by a model based on previous work from the authors (MultiTacos) [Lai, M., Cignarella, A. T., Hernandez Fariás, D. I., Bosco, C., Patti,V., and Rosso, P. Multilingual Stance Detection in Social Media: Four Political Debates on Twitter. Computer Speech and Language 63 (2020)].

 

 

How To Partecipate

Register your team by using the registration web form at http://www.evalita.it/2020 (available soon, see timeframe below).

Information about the submission of results and their format will be available in the Task guidelines.

We invite the potential participants to subscribe to our Google Group in order to be kept up to date with the latest news related to the task. Please share comments and questions with the mailing list. The organizers will assist you for any potential issues that could be raised.

Participants will be required to provide an abstract and a technical report including a brief description of their approach, an illustration of their experiments, in particular techniques and resources used, and an analysis of their results for the publication in the Proceedings of contest.

Important Dates

29th May 2020: training data available (data repository)

4th September 2020: test data available

18th September 2020:  test data available

18th September 2020 - 2nd October 2020: evaluation window

2nd October 2020: systems results due to the organizers

16th October 2020: deadline for submission of system description papers

6th November 2020: technical reports due to organizers (camera-ready)

27th November 2020: video presentations sent to the EVALITA chair

1st or 2nd December 2020: final workshop in Bologna

16th-17th December 2020: final workshop (online)

Official Rankings

Task A

ranking team name framework f-avg precision_against precision_favour precision_none recall_against recall_favour recall_none f_against f_favour f_none
1 UNITOR unconstrained 0.6853 0.8150 0.5916 0.3436 0.7601 0.5765 0.4535 0.7866 0.5840 0.3910
2 UNITOR constrained 0.6801 0.8343 0.5187 0.3659 0.7466 0.6378 0.4360 0.7881 0.5721 0.3979
3 UNITOR constrained 0.6793 0.8183 0.5240 0.3571 0.7709 0.6122 0.3779 0.7939 0.5647 0.3672
4 DeepReading constrained 0.6621 0.8469 0.5060 0.3500 0.6860 0.6429 0.5291 0.7580 0.5663 0.4213
5 UNITOR unconstrained 0.6606 0.8167 0.5388 0.3156 0.7264 0.5663 0.4477 0.7689 0.5522 0.3702
6 IXA constrained 0.6473 0.8209 0.5117 0.3255 0.7102 0.5561 0.4826 0.7616 0.5330 0.3888
7 GhostWriter constrained 0.6257 0.8106 0.4709 0.3226 0.6981 0.5357 0.4651 0.7502 0.5012 0.3810
8 IXA constrained 0.6171 0.8459 0.3947 0.3349 0.6806 0.6122 0.4070 0.7543 0.4800 0.3675
9 SSNCSE-NLP constrained 0.6067 0.7506 0.4245 0.2679 0.7951 0.4592 0.1744 0.7723 0.4412 0.2113
10 DeepReading constrained 0.6004 0.8387 0.4286 0.3069 0.5957 0.6122 0.5407 0.6966 0.5042 0.3916
11 GhostWriter constrained 0.6004 0.8094 0.4772 0.2921 0.6523 0.4796 0.5349 0.7224 0.4784 0.3778
12 UninaStudents constrained 0.5886 0.7300 0.4348 0.3488 0.8491 0.3571 0.1744 0.7850 0.3922 0.2326
baseline 0.5784 0.7549 0.3975 0.2589 0.6806 0.4949 0.2965 0.7158 0.4409 0.2764
13 TextWiller constrained 0.5773 0.7306 0.3707 0.3333 0.8261 0.3878 0.1279 0.7755 0.3791 0.1849
14 SSNCSE-NLP constrained 0.5749 0.7798 0.3664 0.3196 0.6873 0.4898 0.3605 0.7307 0.4192 0.3388
15 QMUL-SDS constrained 0.5595 0.7990 0.3135 0.2500 0.6375 0.5918 0.2151 0.7091 0.4099 0.2313
16 QMUL-SDS constrained 0.5329 0.8114 0.3109 0.2744 0.5391 0.6378 0.3430 0.6478 0.4181 0.3049
17 MeSoVe constrained 0.4989 0.7287 0.2684 0.3155 0.7385 0.2602 0.3081 0.7336 0.2642 0.3118
18 TextWiller constrained 0.4715 0.7804 0.4286 0.1983 0.5889 0.1990 0.5291 0.6713 0.2718 0.2884
19 SSN_NLP constrained 0.4707 0.7763 0.3030 0.2453 0.4582 0.4592 0.5349 0.5763 0.3651 0.3364
20 SSN_NLP constrained 0.4473 0.6848 0.2194 0.1804 0.6267 0.2653 0.2035 0.6545 0.2402 0.1913
21 Venses constrained 0.3882 0.6574 0.18 0.1907 0.4474 0.3776 0.2151 0.5325 0.2438 0.2022
22 Venses constrained 0.3637 0.7011 0.1881 0.2024 0.3383 0.4847 0.2907 0.4564 0.2710 0.2387

Task B

ranking team name framework f-avg precision_against precision_favour precision_none recall_against recall_favour recall_none f_against f_favour f_none
1 IXA constrained 0.7445 0.8539 0.6009 0.4589 0.8585 0.6684 0.3895 0.8562 0.6329 0.4214
2 TextWiller constrained 0.7309 0.8617 0.5344 0.3520 0.8396 0.7143 0.2558 0.8505 0.6114 0.2963
3 DeepReading constrained 0.7230 0.8594 0.5370 0.3624 0.8154 0.7041 0.3140 0.8368 0.6093 0.3364
4 DeepReading constrained 0.7222 0.8481 0.5612 0.4383 0.8127 0.6786 0.4128 0.8300 0.6143 0.4251
5 TextWiller constrained 0.7147 0.8967 0.6243 0.2931 0.7722 0.5765 0.4942 0.8298 0.5995 0.3680
6 QMUL-SDS constrained 0.7088 0.8441 0.4852 0.2581 0.8100 0.7551 0.1395 0.8267 0.5908 0.1811
7 UNED constrained 0.6888 0.8458 0.4961 0.2531 0.7911 0.6429 0.2384 0.8175 0.5600 0.2455
8 QMUL-SDS constrained 0.6765 0.8282 0.4167 0.4706 0.7992 0.7653 0.0930 0.8134 0.5396 0.1553
9 SSNCSE-NLP constrained 0.6582 0.8321 0.4715 0.3508 0.7547 0.5918 0.3895 0.7915 0.5249 0.3691
10 SSNCSE-NLP constrained 0.6556 0.8419 0.4574 0.3660 0.7466 0.6020 0.4128 0.7914 0.5198 0.3880
baseline constrained 0.6284 0.7845 0.4506 0.3054 0.7507 0.5357 0.2965 0.7672 0.4895 0.3009
11 GhostWriter constrained 0.6257 0.8106 0.4709 0.3226 0.6981 0.5357 0.4651 0.7502 0.5012 0.3810
12 GhostWriter constrained 0.6004 0.8094 0.4772 0.2921 0.6523 0.4796 0.5349 0.7224 0.4784 0.3778
13 UNED constrained 0.5313 0.7148 0.3409 0.2246 0.7668 0.3061 0.1802 0.7399 0.3226 0.2000

Contact Us

Write to us HERE!

Or contact the two main organizers: Mirko Lai, mirko.lai@unito.it and Alessandra T. Cignarella, cigna@di.unito.it.