SardiStance

stance detection in Italian tweets sardistance@evalita 2020

SardiStance is the first shared task regarding Stance Detection in Italian tweets. We have collected tweets written in Italian about the Sardines movement, and we invite at automatically detecting their stance.

The task will be organized within EVALITA 2020, the 7th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian, which will be held **ONLINE on December 16th and 17th, 2020** ~~in Bologna, Italy, co-located with CLiC-it 2020, from November 30th, 2020 until December 3rd, 2020.~~

Organizers

Mirko Lai¹

Alessandra T. Cignarella^1,2

Cristina Bosco¹

Viviana Patti¹

Paolo Rosso²

1. Dipartimento di Informatica, Università degli Studi di Torino, Italy

2. PRHLT Research Center, Universitat Politècnica de València, Spain

Credits

Sergio Rabellino

ICT Staff, Dipartimento di Informatica, Università degli Studi di Torino, Italy

Introduction and Motivation

Recently, a special interest in the task of monitoring people's stance towards particular targets has grown; thus leading to the creation of a novel area of investigation named Stance Detection (SD). Research on this topic has an impact on different aspects such as public administration, policy-making, marketing strategies and security. In fact, through the constant monitoring of people's opinion, desires, complaints and beliefs on political agenda or public services, administrators could better meet population's needs. For example, a practical application of SD could improve the automatic identification of people's extremist tendencies or understand and prevent citizens' dissatisfaction and frustration.

Stance Detection (target-specific stance classification) is the task of automatically determining whether the author of a text is in favor, against, whether neither inference is likely or neutral/none towards a given target. The first shared task on SD was indeed held for English at SemEval in 2016, i.e. Task 6 "Detecting Stance in Tweets" (Mohammad et al., 2016). It consisted in detecting the orientation in favor or against six different targets of interest: "Hillary Clinton", "Feminist Movement", "Legalization of Abortion", "Atheism", "Donald Trump", and "Climate Change is a Real Concern". A more recent evaluation for SD systems was proposed at IberEval 2017 for both Catalan and Spanish (Taulé et al., 2017) where the target was only one, i.e. "Independence of Catalonia". The same organizers, proposed a similar shared task the following year at the evaluation campaign IberEval 2018 regarding the target "Catalan first of October Referendum" (Taulé et al., 2018) and encouraging Stance Detection with multimodal approaches.

Another type of stance classification, more general-purpose, is the open stance classification task, usually indicated with the acronym SDQC, by referring to the four categories exploited for indicating the attitude of a message with respect to a rumour: Support (S), Deny (D), Query (Q) and Comment (C) (Aker et al., 2017). Two shared tasks of this type were organized in the last years at SemEval in the 2017 and in the 2019 editions, considering tweets along with Reddit posts in English (Derczynski et al., 2017, Gorrell et al., 2019).

Target Audience

The task is open to everyone from industry and academia. We in particular encourage the participations of researchers, industrial teams, and students.

Task Description

With this task proposal we would like to invite participants to explore features based on the textual content of the tweet, such as structural, stylistic, and affective features, but also features based on contextual information that documents not emerge directly from the text, such as for instance knowledge about the domain of the political debate or information about the user's community.
Overall, we propose two different subtasks:

Task A - Textual Stance Detection:

The first task is a three-class classification task where the system has to predict whether a tweet is in favour, against or neutral/none towards the given target (following the guidelines below), exploiting only textual information, i.e. the text of the tweet.

Task B - Contextual Stance Detection:

The second task is the same as the first one: a three-class classification task where the system has to predict whether a tweet is in favour, against or neutral/none towards the given target. Here participants will have access to a wider range of contextual information based on the post such as: the number of retweets, the number of favours, the type of posting source (e.g. iOS or Android), and date of posting.

Furthermore we will share (and encourage its exploitation) contextual information related to the user, such as: number of tweets ever posted, user's bio (only emojis), user's number of followers, user's number of friends. Additionally we will share users' contextual information about their social network, such as: friends, replies, retweets, and quotes' relations.
The personal ids of the users will be anonymized but their network structures will be maintained intact.

Data

The dataset will include short documents taken from Twitter. From May 29th it will be possible to download it from here: data repository. The files are encrypted. Please register to our Google Group to obtain the passwords.

The test set will be available in the same repository from September 18th, 2020.

Evaluation

Each participating team will initially have access to the training data only. Later, the unlabelled test data will be released (see the timeframe below). After the assessment, the labels for the test data will be released as well.

The evaluation will be performed according to the standard metrics known in literature (accuracy, precision, recall and F1-score). We will provide official ranking for Task A and for Task B, and two separateranking for constrained and unconstrained runs. Systems will be evaluated using F1-score computed over the two main classes (favour and against). The submissions will be ranked by the averaged F1-score above the two classes.

Baselines

For both sub-tasks, we will compute a first baseline using a simple machine learning model based on SVM combined with uni-gram feature. And as second baseline we will provide one measure computed by a model based on previous work from the authors (MultiTacos) [Lai, M., Cignarella, A. T., Hernandez Fariás, D. I., Bosco, C., Patti,V., and Rosso, P. Multilingual Stance Detection in Social Media: Four Political Debates on Twitter. Computer Speech and Language 63 (2020)].

How To Partecipate

Register your team by using the registration web form at http://www.evalita.it/2020 (available soon, see timeframe below).

Information about the submission of results and their format will be available in the Task guidelines.

We invite the potential participants to subscribe to our Google Group in order to be kept up to date with the latest news related to the task. Please share comments and questions with the mailing list. The organizers will assist you for any potential issues that could be raised.

Participants will be required to provide an abstract and a technical report including a brief description of their approach, an illustration of their experiments, in particular techniques and resources used, and an analysis of their results for the publication in the Proceedings of contest.

Important Dates

29th May 2020: training data available (data repository)

4th September 2020: test data available

18th September 2020: test data available

18th September 2020 - 2nd October 2020: evaluation window

2nd October 2020: systems results due to the organizers

16th October 2020: deadline for submission of system description papers

6th November 2020: technical reports due to organizers (camera-ready)

27th November 2020: video presentations sent to the EVALITA chair

1st or 2nd December 2020: final workshop in Bologna

16th-17th December 2020: final workshop (online)

Official Rankings

Task A

ranking	team name	framework	f-avg	precision_against	precision_favour	precision_none	recall_against	recall_favour	recall_none	f_against	f_favour	f_none
1	UNITOR	unconstrained	0.6853	0.8150	0.5916	0.3436	0.7601	0.5765	0.4535	0.7866	0.5840	0.3910
2	UNITOR	constrained	0.6801	0.8343	0.5187	0.3659	0.7466	0.6378	0.4360	0.7881	0.5721	0.3979
3	UNITOR	constrained	0.6793	0.8183	0.5240	0.3571	0.7709	0.6122	0.3779	0.7939	0.5647	0.3672
4	DeepReading	constrained	0.6621	0.8469	0.5060	0.3500	0.6860	0.6429	0.5291	0.7580	0.5663	0.4213
5	UNITOR	unconstrained	0.6606	0.8167	0.5388	0.3156	0.7264	0.5663	0.4477	0.7689	0.5522	0.3702
6	IXA	constrained	0.6473	0.8209	0.5117	0.3255	0.7102	0.5561	0.4826	0.7616	0.5330	0.3888
7	GhostWriter	constrained	0.6257	0.8106	0.4709	0.3226	0.6981	0.5357	0.4651	0.7502	0.5012	0.3810
8	IXA	constrained	0.6171	0.8459	0.3947	0.3349	0.6806	0.6122	0.4070	0.7543	0.4800	0.3675
9	SSNCSE-NLP	constrained	0.6067	0.7506	0.4245	0.2679	0.7951	0.4592	0.1744	0.7723	0.4412	0.2113
10	DeepReading	constrained	0.6004	0.8387	0.4286	0.3069	0.5957	0.6122	0.5407	0.6966	0.5042	0.3916
11	GhostWriter	constrained	0.6004	0.8094	0.4772	0.2921	0.6523	0.4796	0.5349	0.7224	0.4784	0.3778
12	UninaStudents	constrained	0.5886	0.7300	0.4348	0.3488	0.8491	0.3571	0.1744	0.7850	0.3922	0.2326
	baseline		0.5784	0.7549	0.3975	0.2589	0.6806	0.4949	0.2965	0.7158	0.4409	0.2764
13	TextWiller	constrained	0.5773	0.7306	0.3707	0.3333	0.8261	0.3878	0.1279	0.7755	0.3791	0.1849
14	SSNCSE-NLP	constrained	0.5749	0.7798	0.3664	0.3196	0.6873	0.4898	0.3605	0.7307	0.4192	0.3388
15	QMUL-SDS	constrained	0.5595	0.7990	0.3135	0.2500	0.6375	0.5918	0.2151	0.7091	0.4099	0.2313
16	QMUL-SDS	constrained	0.5329	0.8114	0.3109	0.2744	0.5391	0.6378	0.3430	0.6478	0.4181	0.3049
17	MeSoVe	constrained	0.4989	0.7287	0.2684	0.3155	0.7385	0.2602	0.3081	0.7336	0.2642	0.3118
18	TextWiller	constrained	0.4715	0.7804	0.4286	0.1983	0.5889	0.1990	0.5291	0.6713	0.2718	0.2884
19	SSN_NLP	constrained	0.4707	0.7763	0.3030	0.2453	0.4582	0.4592	0.5349	0.5763	0.3651	0.3364
20	SSN_NLP	constrained	0.4473	0.6848	0.2194	0.1804	0.6267	0.2653	0.2035	0.6545	0.2402	0.1913
21	Venses	constrained	0.3882	0.6574	0.18	0.1907	0.4474	0.3776	0.2151	0.5325	0.2438	0.2022
22	Venses	constrained	0.3637	0.7011	0.1881	0.2024	0.3383	0.4847	0.2907	0.4564	0.2710	0.2387

Task B

ranking	team name	framework	f-avg	precision_against	precision_favour	precision_none	recall_against	recall_favour	recall_none	f_against	f_favour	f_none
1	IXA	constrained	0.7445	0.8539	0.6009	0.4589	0.8585	0.6684	0.3895	0.8562	0.6329	0.4214
2	TextWiller	constrained	0.7309	0.8617	0.5344	0.3520	0.8396	0.7143	0.2558	0.8505	0.6114	0.2963
3	DeepReading	constrained	0.7230	0.8594	0.5370	0.3624	0.8154	0.7041	0.3140	0.8368	0.6093	0.3364
4	DeepReading	constrained	0.7222	0.8481	0.5612	0.4383	0.8127	0.6786	0.4128	0.8300	0.6143	0.4251
5	TextWiller	constrained	0.7147	0.8967	0.6243	0.2931	0.7722	0.5765	0.4942	0.8298	0.5995	0.3680
6	QMUL-SDS	constrained	0.7088	0.8441	0.4852	0.2581	0.8100	0.7551	0.1395	0.8267	0.5908	0.1811
7	UNED	constrained	0.6888	0.8458	0.4961	0.2531	0.7911	0.6429	0.2384	0.8175	0.5600	0.2455
8	QMUL-SDS	constrained	0.6765	0.8282	0.4167	0.4706	0.7992	0.7653	0.0930	0.8134	0.5396	0.1553
9	SSNCSE-NLP	constrained	0.6582	0.8321	0.4715	0.3508	0.7547	0.5918	0.3895	0.7915	0.5249	0.3691
10	SSNCSE-NLP	constrained	0.6556	0.8419	0.4574	0.3660	0.7466	0.6020	0.4128	0.7914	0.5198	0.3880
	baseline	constrained	0.6284	0.7845	0.4506	0.3054	0.7507	0.5357	0.2965	0.7672	0.4895	0.3009
11	GhostWriter	constrained	0.6257	0.8106	0.4709	0.3226	0.6981	0.5357	0.4651	0.7502	0.5012	0.3810
12	GhostWriter	constrained	0.6004	0.8094	0.4772	0.2921	0.6523	0.4796	0.5349	0.7224	0.4784	0.3778
13	UNED	constrained	0.5313	0.7148	0.3409	0.2246	0.7668	0.3061	0.1802	0.7399	0.3226	0.2000

Contact Us

Write to us HERE!

Or contact the two main organizers: Mirko Lai, mirko.lai@unito.it and Alessandra T. Cignarella, cigna@di.unito.it.