DC8: Hybrid data generation and retrieval for the detection of online harassment

Open Doctoral Candidate/PhD Position at Gesis Leibniz Institute for Social Sciences (GESIS), Germany, for the HYBRIDS project


Reference number: HYBRIDS- DC8

PhD research topic: Hybrid data generation and retrieval for the detection of online harassment

Host institution: Gesis – Leibniz Institute for Social Sciences (GESIS), Germany

PhD Enrolment: University of Santiago de Compostela (USC), Spain

Main Supervisor: Dr Claudia Wagner, GESIS Leibniz Institute for Social Sciences (GESIS), claudia.wagner@gesis.org


  • Dr. José Alonso, University of Santiago de Compostela (USC).
  • Dr. Arkaitz Zubiaga, Queen Mary University (QMUL)
  • Dr. Dennis Assenmacher, GESIS Leibnitz Institute for Social Sciences (GESIS)

Inter-sectoral Supervisors: Mr. Ettore Di Cesare, Fondazione OPENPOLIS.


  1. To design a hybrid strategy to ensure that datasets that are used to train and test harassment detection systems cover different groups of victims, different styles and languages, as well as different aspects of online harassment that are described in the social sciences and have been measured with survey instruments in the past;
  2. To explore data collection and data generation strategies (such as language generation models and style transfer models) to create high quality datasets for online harassment detection;
  3. To define metrics to evaluate the quality of datasets for online harassment detection;
  4. To develop algorithms to detect online harassment in texts from different domains and use those algorithms to detect and generate new examples automatically.

Expected outcomes:

  1. High-quality training and test data for online harassment detection systems.
  2. Development of automated tools to detect online harassment and generate/collect data that is similar and/or complementary.
  3. Methodology and tool validation in use cases on women, LGBTQ and migrants

Planned secondments:

Eligibility Criteria:

  • Mobility: At the time of recruitment, the researcher must not have resided or carried out his/her main activity (work, studies, etc.) in Germany for more than 12 months in the 36 months immediately before the recruitment date. Time spent as part of a procedure for obtaining refugee status under the Geneva Convention or compulsory national service are not taken into account.
  • The candidate must be at the date of recruitment a doctoral candidate (i.e. not already in possession of doctoral degree). Researchers who have successfully defended their doctoral thesis but who have not yet formally been awarded the doctoral degree will not be considered eligible.
  • The candidate must agree to work exclusively for the action.

Specific requirements:

  • Degree: MS.c degree, in Computer Science (or a related discipline)
  • Programming skills: Excellent programming skills, ideally in Python Language: Excellent command of English, together with good academic writing and presentation skills.

Desirable skills:

  • Strong problem-solving skills, and ability to work independently and collaboratively.
  • Experience in developing and implementing NLP and machine learning algorithms for text analysis, including but not limited to text classification, sentiment analysis, named entity recognition, and topic modeling or clustering.
  • Experience in working with large-scale text corpora and handling unstructured data using preprocessing techniques such as tokenization, stemming, lemmatization, and part-of-speech tagging.
  • Familiarity with neural network architectures and techniques, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), transformers, and attention mechanisms.
  • Strong programming skills in Python, with experience in using popular libraries for data analysis and visualization, such as pandas, numpy, matplotlib, and seaborn.
Estimated starting date: 1st July 2023

Contract: Full-time contract

Duration: 36 months, including 4 secondments of 2/3 months each, at other consortium members’ premises (see Secondment section)

Salary: In line with the MSCA standard rates.

The successful candidate will receive an attractive salary in accordance with the MSCA regulations for Doctoral Candidates. The exact salary will be confirmed upon appointment and is dependent on local tax regulations and on the country correction factor (to allow for the difference in cost of living in different EU Member States). The salary includes a living allowance, a mobility allowance and a family allowance (if applicable).

Application Documents:

  • Europass CV (template available in the following link), including the names and contact details of two academic references, in English, highlighting the merits that are established as evaluation criteria;
  • Scans of Bachelor’s and Master’s transcripts, with certified translation in English (if the degree qualification is not in English); If you have not yet completed your master’s, you must submit a provisional academic transcript.
  • A motivation letter in English, highlighting the consistency between your profile and the chosen DC position/s for which you are applying and describing why you wish to be aHYBRIDS Doctoral Candidate to carry out a PhD; (max. 700 words)
  • Scanned copy of your ID card, resident’s card or passport currently in force;
  • Proof of excellent command of English (e.g., IELTS, TOEFL, Cambridge or equivalent). This is not required in case you are a native English speaker (i.e., English is your mother tongue).

In addition, you can add any other documents which you find relevant for the applications such as Master thesis, publications or project reports.

Evaluation criteria:

  • Academic background (up to 40 points)
  • Knowledge and specific achievements (up to 35 points)
  • Shortlisted candidates will be invited for an interview in which the selection committee will assess the applicant’s communication skills, initiative, and motivation to pursue a PhD. (up to 25 points)

Deadline: April 26, 2023, at 23h59 CET (UCT + 01:00)

Candidates are encouraged to contact the HYBRIDS Project Manager (info@hybridsproject.eu) for assistance or for any information related to the application process. When contacting, please indicate the position reference in the subject line.

Enquiries about research content must be sent to the main PhD supervisor via email (see contact details in Supervisors section).