MULTI-WORD GREEK EXPRESSIONS DATASET

COMPREHENSIVE ANALYSIS AND RELEVANT RESOURCES.

The Polylex constitutes a website aiming at the creation of a useful methodological tool for Multiword Expressions (MWEs) data in Modern Greek.

Currently the website is comprised of:

  • Important linguistic body of 6,000 entries of multiword verbal expressions with emphasis on fixedness. It constitutes the first syntactic lexicon of compound predicates that have been classified in 20 Tables with expressions, in which the lexical, semantic & syntactic properties of the phrases are coded. The given data is in accordance with the principles of the Lexicon – Grammar and it comes from Aggeliki Fotopoulou’ PhD thesis (1993) as well as Marianna Mini’ (2009).
  • Bibliopraphic documentation of the multiword expressions, the Lexicon – Grammar, and the Methodology that was followed during the classification and the codification of the given corpora.

Aims

  • The presented linguistic corpora can already be deployed for various educational, lexicographic, and computational projects (with the relevant permission and copyright) as well as be updated and enriched (through projects or even exchanges with the users).
  • The prospective and most immediate aim is the development of robust and organised syntactic dictionaries, according to lingustic principles, to be evolved and updated over time. We intend to establish the given website as a validation framework and supportive methodological tool for several lexicographic and educational implementations, while it is oriented towards students, linguists, teachers, and researchers of Modern Greek in general.
  • Last but not least, one of our crucial targets is to link this linguistic data with useful tools and Τext corpora in order to proceed with applications of Natural Language Processing (NLP) and put the data in service for virtual agencies and networks.

Addition of new linguistic resources

The website aims to host, with the permission of its creators, additional relevant resources while the following are being developed:

  • Tables (categories) of Verbal MWEs including expressions with clitics without reference e.g ton ipia (= I drunk it / I’m screwed), tin akuo (= I heard it / I get high) (Panagiota Kyriazi)
  • Tables (categories) of Verbal compound expressions e.g Giannis kernai ke Giannis pini (= John treats and John drinks / Someone who takes care of his/her own benefit although it may seem that he/she sees after the public good) (Evaggelia Kazana)
  • Association with the Database of nominal and verbal MWEs that has been designed within the framework of the Program Action ‘DRASIS’, created by Aggeliki Fotopoulou and Voula Giouli (2021)

Licence

  • Polylex is unencumbered and it may be used for educational application and research purposes in accordance with the following license agreement.
  • The corpus will be available with the license CC-BY-NC.

Terms and conditions

By using, modifying, and copying the database material and documentation you agree to comply and understand the corresponding copyright, statements, the terms and conditions, including the disclaimer.