Ariel Gutman
Ariel Gutman defended his PhD thesis at the University of Konstanz in 2016, where he was researching Neo-Aramaic dialects as an associate fellow of the Zukunftskolleg interdisciplinary institute. His curriculum includes a master's degree in Linguistics awarded by the Université Sorbonne Nouvelle and a master's degree in computer Science awarded by the École Normale Supérieure, following a B.Sc. from the Hebrew University in Jerusalem. He has conducted fieldwork on Neo-Aramaic in France and Israel, as well as fieldwork on an Austronesian language in West Papua, Indonesia. He has published numerous articles about Neo-Aramaic and Language Acquisition. His first book (co-authored with Wido van Peursen) entitled The Two Syriac Versions of the Prayer of Manasseh was published by Gorgias Press in 2011. His second book, Attributive Constructions in North-Eastern Neo-Aramaic, was published by Language Science Press in 2018. Currently he is working as a software engineer specialized in computational linguistics at Google, Zurich.
Authored Publications
Sort By
Preview abstract
We propose a templatic Natural Language Generation system, which uses a dependency grammar together with feature structure unification to guide the generation process. Feature structures are unified across dependency arcs, licensing the selection of correct lexical forms. From a practical perspective, the system has numerous advantages, such as the possibility to easily mix static and dynamic content. From a theoretical point of view, the templates can be seen as linguistic constructions, of which the relevant grammar is specified in terms of dependency grammar. In this paper we present the architecture of the system, and two case studies: verbal agreement in French, including the object-agreement pattern of past participles, and definiteness spreading in Scandinavian languages. The latter case study also exemplifies how this framework can be used for cross-lingual comparison and generation.
View details
Using Dependency Grammars in guiding Natural Language Generation
Anton Ivanov
The Israeli Seminar of Computational Linguistics, IBM Research, Haifa (2019)
Preview abstract
We propose a templatic Natural Language Generation system, which uses a dependency grammar together with feature structure unification to guide the generation process. Feature structures are unified across dependency arcs, licensing the selection of correct lexical forms. From a practical perspective, the system allows for numerous advantages, as the possibility easily to mix static and dynamic content. From a theoretical point of view, the templates can be seen as linguistic constructions, of which the relevant grammar is specified in terms of dependency grammar.
View details
Attributive Constructions in North-Eastern Neo-Aramaic
Language Science Press (2018)
Preview abstract
This study is the first wide-scope morpho-syntactic comparative study of North-Eastern Neo-Aramaic dialects to date. Given the historical depth of Aramaic (almost 3 millennia) and the geographic span of the modern dialects, coming in contact with various Iranian, Turkic and Semitic languages, these dialects provide an almost pristine "laboratory" setting for examining language change from areal, typological and historical perspectives. While the study has a very wide coverage of dialects, including also contact languages (and especially Kurdish dialects), it focuses on a specific grammatical domain, namely attributive constructions, giving a theoretically motivated and empirically grounded account of their variation, distribution and development. The results will be enlightening not only to Semitists seeking to learn about this fascinating modern Semitic language group, but also for typologists and general linguists interested in the dynamics of noun phrase morphosyntax.
View details
Crafting a Lexicon of Referential Expressions for NLG Applications
Alexandros Andre Chaaraoui
Pascal Fleury
Proceedings of the LREC 2018 Workshop “Globalex 2018 – Lexicography & WordNets"
Preview abstract
To engage users, a natural language generation system must produce grammatically correct and eloquent sentences. A simple NLG architecture may consist of a template repository coupled with a lexicon containing grammatically-annotated lexical expressions referring to the entities that are present in the domain of the system. The morphosyntactic features associated with these expressions are
crucial to render grammatical and natural-sounding sentences. Existing electronic resources, like dictionaries or thesauri, lack wide-scale coverage of such referential expressions. In this work, we focus on the creation of a large-scale lexicon of referential expressions, relying on n-gram models, morpho-syntactic parsing, and non-linguistic knowledge. We describe the collected linguistic information and the techniques used to perform automatic extraction from large text corpora in a way that scales across languages and over millions of entities.
View details
Crafting a lexicon of referential expressions for NLG applications
Alexandros Chaaraoui
Pascal Fleury
The 2017 Israeli Seminar of Computational Linguistics, Rachel and Selim Benin School of Computer Science and Engineering, Edmond J. Safra Campus, Jerusalem (2017)
Preview abstract
To be perfectly conversational, an agent needs to produce grammatically correct and eloquent sentences. To reach this goal, we use templatic systems with linguistically-aware specifications to generate idiomatic utterances, coupled with annotated lexical entities. The morphosyntactic features of the lexical entities are crucial to render grammatical and natural sounding sentences.
Existing electronic resources, like dictionaries or thesauri, lack wide-scale information about referential expressions (i.e. proper names). In this work, we focus on the creation of a large-scale lexicon of such referential expressions, relying on n-gram models, morpho-syntactic parsing, and non-linguistic knowledge. We describe the linguistic information we collect and the techniques we use to automatically extract this from large text corpora in a way that scales across languages and over millions of entities.
View details