Skip to main navigation Skip to search Skip to main content

Using cross-lingual projections to generate semantic role labeled corpus for urdu - A resource poor language

  • SUNY Buffalo
  • Thomson Reuters

Research output: Contribution to conferencePaperpeer-review

11 Scopus citations

Abstract

In this paper we explore the possibility of using cross lingual projections that help to automatically induce role-semantic annotations in the PropBank paradigm for Urdu, a resource poor language. This technique provides annotation projections based on word alignments. It is relatively inexpensive and has the potential to reduce human effort involved in creating semantic role resources. The projection model exploits lexical as well as syntactic information on an English-Urdu parallel corpus. We show that our method generates reasonably good annotations with an accuracy of 92% on short structured sentences. Using the automatically generated annotated corpus, we conduct preliminary experiments to create a semantic role labeler for Urdu. The results of the labeler though modest, are promising and indicate the potential of our technique to generate large scale annotations for Urdu.

Original languageEnglish
Pages797-805
Number of pages9
StatePublished - 2010
Event23rd International Conference on Computational Linguistics, Coling 2010 - Beijing, China
Duration: Aug 23 2010Aug 27 2010

Conference

Conference23rd International Conference on Computational Linguistics, Coling 2010
Country/TerritoryChina
CityBeijing
Period08/23/1008/27/10

Fingerprint

Dive into the research topics of 'Using cross-lingual projections to generate semantic role labeled corpus for urdu - A resource poor language'. Together they form a unique fingerprint.

Cite this