Skip to main navigation Skip to search Skip to main content

MESS: A multilingual error based string similarity measure for transliterated name variants

  • SUNY Buffalo

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Cross-lingual name matching is an important problem in the fields of machine translation and data mining. Though well studied, it lacks a generic solution largely due to issues like language specific nuances, resource scarcity, etc. Most of the proposed unsupervised approaches focus on a small subset of languages, mostly English and its derivatives, and employ specific handcrafted rules that do not port well to other languages. In this paper, we propose a generic multilingual solution that instead adds simple probabilistic extensions to existing string similarity methods. Not only does our solution depend only on freely available open source resources but we also demonstrate the superiority of our approach on 60 language pairs drawn across language families.

Original languageEnglish
Title of host publicationFIRE 2015 - Proceedings of the 7th Annual Meeting of the Forum for Information Retrieval Evaluation
EditorsPrasenjit Majumder, Mandar Mitra, Madhulika Agrawal, Parth Mehta
PublisherAssociation for Computing Machinery
Pages47-50
Number of pages4
ISBN (Electronic)9781450340045
DOIs
StatePublished - Dec 4 2015
Event7th Annual Meeting of the Forum for Information Retrieval Evaluation, FIRE 2015 - Gandhinagar, India
Duration: Dec 4 2015Dec 6 2015

Publication series

NameACM International Conference Proceeding Series
Volume04-06-December-2015

Conference

Conference7th Annual Meeting of the Forum for Information Retrieval Evaluation, FIRE 2015
Country/TerritoryIndia
CityGandhinagar
Period12/4/1512/6/15

Fingerprint

Dive into the research topics of 'MESS: A multilingual error based string similarity measure for transliterated name variants'. Together they form a unique fingerprint.

Cite this