Skip to main navigation Skip to search Skip to main content

Use of multimedia input in automated image annotation and content-based retrieval

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

This research explores the interaction of linguistic and photographic information in an integrated text/image database. By utilizing linguistic descriptions of a picture (speech and text input) coordinated with pointing references to the picture, we extract information useful in two aspects: image interpretation and image retrieval. In the image interpretation phase, objects and regions mentioned in the text are identified; the annotated image is stored in a database for future use. We incorporate techniques from our previous research on photo understanding using accompanying text: a system, PICTION, which identifies human faces in a newspaper photograph based on the caption. In the image retrieval phase, images matching natural language queries are presented to a user in a ranked order. This phase combines the output of (1) the image interpretation/annotation phase, (2) statistical text retrieval methods, and (3) image retrieval methods (e.g., color indexing). The system allows both point and click querying on a given image as well as intelligent querying across the entire text/image database.

Original languageEnglish
Title of host publicationProceedings of SPIE - The International Society for Optical Engineering
EditorsWayne Niblack, Ramesh C. Jain
Pages249-260
Number of pages12
StatePublished - 1995
EventStorage and Retrieval for Image and Video Databases III - San Jose, CA, USA
Duration: Feb 9 1995Feb 10 1995

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
Volume2420
ISSN (Print)0277-786X

Conference

ConferenceStorage and Retrieval for Image and Video Databases III
CitySan Jose, CA, USA
Period02/9/9502/10/95

Fingerprint

Dive into the research topics of 'Use of multimedia input in automated image annotation and content-based retrieval'. Together they form a unique fingerprint.

Cite this