Skip to main navigation Skip to search Skip to main content

Combining linguistic and pictorial information: Using captions to interpret newspaper photographs

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

There are many situations where linguistic and pictorial data are jointly presented to communicate information. A computer model for synthesising information from the two sources requires an initial interpretation of both the text and the picture followed by consolidation of information. The problem of performing general-purpose vision (without apriori knowledge) would make this a nearly impossible task. However, in some situations, the text describes salient aspects of the picture. In such situations, it is possible to extract visual information from the text, resulting in a relational graph describing the structure of the accompanying picture. This graph can then be used by a computer vision system to guide the interpretation of the picture. This paper discusses an application whereby information obtained from parsing a caption of a newspaper photograph is used to identify human faces in the photograph. Heuristics are described for extracting information from the caption which contributes to the hypothesised structure of the picture. The top-down processing of the image using this information is discussed.

Original languageEnglish
Title of host publicationCurrent Trends in SNePS - Semantic Network Processing System - 1st Annual SNePS Workshop, Proceedings
EditorsDeepak Kumar
PublisherSpringer Verlag
Pages85-96
Number of pages12
ISBN (Print)9783540526261
DOIs
StatePublished - 1990
Event1st Annual Semantic Network Processing System Workshop, SNePS 1989 - Buffalo, United States
Duration: Nov 13 1989Nov 13 1989

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume437 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference1st Annual Semantic Network Processing System Workshop, SNePS 1989
Country/TerritoryUnited States
CityBuffalo
Period11/13/8911/13/89

Fingerprint

Dive into the research topics of 'Combining linguistic and pictorial information: Using captions to interpret newspaper photographs'. Together they form a unique fingerprint.

Cite this