Research Article Open Access

Exploiting Surrounding Text for Retrieving Web Images

S. A. Noah, A. Azilawati, T. M.T. Sembok and T. W.T.S. Meriam

Abstract

Web documents contain useful textual information that can be exploited for describing images. Research had been focused on representing images by means of its content (low level) description such as color, shape and texture, little research had been directed to exploiting such textual information. The aim of this research was to systematically exploit the textual content of HTML documents for automatically indexing and ranking of images embedded in web documents. A heuristic approach for locating and assigning weight surrounding web images and a modified tf.idf weighting scheme was proposed. Precision-recall measures of evaluation had been conducted for ten queries and promising results had been achieved. The proposed approach showed slightly better precision measure as compared to a popular search engine with an average of 0.63 and 0.55 relative precision measures respectively.

Journal of Computer Science
Volume 4 No. 10, 2008, 842-846

DOI: https://doi.org/10.3844/jcssp.2008.842.846

Submitted On: 6 November 2008 Published On: 31 October 2008

How to Cite: Noah, S. A., Azilawati, A., Sembok, T. M. & Meriam, T. W. (2008). Exploiting Surrounding Text for Retrieving Web Images. Journal of Computer Science, 4(10), 842-846. https://doi.org/10.3844/jcssp.2008.842.846

  • 3,073 Views
  • 2,067 Downloads
  • 7 Citations

Download

Keywords

  • Information retrieval
  • image retrieval
  • precision recall