Header logo is ei

Classification of Natural Scenes using Global Image Statistics




The algorithmic classification of complex, natural scenes is generally considered a difficult task due to the large amount of information conveyed by natural images. Work by Simon Thorpe and colleagues showed that humans are capable of detecting animals within novel natural scenes with remarkable speed and accuracy. This suggests that the relevant information for classification can be extracted at comparatively limited computational cost. One hypothesis is that global image statistics such as the amplitude spectrum could underly fast image classification (Johnson & Olshausen, Journal of Vision, 2003; Torralba & Oliva, Network: Comput. Neural Syst., 2003). We used linear discriminant analysis to classify a set of 11.000 images into animal and nonanimal images. After applying a DFT to the image, we put the Fourier spectrum of each image into 48 bins (8 orientations with 6 frequency bands). Using all of these bins, classification performance on the Fourier spectrum reached 70%. In an iterative procedure, we then removed the bins whose absence caused the smallest damage to the classification performance (one bin per iteration). Notably, performance stayed at about 70% until less then 6 bins were left. A detailed analysis of the classification weights showed that a comparatively high level of performance (67%) could also be obtained when only 2 bins were used, namely the vertical orientations at the highest spatial frequency band. When using only a single frequency band (8 bins) we found that 67% classification performance could be reached when only the high spatial frequency information was used, which decreased steadily at lower spatial frequencies, reaching a minimum (50%) for the low spatial frequency information. Similar results were obtained when all bins were used on spatially pre-filtered images. Our results show that in the absence of sophisticated machine learning techniques, animal detection in natural scenes is limited to rather modest levels of performance, far below those of human observers. If limiting oneself to global image statistics such as the DFT then mostly information at the highest spatial frequencies is useful for the task. This is analogous to the results obtained with human observers on filtered images (Kirchner et al, VSS 2004).

Author(s): Drewes, J. and Wichmann, FA. and Gegenfurtner, KR.
Volume: 8
Pages: 88
Year: 2005
Month: February
Day: 0

Department(s): Empirical Inference
Bibtex Type: Poster (poster)

Digital: 0
Event Name: 8th T{\"u}bingen Perception Conference (TWK 2005)
Event Place: T{\"u}bingen, Germany

Links: Web


  title = {Classification of Natural Scenes using Global Image Statistics},
  author = {Drewes, J. and Wichmann, FA. and Gegenfurtner, KR.},
  volume = {8},
  pages = {88},
  month = feb,
  year = {2005},
  month_numeric = {2}