[摘要] Several structural scene cues such as gist, layout, horizontal line, openness, and depth have been shown to guide scene perception (e.g., Oliva & Torralba,
2001); Ross & Oliva,
2009). Here, to investigate whether vanishing point (VP) plays a significant role in gaze guidance, we ran two experiments. In the first one, we recorded fixations of 10 observers (six male, four female; mean age 22;
SD = 0.84) freely viewing 532 images, out of which 319 had a VP (shuffled presentation; each image for 4 s). We found that the average number of fixations at a local region (80 × 80 pixels) centered at the VP is significantly higher than the average fixations at random locations (
t test;
n = 319;
p < 0.001). To address the confounding factor of saliency, we learned a combined model of bottom-up saliency and VP. The AUC (area under curve) score of our model (0.85;
SD = 0.01) is significantly higher than the base saliency model (e.g., 0.8 using attention for information maximization (AIM) model by Bruce & Tsotsos,
2005,
t test;
p = 3.14e-16) and the VP-only model (0.64,
t test;
p < 0.001). In the second experiment, we asked 14 subjects (10 male, four female; mean age 23.07,
SD = 1.26) to search for a target character (T or L) placed randomly on a 3 × 3 imaginary grid overlaid on top of an image. Subjects reported their answers by pressing one of the two keys. Stimuli consisted of 270 color images (180 with a single VP, 90 without). The target happened with equal probability inside each cell (15 times L, 15 times T). We found that subjects were significantly faster (and more accurate) when the target appeared inside the cell containing the VP compared to cells without the VP (median across 14 subjects 1.34 s vs. 1.96 s; Wilcoxon rank-sum test;
p = 0.0014). These findings support the hypothesis that vanishing point, similar to face, text (Cerf, Frady, & Koch,
2009), and gaze direction (Borji, Parks, & Itti,
2014) guides attention in free-viewing and visual search tasks.