摘要:
提出了一种空域和时域相结合的视频显著性检测算法.对单帧图像,受视觉皮层层次化感知特性和Gestalt视觉心理学的启发,提出了一种层次化的静态显著图检测方法.在底层,通过符合生物视觉特性的特征图像(双对立颜色特征及亮度特征图像)的非线性简化模型来合成特征图像,形成多个候选显著区域;在中层,根据矩阵的最小Frobenius-范数(F-范数)性质选取竞争力最强的候选显著区域作为局部显著区域;在高层,利用Gestalt视觉心理学的核心理论,对在中层得到的局部显著区域进行整合,得到具有整体感知的空域显著图.对序列帧图像,基于运动目标在位置、运动幅度和运动方向一致性的假设,对Lucas-Kanade算法检测出的光流点进行二分类,排除噪声点的干扰,并利用光流点的运动幅度来衡量运动目标运动显著性.最后,基于人类视觉对动态信息与静态信息敏感度的差异提出了一种空域和时域显著图融合的通用模型.实验结果表明,该方法能够抑制视频背景中的噪声并且解决了运动目标稀疏等问题,能够较好地从复杂场景中检测出视频中的显著区域.
Abstract:
In order to solve the problems of video saliency detection and poor fusion effect,a video saliency detection model and a fusion model are proposed.Video saliency detection is divided into spatial saliency detection and temporal saliency detection.In the spatial domain,inspired by the properties of visual cortex hierarchical perception and the Gestalt visual psychology,we propose a hierarchical saliency detection model with three-layer architecture for single frame image.The video single frame is simplified layer by layer,then the results are combined to form a whole consciousness vision object and become easier to deal with.At the bottom of the model,candidate saliency regions are formed by nonlinear simplification model of the characteristic image (dual color characteristic and luminance characteristic image),which is in accordance with the biological visual characteristic.In the middle of the model,the candidate regions with the strongest competitiveness are selected as the local salient regions according to the property of matrix minimum Freseniusnorm (F-norm).At the top level of the model,the local salient regions are integrated by the core theory of Gestalt visual psychology,and the spatial saliency map is obtained.In the time domain,based on the consistency assumption of a moving object in target location,motion range and direction,the optical flow points detected by Lucas-Kanade method are classified to eliminate the noise interference,then the motion saliency of moving object is measured by the motion amplitude.Finally,based on the difference between the visual sensitivity of dynamic and static information and the difference in visual sensitivity between color information and gray information,a general fusion model of time and spatial domain salient region is proposed.The saliency detection results of single frame image and video sequence frame image are represented by the gray color model and the Munsell color system respectively.Experimental results show that the proposed saliency detection method can suppress the background noise,solve the sparse pixels problem of a moving object,and can effectively detect the salient regions from the video.The proposed fusion model can display two kinds of saliency results simultaneously in a single picture of a complex scene.This model ensures that the detection results of images are so complicated that a chaotic situation will not appear.