Background subtraction is often considered to be a required stage of any video surveillance system being used to detect objects in a single frame and/or track objects across multiple frames in a video sequence. Most current state-of-the-art techniques for object detection and tracking utilize some form of background subtraction that involves developing a model of the background at a pixel, region, or frame level and designating any elements that deviate from the background model as foreground. However, most existing approaches are capable of segmenting a number of distinct components but unable to distinguish between the desired object of interest and complex, dynamic background such as moving water and high reflections. In this paper, we propose a technique to integrate spatiotemporal signatures of an object of interest from different sensing modalities into a video segmentation method in order to improve object detection and tracking in dynamic, complex scenes. Our proposed algorithm utilizes the dynamic interaction information between the object of interest and background to differentiate between mistakenly segmented components and the desired component. Experimental results on two complex data sets demonstrate that our proposed technique significantly improves the accuracy and utility of state-of-the-art video segmentation technique.