Nowadays, ever expanding camera network makes it difficult to find the suspect from lengthy video records. This paper proposes a target-driven video summarization framework which provides two-step Filtered Summarized Video (FSV) for tracing suspects. Before the target is identified, users can find the target efficiently using the firststep FSV of any arbitrary camera. The first-step FSV filters all the attributes of the target including the time information and the target's categories. After identifying the target, the second-step FSV with additional spatio-temporal & appearance cues are triggered in the neighbor cameras. To enhance the accuracy of the object classification for FSV, we propose a Perspective Dependent Model (PDM) which consists of many grid-based models. Finally, the experimental results show that grid-based model is more robust than general detectors and the user study demonstrates better performance for target finding and tracking in camera network for surveillance.