With the exponential growth of web multimedia contents, the Internet is rife with near-duplicate videos, the video copies applied with visual/temporal transformations and/or post productions. Two critical issues, copyright infringement and search result redundancy, arise accordingly. To resolve these problems, this paper proposes a spatiotemporal pattern-based approach under the hierarchical filter-and-refine framework for efficient and effective near-duplicate video retrieval and localization. Firstly, non-near-duplicate videos are fast filtered out through a computationally efficient data structure, termed pattern-based index tree (PI-tree). Then, an m-pattern-based dynamic programming (mPDP) algorithm is designed to localize near-duplicate segments and to re-rank the videos retrieved. The influence of time shift misalignment can be alleviated by time-shift m-pattern similarity (TPS) measurement. Comprehensive experiments on the five datasets are conducted to verify the effectiveness, efficiency, robustness, and scalability of the proposed approach. Convincing results demonstrate that our proposed approach outperforms the state-of-the-art approaches in terms of mean average precision (MAP) and normalized detection cost rate (NDCR) on the testing datasets. Furthermore, the proposed approach can achieve high quality of near-duplicate video localization in terms of quality frames (QF) and mean F1.