The identification of changing or moving areas in the field of view of a camera is a fundamental pre-processing step in computer vision and video processing. Example applications include visual surveillance (e.g., people counting, action recognition, anomaly detection, post-event forensics), smart environments (e.g., room occupancy monitoring, fall detection, parking occupancy detection), and video retrieval (e.g., activity localization and tracking). Although subsequent processing may be different in each case, typically one has to start with the identification of regions of interest which, in the case of video, are either short-term changes, i.e., video dynamics (motion), or long-term changes, i.e., appearing/disappearing objects and structural changes. Clearly, motion and change detection are only pre-processing steps for subsequent tracking, classification, or estimation, albeit important ones.

To date, many motion and change detection algorithms have been developed that perform well in some types of videos, but most are sensitive to sudden illumination changes, environmental conditions (night, rain, snow, air turbulence), background/camera motion, shadows, and camouflage effects (photometric similarity of object and background). There is no single algorithm today that seems to be able to simultaneously address all the key challenges that accompany real-world (non-synthetic) videos. In fact, no single, realistic, large-scale dataset exists that covers a range of challenges present in the real world and includes accurate ground truths.

This website encapsulates a rigorous and comprehensive academic benchmarking effort for testing and ranking existing and new algorithms for change and motion detection. It will be revised/expanded from time to time based on received feedback, and will maintain a comprehensive ranking of submitted methods for years to come.


Two datasets are available: 2012 DATASET and 2014 DATASET. Both provide a realistic, camera-captured (no CGI), diverse set of videos. They have been selected to cover a wide range of detection challenges and are representative of typical indoor and outdoor visual data captured today in surveillance, smart environment, and video database scenarios. The 2012 DATASET includes the following challenges: dynamic background, camera jitter, intermittent object motion, shadows and thermal signatures. The 2014 DATASET includes all the 2012 videos plus additional ones with the following difficulties: challenging weather, low frame-rate, acquisition at night, PTZ capture and air turbulence. Each dataset is accompanied by accurate ground-truth segmentation and annotation of change/motion areas for each video frame. Please see the OVERVIEW of the dataset for a detailed description of included video categories and examples of ground truth.

Performance evaluation

In addition to providing a fine-grained and accurate annotation of videos, we also provide tools to compute performance metrics and thus identify algorithms that are robust across various challenges. The source code to compute all performance metrics is provided in UTILITIES. These metrics are reported under the RESULTS tab separately for each dataset. Details of evaluation methodology and specific metrics used can be found in EVALUATION under the RESULTS tab.


Researchers from both academia and industry are invited to test their change and motion detection algorithms on one or both datasets, and to report their methodology and results (please read the rules and instructions below). Results from all submissions that meet certain minimum quality standards will be reported and maintained on this website.

Instructions for prospective participants:

  • The 2012 DATASET and 2014 DATASET contain several video categories with 4 to 6 video sequences in each category. Results can be reported for one, multiple, or all video categories however in any one category results must be reported for all sequences in that category.
  • Only one set of tuning parameters should be used for all videos.
  • Numerical scores can be computed using Matlab or Python programs available in UTILITIES. Both programs take the output produced by an algorithm, the ground truth, and a region-of-interest mask (see OVERVIEW under the DATASETS tab) and compute performance metrics described in EVALUATION under the RESULTS tab.
  • In order for a method to be ranked on this website, upload your results via the UPLOAD FOR 2012 DATASET and UPLOAD FOR 2014 DATASET page.
  • If you use this facility to test and report results in any publication, we request that you acknowledge this website ( and cite the following overview paper (see CDW-2012 page) that summarizes the dataset, performance evaluation metrics, and the results and findings for 19 state-of-the-art methods:



The 2012 dataset, original website and utilities associated with this benchmarking facility wound not have materialized without the tireless efforts of Masters student Nil Goyette at Université de Sherbrooke who has given his heart and soul to this marathon undertaking. We would also like to recognize the following individuals for their contributions to this effort:

  • Yi Wang, University of Sherbrooke, Canada
    Webmaster, software developer, captured footage, helped with ground truthing
  • Nil Goyette, University of Sherbrooke, Canada
    Former webmaster, software developer, captured footage, helped with ground truthing
  • Yannick Bénézeth, Université de Bourgogne, France
    Provided video footage, helped with ground truth
  • Dotan Asselmann, Tel Aviv University, Israel
    Provided every video in the Turbulence category, helped with ground truthing
  • Shaozi Li, Weiyuan Zhuang, and Yaping You, University of Xiamen, China
    Helped with ground truthing
  • Luke Sorenson and Lucas Liang, Boston University, USA
    Helped with ground truthing
  • Marc Vandroogenbroeck, Université de Liège, Belgium
    Provided video footage
  •, ul. Inżynierska 403-422 Warszawa, Poland
    Provided video footage