Region Based Convolutional Neural Networks

Region-based Convolutional Neural Networks (R-CNN) are a family of machine learning models for <a href="/facts/Computer_vision/Tl2Yyk66">computer vision</a>, and specifically <a href="/facts/Object_detection/sCx3FK44">object detection</a> and localization. The original goal of R-CNN was to take an input image and produce a set of <a href="/facts/Minimum_bounding_box/oFHL8bxI">bounding boxes</a> as output, where each bounding box contains an object and also the category (e.g. car or pedestrian) of the object. In general, R-CNN architectures perform selective search over feature maps outputted by a CNN.
R-CNN has been extended to perform other computer vision tasks, such as: tracking objects from a drone-mounted camera, locating text in an image, and enabling object detection in <a href="/facts/Google_Lens/wxUfPztU">Google Lens</a>.
Mask R-CNN is also one of seven tasks in the MLPerf Training Benchmark, which is a competition to speed up the training of neural networks.

Region Based Convolutional Neural Networks open-in-new

Region Based Convolutional Neural Networks