Fine-Grained Categorization
Fine-grained categorization, as a sub-field of object recognition, aims to distinguish subordinate categories within entry level categories. Examples include recognizing species of birds such as “northern cardinal” or “indigo bunting”; flowers such as “tulip” or “cherry blossom”. Fine-grained categorization often requires efforts from different aspects compared with generic object recognition.
At SE(3), we are mainly interested in three fundamental problems of fine-grained categorization: 1) building large-scale, high-quality datasets for benchmarking fine-grained categorization methods; 2) designing algorithms that are more suitable for fine-grained recognition tasks; 3) exploring ways to bring human’s expertise into fine-grained categorization.
Works
- Kernel Pooling for Convolutional Neural Networks
Convolutional Neural Networks (CNNs) with Bilinear Pooling, initially in their full form and later using compact representations, have yielded impressive performance gains on a wide range of visual tasks, including fine-grained visual categorization, visual question answering, face recognition, and description of texture and style. The key to their success lies in the spatially invariant modeling ...
- Boosted Convolutional Neural Networks
In this work, we propose a new algorithm for boosting Deep Convolutional Neural Networks (BoostCNN) to combine the merits of boosting and modern neural networks. To learn this new model, we propose a novel algorithm to incorporate boosting weights into the deep learning architecture based on least squares objective function. We also show that it ...
- Fine-Grained Categorization and Dataset Bootstrapping using Deep Metric Learning with Humans in the Loop
Existing fine-grained visual categorization methods often suffer from three challenges: lack of training data, large number of fine-grained categories, and high intra-class vs. low inter-class variance. In this work we propose a generic iterative framework for fine-grained categorization and dataset bootstrapping that handles these three challenges. Using deep metric learning with humans in the loop, ...
- Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection
We introduce tools and methodologies to collect high quality, large scale fine-grained computer vision datasets using citizen scientists – crowd annotators who are passionate and knowledgeable about specific domains such as birds or airplanes. We worked with citizen scientists and domain experts to collect NABirds, a new high quality dataset containing 48,562 images of North ...
- Learning Localized Perceptual Similarity Metrics for Interactive Categorization
Current similarity-based approaches to interactive finegrained categorization rely on learning metrics from holistic perceptual measurements of similarity between objects or images. However, making a single judgment of similarity at the object level can be a difficult or overwhelming task for the human user to perform. Secondly, a single general metric of similarity may not be ...
- Bird Species Categorization Using Pose Normalized Deep Convolutional Nets
We propose an architecture for fine-grained visual categorization that approaches expert human performance in the classification of bird species. Our architecture first computes an estimate of the object’s pose; this is used to compute local image features which are, in turn, used for classification. The features are computed by applying deep convolutional nets to image ...
- The Ignorant Led by the Blind: A Hybrid Human–Machine Vision System for Fine-Grained Categorization
We present a visual recognition system for finegrained visual categorization. The system is composed of a human and a machine working together and combines the complementary strengths of computer vision algorithms and (non-expert) human users. The human users provide two heterogeneous forms of information object part clicks and answers to multiple choice questions. The machine ...
- Similarity Comparisons for Interactive Fine-Grained Categorization
Current human-in-the-loop fine-grained visual categorization systems depend on a predefined vocabulary of attributes and parts, usually determined by experts. In this work, we move away from that expert-driven and attributecentric paradigm and present a novel interactive classifi- cation system that incorporates computer vision and perceptual similarity metrics in a unified framework. At test time, users ...
- Attribute-Based Detection of Unfamiliar Classes with Humans in the Loop
Recent work in computer vision has addressed zero-shot learning or unseen class detection, which involves categorizing objects without observing any training examples. However, these problems assume that attributes or defining characteristics of these unobserved classes are known, leveraging this information at test time to detect an unseen class. We address the more realistic problem of ...
- Style Finder: Fine-Grained Clothing Style Recognition and Retrieval
With the rapid proliferation of smartphones and tablet computers, search has moved beyond text to other modalities like images and voice. For many applications like Fashion, visual search offers a compelling interface that can capture stylistic visual elements beyond color and pattern that cannot be as easily described using text. However, extracting and matching such ...
- Bootstrapping Fine-Grained Classifiers: Active Learning with a Crowd in the Loop
We propose an iterative crowd-enabled active learning algorithm for building high-precision visual classifiers from unlabeled images. Our method employs domain experts to identify a small number of examples of a specific visual event. These expert-labeled examples seed a classifier, which is then iteratively trained by active querying of a non-expert crowd. These non-experts actively refine ...
- Multiclass Recognition and Part Localization with Humans in the Loop
We propose a visual recognition system that is designed for fine-grained visual categorization. The system is composed of a machine and a human user. The user, who is unable to carry out the recognition task by himself, is interactively asked to provide two heterogeneous forms of information: clicking on object parts and answering binary questions. ...
- Visual Recognition with Humans in the Loop
We present an interactive, hybrid human-computer method for object classification. The method applies to classes of objects that are recognizable by people with appropriate expertise (e.g., animal species or airplane model), but not (in general) by people without such expertise. It can be seen as a visual version of the 20 questions game, where questions ...
- Caltech-UCSD Birds 200
Caltech-UCSD Birds 200 (CUB-200) is a challenging image dataset annotated with 200 bird species (mostly North American).. It was created to enable the study of subordinate categorization, which is not possible with other popular datasets that focus on basic level categories (such as PASCAL VOC, Caltech-101, etc). The images were downloaded from the website Flickr ...