The world is full of objects: cups, phones, computers, books, and
countless other things. For many tasks, robots need to understand that
this object is a stapler, that object is a textbook, and this other
object is a gallon of milk. The classic approach to this problem is
object recognition, which classifies each observation into one of
several previously-defined classes. While modern object recognition
algorithms perform well, they require extensive supervised training:
in a standard benchmark, the training data average more than four
hundred images of each object class.
The cost of manually labeling the training data prohibits these
techniques from scaling to general environments. Homes and workplaces
can contain hundreds of unique objects, and the objects in one
environment may not appear in another.
We propose a different approach: object discovery. Rather than rely on
manual labeling, we describe unsupervised algorithms that leverage the
unique capabilities of a mobile robot to discover the objects (and
classes of objects) in an environment. Because our algorithms are
unsupervised, they scale gracefully to large, general environments
over long periods of time. To validate our results, we collected 67
robotic runs through a large office environment. This dataset, which
we have made available to the community, is the largest of its kind.
At each step, we treat the problem as one of robotics, not disembodied
computer vision. The scale and quality of our results demonstrate the
merit of this perspective, and prove the practicality of long-term
large-scale object discovery.