Large-Scale Learning For Media Understanding: Editorial Open Access
DOI 10.1186/s13640-015-0080-7
2015 Rocha and Scheirer. This is an Open Access article distributed under the terms of the Creative Commons Attribution
License (, which permits unrestricted use, distribution, and reproduction in any
medium, provided the original work is properly credited.
will yield a quick answer as to whether or not it is retrieval. Using a color descriptor is not enough to
consistent with human behavior. capture all possible class variability. Including other
3. Move away from a strict adherence to data sets. complementary features, such as shape and texture,
Related to the above observation, we have observed is key for a successful retrieval system. In a
that posting good numbers on a benchmark data set remote-sensing image-classification system, the RGB
is no longer a means to an end, but an end in itself. It color channels are just one way to capture image
is generally not true that a good result on a particular information. Infrared channels can also play an
data set means that the algorithm which produced it important role, and each channel can have its own
will always perform well on images from outside of custom-tailored descriptors. Therefore, we
that data set. Well before all of the excitement over recommend thinking of possible complementary
the performance of deep learning architectures on features when dealing with visual problems, along
the ImageNet challenge [11], Torralba and Efros [12] with innovative ways for combining them.
questioned the fields singular focus on such narrow Sometimes what seems unsolvable using just one
problems, arguing that all data sets in computer piece of visual evidence becomes much easier
vision contain some measure of easily learned bias when considering evidence from different and
that can inevitably lead to false conclusions. Bias complementary features and sensors.
becomes evident when testing an algorithms 6. Be aware of machine-learning black-boxes. With
cross-data set generalization ability, or, in other the ever-increasing need for processing vast amounts
words, training a model on one data set and applying of data, researchers often rely on off-the-shelf
it to another. We recommend that researchers go machine learning solutions to tackle their problems
even further by testing their algorithms on data from using so-called black-boxes. Although it is quick and
sources external to any data setif an algorithm fails easy to turn to such solutions, this comes at a price; if
when presented with frames from a live camera, the underlying problem is poorly understood, the
more work needs to be done. default parameters of a chosen black-box model will
4. Avoid dogma (but do so in a principled way). Like likely result in poor performance. Hence, we
any academic field, machine learning has its share of recommend that researchers pay close attention to
subdisciplines, each with its own prescriptions for the intrinsic properties of their problems and to
problem solving. Sometimes, these carefully choose the learning algorithm and its
subdiscipline-specific views become stumbling parameters when actually implementing a solution.
blocks to general progress. An example of this is the Sometimes just a small amount of parameter tuning
topic of convex optimization, which has come to be can save weeks of processing and yield very good
the dominant mode of optimization for visual classification results.
recognition problems. More often than not, it is 7. Think of new useful applications. Researchers
frowned upon to propose an algorithm that may get these days concentrate on just a handful of
trapped in local minimaeven if it demonstrates well-known applications. Digital photo tagging is,
superior empirical performance over the without a doubt, a great application, but it is not the
state-of-the-art. Thankfully, the reemergence of only one we should be working on. Get creative when
artificial neural networks, which are non-convex, has demonstrating the capabilities of a new algorithm.
loosened this tension by demonstrating the utility of Some interesting applications that we have seen
complex and hierarchical network structures that are lately include the following: shellfish detection for the
not amenable to convex optimization [13]. Hence, protection of fisheries [14], digital restoration of
strive to design an algorithm that works to your historical documents [15], and steering headlight
performance specification, and not one that is beams around raindrops [16]. These are a good start,
unnecessarily constrained by theory. However, if a but there is certainly much more over the horizon.
theory does lead you to a good solution for particular
cases, take advantage of it. With the above advice setting the stage, this special issue
5. Seek different evidence when characterizing examines emerging questions and algorithms related to
visual data. There is no silver bullet to solve all complex visual processing tasks where machine learning
problemsespecially when describing images. is applicable. This spans a number of important problems
Different problems often demand different forms of at multiple stages of the image analysis pipeline, from
image description. However, even within a single features to decision-making strategies, all the way through
problem, it is hard to think of a simple descriptor to end-user applications. This issue brings together
that captures all the nuances and cues present in an seven articles describing original research that is closely
image. Consider the example of content-based image matched to these stages.
