[CPL Seminar]
[
Schedule]
[
Jan 9]
[
Jan 16]
[
Jan 23]
[
Jan 30]
[
Feb 6]
[
Feb 20]
[
Feb 25]
[
Mar 7 Shum]
[
Mar 7 Szeliski]
[
Mar 13]
[
Mar 20]
[
Mar 27]
[
April 3]
[
April 10]
[
April 17]
[
April 24]

April 17

Nebojsa Jojic

Understanding multimedia using generative models

Most of the research on understanding natural signals is based on some
sort of a model of the world. These models have typically been highly
specific about one aspect of the world, for instance, the appearance of
a human face, or the motion type of a layer, or the spectral characteristic of speech
but addressing other, "non-interesting" parts of the scene is avoided,
or left to a separate integration module. The limited flow of
information and limited adaptivity of such systems make them very
brittle in realistic applications. In order to build more robust
understanding algorithms, models need to be capable of capturing various
aspects of the data at the same time, be fairly simple, but adapt to the
data.

Generative models, as defined by the machine learning community, are
flexible models that describe the data of interest through a feasible
generation process, starting only from a minimal number of parameters
and using sampling from appropriate probability distributions to
introduce variability. While the generative process itself is rarely
used directly, the descriptive power of the model is used for inference,
classification, and data manipulation.

In this talk, I will overview the generative approach to multimedia
understanding, and report some of our recent results on audio-visual
tracking; multimedia clustering, search and retrieval; and video
editing, such as object extraction, illumination correction,
stabilization, etc. This is joint work with Brendan Frey and Hagai Attias.