Research Mentor(s)

Scott Wehrwein

Description

Long-term video streams such as those available from live streaming webcams provide a unique view into changes in scenes that happen over multiple timescales. While timelapses and other machine learning free techniques are able to show some changes, they struggle to simultaneously represent changes occurring at differing time-scales. In our approach, we aim to represent frames in a video as latent vectors through learned models. We focus on two major directions: (1) learning models that encode and decode frames to and from a latent vector, and (2) learning a model that directly generates frames from latent vectors conditioned on user specified properties. In both cases we aim to enforce that the latent vector represents timescale related content in a separable fashion, either directly as in case (2) or indirectly via a novel loss function as in case (1). The ability to analyze and manipulate these latent representations has the potential to provide novel insights into the long-term video that expands beyond what could be seen in non-learned approaches. We refer to this process of separating out timescale related content through learned latent-space representations as “timescale disentanglement.”

Document Type

Event

Start Date

May 2022

End Date

May 2022

Location

Carver Gym (Bellingham, Wash.)

Department

CSE - Computer Science

Genre/Form

student projects; posters

Type

Image

Rights

Copying of this document in whole or in part is allowable only for scholarly purposes. It is understood, however, that any copying or publication of this document for commercial purposes, or for financial gain, shall not be allowed without the author’s written permission.

Language

English

Format

application/pdf

Share

COinS
 
May 18th, 9:00 AM May 18th, 5:00 PM

Timescale Disentanglement

Carver Gym (Bellingham, Wash.)

Long-term video streams such as those available from live streaming webcams provide a unique view into changes in scenes that happen over multiple timescales. While timelapses and other machine learning free techniques are able to show some changes, they struggle to simultaneously represent changes occurring at differing time-scales. In our approach, we aim to represent frames in a video as latent vectors through learned models. We focus on two major directions: (1) learning models that encode and decode frames to and from a latent vector, and (2) learning a model that directly generates frames from latent vectors conditioned on user specified properties. In both cases we aim to enforce that the latent vector represents timescale related content in a separable fashion, either directly as in case (2) or indirectly via a novel loss function as in case (1). The ability to analyze and manipulate these latent representations has the potential to provide novel insights into the long-term video that expands beyond what could be seen in non-learned approaches. We refer to this process of separating out timescale related content through learned latent-space representations as “timescale disentanglement.”

 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.