ROVIA: Automated Underwater Video Highlight Generation using Deep Neural Network

Document Type

Conference Proceeding

Date of Original Version



Deep-sea video is one of the most important data sources in deep-sea science, but also an extreme challenge for data usage and archiving. With technological advances, underwater videos collected by HOVs (Human Occupied Vehicles), AUVs (Autonomous Underwater Vehicles and ROVs (Remotely Operated Vehicles) produce extreme volumes of data. For general use, underwater dive videos, however, can be sparse, with only a few high-value clips interspersed with hours of video relevant only to specific domains. The process of condensing such high-volume datasets can be time-consuming as human annotators must manually clip videos to identify highlights. Our study develops a portable and field-deployable CNN (Convolutional Neural Network) model to identify potential biological, geological, and operational highlights from long-dive videos. ROVIA is a smart deep-sea video highlight generator that effectively extracts spatiotemporal features linked to camera zooming, organism movement, and changes in optical flow to identify a highlight accurately. This automated highlight generator provides increased efficiency in condensing deep-sea video to aid in archiving and enhance the utilization of the clips for scientific and educational purposes.

Publication Title, e.g., Journal

Oceans Conference Record (IEEE)