The freest of lunches:
Using out-of-domain data to boost
oceanographic image classification

Eric Orenstein

University of California, San Diego

May 23, 2018
Pacific Forum—11:00 a.m.

Over the past decade, the biological oceanographic community has increasingly relied on in situ digital imaging to sample the denizens of the sea. These data sets have grown intractably large, requiring countless hours of human labor for analysis. Oceanographers have begun to leverage advances in machine learning to automate the process. In this talk, I will outline ongoing efforts in the Jaffe Laboratory for Underwater Imaging to speed classification of data from the Scripps Plankton Camera. I will focus on experiments using out-of-domain data to boost the performance of machine classifiers and present two early time series analyses—one examining a parasite-host relationship, the other tracking the occurrence of chain-forming diatoms. Data from other sources do indeed improve accuracy, but more work remains to judge the quality of the labels and develop consistent annotation pipelines. There is, after all, no such thing as a free lunch.

Next: Isa Rosso


Data repository
Data policy
Deep-Sea Guide
What is happening in Monterey Bay today?
Central and Northern California Ocean Observing System
Chemical data
Ocean float data
Slough data
Mooring ISUS measurements
Southern Ocean Data
Mooring data
M1 Mooring Summary Data
M1 Asimet
M1 download Info
M1 EMeter
Molecular and genomics data
ESP Web Portal
Seafloor mapping
Soundscape Listening Room
Upper ocean data
Spatial Temporal Oceanographic Query System (STOQS) Data
Image gallery
Video library
Creature feature
Deep-sea wallpapers
Previous seminars
David Packard Distinguished Lecturers
Research software
Video Annotation and Reference System
System overview
Data Use Policy
Video Tape User Guide
Video File User Guide
Annotation Glossary
Query Interface
Basic User Guide
Advanced User Guide
Query Glossary
VARS publications
VARS datasets used in publications
Oceanographic Decision Support System
MB-System seafloor mapping software
How to download and install MB-System
MB-System Documentation
MB-System Announcements
MB-System Announcements (Archive)
MB-System FAQ
MB-System Discussion Lists
MB-System YouTube Tutorials
Matlab scripts: Linear regressions
Introduction to Model I and Model II linear regressions
A brief history of Model II regression analysis
Index of downloadable files
Summary of modifications
Regression rules of thumb
Results for Model I and Model II regressions
Graphs of the Model I and Model II regressions
Which regression: Model I or Model II?
Matlab scripts: Oceanographic calculations
Matlab scripts: Sound velocity
Visual Basic for Excel: Oceanographic calculations
Educational resources
Navigating STEM careers
MBARI Summer Internship Program
2017 Summer Interns Blog
Education and Research: Testing Hypotheses (EARTH)
EARTH workshops
2016—New Brunswick, NJ
2015—Newport, Oregon
2016 Satellite workshop—Pensacola, FL
2016 Satellite workshop—Beaufort, NC
EARTH resources
EARTH lesson plans
Lesson plans—published
Lesson plans—development
Lesson drafts—2015
Lesson drafts—2016 Pensacola
Adopt-A-Float Program
Center for Microbial Oceanography: Research and Education (C-MORE) Science Kits
Science at home: Curriculum and resources
Sample archive
SciComm Resources