Skip to main content

Staging Data for FathomNet

· 3 min read
Jonathan Takahashi

CVision AI is proud to be a founding member and contributor to FathomNet. This post will cover how we used Tator to help our collaborators contribute to FathomNet.

FathomNet.org

FathomNet is a collection of curated underwater images and localizations. Localization data is stored alongside additional metadata and public image URLs. As a high-quality dataset, metadata and localizations are expected to have undergone a process of quality control and review.

Staging Data on Tator for FathomNet Submissions

Tator serves as a staging area before submission to FathomNet, a place where raw data can be organized and reviewed before exporting to FathomNet.

Our collaborators typically start with video data, not images. Biologists select frames where organisms are present, draw boxes around them, and label them with the correct species and other metadata. In addition, our collaborators want to select which data they share. To handle this, we set up two projects, a private video project and a public image project. Using a frame extraction script we wrote using tator-py, video frames in selected media sections were extracted and cloned to the public project along with their localizations and metadata. Metadata in the public project was configured to match the FathomNet specification; mappings between attributes in the private project and public project can be defined so they don't need to use the same configuration.

Localization metadata input form on Tator

Localizations Approval Workflow

Using this workflow, localizations can be created, reviewed, and approved for export prior to making anything public.

Once data is in the public project, there needs to be an easy way to export the data to FathomNet. To facilitate this, we created an applet for FathomNet export. This allows users to apply filters on the public project (by section, by date, etc.) and export that data in the FathomNet CSV format from within Tator. The CSV is downloaded locally to the user's PC, and can be reviewed one more time before submitting.

Private to public project workflow: Upload to a Private Video Project. Extract frames to a Public Image Project, and Export to FathomNet.

These steps result in a simple workflow for users:

Step 1 - Upload video or images to a private project. Video or images can be uploaded using scripts or through the web interface.

Step 2 - Annotate videos. Localizations are drawn using Tator's frame-accurate annotation view.

Step 3 - Extract video frames. Video frames containing localizations are extracted to a public project along with localizations.

Step 4 - Export to your computer. The FathomNet export applet is used to download a CSV in a format ready for upload.

Step 5 - Upload CSV to FathomNet. Create a user account on FathomNet.org and upload the resulting CSV file as a Collection.

The UI for the Fathomnet export applet on Tator allows users to match metadata to supported columns, and export to a CSV for FathomNet Collections.

If you are a researcher looking to contribute to FathomNet, Tator is a free and open source tool that can be used for staging, annotating, hosting, and exporting your data. Please feel free to contact us with any questions, or if you are interested in using our cloud-based deployment of Tator.