Introduction

Some of the most exciting opportunities come from the intersection of unrelated fields. Using artificial intelligence to improve the life of surfers is one such example.

In this blog we'll be leveraging the techniques identified in YOLACT ("You Only Look At CoefficienTs"), a 2019 paper on instance segmentation by Bolya, Zhou, Xiao and Lee. To build two proof of concepts:

  1. Next Gen of Surf Report Websites
    a. Real time data on crowd sizes
    b. More accurate reporting of surf quality
  2. Problem 1 of our AI surfing judge project
YOLACT: Real-time Instance Segmentation
We present a simple, fully-convolutional model for real-time instancesegmentation that achieves 29.8 mAP on MS COCO at 33.5 fps evaluated on asingle Titan Xp, which is significantly faster than any previous competitiveapproach. Moreover, we obtain this result after training on only one GPU. Weac…
AI surfing judge
Providing individuals the ability to have their surfing scored by WSL level judges in near real time for all their sessions with only a camera.

What is Instance Segmentation and YOLACT?

Instance segmentation is a branch of computer vision that focuses on the identification and delineation of like pixels into a bounded group. To make this definition less abstract it's helpful to contrast this approach with object detection and semantic segmentation.

When most people think of computer vision they picture images with labelled boxes for dogs, cats etc. This is a classic implementation of object detection, which typically involves the establishment of a boundary box which is then subjected to a classifier to identify a box label.

Semantic segmentation differs in that each pixel within an image is demarcated into a category. This segmentation facilitates an understanding of the interactions between objects.

Instance segmentation can therefore be viewed as a refined object detection method whereby boundary boxes are first established and semantic segmentation is then performed.

YOLACT is a parallelized instance segmentation framework capable of real time detection. The speed of YOLACT is it's primary contribution and unlocks several exciting use cases. At a high level the model works by dividing the workload into two parallel tasks.

  1. Generation of instance independent prototype masks ("protonet") using Fully Convolutional Networks
  2. Prediction of mask coefficients via a object detection branch

An instance segmentation is ultimately created via the linear combination of these aforementioned tasks.

Next Gen Surf Report Websites

There are two primary questions I ask before going for a surf, how good is it and how crowded is it?

In Australia our two primary websites for surf reports; Swellnet and Surfline (formerly CoastalWatch) are woeful at answering these questions for a specific location. Their camera networks are however great, enabling patient viewers to answer the questions themselves. Recently though the camera's which have been free for years are being locked behind paywalls. This move seems to rely quite heavily on the absence of competition and has lead to many frustrated surfers.

Our next generation of surf websites can answer these questions whilst keeping human labor costs down with Ai. For example let's demonstrate the use of YOLACT to answer in real time how crowded is the lineup?

0:00
/
Model Output - Final max detections at once: 6 

The segmentation is shown above to illustrate the YOLACT implementation; however, it would not need to be displayed to a user. The crowd can be inferred by the maximum number of surfers simultaneously identified over a brief time window. This implicitly also provides a proxy for crowd concentration as surfer's may leave or enter the frame. It also helps protect against momentary shortcomings in the segmentation.

Thanks to the COCO data set which has over 1.5 million object instances we have all the data we need to accurately segment for surfers and answer How Crowded Is It? Note a generic object detection method could also fit this use case.

COCO - Common Objects in Context

How good is the surf is a more challenging question given the lack of data to train on. Our primary requirement is a dataset of footage for a surf location and corresponding labels ranking the quality.

For one of the incumbents trying to build this dataset I think a natural approach is to attempt to gamify the collection. Instead of locking the cameras behind paywalls, have the users be prompted to rate the surf conditions they just watched to extend their view time. This may also lead to increased sign ups given the fostering of a sense of contribution to the improvement of the site.

Building off these features, there's a good chance to disrupt the incumbents in regional areas. Whilst Surfline boasts of it's large camera networks, for most users, to put it bluntly this is completely irrelevant. If I'm a surfer in Sydney or the Gold Coast I could care less about a camera at Haggerty's in the Palos Verdes Estates of LA County and vice versa.

Looking at the Australian market, there is a passionate network of over 200 Boardriders clubs. These are groups of local surfers that hold monthly contests and are committed to their break. Historically these groups have been notoriously hard to crack for our incumbents. Some of their cameras for example have been prone to 'disappear' at heavily localized areas form time to time.

A solution could be to cater specifically for these groups. Creating specialized cameras accessible to members of these clubs and in which they directly contribute to the development of our model for surf quality. This would enable access to better locations for the camera and promote significantly less noise in the training data. As these local experts are also the primary beneficiaries they are incentivized to accurately rank footage they watch.  This approach has parallels to the 'data dignity' concept discussed by Lainer and Weyl.

The IP ultimately developed could then be generalized to more surf breaks to develop a direct challenger to the incumbents or sold.  

Problem 1 of the AI Surf Judge

In developing our judge, accurate real time segmentation such as provided by YOLACT of the surfer and their surfboard is an essential building block. This facilitates both speed and duration analysis without the need for sensors. Relative positioning data and the evolution between frames can also be inferred enabling the identification of turns. This can be used as variables in our model for scoring the surfer's performance.

More detail on the surfing judge can be found here:

AI surfing judge
Providing individuals the ability to have their surfing scored by WSL level judges in near real time for all their sessions with only a camera.
0:00
/

Summary

There are many elements of the surfing ecosystem that can benefit from artificial intelligence. In this post we demonstrated using instance segmentation to improve surf report websites by identifying in real time how crowded a line up is and proposed a disruptive model to evaluate the quality of the conditions. We also demonstrated how YOLACT could be used to support problem 1 of our AI surfing judge.

Continue the discussion

Feel free to send me an email!
📧 will@willbosler.com