Real-time Action Detection for Sports
Tensorway developed advanced computer vision models, significantly improving hit detection accuracy in real time through both audio and video-based tracking.

Story behind
Our client’s product is a fitness game involving a small ball attached to a headband by an elastic string, designed to improve coordination, reflexes, and cardiovascular fitness.
Players wear the headband and punch the ball, attempting to keep it in motion without letting it fall, making it a fun and engaging way to exercise.
Client’s mobile app gamifies the experience by offering competitions where users can track the number of ball hits within a specified time frame or aim to achieve a certain number of hits.
Goal
Tensorway’s task was to create an AI model able to count ball hits in real time with maximum accuracy to eliminate manual checks. The client already had a solution to do just that, but it failed to deliver accurate results due to technological limitations.
Challenges
The existing audio-based solution delivered inaccurate results due to background noise, thus requiring manual score verification.
The resulting AI model also needed to be lightweight to work effortlessly on mobile devices.

Tensorway’s solution
Audio classification model
We came up with an approach to use a computer vision-based model to enhance audio classification. By calculating the spectrogram of the audio, which transforms the sound into a visual representation, the computer vision model could classify hits more accurately. Optimizing the algorithms led to a reduction in Mean Absolute Error (MAE) by two times compared to the previous solution.
Transition to action detection using CV
To address the challenge of background noise and cheating inherent to audio models, we switched from audio analysis to a computer vision model for action detection. This model analyzes consecutive video frames to detect hits, requiring sophisticated labeling and dataset cleaning. The optimized lightweight model significantly improved accuracy, effectively overcoming background noise limitations.
Multistep training pipelines
We developed multistep training pipelines for both sound and video-based approaches from scratch. This involved managing data gathering, labeling, and the development of complex audio/image/video labeling setups to ensure the models were trained on high-quality data.
Model deployment and application development
We converted the models for deployment on Android and iOS platforms, building demo applications to showcase the improved hit-counting accuracy in real-time scenarios. This step ensured that the models were not only theoretically sound but also practically applicable.

As a result...
Developed lightweight computer vision models, significantly improving real-time hit-counting accuracy.
Created multistep training pipelines for both sound and video-based approaches from scratch.
Managed data gathering, labeling, and the development of complex audio/image/video labeling setups.
Converted models for deployment on Android and iOS, building demo applications.
The audio model’s accuracy doubled, and the video-based model provided even better precision.
Project team, steps,
and timeline
Team
Initial research with different POCs
Audio POC
Additional data cleaning and labeling
Audio training pipeline
Models conversion, mobile deployment, robustness improvements
Switch to video approach (POCs)
Initial data labeling
Video training pipeline and testing different approaches
Additional data labeling
Demo apps
Total project duration
Other possible
applications
The Tensorway team explores building a unified model that would combine audio and video approaches for the best performance. Our goal is to overcome limitations such as background noise in audio, video quality issues, and camera positioning to create a solution that delivers robust performance in any condition.
We create state-of-the-art AI solutions for sports & entertainment. Tensorway’s custom models transform how users experience sports.
Interested in exploring the possibilities?
Reach out to us to discuss more.