Project Background and Objective

Respiratory rate is an important indicator of the health and welfare status of dairy cows. In recent years, progress has been made in monitoring the RR of dairy cows using video data and learning methods. However, existing approaches often involve multiple processing modules, such as region of interest detection and tracking, which can introduce errors that propagate through successive steps. The objective of this project was to develop an end-to-end computer vision method to predict RR of dairy cows continuously and automatically.

Model

The method leverages the capabilities of a state-of-the-art Transformer model, VideoMAE, which divides video frames into patches as input tokens, enabling the automated selection and featurization of relevant regions, such as a cow’s abdomen, for predicting RR. The original encoder of VideoMAE was retained, and a classification head was added on top of it.

Data Collection

We collected data from 6 dairy cows (1 Holstein Friesian and 5 Brown Swiss) across 3 periods (period 1: November 2022; period 2: May 2023; period 3: June 2023). Each period involved 2 cows, and each cow was individually kept in her designated stall in a tie-stall barn. For each cow, 2 2D cameras (DAHUA, model: DH-SD1A404XBGNR, 2.8 mm–12 mm lens) were installed: one on the side and one above, to capture abdominal movements related to respiration. The cameras were linked to a DAHUA network video recorder (NVR, model: DHINVR4216–16P-4KS2/L). The positions of the cameras and the infrastructure of the data collection system are illustrated in the picture below. During the data collection periods, each cow wore an Embla XactTrace Respiration Belt (product reference: Single Use Cut-To-Fit Universal Respiratory Inductance Plethysmography Belt) and a recording device (Embletta® MPR PG) to collect the RR measurement serving as the ground truth (GT) for the computer vision method.

Results and Demos

Compared with the ground truth obtained from the respiratory belt, the method achieved an Mean Absolute Error (MAE) of 2.58 breaths/minute (bpm), root mean squared error (RMSE) of 3.52 bpm, root mean squared prediction error (RMSPE) of 15.03%, and a Pearson Correlation of 0.86, averaged across the 6 test sets. When compared with the conventional method, which involves multiple processing modules, the end-toend method showed better performance in terms of MAE, RMSE, and RMSPE. The videos below demonstrate how the model monitors the cow’s respiration rate continuously from RGB videos, especially when the cow has little movement. You can find the paper about this work at this link.