Skip to main content

Deep learning accurately predicts white shark locomotor activity from depth data



The study of bioenergetics, kinematics, and behavior in free-ranging animals has been transformed through the increasing use of biologging devices that sample motion intensively with high-resolution sensors. Overall dynamic body acceleration (ODBA) derived from biologging tags has been validated as a proxy of locomotor energy expenditure has been calibrated in a range of terrestrial and aquatic taxa. The increased temporal resolution required to discern fine-scale processes and infer energetic expenditure, however, is associated with increased power and memory requirements, as well as the logistical challenges of recovering data from archival instruments. This limits the duration and spatial extent of studies, potentially excluding relevant ecological processes that occur over larger scales.


Here, we present a procedure that uses deep learning to estimate locomotor activity solely from vertical movement patterns. We trained artificial neural networks (ANNs) to predict ODBA from univariate depth (pressure) data from two free-swimming white sharks (Carcharodon carcharias).


Following 1 h of training data from an individual shark, ANN enabled robust predictions of ODBA from 1 Hz pressure sensor data at multiple temporal scales. These predictions consistently out-performed a null central-tendency model and generalized predictions more accurately than other machine learning techniques tested. The ANN prediction accuracy of ODBA integrated overtime periods ≥ 10 min was consistently high (~ 90% accuracy, > 10% improvement over null) for the same shark and equivalently generalizable across individuals (> 75% accuracy). Instantaneous ODBA estimates were more variable (R2 = 0.54 for shark 1, 0.24 for shark 2). Prediction accuracy was insensitive to the volume of training data, no observable gains were achieved in predicting 6 h of test data beyond 1–3 h of training.


Augmenting simple depth metrics with energetic and kinematic information from comparatively short-lived, high-resolution datasets greatly expands the potential inference that can be drawn from more common and widely deployed time-depth recorder (TDR) datasets. Future research efforts will focus on building a broadly generalized model that leverages archives of full motion sensor biologging data sets with the greatest number of individuals encompassing diverse habitats, behaviors, and attachment methods.


Biologging tag technologies capable of recording tri-axial motion at increasingly fine resolutions have transformed quantitative studies of biomechanics, energy expenditure, and behavior in free-ranging animals [1,2,3,4]. Ensuing datasets are highly detailed, but can be limited in scope by their expense, short deployment durations and challenging data retrieval [5, 6]. Unlike many other tagging technologies, high resolution (> 5 Hz) motion-sensitive biologgers are currently exclusively archival. These tags need to be recovered to access the memory, which can prove difficult in wide-ranging species [6, 7]. Widely used time-depth recorders (TDRs) [8,9,10], are less affected by these constraints due to lower required sampling frequencies, facilitating data transmissions to satellites [11,12,13]. However, without motion-sensitive logging, they have lacked the ability to elucidate fine-scale behavior, locomotor-kinematics and bioenergetics [14].

The rates at which wild animals expend and acquire energy ultimately determine vital rates that are influential on survival and fitness [1, 15, 16]. Measuring patterns of energy expenditure at an individual scale, thus, informs life history [17], foraging [18], biogeography [19], behavioral strategies [20, 21], and ultimately conservation [14]. Prior to the advent and expanding use of animal-borne biologging sensors [22, 23], energy use and metabolic rates were approximated in the laboratory using direct and indirect calorimetry, in the field using doubly labeled water [24], or heart rate monitoring [25]. In the last decade, motion-sensing biologging tags have emerged as an effective tool for approximating metabolic rate, with overall dynamic body acceleration (ODBA) emerging as a common proxy of locomotory energy expenditure that has been calibrated in numerous taxa [26,27,28,29,30]. Recent work has shown ODBA is particularly well suited to estimating energy expenditure in fishes [31,32,33].

Sampling dynamic body motion, for ODBA calculation, requires infra-second sampling rates and storing these data consumes disproportionate amounts of on-board power reserves [6]. Large volumes of high-resolution data are difficult to relay via satellite or acoustic telemetry due to bandwidth restrictions, and the power draw of transmissions [6]. As such, standard practice mandates device retrieval for full data acquisition, especially for many marine animals that surface infrequently and/or travel beyond land-based transmission infrastructure [34]. For species that do not reliably return to locations where they can be recaptured, marine scientists primarily use remote release mechanisms (e.g., corrodible wires, suction release, etc.) to ensure device retrieval within an accessible area [34]. While remote release methods are fruitful especially when combined with a localizing VHF or satellite beacon [7, 34,35,36], this approach leads to abbreviated tag deployments and largely limits data collection to areas close to the site of capture.

Biologging studies often require tags to condense or simplify the data collected in a process called data abstraction, which is designed to best represent the original data in the fewest number of data points. However, a secondary approach, which is often used during post processing is data augmentation, when one dataset is used to impute a separately, not directly measured variable. These techniques are a fruitful way to circumvent constraints on data richness [37]. Machine learning (ML) methodologies may be particularly useful in data augmentation. ML techniques are capable of a wide variety of linear and nonlinear approximation and offer advantages in modeling correlative relationships with complex and interactive behavior, with minimal underlying assumption [38]. ML techniques have been applied in movement ecology [39,40,41] and accelerometry studies [42,43,44,45], primarily for behavioral state or classification tasks [46, 47]. Leveraging biologging’s data richness, ML could be applied to augment new and existing economically sampled data streams.

Locomotor activity in swimming animals has been shown to vary with the rate of change of depth and this relationship is evident in the dive patterns of diverse taxa including pinnipeds, sharks [48], and teleosts that do not rely on gas-bladders for buoyancy [49]. There are a number of mechanisms that likely contribute to this relationship. First, for organisms with negative buoyancy, increased work will be associated with moving against gravity during ascent relative to descent at a given rate [50, 51]. For organisms with net-positive buoyancy [52], this relationship will be reversed as work is now against the buoyant force. Second, acceleration associated with changes in vertical direction and velocity incur locomotor cost, and this should be reflected in ODBA. Thirdly, hydrodynamic resistance is a squared function of speed, and changes in depth reflect the vertical component of the animal’s swimming speed.

Overall the relationship between vertical movement and locomotor cost is based on first principles. Therefore, at first glance vertical displacement alone seems an insufficient predictor of ODBA (Fig. 1) since it represents only a single dimension of overall movement, while two horizontal planes remain unknown. However, this unidimensional view can be further informed by patterns evident in the depth time series data. These could include repeated behavioral patterns exhibited by the tagged organism. Additionally, by including depth data preceding and/or following a moment in time, the dynamics of vertical movement can be highly informative; similar to the way animation of 2-dimensional representations (i.e., multiple images of a rotated object) lends perception into an unobserved 3rd dimension, volume. From these secondary signals, a better picture of the unobserved dimensions, and their integrated metric, ODBA, might be inferred.

Fig. 1

Depth and locomotor activity of a free-swimming white shark. Sample traces (a) of smoothed overall dynamic body acceleration (ODBA) (red) derived from tri-axial acceleration, and vertical movement (black) data for shark 1 show how raw data is subdivided into contiguous blocks of training (shaded) and testing sets. Inset (b) shows an expanded 1-h view of the two signals

Here, we sought to use ANNs and other machine learning methods to estimate the energetics of free-swimming sharks from time-depth measurements of vertical movements alone. Our approach used archival biologging tags sampling tri-axial acceleration and depth data from white sharks (Carcharodon carcharias). We aimed for simple model designs that minimized data consumed and required minimal model tuning. Our goal was simply to (a) test whether artificial neural networks (ANN), in comparison to other approaches, could provide an accurate locomotor energy expenditure estimate with a reasonable ration of training data to test data from a single individual, and (b) determine whether resulting models and performance were robust to generalization when deployed on data from other individuals without additional training data. This proof-of-concept could offer a pathway for overcoming constraints that limit activity-tracking at extended scales (e.g., over a season or year, or the full migratory range of an animal’s movement), and for enriching large volumes of historical TDR data with novel insights into animal activity rates.


Biologging data collection

Accelometry and vertical movement data were extracted from fin-mounted biologging tags deployed on two individual white sharks referred to here as shark 1 and shark 2 deployments. After attracting sharks to a research boat using a seal decoy, tags were attached to the dorsal fins of two free-swimming white sharks (shark 1–4 m male; shark 2–3.4 m female) using a pole-mounted spring-loaded clamp [35, 36] with a programmable release mechanism. Data were collected from deployments in November 2015 (shark 1) and November 2016 (shark 2) at Tomales Point in central California. Tags were deployed for 27 and 29 h, respectively. For this study, depth and tri-axial accelerations were truncated to a standard 24-h continuous record. Raw acceleration and depth were sampled at 40 and 20 Hz, respectively. Static acceleration was calculated by using a 5-s running mean of the raw acceleration data, and dynamic acceleration was calculated by subtracting the static acceleration from the raw acceleration. ODBA was calculated as the sum of the absolute value of smoothed tri-axial dynamic accelerations [53]. Both depth and ODBA were down-sampled to 1 Hz for model input.

Feed-forward artificial neural networks (ANNs)

Feed forward artificial neural networks consist of interconnected computational units referred to as neurons. Simply represented, input data is passed through an input layer and subsequently propagated through a defined number of hidden layers whereby the sum of the products of the connection weights from each layer approximate a function to estimate the observed output values [54]. Under repeated iteration and adjustment of the connection weights, a function between input (depth) and output (ODBA) is as closely estimated as possible given the parameter space available in the network (ODBA in this case) [55,56,57]. This ability to approximate a wide variety of continuous functions when given appropriate parameter space is called the Universal Approximation Theorem [38]. Detailed development of model architecture lies in the selection of node functions (i.e. activation functions), layer sizes (number of hidden layers and numbers of nodes in each layer), learning rate, regularization parameters, and parameter dropout.

The workflow for tuning ANNs consisted of two stages: (1) training and validation, and (2) testing. As described above, the neural network used the input parameters as the first layer of neurons, and the last layer of neurons represents the predicted output values. During the training and validation phase, the cost (or loss) function, in this case the mean squared error (MSE), was used to evaluate the performance of the ANNs by comparing the instantaneous ODBA data to the output values predicted by the ANNs. Stochastic gradient descent, a common optimization method for ANNs, was then used to iteratively adjust the weights and biases for each neuron to allow the ANNs to best approximate the training data output. At each iteration, a backpropagation algorithm estimated the partial derivatives of the cost function with respect to incremental changes of all the weights and biases, to determine the gradient descent directions for the next iteration. Note that in our model, the neurons of each hidden layer were composed of Rectified Linear Units (i.e., a ReLU activation function), to avoid the vanishing gradients and to improve the training speed [58]. Validation data were not used in the optimization or backpropagation algorithms. Instead, the cost function was evaluated over the validation data served as an independent tuning metric of the performance of the ANN; if the cost function of validation data was increasing with each iteration, it would suggest that the neural net is overfitting the training data.

We used Python toolkit Keras library, which provides a high-level application programming interface to access Google’s TensorFlow deep learning library. For the examples chosen here, we used an adaptive moment estimation (Adam) optimization method, a stochastic gradient descent algorithm that computes adaptive learning rates [59].

ANNs model development

ANNs were tuned across a range of training data volume, while tested on a standardized 6-h set of 1 Hz depth data (n = 21,600 data points) withheld from tuning and training procedures (Fig. 1). Initially, ANNs were trained exhaustively with all 18 h of data remaining following the train-test split (i.e., all data independent of the standard 6-h test set) while optimal ANN architectures were tuned and evaluated. Following an initial evaluation, training datasets consisted of 1-h increments of 1 Hz measurements of depth and ODBA, with 20% withheld from training for a hold-out cross-validation/development set (Fig. 1).

Tuning ANN input features and structures involved varying and evaluating a range of model architectures. Input features are passed to the neural net within moving windows that consist of depth data from t = 1: n (n = 5:60 input data points) to predict ODBA output at t = 1 (Fig. 2). Similarly, we tested a range from “shallow” to “deep” structures, varying the interconnected neurons per hidden layer and number of hidden layers (Additional file 1: Table S1). Following initial exploration of model architecture, architectures with good performance were fine-tuned individually to improve results on each deployment’s test set. We used k-fold cross-validation (k = 10) to ensure consistent predictive performance in the test set and prevent overfitting. Artificial Neural Network tuning proceeded to identify minimally trained model structures that produced acceptable R2 values in the test set and was the basis for selecting moving window size of 30 inputs of depth measurement, and three layers of 40 hidden nodes as a standard architecture for this proof-of-concept study. We then investigated the sensitivity of model results to the volume of training data, tailoring development towards leaner approaches (minimal training) that continue to maintain comparable predictive performance on the standard test set. Common techniques were used to minimize overfitting, such as early-stopping [60, 61] and dropout [62].

Fig. 2

Structure of the feed-forward artificial neural network (ANN) model. Best performing parameters and hyperparameters to obtain the best prediction outputs were as follows: (1) input features = 30 (t = 1 − t = 30), (2) hidden layers = 3, (3) neurons = 40 in each layer, and (4) connection and bias weightings

ANN benchmarking

Additionally, we benchmarked ANN formulations against other common modeling approaches, including tree-based algorithms, such as random forests [63], and gradient boosting [64], as well as support vector machines [65], and linear regression. Here we applied the same workflow to predict ODBA and compared the performance with the ANN approach. Brief descriptions of each method and its implementation are described below, as well as in similar applications in ecological literature [66,67,68].

Random forest analysis is a supervised ensemble classifier that generates unpruned classification trees to predict a response. To address issues of overfitting, random forests implements bootstrapping sampling of the data and randomized subsets of predictors [63]. Final predictions are ensembled across the forest of trees (n = 100) on the basis of an averaging of the probablistic prediction of each classifier. No maximums were set for tree depth, number of leaf nodes, or number of features in order to weight prediction over interpretability, similar to ANNs.

Gradient boosting is another tree-based method that uses a forward stage-wise additive model [64] to iteratively develop predictions from previous “shallower” trees. At each boosting stage (n = 100, learning rate = 0.1), subsequent trees are fit to the negative gradients of the loss function to improve prediction and optimize parameters [69]. Again no maximum was set for tree depth, number of estimators or number of features to encourage maximal prediction.

Support vector machines (SVM) are supervised discriminative classifiers defined by a separating hyperplane [65]. Given labeled training, the algorithm categorizes new examples according to optimal hyperplanes that maximize the distance separating the nearest training data of any class. This method has been used in regression problems (‘support vector regression’, [70]) and, as with other methods, was allowed to operate as freely as possible to maximize prediction (degree of polynomial kernel = 5).

Linear regression (LR) is a common method that estimates a predictive relationship between variables by fitting a linear equation. Ordinary least squares were used to estimate parameters defining a linear relationship between the explanatory and response variable.

Evaluation metrics

Model performance in the context of real-world use cases depends on the selection of an appropriate evaluation metric. A range of options exist, and the selection relies on one that is consistent with the estimation needs. Two evaluation metrics were used to understand model performance in the test set, a point estimation, and an accumulative, or “time-integrated,” measure. The coefficient of determination (R2) was used as a straightforward evaluation metric to measure pointwise fitting performance of predicted ODBA with the corresponding observed ODBA at each 1 Hz time step. While point estimate performance is valuable for assessing model reliability in predicting instantaneous kinematics and short bursts of activity, we also sought to evaluate the models on broader time scales more relevant to understanding energetic expenditure over ecological temporal and spatial scales. Therefore, we also developed a metric to measure the performance of time-integrated accumulation of predicted and observed ODBA. For intervals with increasing widths (5–10,000 s at 5 s increments), we calculated the area under the curve (AUC) by summing 1 Hz measurements of predicted and observed ODBA. Resampling was used to evaluate overall performance throughout the test set, with 2000 randomly placed replicates of each interval width. For each replicate, we calculated AUC of predicted and observed ODBA; then computed the percentage error. The model accuracy of time-integrated ODBA at a given interval is then defined as:

$${\text{Accuracy}} = 1 - \frac{{ \left| { {\text{observed AUC }} - {\text{predicted AUC}}} \right|}}{\text{observed AUC}}$$

We also used this metric to test the generalizability of ANNs trained on one individual to predict ODBA from the depth only data of a second individual. Finally, we compared these results to a null model comprising the median computed ODBA value over the same integrated time-scale.


Pointwise estimates of ODBA provided an initial metric for model accuracy at an instantaneous timescale. Following 1 h of training the standard ANN model resulted in an R2 of 0.54 for shark 1 and 0.25 for shark 2 (Fig. 3). For the time-integrated ODBA predictions, accuracy increased as a function of increasing observation window. In this case test accuracy approached 90% and the range of errors narrowed in both datasets with ODBA binned between 15 and 30 min (Fig. 4). At finer resolutions (e.g., intervals between < 1 and 15 min), model accuracy exceeded 85% and 70%, respectively, in both datasets. Asymptotic performance was evident in both datasets after binning ODBA over 90 min but occurred earlier in shark 1 (Fig. 4 and Additional file 1: Figure S1). Model accuracy was relatively insensitive to training duration over 1 h (Fig. 4; and see Additional file 1: Figure S1 for full suite of model accuracy plots from ANNs trained with 1–17 h of data). In a test of generalizability, the model trained on each shark and used to predict ODBA in the other, produced similar time-integrated results with model accuracy ranging from 80% to 85% between < 1 and 30 min bins, respectively (Fig. 5). Moreover, the 10-fold cross-validation tests show consistent predictive performance and appear to have no overfitting issues in our model (Additional file 1: Tables S1 and S2).

Fig. 3

a Predicted locomotor activity of a white shark following deep learning. The observed (blue) overall dynamic body acceleration (ODBA) measured from tri-axial acceleration data is overlaid by the predicted (red) values over 6 h obtained from the artificial neural network (ANN) model trained with 1 h of data. b The distribution of observed (red) and predicted (blue) ODBA values over the 6-h test set

Fig. 4

Model prediction accuracy over increasing integrated time periods. Resampled estimates of the time-integrated accuracy metric for locomotor activity predictions from artificial neural network (ANN) model for shark 1 (red) and shark 2 (blue) following (a, b) 1 h, (c, d) 3 h, and (e, f) 12 h of training data. A dashed line (a, b) traces the performance of a null model—the median calculated value of ODBA across increasing integrated time periods. Overall performance was insensitive to increased training above 1 h. Accuracy increased with time over which ODBA was integrated (af, x axes) in all cases. Generally, only marginal gains were achieved above time scales of 5 to 10 min

Fig. 5

Generalized versus native model performance. Comparable model predictions resulted when artificial neural networks (ANNs) were trained on vertical movements from one shark and applied to estimate the locomotor activity of both the same individual and a second shark, for which there was no training. Observed (black line) overall dynamic body acceleration (ODBA) for a shark 1 and b shark 2 were estimated following training on the same (blue; native) and the other (red; generalized) individual. Residual plots (shark 1 and 2; c and d, respectively) from the observed/predicted comparisons for the same (blue) and the other (red) individual demonstrate no bias when generalizing the model to predict locomotor behavior across individuals. Both native and generalized models outperformed a null model—the median calculated value of ODBA across increasing integrated time periods

At all timescales the ANN model outperformed the null median ODBA model for both the native and generalized model predictions. In comparison to other common ML algorithms, final ANN models performed similarly in native applications (Additional file 1: Figure S2) or exceptionally better in generalized (Additional file 1: Figure S3) cases, respectively. Instaneous performance of the ANN in the test set (R2 = 0.54) was similar to random forest (R2 = 0.57) and gradient boosting techniques (R2 = 0.55; Additional file 1: Table S3). Each of these methods demonstrated greater than 85% accuracy across temporal scales as well (Additional file 1: Figure S2). Unlike ANN’s robust performance in a generalized case (Additional file 1: Figure S3, R2 = 0.22), these methods failed when deployed on data wholly independent of its training (Additional file 1: Figure S3; \(R^{ 2}_{\text{RF}}\) = 0.001, \(R^{ 2}_{\text{XGB}}\) = 0.001, \(R^{ 2}_{\text{SVM}}\) = 0.004, \(R^{ 2}_{\text{LR}}\) = 0.002), confirming other findings that tree-based methods are less generalizable [71]. Linear regression failed to produce acceptable results in both native and generalized cases (Additional file 1: Table S3, Additional file 1: Figure S3).


Our results demonstrate the ability of supervised machine learning techniques to extract reliable predictions of ODBA from vertical movement data (Figs. 3, 4). ODBA is a valuable proxy of energetics derived from accelerometry data that is generally more challenging to collect, in comparison to vertical movement data. Our approach was particularly suited for predicting ODBA aggregated over timescales of minutes (Fig. 4). At these integrated time scales accuracy neared 90% after 1 h of training data on a single individual and modest model tuning.

The predictive power of deep learning techniques generally improves with the increasing diversity of data [72], leaving further room for increasing predictive accuracy and more broadly generalizing across individuals and perhaps taxa as training sets accumulate. Gains in predicative power through more systematic model architecture selection, tuning, and model ensembles could also improve performance. Here we consider the implications of this ANN-enabled data augmentation procedure for broader amplification of biologging data from sharks and other taxa swimming or flying in 3-dimensional environments.

Vertical movement and the temporal scale of ODBA

We demonstrate deep learning in the ANN can be adequately trained to predict locomotor activity in sharks from vertical displacement and may be generalizable to other swimming or flying organisms. Animals moving in fluid environments (i.e., swimming, flying) share a common set of energetic tradeoffs [73] and exhibit convergent properties in gait and locomotion related to optimal energetic efficiency [48, 49]. The strength of the deep learning approach in predicting ODBA relies on the physics of flying/swimming [74], directional acceleration, and pattern recognition [75]. Therefore, this approach may be accordingly applicable and could be transferable cross-taxonomically with further development. For flying/swimming, the constant buoyancy of sharks is responsible for the strong link between locomotor activity and the kinematics of vertical movements [52, 73, 74]. This link should also apply to other swimming organisms that have limited or no buoyancy compensation (i.e. gas bladder), for example, ram-ventilating teleosts [49]. Applicability to organisms with compressible volume (e.g. breath-holding organisms) should also be feasible [48], since in this case volume (and therefore buoyancy) will be a predictable function of pressure (vertical position). However, since these animals can alter gas volume between breath-holds, it may be necessary to train data across a broader set of conditions.

The increasing accuracy of our prediction as a function of ODBA time-aggregation (Fig. 4) suggests that this approach is optimally suited for questions and processes on these corresponding time-scales. For example, the locomotor activity of prey acquisition and handling in white sharks can be visualized when ODBA is integrated over minutes [76], and sustained unidirectional migration is reflected in distinctive dive patterns on the scale of weeks to months [77, 78]. On the other hand, studies that require resolution of ODBA on scales of seconds may be less suited for this approach. For example, resolving peak bursts of locomotor activity or individual tailbeat strokes [76] requires sub-second resolution. Indeed, in our results, the areas of mismatch in prediction were largely attributed to short bursts and peaks in ODBA (Fig. 3). Such fine-scale dynamics, however, often can be addressed with short-term studies, where there are few limitations for using the full suite of available biologging tools.

Our preliminary assessment of generalizability suggests this approach is capable of ODBA predictions for individuals wholly independent of the ANN’s training set. Our simple exploration swapped ANNs trained from one individual’s training set on the testing set of the other individual (Fig. 5). Distributions of the residuals were unbiased relative to the native instance and time-integrated performance comparable (Fig. 5 and Additional file 1: Figure S1). As a proof of concept, this initial generalizability evaluation demonstrates the feasibility and importantly distinguishes the ANN approach from ML alternatives (Additional file 1: Figure S2, Additional file 1: Figure S3). Though less interpretable comparably, the ANN’s unmatched performance in predicting on data wholly independent of the training source (Additional file 1: Figure S3) demonstrates its broader utility as an augmentation tool. Ultimately the applicability of these methods will be limited by the comprehensiveness (diversity) of their training datasets and further development should focus on expanding the individuals, behaviors, and habitats accounted for during training.

Data augmentation through artificial intelligence

The advent of diminutive motion-sensing loggers has revolutionized activity tracking in wild animals and greatly advanced ecological understanding in natural settings. However, given the current state of technology, there remain power, memory, and device placement and size constraints limiting the temporal and spatial scale as well as the size of subjects in current studies. As a result, the advances that these sensors promise have yet to be broadly realized at landscape-level scales (e.g. across the full migratory range of a subject, or for a full year). Data augmentation procedures can operate powerfully in tandem with animal-borne instrumentation to bridge these constraints extending their use in future studies and potentially leverage novel information from large volumes of historical TDR data.

Our results suggest that ANN models could enable efficient duty cycling of motion-sensing loggers’ sensors that reduce informational loss regarding bioenergetic proxies. We show that between duty cycles, ODBA can be reasonably predicted with an inexpensive (power and data) pressure transducer continually logging at ≥ 1 s intervals. Full motion-sensor data cycles could then be minimized to provide adequate amounts of training data. For estimating ODBA we found ANNs to be relatively insensitive to the volume of training data above the 1 h and were robust even when augmenting 6 times as much data as it was trained with (Fig. 4 and Additional file 1: Figure S1). Such lean augmentation procedures provide promising duty cycling approaches that make efficient use of tag resources with minimal overt information loss. We anticipate that the cumulative addition of more and diverse training data sets over time will vastly increase this ratio while improving prediction accuracy.

Augmentation procedures that leverage deep learning could also be generalized to apply to independent datasets lacking associated motion-sensing data needed to measure ODBA. For example, historical TDR data. Our initial generalization found comparable predictive performance for an ANN trained on a different shark of similar size (Fig. 5). Where computation is not a constraint, training sets can be enlarged to encompass the widest breadth of individuals, behaviors, and habitats available [45]—and contribute to an ever-growing library and development of a powerful ensemble model. Leveraging this information in a deep learning context holds great potential for augmenting decades worth of existing TDR datasets once cross-generalization has been thoroughly validated. An entire biologging database with deep ANN structures [79, 80] implementing transfer learning [81] thus holds great promise as a powerful approach for augmenting biologging data relevant to larger ecological and spatiotemporal scales. This broadly generalizable approach would be much in the same spirit of well-known image recognition models trained the on web database of over 14 million labeled images or the word vector models trained on large volumes of text scraped from vast breadths of the internet [82, 83].

Future directions

We leveraged machine learning to augment sparse vertical movement data informed by ecologically-valuable proxies measured by costly and sophisticated biologging technologies. By using these advanced post-processing techniques to bridge the complementary vertical movement and ODBA data, biologging studies can exploit strengths of various tagging technologies to extend and generate greater understanding of activity rate and underlying bioenergetics at broader scales. Energy landscapes, for instance, which are mechanistic frameworks to connect animal movement, behavior, and energetic costs [84], have deepened the understanding of cost-effective movement, resource acquisition, and behavioral decisions (e.g., selection of tail-winds in soaring birds [85]), but require extending our ability to estimate locomotor activity over increased spatio-temporal scales.

Following this proof-of-concept study, to gain the greatest leverage in prediction accuracy, augmentation ratio, and generalizability (including historical data), future work should focus on leveraging a maximum number of full-motion sensor biologging data sets with the greatest number of individuals encompassing diverse habitats, behaviors, and attachment methods. This, coupled with a systematic approach for optimal model tuning will maximize utility. A more in-depth validation of this technique should help determine optimal duty-cycle ratios for augmentation to guide future tag programming and experimental design. Determining the relationship between TDR sampling rate and ODBA predictive accuracy will also help determine the minimal data resolution that can be used to estimate locomotor activity.

Alternate deep learning techniques and structures could improve the relatively simple formulation we implemented in this study. Future work can investigate applications of techniques specialized for time series data, such as recurrent neural networks (RNNs) or long short-term memory (LSTM). RNNs have been proved to be very efficient at exploring the dynamic temporal behavior for a time sequence. Similarly, LSTM maintains a memory of values over arbitrary time intervals [86] and can be implemented as a layer within an RNN. Such approaches have found success when applied to tasks in speech recognition, handwriting recognition, and polyphonic music modeling as well as financial forecasting problems [87,88,89]. Other convolutional and recurrent network structures are finding increased traction in ecological and behavioral studies [45, 90]. Despite the inherent time-series nature of our data, we find its simple network structure an ideal first step in applying these techniques in biologging data augmentation schemes.


Here, we have presented a deep learning approach to predicting ODBA from vertical movement data alone and applied resulting neural networks to approximate energetic expenditures of tagged white sharks. For each individual, resulting neural networks proved highly capable at recognizing and learning patterns in vertical movement data that were predictive of ODBA measurements calculated from tri-axial accelerometry data. Testing these trained networks against withheld data demonstrated the neural network’s performance estimating energy expenditure, particularly over broader temporal intervals. Performance was also robust to generalization across individuals. Along with other pioneering ecological studies capitalizing on artificially intelligent data processing [45, 90, 91], these approaches can take full advantage of the power of machine learning to push and enhance ecological inference from animal-borne instrumentation to new scales.

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.


  1. 1.

    Brown DD, Kays R, Wikelski M, Wilson R, Klimley AP. Observing the unwatchable through acceleration logging of animal behavior. Animal Biotelemetry. 2013;1(1):20.

    Article  Google Scholar 

  2. 2.

    Hussey NE, et al. Aquatic animal telemetry: a panoramic window into the underwater world. Science. 2015;348(6240):1255642.

    PubMed  Article  CAS  Google Scholar 

  3. 3.

    Wilmers CC, Nickel B, Bryce CM, Smith JA, Wheat RE, Yovovich V. The golden age of bio-logging: how animal-borne sensors are advancing the frontiers of ecology. Ecology. 2015;96(7):1741–53.

    PubMed  Article  Google Scholar 

  4. 4.

    Cooke SJ, et al. Remote bioenergetics measurements in wild fish: opportunities and challenges. Comp Biochem Physiol A: Mol Integr Physiol. 2016;202:23–37.

    CAS  Article  Google Scholar 

  5. 5.

    Rutz C, Hays GC. “New frontiers in biologging science. London: The Royal Society; 2009.

    Google Scholar 

  6. 6.

    Whitney NM, Papastamatiou YP, Gleiss AC. Integrative multi-sensor tagging: emerging techniques to link elasmobranch behavior, physiology and ecology. Biol Sharks Relat. 2012;1:265–90.

    Google Scholar 

  7. 7.

    Whitmore BM, White CF, Gleiss AC, Whitney NM. A float-release package for recovering data-loggers from wild sharks. J Exp Mar Biol Ecol. 2016;475:49–53.

    Article  Google Scholar 

  8. 8.

    Thums M, Bradshaw CJ, Hindell MA. A validated approach for supervised dive classification in diving vertebrates. J Exp Mar Biol Ecol. 2008;363(1–2):75–83.

    Article  Google Scholar 

  9. 9.

    Musyl MK, et al. Postrelease survival, vertical and horizontal movements, and thermal habitats of five species of pelagic sharks in the central Pacific Ocean. Fish Bull. 2011;109(4):341–68.

    Google Scholar 

  10. 10.

    Jepsen N, Thorstad EB, Havn T, Lucas MC. The use of external electronic tags on fish: an evaluation of tag retention and tagging effects. Anim Biotelem. 2015;3(1):49.

    Article  Google Scholar 

  11. 11.

    Arnold G, Dewar H. Electronic tags in marine fisheries research: a 30-year perspective. In: Electronic tagging and tracking in marine fisheries. New York: Springer; 2001, p. 7–64.

    Google Scholar 

  12. 12.

    Boustany AM, Marcinek DJ, Keen J, Dewar H, Block BA. Movements and temperature preferences of Atlantic bluefin tuna (Thunnus thynnus) off North Carolina: a comparison of acoustic, archival and pop-up satellite tags. Electronic tagging and tracking in marine fisheries. New York: Springer; 2001. p. 89–108.

    Google Scholar 

  13. 13.

    Block BA, et al. Toward a national animal telemetry network for aquatic observations in the United States. Anim Biotelem. 2016;4(1):6.

    Article  Google Scholar 

  14. 14.

    Wilson AD, Wikelski M, Wilson RP, Cooke SJ. Utility of biological sensor tags in animal conservation. Conserv Biol. 2015;29(4):1065–75.

    CAS  PubMed  Article  Google Scholar 

  15. 15.

    Brown JH, Gillooly JF, Allen AP, Savage VM, West GB. Toward a metabolic theory of ecology. Ecology. 2004;85(7):1771–89.

    Article  Google Scholar 

  16. 16.

    Wilson RP, Shepard E, Liebsch N. Prying into the intimate details of animal lives: use of a daily diary on animals. Endanger Species Res. 2008;4(1–2):123–37.

    Article  Google Scholar 

  17. 17.

    Zera AJ, Harshman LG. The physiology of life history trade-offs in animals. Annu Rev Ecol Syst. 2001;32(1):95–126.

    Article  Google Scholar 

  18. 18.

    Lowe CG. Bioenergetics of free-ranging juvenile scalloped hammerhead sharks (Sphyrna lewini) in Kāne’ohe Bay, Ō’ahu, HI. J Exp Mar Biol Ecol. 2002;278(2):141–56.

    Article  Google Scholar 

  19. 19.

    McNab BK. Minimizing energy expenditure facilitates vertebrate persistence on oceanic islands. Ecol Lett. 2002;5(5):693–704.

    Article  Google Scholar 

  20. 20.

    Hinch SG, Rand PS. Swim speeds and energy use of upriver-migrating sockeye salmon (Oncorhynchus nerka): role of local environment and fish characteristics. Can J Fish Aquat Sci. 1998;55(8):1821–31.

    Article  Google Scholar 

  21. 21.

    Costa DP. Reproductive and foraging energetics of pinnipeds: implications for life history patterns. In: The behaviour of pinnipeds. New York: Springer; 1991. pp. 300–44.

    Google Scholar 

  22. 22.

    Costa DP. Methods for studying the energetics of freely diving animals. Can J Zool. 1988;66(1):45–52.

    Article  Google Scholar 

  23. 23.

    Kooyman GL. Genesis and evolution of bio-logging devices: 1963–2002. 2004.

  24. 24.

    Speakman JR, Racey PA. The doubly-labelled water technique for measurement of energy expenditure in free-living animals. Sci Progr. 1988;1:227–37.

    Google Scholar 

  25. 25.

    Butler P, Woakes A. Heart rate and aerobic metabolism in Humboldt penguins, Spheniscus humboldti, during voluntary dives. J Exp Biol. 1984;108(1):419–28.

    CAS  PubMed  Google Scholar 

  26. 26.

    Wilson RP, et al. Moving towards acceleration for estimates of activity-specific metabolic rate in free-living animals: the case of the cormorant. J Anim Ecol. 2006;75(5):1081–90.

    PubMed  Article  Google Scholar 

  27. 27.

    Wilson RP, et al. A spherical-plot solution to linking acceleration metrics with animal performance, state, behaviour and lifestyle. Mov Ecol. 2016;4(1):22.

    PubMed  PubMed Central  Article  Google Scholar 

  28. 28.

    Gleiss AC, Wilson RP, Shepard EL. Making overall dynamic body acceleration work: on the theory of acceleration as a proxy for energy expenditure. Methods Ecol Evol. 2011;2(1):23–33.

    Article  Google Scholar 

  29. 29.

    Fahlman A, Wilson R, Svärd C, Rosen DA, Trites AW. Activity and diving metabolism correlate in Steller sea lion Eumetopias jubatus. Aquat Biol. 2008;2(1):75–84.

    Article  Google Scholar 

  30. 30.

    Halsey L, Shepard E, Quintana F, Laich AG, Green J, Wilson R. The relationship between oxygen consumption and body acceleration in a range of species. Comp Biochem Physiol A: Mol Integr Physiol. 2009;152(2):197–202.

    CAS  Article  Google Scholar 

  31. 31.

    Metcalfe J, Wright S, Tudorache C, Wilson R. Recent advances in telemetry for estimating the energy metabolism of wild fishes. J Fish Biol. 2016;88(1):284–97.

    CAS  PubMed  Article  Google Scholar 

  32. 32.

    Lear KO, Whitney NM, Brewster LR, Morris JJ, Hueter RE, Gleiss AC. Correlations of metabolic rate and body acceleration in three species of coastal sharks under contrasting temperature regimes. J Exp Biol. 2017;220(3):397–407.

    PubMed  Article  Google Scholar 

  33. 33.

    Wright S, Metcalfe S, Hetherington S, Wilson R. Estimating activity-specific energy expenditure in a teleost fish, using accelerometer loggers. Mar Ecol Progr Ser. 2014;220(3):397–407.

    Google Scholar 

  34. 34.

    Lear KO, Whitney NM. Bringing data to the surface: recovering data loggers for large sample sizes from marine vertebrates. Anim Biotelem. 2016;4(1):12.

    Article  Google Scholar 

  35. 35.

    Gleiss AC, Norman B, Liebsch N, Francis C, Wilson RP. A new prospect for tagging large free-swimming sharks with motion-sensitive data-loggers. Fish Res. 2009;97(1–2):11–6.

    Article  Google Scholar 

  36. 36.

    Chapple TK, Gleiss AC, Jewell OJ, Wikelski M, Block BA. Tracking sharks without teeth: a non-invasive rigid tag attachment for large predatory sharks. Animal Biotelemetry. 2015;3(1):14.

    Article  Google Scholar 

  37. 37.

    Fedak M, Lovell P, McConnell B, Hunter C. Overcoming the constraints of long range radio telemetry from animals: getting more useful data from smaller packages. Integr Comp Biol. 2002;42(1):3–10.

    PubMed  Article  Google Scholar 

  38. 38.

    Hornik K, Stinchcombe M, White H. Multilayer feedforward networks are universal approximators. Neural Netw. 1989;2(5):359–66.

    Article  Google Scholar 

  39. 39.

    Breed GA, Costa DP, Jonsen ID, Robinson PW, Mills-Flemming J. State-space methods for more completely capturing behavioral dynamics from animal tracks. Ecol Model. 2012;235:49–58.

    Article  Google Scholar 

  40. 40.

    Langrock R, King R, Matthiopoulos J, Thomas L, Fortin D, Morales JM. Flexible and practical modeling of animal telemetry data: hidden Markov models and extensions. Ecology. 2012;93(11):2336–42.

    PubMed  Article  Google Scholar 

  41. 41.

    Michelot T, Langrock R, Patterson TA. moveHMM: an R package for the statistical modelling of animal movement data using hidden Markov models. Methods Ecol Evol. 2016;7(11):1308–15.

    Article  Google Scholar 

  42. 42.

    Nathan R, Spiegel O, Fortmann-Roe S, Harel R, Wikelski M, Getz WM. Using tri-axial acceleration data to identify behavioral modes of free-ranging animals: general concepts and tools illustrated for griffon vultures. J Exp Biol. 2012;215(6):986–96.

    PubMed  PubMed Central  Article  Google Scholar 

  43. 43.

    Carroll G, Slip D, Jonsen I, Harcourt R. Supervised accelerometry analysis can identify prey capture by penguins at sea. J Exp Biol. 2014;217:113076.

    Article  Google Scholar 

  44. 44.

    Ladds MA, Thompson AP, Kadar JP, Slip DJ. “Super machine learning: improving accuracy and reducing variance of behaviour classification from accelerometry. Anim Biotelem. 2017;5(1):8.

    Article  Google Scholar 

  45. 45.

    Brewster LR, et al. Development and application of a machine learning algorithm for classification of elasmobranch behaviour from accelerometry data. Mar Biol. 2018;165(4):62.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  46. 46.

    Wilson RP, et al. Give the machine a hand: A Boolean time-based decision-tree template for rapidly finding animal behaviours in multisensor data. 2018.

  47. 47.

    Wang G. Machine learning for inferring animal behavior from location and movement data. Ecol Inform. 2019;49:69–76.

    Article  Google Scholar 

  48. 48.

    Gleiss AC, et al. Convergent evolution in locomotory patterns of flying and swimming animals. Nat Commun. 2011;2:352.

    PubMed  Article  CAS  Google Scholar 

  49. 49.

    Noda T, Fujioka K, Fukuda H, Mitamura H, Ichikawa K, Arai N. The influence of body size on the intermittent locomotion of a pelagic schooling fish. Proc R Soc B: Biol Sci. 1832;2016(283):20153019.

    Google Scholar 

  50. 50.

    Gleiss AC, Norman B, Wilson RP. Moved by that sinking feeling: variable diving geometry underlies movement strategies in whale sharks. Funct Ecol. 2011;25(3):595–607.

    Article  Google Scholar 

  51. 51.

    Iosilevskii G, Papastamatiou YP. Relations between morphology, buoyancy and energetics of requiem sharks. R Soc Open Sci. 2016;3(10):160406.

    PubMed  PubMed Central  Article  Google Scholar 

  52. 52.

    Nakamura I, Meyer CG, Sato K. Unexpected positive buoyancy in deep sea sharks, Hexanchus griseus, and a Echinorhinus cookei. PLoS ONE. 2015;10(6):e0127667.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  53. 53.

    Qasem L, et al. Tri-axial dynamic acceleration as a proxy for animal energy expenditure; should we be summing values or calculating the vector? PLoS ONE. 2012;7(2):e31187.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  54. 54.

    LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436.

    CAS  PubMed  Article  Google Scholar 

  55. 55.

    Rosenblatt F. Principles of neurodynamics. perceptrons and the theory of brain mechanisms. Cornell Aeronautical Lab Inc Buffalo NY1961.

  56. 56.

    Nielsen MA. Neural networks and deep learning. San Francisco: Determination press; 2015.

  57. 57.

    Goodfellow I, Bengio Y, Courville A. Deep learning. Cambridge: MIT press; 2016.

    Google Scholar 

  58. 58.

    Nair V, Hinton GE. Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), 2010, pp. 807–14.

  59. 59.

    Kingma DP, Ba J. Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.

  60. 60.

    Prechelt L. Automatic early stopping using cross validation: quantifying the criteria. Neural Netw. 1998;11(4):761–7.

    PubMed  Article  Google Scholar 

  61. 61.

    Caruana R, Lawrence S, Giles CL. Overfitting in neural nets: backpropagation, conjugate gradient, and early stopping. In: Advances in neural information processing systems. 2001, pp. 402–08.

  62. 62.

    Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res. 2014;15(1):1929–58.

    Google Scholar 

  63. 63.

    Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.

    Article  Google Scholar 

  64. 64.

    Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. 2001;1:1189–232.

    Article  Google Scholar 

  65. 65.

    Suykens JA, Vandewalle J. Least squares support vector machine classifiers. Neural Process Lett. 1999;9(3):293–300.

    Article  Google Scholar 

  66. 66.

    Peters DP, Havstad KM, Cushing J, Tweedie C, Fuentes O, Villanueva-Rosales N. Harnessing the power of big data: infusing the scientific method with machine learning to transform ecology. Ecosphere. 2014;5(6):1–15.

    CAS  Article  Google Scholar 

  67. 67.

    Crisci C, Ghattas B, Perera G. A review of supervised machine learning algorithms and their applications to ecological data. Ecol Model. 2012;240:113–22.

    Article  Google Scholar 

  68. 68.

    Cutler DR, et al. Random forests for classification in ecology. Ecology. 2007;88(11):2783–92.

    PubMed  Article  Google Scholar 

  69. 69.

    Hastie T, Tibshirani R, Friedman J. The elements of statistical learning. New York: Springer; 2009.

    Google Scholar 

  70. 70.

    Drucker H, Burges CJ, Kaufman L, Smola AJ, Vapnik V. Support vector regression machines. In: Advances in neural information processing systems. 1997, p. 155–61.

  71. 71.

    Tang C, Garreau D, von Luxburg U. When do random forests fail? In: Advances in neural information processing systems. 2018. p. 2987–97.

  72. 72.

    Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E. Deep learning applications and challenges in big data analytics. J Big Data. 2015;2(1):1.

    Article  Google Scholar 

  73. 73.

    Gleiss AC, Potvin J, Goldbogen JA. Physical trade-offs shape the evolution of buoyancy control in sharks. Proc R Soc B: Biol Sci. 1866;2017(284):20171345.

    Google Scholar 

  74. 74.

    Weihs D, Keyes RS, Stalls DM. Voluntary swimming speeds of two species of large carcharhinid sharks. Copeia. 1981;1981(1):219–22.

    Article  Google Scholar 

  75. 75.

    Bishop CM. Pattern recognition and machine learning.

  76. 76.

    Jorgensen SJ, et al. In the belly of the beast: resolving stomach tag data to link temperature, acceleration and feeding in white sharks (Carcharodon carcharias). Anim Biotelem. 2015;3(1):52.

    Article  Google Scholar 

  77. 77.

    Jorgensen SJ, et al. Eating or meeting? Cluster analysis reveals intricacies of white shark (Carcharodon carcharias) migration and offshore behavior. PLoS ONE. 2012;7(10):e47819.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  78. 78.

    Raye G, Jorgensen SJ, Krumhansl K, Ezcurra JM, Block BA. Travelling light: white sharks (Carcharodon carcharias) rely on body lipid stores to power ocean-basin scale migration. Proc R Soc B: Biol Sci. 2013;280(1766):20130836.

    Article  Google Scholar 

  79. 79.

    Larochelle H, Erhan D, Courville A, Bergstra J, Bengio Y. An empirical evaluation of deep architectures on problems with many factors of variation. In: Proceedings of the 24th international conference on Machine learning. ACM; 2007, p. 473–80.

  80. 80.

    Bergstra J, Bengio Y. Random search for hyper-parameter optimization. J Mach Learn Res. 2012;13:281–305.

    Google Scholar 

  81. 81.

    Yosinski J, Clune J, Bengio Y, Lipson H. How transferable are features in deep neural networks? In: Advances in neural information processing systems, 2014, p. 3320–8.

  82. 82.

    Oquab M, Bottou L, Laptev I, Sivic J. Learning and transferring mid-level image representations using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, p. 1717–24.

  83. 83.

    Hoo-Chang S, et al. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imag. 2016;35(5):1285.

    Article  Google Scholar 

  84. 84.

    Shepard EL, Wilson RP, Rees WG, Grundy E, Lambertucci SA, Vosper SB. Energy landscapes shape animal movement ecology. Am Nat. 2013;182(3):298–312.

    PubMed  Article  Google Scholar 

  85. 85.

    Wilson RP, Quintana F, Hobson VJ. Construction of energy landscapes can clarify the movement and distribution of foraging animals. Proc R Soc B: Biol Sci. 2011;279(1730):975–80.

    Article  Google Scholar 

  86. 86.

    Hochreiter S, Schmidhuber J. LSTM can solve hard long time lag problems. In: Advances in neural information processing systems. 1997. p. 473–9.

  87. 87.

    Zaremba W, Sutskever I, Vinyals O. Recurrent neural network regularization. arXiv preprint arXiv:1409.2329, 2014.

  88. 88.

    Luong MT, Pham H, Manning CD. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025, 2015.

  89. 89.

    Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J. LSTM: a search space odyssey. IEEE Trans Neural Netw Learn Syst. 2017;28(10):2222–32.

    PubMed  Article  Google Scholar 

  90. 90.

    Browning E, Bolton M, Owen E, Shoji A, Guilford T, Freeman R. Predicting animal behaviour using deep learning: GPS data alone accurately predict diving in seabirds. Methods Ecol Evol. 2018;9(3):681–92.

    Article  Google Scholar 

  91. 91.

    Valletta JJ, Torney C, Kings M, Thornton A, Madden J. Applications of machine learning in animal behaviour studies. Anim Behav. 2017;124:203–20.

    Article  Google Scholar 

Download references


We thank Francisco Chavez, George Matsumoto, Sue Lisin, Ben Becker, Jim Scholfield, Dale Graves, Jon Erickson, Farley Shane, Jose Rosal, Jim Montgomery, Erik Trauschke, Hans Thomas, Peter Kraft, and Nikolai Liebsch for their assistance. Field data collection was permitted by Point Reyes National Seashore and the California Department of Fish and Wildlife. Computational resources were based on Cyverse and its work supported by the National Science Foundation under Award Numbers DBI-0735191 and DBI-1265383. URL: The code and data are available on our public Github repository:


ACG is funded by the Australian Research Council DECRA fellowship (150100321). OJDJ’s collaboration with Monterey Bay Aquarium is supported by a Company of Biologist, Journal of Experimental Biology, Travelling Fellowship (JEBTF-180517) and Veterinary and Life Science, Small Grant, Murdoch University. Further support was generously provided by the Monterey Bay Aquarium and the David and Lucile Packard Foundation.

Author information




SJJ designed the study. PK, SJJ, JHM, TM, LB, SA developed field equipment and conducted field work. ZYCL, TG, CFW, and JHM analyzed the data and drafted the figures. JHM, ZYCL, SJJ, AG wrote the manuscript. All authors contributed data and edited the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Zac Yung-Chun Liu or Jerry H. Moxley.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional files

Additional file 1: Figure S1. Effect of training data volume on model performance. Consistent response curves and marginal performance gains with increasing training data volume supported minimizing training datasets, and maximizing prediction periods. Figure S2. Native model performance compared between the manuscript’s ANN (a) and alternate methods (b-d) that produced similar levels of accuracy over longer time-integrated intervals. In particular, tree-based models (c-d) produced equivalent results to the ANN when predicting data within a native case (i.e., the same individual provided the training and testing data). Figure S3. Generalized model performance compared the manuscript’s ANN approaches (b) outperform other methods when trained on data wholly independent (i.e., generalized) from the data used in training. ANN predictions (a) from a native use (i.e., same individual’s data used in training and testing) were slightly more accurate than testing on data from an entirely different deployment (b). No other method (c-e) produced similar results, in spite of strong performance in the native case (Supplement Figure 2). Table S1. Preliminary results from ANN’s sensitivity to volume of training data show marginal improvements (and, in cases, loss) after 1 hour of training. Table S2. Metrics evaluating the effects of different ANN model design and tuning on the test set prediction through k-fold validation. Table S3. Accuracy of predictions within the training and testing sets for various model implementations in a native case, where the model is trained and tested on the same 1 hour of data from the same individual.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Liu, Z.Y., Moxley, J.H., Kanive, P. et al. Deep learning accurately predicts white shark locomotor activity from depth data. Anim Biotelemetry 7, 14 (2019).

Download citation


  • White shark
  • Biologging
  • Machine learning
  • Locomotion
  • Dynamic body acceleration
  • Data augmentation
  • Neural networks