Taking the time for range testing: an approach to account for temporal resolution in acoustic telemetry detection range assessments

Background: In acoustic telemetry studies, detection range is usually evaluated as the relationship between the probability of detecting an individual transmission and the distance between the transmitter and receiver. When investigating animal presence, however, few detections will suffice to establish an animal’s presence within a certain time frame. In this study, we assess detection range and its impacting factors with a novel approach aimed towards studies making use of binary presence/absence metrics. The probability of determining presence of an acoustic transmitter within a certain time frame is calculated as the probability of detecting a set minimum number of trans‑ missions within that time frame. We illustrate this method for hourly and daily time bins with an extensive empirical dataset of sentinel transmissions and detections in a receiver array in a Belgian offshore wind farm. Results: The accuracy and specificity of over 84% for both temporal resolutions showed the developed approach performs adequately. Using this approach, we found important differences in the predictive performance of distinct hypothetical range testing scenarios. Finally, our results demonstrated that the probability of determining presence over distance to a receiver did not solely depend on environmental and technical conditions, but would also relate to the temporal resolution of the analysis, the programmed transmitting interval and the movement behaviour of the tagged animal. The probability of determining presence differed distinctly from a single transmission’s detectability, with an increase of up to 266 m for the estimated distance at 50% detection probability ( D 50 ). Conclusion: When few detections of multiple transmissions suffice to ascertain presence within a time bin, pre‑ dicted range differs distinctly from the probability of detecting a single transmission within that time bin. We recom‑ mend the use of more rigorous range testing methodologies for acoustic telemetry applications where the assess‑ ment of detection range is an integral part of the study design, the data analysis and the interpretation of results.

transmitter signals by an acoustic receiver set-up [1]. This relationship is subject to the transmitter-receiver distance, environmental conditions and technical features, in addition to the behaviour of the tagged animal itself. Environmental impacts include static features, such as habitat type and bottom depth [2,3], as well as system dynamics that vary over time, such as wind, water currents, precipitation, biogenic and anthropogenic noise, temperature and stratification [4][5][6]. The detection range can also be dependent on the specifications of the equipment used, including transmitter type, transmitting power output and transmitter placement [7][8][9], as well as receiver depth, orientation and deployment method [5,10,11]. Biofouling on the receiver can significantly decrease receiver performance over time [12]. The tagged animal's behaviour can influence the detectability, e.g. through the occupancy of a specific depth or a propensity to hide or burrow [13,14]. Spatiotemporal variability in detection range is commonly investigated with a range test [1,4,5], where these patterns are evaluated against a relevant subset of factors of potential interference to transmissions.
Whether to optimize the design of a receiver array or to account for variability in detection probability during a study, a range test must be tailored to a study's specific application [1,15,16]. Before and/or during a telemetry study, the detection range is generally evaluated by means of sentinel transmitters at a known, generally fixed, position. Detection range is then typically assessed as the probability of detecting a single transmission at the known distance between receiver and transmitter. This individual detection probability is estimated either for every single transmission [3,10,17], or as the probability of detecting a single transmission within a period of time (e.g. for a daily resolution, this represents the probability of detecting a transmission given that day's conditions) [5,6,18]. However, many telemetry analyses do not build on single detections as a response variable, but rely on a binary presence/absence metric within a specified time bin (e.g. residency) [19][20][21]. For these studies, one detection (or at most a few) within a period of time, generally one hour or day, will suffice to classify the animal as present in that time bin. The probability of determining presence, i.e. detecting at least one or a few transmitted signals within a period of time, thus differs distinctly from the probability of detecting a single transmission [22].
For studies investigating presence of a tagged animal within a specified time bin, the assessment of range has to take into account the temporal resolution of interest. Environmental variables may impact detection range differently on distinct temporal scales [23]. The effect of tidal currents for example, can differ between hourly and daily resolutions. Moreover, the probability of determining presence of a tagged animal will increase if multiple transmissions can be detected. The number of potentially detectable transmissions is related to the chosen time bin and the transmitting interval settings, as well as the behaviour of the animal itself. A larger time bin and shorter transmitting interval result in a higher number of transmissions that can be detected by a receiver and thus in a higher probability that a fish is effectively observed as present within the specified time bin. Fish movement behaviour will also influence the probability of determining presence. An animal passing by a receiver location is expected to spend less time within range of a receiver than an animal that resides at that location. Telemetry researchers already adapt transmitter settings in line with the expectations of residency and movement behaviour to increase the detection probability (e.g. a shorter transmitting interval during the expected migration along a receiver curtain) [15] or reduce the risk of collisions [24]. However, assumptions on movement behaviour are rarely taken into account explicitly in detection range assessments.
In this study, we propose an approach to assess factors that impact the detection range, suitable for studies making use of binary presence/absence metrics. Our conceptual approach builds on the detection probability of a single transmission within a certain time frame to calculate the probability of detecting a given minimum number of transmissions within that time frame. The method can be applied to any receiver array equipped with sentinel transmitters. When investigating the probability of determining the presence of a tagged animal, the number of potentially detectable transmissions is estimated as a function of the chosen time bin, the transmitting interval settings and the behaviour of the animal itself. By applying the method to an extensive data set, the objectives of the current study are to (1) evaluate the predictive performance of the new approach; (2) compare different hypothetical range testing scenarios using this method, and (3) investigate the implications for detection range in study designs with different transmitter settings and animal species.

Analytical protocol
Firstly, data are prepared to model the detection probability of individual transmissions π at a given temporal scale (e.g. hourly or daily). For every receiver-sentinel transmitter combination, the number of transmissions and detections are calculated for the relevant time bin and fitted in a binomial generalized linear model (using a frequentist or Bayesian approach) to predict π in relation to ambient and technical variables. The probability P of discerning k or more detections out of n transmissions throughout that time bin is then calculated as the cumulative distribution function: with p representing the individual detection probability, obtained as the predicted π from the logistic model. In Eq. 1, X denotes the number of detections and n the number of transmissions within the considered time frame. The detection threshold k is the minimum number of detections (X) for a transmitter to be ascertained as present. Therefore, P amounts to the probability of detecting a transmitter at least k times out of the n transmitted signals within a period of time, given the probability π of detecting a single transmission under the prevailing circumstances within that time frame (Fig. 1).

Zero threshold
To address the risk of overestimating P, we propose to set a zero threshold for the modelled probability π. The 'zerocorrected' individual probability π 0 is defined as 0 below a set threshold value for π and rescaled to values between (1) 0 and 1 for the remaining range of the predicted π. Even an extremely low individual probability π can generate a high cumulative probability P if n is high (Fig. 1). The zero threshold deals with the concern of cumulating low predicted probabilities. A logistic model can never render a predicted probability of zero, as the logarithm of zero is not defined. The predicted probability π is also associated with uncertainty, which will propagate with the summing and multiplication operations in Eq. 1 [28]. Setting the zero threshold should be a study-specific consideration, where one evaluates the confidence in the logistic model on the one hand and weighs the risk of overestimating versus underestimating π on the other.

Defining n
In Eq. 1, n represents the number of transmissions that can be detected by a receiver. For a fixed sentinel transmitter, n is defined as the number of executed transmissions within the considered time bin. For a nonstationary animal-borne transmitter, however, n needs to reflect the number of transmissions broadcasted while the tagged animal is within a certain range around a receiver. Therefore, the value of n will depend on the programmed transmitting interval and the time bin, in addition to the movement behaviour of the tagged animal.
Here, we calculate the integer n as (2) n = t min T , where T is the mean transmitting interval and t min is the minimum time an animal is hypothesized to spend within range of the receiver. When defining t min , we make assumptions based on the expected movement behaviour (e.g. speed or residency) of the species of interest. For example, high residency or low activity would result in a higher estimate for t min than for migrating behaviour. The less is known about a study species and/or area, the more conservatively low t min should be set.

Empirical data set
Between 13 May and 12 October 2020, an array of 27 VR2AR receivers (InnovaSea Systems Inc., USA) was set up in the Belwind offshore wind farm in the Belgian part of the North Sea. Receivers were deployed with tripod moorings [10], with distance between receivers ranging from 125 to 1628 m (Fig. 2). The array design was purposed to investigate presence and fine-scale movement patterns of plaice (Pleuronectes platessa), Atlantic cod (Gadus morhua) and European seabass (Dicentrarchus labrax) in the framework of ongoing studies, for which the VR2AR receivers' built-in transmitters (mean transmitting interval of 10 min) served as synchronization tags for a fine-scale positioning application. Transmitting power output was set as high (154 dB) for the entire study period for all built-in transmitters, except for three ( Fig. 2) that were programmed as low (142 dB) before 16 June 2020 in the interest of assessing the effect of power output on detection range. Detections on the dates of receiver installation, receiver recovery and power setting changes were excluded from the analysis, making for a total of 150 days of detection data. Ambient and technical conditions taken into account consisted of wind and current speed and azimuth, noise, receiver tilt angle, temperature and days since deployment (Table 1). Wind measurements were obtained from 'Meetnet Vlaamse Banken' from station Westhinder (51.38°N, 2.44°E). Modelled current data originated from a forecast model [29]. From the hourly wind and current velocities, daily median current and direction were calculated using trigonometry principles. For both wind and were either set to transmit at high power output for the entire study period (purple) or at low power output before 16 June 2020 and high power output afterwards (blue). Hypothetical range testing scenarios included either all receivers and built-in transmitters or those within a North-South or East-West axis (pink dotted lines) current, the azimuth was calculated as the angle between the transmitter-receiver bearing and the direction. Noise (mV), tilt angle (°) and temperature (°C) were drawn from the VR2AR built-in sensors. The hourly measurements were linearly interpolated to the stroke of every hour, from which daily medians were calculated. Before inclusion in the model, all continuous variables were standardized.

Application of the approach
The described protocol was applied to the empirical data set to assess the detection range for determining presence in hourly and daily time bins.

Logistic model
We evaluated for every sentinel transmission whether it was detected by the receivers in the study set-up. To account for internal clock drift of the acoustic receivers, the recorded time of detection had to be within 100 s before or after the registration of the successful transmission on the built-in transmitter's receiver (D. Webber, pers. comm.), after applying a linear time correction on the offloaded receiver data (VUE software, InnovaSea Systems Inc., USA). For every transmitter-receiver combination, the hourly and daily numbers of transmissions and detections were calculated. Transmitter-receiver combinations spaced more than 1100 m were excluded from the analysis. Generalized linear models with a binomial distribution were applied to predict π hour and π day . Response variables were the hourly and daily number of transmissions successfully detected versus those undetected. The inclusion of different explanatory variables was evaluated for (1) relevance by data exploration [30]; (2) statistical significance by backwards model selection using the Akaike information criterion (AIC) and likelihood ratio test (LRT) [31], and (3) practical significance on the basis of effect size [32,33], whereby factors were excluded from the model if the effect estimate was below |0.2|.

Cumulative detection probability
Cumulative detection probabilities P hour and P day were then calculated (Eq. 1) and validated for the entire study period. The detection threshold k was set at 2, as applied by many studies [19][20][21]. The number of tries n was set as the registered number of sentinel transmissions within the hour or day. Individual detection probabilities π hour and π day were obtained using the logistic model formulae. P hour and P day were then calculated with individual detection probabilities π 0 hour and π 0 day at a zero threshold of 0.05. If P ≥ 0.5, sentinel transmitters were classified as present versus not present for P < 0.5 [34]. These binary predictions were compared with the determined presence throughout every day and hour (0/1, with 1 meaning at least 2 (k) transmissions were detected). To assess the predictive performance, a confusion matrix was inspected from which the performance metrics sensitivity, specificity and accuracy were calculated, in addition to the computation of area under the curve (AUC) [35]. High values for accuracy and AUC suggested a good overall performance, whereas sensitivity and specificity depicted the model's ability to correctly predict positive and negative values, respectively. For range testing, we favoured high scores for specificity over sensitivity, as a high number of false positives would indicate an overestimation of range.

Scenarios for detection range assessment
Using our empirical data set, we evaluated different scenarios for detection range assessment with a crossvalidation approach. Therefore, we split the full data set of sentinel transmissions and detections into different training and test subsets (Table 2), as if we were assessing detection range (training set) for an actual telemetry study (test set). For each of the test subsets, we considered 16 June 2020 as the start of the hypothesized study. Training sets either contained 'range test' data from before this date, 'reference tag' data from during this study, or both. 'Range test' training data considered the data of 8, 16, 24 or 32 days before the start of the hypothetical study. Spatially, these training sets consisted either of all 27 receiver-sentinel transmitter combinations, a North-South axis (8 receivers) or East-West axis (9 receivers), approximately parallel and perpendicular to the dominant current direction, For the cross-validation, logistic models were trained on each of the specified training sets. The included variables were drawn from the model selection based on the full hourly and daily data sets. As sentinel transmitters were all set to transmit at high power output after 16 June 2020, power output was not included in the logistic models for the 'reference tag' training data. Using the logistic model formulae from the training model, π hour and π day were predicted for the test data. Cumulative probabilities P hour and P day were calculated with Eq. 1, with k set as 2 and n as the number of registered sentinel transmissions in each specific hour or day. Transmitters were thus predicted as detected in that hour or day if P ≥ 0.5 and as not detected (0) if P < 0.5. The predictive capacity of these models was assessed by calculating root mean square error (RMSE) of the true detection percentage and the predicted π and by calculating specificity, AUC and the Brier Skill Score (BSS) for the binary predictions based on the cumulative probability P ( Table 2). For the calculation of BSS, the Brier score of the full model was used as the reference value Brier score [36].

Assessing range for different study species
Detection range in our study area was estimated in the context of ongoing telemetry studies investigating hourly or daily presence of different species. The expected minimum time t min was hypothesized to be 15 min per hour and 30 per day for very mobile species (e.g. European seabass), 30 and 60 min for less active species (e.g. Atlantic cod) and 1 and 3 h for species that would mostly stay put (e.g. plaice). Using these t min estimates in Eq. 2, n was calculated for the different species at mean transmitting intervals T of 90, 180 and 360 s. P hour and P day were calculated (Eq. 1) for distances from 100 to 1100 m with k = 2 and the predicted π 0 hour and π 0 day at median hourly and daily conditions, respectively. The distance at which detection probability was predicted to be 50% (= D 50 ) was calculated using one-dimensional root-finding.

Logistic model
After variable selection, the final logistic regression models for both hourly and daily response variables included the explanatory variables distance, noise, power output, the interactions of distance-noise and distance-power output (Table 3). Visual inspection of the relationship with distance led us to include distance transformed to the second power [37], which contributed to an improved model fit. In summary, high levels of ambient noise and low transmitting output power significantly reduced the probability of a transmission being detected, whereby these negative effects were exacerbated at greater distance (Fig. 3). At shorter distance (< 300 m) of a receiver, the detection probability of a low power output transmitter exceeded that of one with high power output, which was likely due to close proximity detection interference [2,26]. Details of the model selection were fully described in Additional file 1.

Cumulative detection probability
Performance metrics were compared for calculations of P hour and P day (k = 2, n: median 143 per day, 6 per hour) using π and π 0 ( Table 4). While the predictive performance differed only slightly for P hour , it markedly improved with the zero threshold for P day . Aside from a higher overall performance (accuracy and AUC), specificity increased by 30.3% for the daily model (2.2% for the hourly model). Whereas P hour was overestimated at short distance (< 600 m), the accuracy of the daily predictions was more consistent over distance (Fig. 3).

Scenarios for detection range assessment
The performance of distinct scenarios for the assessment of detection range varied considerably (Fig. 4). When models were trained exclusively with 'range test' data before the hypothesized start of the study, the performance of the scenarios using the full receiver set-up and the East-West axis were comparable. Training sets with receivers located parallel to the dominant current direction along the North-South axis, resulted in a lower performance (higher RMSE and lower specificity and AUC). The variation in performance between different Table 3 Summary of the GLM with binomial distribution for individual detection probability π hour (left) and π day (right). Hourly noise measurements were linearly interpolated to the stroke of every hour (left), from which daily medians were calculated (right)   and 'reference tag' data yielded much more consistency in the performance metrics. Yet, specificity for 'reference tag' training sets excluding the 'range test' data was often higher than for those where it was included, therefore seemingly resulting in improved predictions.
To understand the variation in the performance metrics, AUC and BSS were plotted against specificity and RMSE (Fig. 5). AUC and BSS displayed a parabolic relationship with specificity, meaning higher specificity came at the cost of lower overall prediction performance. An optimal approach for range assessment should be found at the trade-off between specificity and general performance, i.e. at the top of the parabola. Importantly, the training models combining 'range test' and 'reference tag' data were all found to be comparable in this relationship. Finally, low RMSE values for individual detection probability π produced more accurate cumulative probability predictions, as could be expected.

Assessing range for different study species
Using hypothesized t min values for species with distinct movement patterns, we calculated n at different mean transmitting intervals (Table 5). For a fast moving species, thought to spend at least 30 min throughout a day around a receiver if present that day, and equipped with a tag transmitting on average once every 180 s, n would result in minimum 10 transmissions that could be detected by that receiver throughout the day. Notice that different values for t min can result in a similar n, depending on the transmitting interval.
Using these values for n, detection probabilities P hour and P day were calculated (Eq. 1; k = 2) using the logistic model predictions of π over distance for median noise conditions and high transmitting power output. The visualizations in Figs. 6 and 7 illustrate the impact of temporal resolution, transmitter interval settings and (expected) movement behaviour on detection range. Detection range as predicted by P hour and P day markedly exceeded π hour and π day . The estimated D 50 increased by 84 to 266 m, depending on n. These results illustrate the distinction between the probability π of detecting an individual transmission in a given time frame versus the  Table 5). The intersection of the curves with a probability of 0.5 (white line) indicates the D 50 . The intersection of the curves of π and P was a result of setting the detection threshold k at 2, whereas π and P at k = 1 would never intersect (Fig. 1 probability P of determining presence during that time frame.

Importance of considering time
Our results stress the importance of explicitly accounting for time when assessing detection range. When few detections of multiple transmissions suffice to ascertain presence within a time bin, predicted range differs distinctly from the probability of detecting a single transmission within that time bin. Our results showed that detection range might be severely underestimated when applying the individual detection probability for studies making use of binary presence/absence metrics. Moreover, a single receiver station can result in different detection ranges for animals occupying the space at that location differently. High values of t min , e.g. for animals known to move slowly and/or to exhibit high residency (or for transmitters set at short transmitting intervals), were demonstrated to result in a higher estimated range.

Evaluation of the proposed method
To our knowledge, this study offers the first framework to quantify the detection range for presence/absence metrics within a given time frame. The proposed formula (Eq. 1) provides a mathematically straightforward tool that builds on the commonly estimated probability of detecting a single transmission π. The accuracy and specificity of over 84% shows the developed approach performs adequately. However, the performance of the hourly model varied with distance, whereas the accuracy of the daily predictions was more consistent. The formula's parameters zero threshold, detection threshold k and number of tries n should therefore be set and evaluated according to the specific needs of a study. The zero threshold can explicitly deal with the risk of cumulating low logistic probabilities. The selected value for this threshold depends on the confidence in the binomial model predictions and the trade-off of the risks of over-and underestimating detection range. We believe that the relatively simple concept of a zero threshold-"below what threshold value do I not trust my logistic model outcome to exceed zero"-is to be preferred over a more sophisticated, yet mathematically exceedingly complex alternative of calculating the logistic error propagation [28]. For the purpose of understanding hourly and daily presence within the study area, we explicitly wanted to limit the amount of false positives as to not overestimate detection range. In contrast, telemetry studies that build on a smaller detection range [38] need to favour higher sensitivity. Applying the zero threshold in our study improved the daily predictions more dramatically than it did for the hourly model. This was in part attributed to a larger n, which made for a steeper curve than P hour (Fig. 1).

Fig. 7
Predicted detection probabilities over distance around a receiver for high transmitting power at median noise conditions for an hourly (upper) and daily (lower) resolution, as calculated with different numbers of detectable transmissions n ( Table 5). The D 50 distance is marked for each probability (white line and text) with probabilities over and under 0.5 coloured in red and blue, respectively When setting a zero threshold therefore, the number of transmissions n, as well as the detection threshold k, should always be taken into consideration.
In addition to the estimated π 0 , the proposed approach requires values for n and k that are tailored to the telemetry study. Firstly, though a minimum of (generally 2) detections is often applied to qualify a time bin with fish presence [19][20][21], this detection threshold k has never been considered in range assessments. Secondly, the formula obliges a researcher to contemplate on the presumed number of detectable transmissions n in an animal study. Reflecting the hypothesized minimum time an animal would be in range of a receiver, t min depends on the animal's behaviour in a certain habitat (e.g. proneness to residency or a tendency to burrowing) and the considered time bin. Depending on the species, t min may even be assumed to vary over time, for example if an animal is only seasonally resident [19] or exhibits diel variation in movement behaviour [41]. If little is known about the animals, researchers can opt to set precautionary low values for t min and therefore n. Likewise, if a study requires to pick up nearly every transmission of a tagged animal in a certain area (e.g. during migration), researchers have to program the transmitting interval settings and/or space between receivers in the array accordingly [15]. The predicted cumulative probability P would then reach values similar to or even lower than the individual detection probability π (Fig. 1). In many cases, however, information is available on the expected movement behaviour (e.g. if the species was tagged before), which can be used for a more adequate assessment of range. Intuitively, one may resist the idea of seemingly imposing a bias on the analysis. In practice, however, the formula for calculating n (Eq. 2) builds on parameters that are otherwise presumed implicit when designing a telemetry study (e.g. for the choice of transmitting interval settings) [15,39,40]. By specifying how these parameters relate (Eqs. 1 and 2), they can explicitly be taken into account in the assessment of detection range and in the design of a telemetry study.

Accounting for range
Despite an increasing recognition in the telemetry community for the need of range testing, only few range test studies [38,42] evaluate their own design or the applicability to the telemetry study and analytical application. As a standard practice, receivers and sentinel transmitters are placed on a line to investigate range [4,5,43]. In this study, we show that the orientation of that line can influence the estimation of detection range, likely in relation to the direction of the dominant currents [23]. Likewise, detections of sentinel transmitters used during this study weren't necessarily representative of the performance of the entire array. In our case, the optimal strategy to obtain reliable detection errors was to assess range before the study using the entire receiver array, in addition to sentinel transmitter data during the study.
Aside from the range test itself, the method to account for detection error must be tailored to the analytical application and its temporal resolution. From the method elaborated in this study, the cumulative probability P enables the calculation of detection error at the same temporal resolution of the presence metric of interest. When analysing patterns in presence, this measurement error can be directly included either as a Bayesian error structure in a generalized model [44] or in a state-space modelling framework [45][46][47]. For telemetry analyses that do not build on presence/absence as a response variable, different methods have been developed to account for range or detection efficiency [16]. Detection counts for example can be directly recalibrated using a correction factor [25], whereas error can also be included in the calculation of centres of activity based on detection counts [48,49]. When investigating the sequence of detections in space, range can be assessed specifically for migratory routes [50] or network analysis [38]. For fine-scale positioning, horizontal position errors would be quantified within an entire receiver array [8], potentially accounting for individual receiver's contributions [51] and system settings [52].

Implications for study design
We strongly argue to consider the assessment of range as a fundamental aspect of the study design, the data analysis and the interpretation of results. Aside from factors beyond a researcher's control, such as environmental conditions and movement behaviour [15], range is an interplay of distance to a receiver [1], the deployment set-up [10] and receiver type [38], tag attachment [9], transmitting power output [2,7] and depending on the application: transmitting interval and temporal resolution of the analysis. Therefore, researchers can finetune more aspects in the design of a telemetry study than simply the lay-out of a receiver array. Understanding the effect of these factors on detection range, is also advantageous for budget management of expensive telemetry equipment. Adequate range assessments may optimize transmitter battery life times, e.g. by carefully deciding on transmitting interval and power output [2], or reduce the number of receivers required in an array [53][54][55]. Building on the multitude of detection range studies, this study can serve as a plea to rethink detection range as a spatiotemporal interplay of many factors.