An at-sea assessment of Argos location accuracy for three species of large whales, and the effect of deep-diving behavior on location error

Argos satellite telemetry is used globally to track terrestrial and aquatic megafauna, yet the accuracy of this system has been described empirically only for a limited number of species. We used Argos-linked archival tags with Fastloc GPS deployed on free-ranging sperm (Physeter macrocephalus), blue (Balaenoptera musculus), and fin (B. physalus) whales to derive empirical estimates of Argos location errors for these species, examine possible behavior-related differences, and test the effect of incorporating species-specific error parameters on performance of a commonly used movement model. Argos location errors for blue and fin whale tags were similar and were combined (n = 1712 locations) for comparison against sperm whale tags (n = 1206 locations). Location error magnitudes for tags attached to sperm whales were significantly larger than blue/fin whale tags for almost all Argos location classes (LC), ranging from 964 m versus 647 m for LC 3, respectively, to 10,569 m versus 5589 m for LC B, respectively. However, these differences were not seen while tags floated at the surface after release. Sperm whale tags were significantly colder than ambient temperature when surfacing from a dive, compared to blue/fin whale tags (16.9 °C versus 1.3 °C, respectively) leading to larger changes in tag temperature during post-dive intervals. The increased rate of tag temperature change while at the surface was correlated to increased error magnitude for sperm whales but not blue/fin whales. Movement model performance was not significantly improved by incorporating species-specific error parameters. Location accuracy estimates for blue/fin whales were within the range estimated for other marine megafauna, but were higher for sperm whales. Thermal inertia from deep, long-duration dives likely caused transmission frequency drift and greater Argos location error in sperm whales, as tags warmed at the surface during post-dive intervals. Thus, tracks of deep-diving species may be less accurate than for other species. However, differences in calculated error magnitude between species were less than typical scales of movement and had limited effect on movement model performance. Therefore, broad-scale interpretation of Argos tracking data will likely be unaffected, although fine-scale interpretation should be made with more caution for deep-diving species inhabiting warm regions.

moving tens to hundreds of kilometers in a single day [1,2]. The advent of electronic tracking and bio-logging devices has allowed for monitoring of marine animal movements over periods of months or even years [3][4][5][6] and has provided the foundation for a host of discoveries across a range of scientific disciplines. For example, tracking data have been used to identify persistent areas of multi-species aggregation in the high seas [7] as well as previously unknown breeding and feeding areas [8][9][10]. Other discoveries include unexpected reproductive connections between endangered and non-endangered populations [11] and the use of possible navigational cues during long-distance migrations [12]. These data also have informed models developed to better understand the environmental drivers behind the animals' movements and to better predict their distribution and possible responses to future environmental changes [13][14][15][16][17][18]. Further, bio-telemetry and bio-logging approaches have led to useful management applications such as the identification of potential anthropogenic conflicts [19,20], the development of new mitigation tools [21][22][23], and the generation of information critical for conservation and management policy at a global scale [24,25].
Two established technologies for tracking the largescale movements of marine megafauna via satellite include Doppler-based positioning with the Argos system and rapid-fixing global positioning system (GPS) with Fastloc GPS [4,[26][27][28]. The Argos satellite system, operated by Collecte Localisation Satellitales (CLS), consists of modules attached to polar-orbiting NOAA and Eumetsat satellites, which record UHF radio transmissions from an Argos platform transmitter terminal (PTT) and then transmit those signals to land-based receiving stations for processing and location estimation [29]. Positioning of the PTT on the globe is estimated by Doppler shift in the frequency of transmissions received by the satellite during a pass. Least-squares analysis was historically used to optimize location estimation; however, in 2011 CLS incorporated a Kalman filter algorithm to provide more positions and better accuracy [29][30][31][32]. PTT positions are assigned one of seven location classes (LC; 3, 2, 1, 0, A, B, and Z in descending order of quality) based on the number of messages received during a pass and the estimated error associated with the calculated position. The nominal error radii associated with each numeric location class range from < 250 m for LC 3 to < 1500 m for LC 0, with no accuracy assessment of the lettered classes, as the error estimation is unbounded [29]. Empirically, Argos location error is best described by an ellipse with greater error in the longitudinal direction and highly skewed distributions along both longitude and latitude axes [29,[33][34][35]. Tracks of marine megafauna typically contain a high proportion of locations with no Argos accuracy assessment (i.e., LCs A and B) due to the limited time these animals spend at the surface [33,34,36]. Measured Argos location errors for marine species have been described for tagged animals within an enclosure of known location [34], as well as by comparing Argos locations to temporally proximate Fastloc GPS locations collected from tagged free-ranging animals [33]. These empirically derived error estimates have been comparable or slightly larger than the nominal values given by Argos, while estimated errors for location classes with no accuracy assessment (LCs A and B) have ranged from 1 to 10 km depending on the study and species [33,35,[37][38][39][40][41].
Fastloc GPS is an adaptation of traditional GPS that captures a "snapshot" of the GPS satellite constellation in less than 1 s and saves the information onboard for later reconstruction of the tag's position when matched with temporally coincident satellite ephemeris data [27]. The fast acquisition time allows for highly accurate locations to be collected for marine species that may only surface briefly [39]. Fastloc GPS location accuracy improves as the number of satellites in view increases and can range from < 170 m (five satellites) to < 30 m (eight satellites) [28,37,38,42], making it a preferred method compared to Argos. Fastloc GPS locations can be recovered by direct download after tag recovery or as data contained within an Argos transmission, although bandwidth limitations mean each Argos transmission can only contain one Fastloc GPS location. Thus, the number of Fastloc GPS locations that can be transmitted via Argos may be limited, especially if they are collected at a fine temporal resolution or if other behavior-related data are also being transmitted.
To date, large whales have been tracked primarily using the Argos system [4,12,43,44]. However, estimates of Argos error among marine mammals are only available for species amenable to capture or captivity, like pinnipeds [33,34,37,[39][40][41], and no empirical error estimates have been made for cetaceans. In addition, Argos location error has been found to be higher for northern elephant seals (Mirounga angustirostris) in what was speculated to be a temperature effect related to the species' deeper, long-duration dives [33]. The discovery of a possible diving behavior effect on Argos location accuracy prompted Costa et al. [33] to suggest that, depending on the behavior of the study species, investigators may want to derive their own error distributions. Thus, developing estimates of Argos location error specific to whale species with different diving behavior is an area of interest we address with this study.
Animal movement models are often fit to Argos tracking data as a way to generate regularly spaced locations with reduced error [39,45,46]. While there are a variety of methodologies, these techniques generally model a track's step length and turning angle while linking Argos error estimates associated with observed locations to account for location uncertainty [41,47]. They typically use all but the worst-quality locations (LC Z), although additional filtering of the input data can result in further improvement to the accuracy of model output [41]. These movement models have been applied across a wide range of marine taxa, with the estimates of Argos error used in the models almost exclusively coming from a study of captive seals [34]. However, few studies have explored the effects of filtering input data and using alternative Argos error estimates on the true error associated with the locations predicted by these models [37,40,41].
Here, we used Argos-linked archival tags with Fastloc GPS deployed on three free-ranging large-whale species, sperm (Physeter macrocephalus), blue (Balaenoptera musculus), and fin (B. physalus) whales to characterize Argos location errors for these species in comparison to theoretical values provided by the Argos system [29] as well as to empirical values obtained for other marine animals. Prior to recovery, tags released from the whales and floated at the surface for days or weeks, providing data for a comparison of location error between periods when they were attached versus when they were floating, allowing the investigation of any behavior-related effects. Considering that larger Argos errors have been observed in northern elephant seals compared to other pinnipeds [33], we hypothesized there would be inter-species differences in Argos location error related to whale behavior, with locations for deep-diving sperm whales having larger errors than in the shallower-diving blue and fin whales. We then investigated the impact of possible interspecies differences in Argos error on tracks generated with animal movement models using the derived error estimates as parametric input to a commonly used movement model. Accuracy of the model-estimated locations was assessed under different Argos error values and data processing regimes. Our hypothesis was that accuracy of model estimates would improve when using empirical Argos error estimates derived specifically for large whales relative to the default model parameters.

Results
Median tag attachment duration was 19.5 days (range = 0-49.6 days, n = 20 tags) for sperm whales, 22.4 days (range = 18.3-28.9 days, n = 8 tags) for blue whales, and 14.2 days (range = 4.9-16.0 days, n = 5 tags) for fin whales. A total of 8502 Argos locations and 20,852 Fastloc GPS locations were generated while tags were attached to whales, and 2300 Argos locations and 1079 Fastloc GPS locations were generated after the tags had released from whales and were floating at the surface prior to recovery (Table 1). Pre-processing removed 22  Argos LC-Z locations and 2139 Fastloc GPS locations  from attached tags and removed 1 Argos LC-Z and 76 Fastloc GPS locations from floating tags to form the "complete" dataset (see Methods: Data pre-processing). A custom-editing protocol, implemented to identify and remove spurious or redundant locations, removed an additional 1511 Argos locations from the complete Argos dataset to produce the "edited" Argos dataset (Table 1).
A total of 2923 Argos locations for attached tags were temporally proximate ("matched") to Fastloc GPS locations after removal of duplicates (see Methods: Computation of Argos location errors), with 1-378 matched locations per track. The magnitude of location error for matched locations was < 124 km with the exception of five values > 634 km, which were removed as outliers based on their MAD z-score, for a final dataset of 2918 Argos locations with matched Fastloc GPS locations. LC 1 was the dominant Argos location class for sperm whales (30% of all matched locations; Table 1, Fig. 1), while LC B was the most abundant for blue and fin whales (58%; Table 1, Fig. 1). Fastloc GPS locations using six or more satellites (95 th percentile of error < 70 m; [42]) accounted for 66.3% of matched locations for sperm whales and 94.9% of matched locations for blue/fin whales (Additional file 1: Table S1).
Error magnitudes for tags attached to sperm whales were not log-normally distributed (p < 0.001, Chi squared test) but were nevertheless log-transformed to provide variance stabilization and a better comparison with the blue/fin data using non-parametric methods. Log-transformed error magnitudes for attached blue and fin whale tags were normally distributed, although evidence of normality was slightly weaker for blue whales (p = 0.09 for blue whales and p = 0.23 for fin whales, Chi squared test). The transformed error magnitudes were not significantly different between these two species by location class (p = 0.43, GLM), so the data were combined into a single "blue/fin" category.
There was strong evidence that median Argos location error magnitudes of attached sperm whale tags were larger than those of blue/fin whales for all location classes except for LC 3 (p < 0.01, Mood's median test; Fig. 1). For attached tags, the 68th percentile of error magnitudes for LC 3 locations was 964 m for sperm whales and 647 m for blue/fin whales, while for LC B it was 10,569 m for sperm whales compared to 5589 m for blue/fin whales ( Table 2). The longitude and latitude error components for sperm whales were 1.5-2 times greater than those of blue/fin whales. The 68th percentile values of longitude and latitude error components for sperm whales ranged Errors of bearing for Argos locations while attached to whales appeared bimodally distributed, with larger error in the east-west direction for both species (Fig. 4). Bearing errors were not from a uniformly circular distribution for either species according to evidence from Kuiper's test of uniformity (V = 3.90, p < 0.01 for sperm whales, V = 3.96, p < 0.01 for blue/fin whales), Watson's test for circular uniformity (U 2 = 1.08, p < 0.001 for sperm whales, U 2 = 1.55, p < 0.01 for blue/fin whales), and Rao's spacing test of uniformity (t = 141.6, p value < 0.001 for sperm whales, t = 145.3, p < 0.001 for blue/fin whales).
In contrast to attached tags, Argos location error magnitudes for floating tags were similar between sperm and blue/fin whales, with overlapping 95% CIs of logtransformed medians of the same location class, and with differences in error magnitude between location classes (p < 0.01, Mood's median test; Fig. 1). The 68% error magnitude for floating tags ranged from 408 m for LC 3 to 1257 m for LC B (Table 3). Median error magnitudes for sperm whale tags were significantly different for locations during attachment compared to floating for all six location classes (p < 0.01, Mood's median test), indicating error was higher while the tags were attached to the whales. For blue/fin whale tags, median error magnitudes were significantly different for locations during periods when tags were attached compared to floating for only LC B and LC 3 locations (p < 0.01, Mood's median test), and there was no difference for the other four location classes (p > 0.1, Mood's median test), indicating little difference in error magnitude when tags were floating compared to attached.
Sensors onboard the tags recorded internal and ambient temperatures allowing us to quantify their evolution throughout dives and subsequent post-dive intervals (PDI). Water temperature during sperm whale dives ranged from 31 °C at the surface to as low as 4 °C at depth, while for blue/fin whale dives it ranged from 23 °C at the surface to 6 °C at depth (Additional file 4: Fig. S3). The distributions of most metrics comparing internal

Table 1 Number of Argos and Fastloc GPS locations collected from three free-ranging whale species
Each row represents the number of locations after the corresponding step of data processing. Location Class = LC; attached is when tag is physically on the whale; floating is when a tag has detached from the whale and is floating freely; Argos pre-processing (pre) is removal of LC Z; GPS pre-processing (pre) is removal of locations derived from fewer than five satellites, with residual value > 30, or speed > 30 km/h; matched is when an Argos location is within 5 min of a Fastloc GPS location; interpolated is an Argos location time stamp used for interpolating a position along the Fastloc GPS track to match with the Argos location; raw are unprocessed Fastloc GPS locations. A total of 23 LC-Z locations were removed during pre-processing (13 attached, 0 floating from sperm whales, 9 attached and 1 floating from blue/fin whales), which are not represented in the table. Numbers presented represent location data from 20 ADB tags deployed on sperm whales, eight ADB tags deployed on blue whales, and five ADB tags deployed on fin whales. Data were collected from sperm whales tracked in the Gulf of Mexico during summer 2011 and 2013, and from blue and fin whales tracked off southern California during late summer 2014 and 2015

Species
Location tag temperature to ambient temperature were bimodal for sperm whale tags, with the secondary peak near 0 °C, representing very small differences occurring during short-duration, shallow dives. Thus, only values related to the larger peak in bimodal data will be discussed as the "primary mode" of the sperm whale data. In contrast, the corresponding distributions for blue/fin whale tags were unimodal (but skewed), so median values are reported (Fig. 5).
For tags attached to sperm whales, the primary mode of internal temperature change during a dive was 19.7 °C (Fig. 5, top panel), and tags were 16.9 °C colder than the ambient water temperature when surfacing from a dive (Fig. 5, middle panel). In contrast, tags attached to blue/ fin whales experienced a median internal temperature change of just 1.3 °C during a dive (Fig. 5, top panel) and were a median of 2.3 °C colder when surfacing from a dive (Fig. 5, middle panel). During the PDI, the internal temperature of tags attached to sperm whales rose at more than three times the rate of blue/fin whale tags (mode = 1.34 °C/min and median = 0.37 °C/min, respectively; Fig. 5, bottom panel). This, together with a longer time spent at the surface for sperm whales (median PDI = 9.5 min for sperm whales versus 2.1 min for blue/ fin whales; Additional file 5: Fig. S4), resulted in tags from both species typically recovering from their internal temperature deficit during the PDI, as median internal tag temperature was, respectively, 1.7 °C and 1.8 °C colder than ambient temperature at the start of the next dive (Additional file 6: Fig. S5). Possible relationships between derived tag temperature metrics (See Methods: Temperature covariates affecting Argos error) and Argos error magnitude were explored using generalized linear model (GLM) regression, while controlling for location class and whale species combinations ("SpeciesLC"). There was strong support for a correlation between absolute rate of internal tag temperature change during the PDI ("TempRate") and the log-transformed Argos error magnitude, with additional interaction between SpeciesLC and TempRate (p < 0.0001 for both covariates and the interaction term, R 2 = 32.3, GLM). Inclusion of additional covariates for temperature resulted in only minor improvements to model performance and were discarded in favor of parsimony. Examination of the estimated interaction term coefficients showed support for a linear relationship between Tem-pRate and log-transformed error magnitude for all sperm whale location classes (p values ranged from < 0.01 to 0.06, GLM) except for LC B (p = 0.13, GLM), while blue/ fin whale coefficients showed no evidence of a linear relationship (p values ranged from 0.10 to 0.83, GLM).
For a model fit only to the sperm whale data, there was only weak support for an interaction between TempRate and location class (p = 0.08, GLM) so a regression model constraining slopes for all location classes to be equal was performed. The final model indicated that a unit increase in TempRate corresponded to 2.01 times greater Argos error magnitude (95% CI 1.81-2.24) for sperm whale tags after accounting for differences between location class. There was little support for an effect of TempRate on error magnitude in a model fit only to the blue/fin whale data (p = 0.088, GLM) and location class was the only strongly significant covariate in that model (p < 0.0001, GLM).
Four separate implementations of hierarchical Bayesian state-space models (hSSM) using different Argos error estimates or input data editing methods were fit to 16 sperm whale and 13 blue/fin whale Argos tracks (See Methods: Effect of error on movement model outputs). For sperm whale tracks, each model estimated 1748 locations, of which 132 were within 5 min of a Fastloc GPS location, with 1-34 matches per track. True error magnitudes for model-estimated locations were not significantly different between models using the default VMRF02 and our derived sperm whale Argos error parameter values (Table 4) when fitted to the complete Argos input dataset (median difference = 0.26 km, p-value = 0.21, signed-rank test). However, the model using sperm whale Argos error parameters performed significantly better compared to one using VMRF02 error parameters when fitted to the edited Argos input dataset (median difference = 0.19 km, p = 0.0017, signedrank test). No further significant differences in modelestimated location error magnitudes were found when comparing hSSM models using the complete versus the edited Argos input datasets with either VMRF02 (median difference = − 0.01 km, p = 0.71, signed-rank test) or our derived sperm whale Argos error parameter values (median difference = − 0.03 km, p = 0.25, signed-rank test).

Table 2 Percentiles of ranked absolute Argos location errors calculated for three free-ranging whale species
The 68th and 95th percentiles of theoretical error for each location class (LC) as reported by the Argos Service [29] compared to the error magnitude calculated in this study for tags attached to sperm and blue/fin whales. Also presented are the ranked absolute errors in longitude and latitude for each location class as reported by  hSSM models fitted to blue/fin whale Argos tracks each estimated 986 locations, of which 390 were within 5 min of a Fastloc GPS location, with 1-70 matches per track. True error magnitudes of model-estimated locations compared to matched Fastloc GPS locations were not significantly different between hSSM models using the VMRF02 Argos error parameter values and our derived blue/fin whale Argos error parameter values (Table 4), both when fitted to the complete Argos input dataset (median difference = − 0.22 km, p = 0.93, signed-rank test) and when implemented on the edited dataset (median difference = 0.02 km, p = 0.98, signedrank test). Differences in model-estimated location error magnitudes were also not significant when comparing hSSM models using the complete versus the edited Argos input datasets with either VMRF02 (median difference = − 0.002 km, p = 0.93, signed-rank test) or our derived blue/fin whale error values (median difference = − 0.0003 km, p = 0.48, signed-rank test).

Discussion
This is the first study to characterize Argos location error for tags attached to free-ranging whales using concurrently collected Fastloc GPS locations to represent the whales' true location. The observed similarity of calculated errors for blue and fin whale locations was expected based on the similar dive behavior of the two species [48], supporting the combined reporting of their results. Calculated error magnitudes for blue/fin whales were comparable to values reported for sea turtles and pinnipeds (68th percentile error ranging from LC 3 = 400-580 m to LC B = 2000-30,500 m; [33,35,37,38]). However, Argos location error magnitude was significantly larger   Figs. 2 and 3). This difference in error values between species was not present during the period when these tags had released and were floating prior to recovery, indicating that the observed error differences while attached to whales were not due to latitudinal differences in satellite trajectories, pass quality between study areas (California versus Gulf of Mexico waters), or differences in tag manufacturing runs. Thus, the results support our initial hypothesis that there would be inter-species differences in Argos error and indicate that species-specific differences in behavior while tags were attached affected the accuracy of calculated Argos locations.
Sperm whales produced a larger proportion of goodquality Argos locations (LC 1 or higher) compared to blue and fin whales, for whom poor-quality location classes (LC 0 or lower) dominated, as often occurs in marine species [33,40,49]. The larger proportion of high-quality locations for sperm whales is likely due to the extended time (5-10 + min; [50,51]) they spend at the surface following a dive, allowing more transmissions to occur during a satellite pass. Thus, sperm whale Argos tracks give the appearance of having greater accuracy than those of other marine species, although the larger error magnitudes arising from dives spanning wide temperature gradients indicate their tracks may not actually be more accurate. Bearings of calculated Argos errors were similar between species, indicating that the observed differences in error were limited to its magnitude. The observed longitudinal bias of calculated error bearings has been described in a variety of studies on other species and results from the polar orbit of the Argos satellites ( [33,34,36]; but see [37]).

Table 3 Percentiles of ranked absolute Argos location errors calculated for floating whale tags
The 68th and 95th percentiles of theoretical error for each location class (LC) as reported by the Argos Service [29] compared to the error magnitude calculated in this study for tags floating at the surface after having released from sperm and blue/fin whales. Also presented are the ranked absolute errors in longitude and latitude for each location class as reported by VMRF02 and as calculated in this study. All values are in meters. Data were collected from sperm whales tracked in the Gulf of Mexico during summer 2011 and 2013, and from blue and fin whales tracked off southern California during late summer 2014 and 2015

68%
Magnitude Argos error magnitude was found to increase with an increasing rate of tag temperature change while at the surface across all location classes for sperm whale tags. The rate of tag temperature change was driven by the larger differences between internal tag and ambient water temperatures experienced by tags when surfacing from a dive. Sperm whales tagged in this study inhabit a region (the Gulf of Mexico) with very warm surface waters and make longer, deeper dives (> 500 m depth and > 30 min duration; [50,51]) than the more shallow-diving blue and fin whales (< 350 m depth and < 15 min duration; [48]), which were tagged in the temperate/sub-tropical waters of southern California. Thus, sperm whale tags experienced a wider range of water temperatures during dives compared to blue and fin whales (4-31 °C versus 6-23 °C). Due to thermal inertia, tags will cool or warm more slowly as ambient temperature changes, so internal tag temperatures during, or after a dive, may often be different than ambient temperatures depending on how much time has been spent at a given temperature. The longer dive durations of sperm whales allowed tags more time to equalize with cold water temperatures at depth, resulting in significantly larger internal tag temperature changes during dives compared to blue and fin whales (19.7 °C versus 1.3 °C), while also resulting in a significantly larger temperature deficit upon returning to the warm surface waters (16.9 °C versus 2.3 °C). Thus, the much larger tag temperature differentials when surfacing from a dive in sperm whales were driven by the longer occupancy of cold waters at depth combined with exposure to the warm surface layer characteristic of the Gulf of Mexico.
Changes in temperature can affect transmitter frequency stability [52], which is an important component of Argos location error [29,33]. We surmise that transmission frequencies of sperm whale tags likely changed while at the surface as tags warmed from large temperature deficits. However, the magnitude of these changes is unknown at present, as the transmitter frequency reported by Argos is an estimate of the true frequency based on all received transmissions during a pass. Instability of an Argos transmitter's frequency can lead to larger uncertainties in estimated locations [30,53], so even relatively small changes in transmitter frequency as the tags warmed at the surface may have had a significant effect on error magnitude if they occurred during a satellite pass.
Sensitivity of a tag to temperature gradients will likely vary with the transmitter used as well as thermal properties of the tag. Tags mounted externally [54,55] will likely have different thermal responses compared to those that are implanted in blubber and muscle [4] based on the surface area of the tag in contact with water. Overall tag size, materials used, and placement of the transmitter within the body of the tag will also play a role by varying the level of insulation around the transmitter. Data on a tag's internal temperature versus ambient are often not available for most studies, precluding the identification of a specific temperature threshold where an increase in error might occur. Species-or individual-level behavior differences affecting the time spent in different temperature regimes will also have an effect.
Despite wide, repeated temperature changes during dives, temperature effects on Argos error magnitudes appear to have been short-lived, as they returned to a common range across species after the tags released from whales and floated at the surface. This indicates that, while Argos location error may be larger for some species as a consequence of their dive behavior, it will not result in a permanent shift in a transmitter's frequency

Table 4 T-distribution parameter estimates derived from estimated Argos location error distributions for three freeranging whale species
Maximum-likelihood estimates of t-distribution parameters for scale (km) and degrees of freedom (df), with standard errors (se) fit to Argos location error estimates for each location class (LC). Values are presented for VMRF02 (as estimated by [45]) and as obtained in this study for sperm and blue/fin whale datasets over time, which would make long-duration tracking of deep-diving animals unreliable. Further, although interpretation of tracking data becomes difficult when measurement error exceeds biological stochasticity [56], the magnitude of Argos errors estimated for the whale species in our study remained well within their typical scales of horizontal movement (~ 3-4 km/h; [8,50]). Thus, the interpretation of broad-scale movements and occupancy for deep-diving species, as estimated from Argos tracking, should not be affected. However, interpretation of fine-scale location differences should be treated with more caution if they occur in close temporal proximity. Warm surface waters, which appeared to contribute to increased Argos error in this study, are characteristic of low-latitude regions globally. These regions also offer fewer opportunities to receive Argos locations, as satellite coverage declines with decreasing latitude [29]. Thus, telemetry studies in low-latitude regions will be affected by both larger Argos error magnitude (at least in deepdiving species) and sparser location data collection, indicating the need to augment recovery of locations in these regions through alternative systems [57,58].
Contrary to our second hypothesis, there was little difference in hSSM movement model performance when using the complete or edited datasets as input, or when using either the VMF02 or our empirically calculated error structures to describe Argos locations within the model, suggesting the current hSSM model formulation is robust to a variety of potential sources of error and behavior of the study animals. Other work has found the sensitivity of hSSM model output-to-input data filtering to be variable, with little effect of the application of various speed filters for some species and a significant effect for others [37,41]. Precision of the input data can impact the accuracy and precision of locations estimated by the model [39,40], but decisions about when to discard or filter data should be made after consideration of the temporal coverage of available locations, as discarding even poor-quality locations can have serious effects on the output of sparse datasets [41,59]. This is the likely explanation for the better accuracy of the hSSM model when using edited input data and our derived sperm whale error specification, as locations were frequent enough to discard severely erroneous ones, allowing the remaining locations to be better modeled by the species-specific error parameters in the model.

Conclusions
The species-specific Argos location error characterization we have conducted for three large-whale species will better inform a wide range of future telemetry studies on cetaceans and other marine megafauna. Our empirical estimates of Argos location accuracy for blue/fin whales were broadly within the range of values estimated for other marine megafauna, while estimates for sperm whales exceeded those ranges for some location classes.
The dive behavior of blue and fin whales, for which no effect on error magnitude was observed, encompasses the typical range of dive depths and durations made by a large number of marine species. However, researchers studying deep-diving species like beaked whales (Ziphiidae) or elephant seals (Mirounga spp.), which experience large temperature ranges during the course of regular diving, should be aware of the possibility that received Argos locations may not be as accurate as the nominal location classes indicate, especially in low-latitude regions. Tracking studies in regions with cooler surface waters may be less affected due to a reduced tag-versusambient temperature differential when surfacing from a dive. In addition, regional differences in diving behavior within a species may also have to be considered, as in at least one example a blue whale off Australia was recorded diving to much greater depth (≥ 500 m; [60]) than those observed in this study.
We have described how behavior-induced changes to tag temperature can affect Argos location error for one style of tag. Thermal properties of tags will likely vary by manufacturer and tag style due to differences in thermal inertia, transmitter positioning, and different temperature responses of transmitters used in the tags, with corresponding differences in effect on Argos error magnitude. Thus, further work is needed to better characterize the effect of animal behavior on location error across a range of tag styles currently in use on marine megafauna. In the absence of hardware-based solutions to this issue, the similar performance of hSSMs using different error values suggests that applying movement models to Argos tracks may be a post hoc strategy to mitigate increased location error for deep-diving species.

Data collection
For this study we used the Advanced Dive Behavior (ADB) tag [55], an Argos-linked bio-telemetry and biologging device manufactured by Wildlife Computers, Inc. (Seattle, Washington, USA). All tags in this study were deployed at close range (1.5-4 m) from a 6.4-m rigidhulled inflatable boat as previously described in [4,55]. ADB tags were attached to sperm whales in the Gulf of Mexico in 2011 (n = 11) and 2013 (n = 9), and to blue and fin whales off southern California in 2014 (n = 4 blue whales; n = 3 fin whales) and 2015 (n = 4 blue whales; n = 2 fin whales). The tags recorded Fastloc GPS locations [28] every 7 min (blue and fin whales) or after surfacing from dives > 10 m depth and > 10 min in duration (sperm whales). Onboard sensors collected additional data (as described in [55]) including internal tag temperature (every 10 s) and ambient water temperature (every 1 s). Sensor data and successful Fastloc GPS fixes were stored in the tag's archive for download after recovery. Tags were fixed to a semi-implantable, stainless-steel housing with a corrodible link wire during deployment [55]. They remained attached to the whales until release criteria were met, at which time the tags separated from the housing to float at the surface for recovery and data download [55]. Argos transmissions were attempted every 45 s (controlled by a saltwater conductivity switch on the tag) throughout the day while attached and every 60 s while floating. An Argos transmission contained either one Fastloc GPS location or a dive data summary (not used in this study). Thus, each tag generated both an Argos track and a Fastloc GPS track (Additional file 7: Fig. S6). When possible, Fastloc GPS locations were retrieved from the archive of recovered tags, but only Argos-transmitted Fastloc GPS locations were available for unrecovered tags. Further details of tag design, application, and transmission cycles are available in [55].
Additional long-duration, location-only tags with Argos satellite transmitters [4] were attached simultaneously with ADB tags to a subset of sperm whales (termed "double-tagged" individuals; n = 7 in 2011 and n = 1 in 2013) for purposes of linking fine-scale dive behavior to broad-scale movements. These tags only generated Argos locations and transmitted every 10 s when at the surface during four 1-h periods each day.

Data pre-processing and editing
Fastloc GPS location accuracy diminishes as the number of available satellites decreases and/or with higher residual values [33,38,42,50], so locations derived from four or fewer satellites and/or with a residual value > 30 were removed prior to analysis [49,61]. A speed filter was also implemented to remove Fastloc GPS locations resulting in a speed > 30 km/h between consecutive locations. The retained Fastloc GPS locations were then considered to be the true location of the whales while tags were attached, and of the tags when they were floating at the surface after detaching from the whales (see below).
Argos locations in 2013, 2014, and 2015 were processed by CLS using the standard Kalman filter algorithm [29][30][31], while data from 2011 were originally generated using the older least-squares method and subsequently re-processed with the standard Kalman filter algorithm by CLS to ensure equivalence in location estimation methodology for all tags in this study. Argos locations with a quality designation of LC Z were removed from our analyses as they represent locations that failed plausibility tests [29]. The set of all Argos locations, with LC Z excluded, was termed the "complete" Argos dataset.
These data were then processed using the following custom-editing protocol to create an "edited" Argos dataset: • If more than one Argos satellite is in view of the transmitter, multiple locations close in time (< 20 min) can be produced. The speeds between these locations are frequently high due to the larger influence of location errors over a short period than if locations were far apart in time. In these instances, low-quality locations (LC 0, A, or B) were removed if they were received within 20 min of a high-quality location (LC 1, 2, or 3). • After removal, speeds between remaining locations were computed, and if a speed between two consecutive locations exceeded 12 km/h, one of the two locations was removed, with the location resulting in a shorter overall track length being retained to finalize the edited Argos dataset.

Computation of Argos location errors
Argos and Fastloc GPS tracks were separated into periods when tags were attached to a whale and when they were floating at the surface prior to recovery. For attached periods we extracted temporally proximate ("matched") Argos and Fastloc GPS locations occurring within 5 min of each other [33,37]. If an Argos location was within 5 min of more than one Fastloc GPS location, only the Fastloc GPS location closest in time to the Argos location was retained to avoid pseudo-replication of the same Argos location. In the case of double-tagged whales, often both tags produced Argos locations for the same satellite pass, resulting in two separate estimates of the whale's position. For this situation, both Argos locations were used as they were considered independent observations. For floating tags, presumed linear drifting between more regularly spaced Fastloc GPS locations allowed for a more precise estimate of the tag's true location than when attached. True tag locations were estimated for the time of received Argos locations using linear interpolation between the two temporally closest Fastloc GPS locations. Argos locations prior to the first floating and after the last floating Fastloc GPS location were not used. These interpolated Fastloc GPS locations and corresponding Argos locations were considered the matched pair.
Although it is generally small, Fastloc GPS location error will influence the calculated Argos error magnitude of matched locations. When tags were attached, this influence could not be accounted for as it was confounded by the whales' true movement. However, when tags were floating at the surface prior to recovery, ocean currents moved the tags in a consistent direction and speed, making short departures in the Fastloc GPS track due to location error more recognizable. Although it would have been possible to more accurately recreate the tags' true locations for floating periods by implementing a more restrictive filtering protocol for those portions of tracks, we chose to use the same protocol for both floating and attached tag tracks so that Fastloc GPS location errors were implicitly incorporated into the Argos error estimates in an equivalent way. The only exception was for one blue whale tag that produced 16 Argos locations with suspiciously large error magnitudes (> 10 km) while it was floating, seven of which were LC 1 or better. Examination of the Fastloc GPS locations used as true locations revealed the large error magnitudes were all driven by a single Fastloc GPS location at the end of the track that also produced a suspiciously high speed from the previous location (19 km/h) after an extended gap in locations (> 24 h). As tags generally floated at < 2 km/h, this Fastloc GPS location was removed to address the suspiciously high Argos error magnitudes.
Location error was computed in terms of distance and bearing between matched Fastloc GPS and Argos locations using the package geosphere v. 1.5-10 [62] for the R software v. 3.6.1 [63]. The Vincenty ellipsoid formula was used to calculate the distance between the two locations as the overall magnitude of the error vector, as well as its separate longitudinal and latitudinal components. Error magnitudes were examined for extreme values using outlier identification in the software package Statgraphics v. Centurion 18 based on the median absolute deviation (MAD) z-score (http://cdn2.hubsp ot.net/hubfs /40206 7/ PDFs/Outli er_Ident ifica tion-1.pdf ). Observations with modified median absolute deviation z-score values > 200 were considered outliers and removed from the analyses.
If approximately normal (according to Shapiro-Wilk W test and equal-variances Levene's test), means of error magnitudes were compared by location class and species in Statgraphics using generalized linear models (GLM). If data were not normally distributed, medians were compared non-parametrically using Mood's median test and Kruskal-Wallis rank test in Statgraphics. Bearing errors were also tested for departure from uniformity, as they have been found to concentrate in the east-west directions for pinnipeds [33,34,36]. Tests for circular uniformity assume any departures are due to unimodal distributions, so using multiple tests is suggested to validate departures from uniformity [64]. Therefore, Kuiper's V, Watson's U 2 , and Rao's spacing tests were performed using the R package circular v. 0.4-93 [65]) to explore whether bearings were uniformly distributed.

Temperature covariates affecting Argos error
Internal tag and ambient water temperature records were isolated from the data archives for dives > 10 m in depth and for the following PDI. Metrics describing the temperature regimes experienced by the tags were derived for each dive and PDI, including the maximum internal tag temperature difference from the start of the dive to the tag's coldest point, the difference between internal and ambient tag temperature at the end of a dive and the end of the PDI, and the rate of change of internal tag temperature during the PDI, calculated as the difference in internal tag temperature from the start to the end of the PDI divided by the duration of the PDI. The timing of Argos locations and associated error magnitude data were matched to the temperature metrics for the temporally closest PDI and its corresponding dive. The temperature metrics were used as covariates to test for an effect on calculated Argos error magnitude using GLM in Statgraphics. A "SpeciesLC" indicator variable was used to control for all combinations of Argos location classes for each species in one variable, as a way to examine interactions of temperature metrics with species and location class covariates in a readily interpretable fashion.

Effect of error on movement model output
A Bayesian hierarchical switching state-space model (hSSM [45]) was used to model both the complete and edited Argos datasets from attached portions of tracks using two different Argos error regimes: values reported by Vincent et al. [34] (hereafter referred to as "VMRF02" in the context of these comparisons) and the species-specific errors calculated by this study. The estimated error magnitudes were fit to a Student's t-distribution using the function fitdistr from R package fitdistrplus v. 1.0-14 [66], which provided estimates for degrees of freedom and scale parameters for use in the movement models.
The two-part hSSM model fits a first-difference correlated random walk model to each track while using observed locations to account for location uncertainty. The hierarchical nature of the model estimates state variables like position and behavior state individually (by track), but movement parameters are assumed to be shared across individuals [45]. Tracks were modeled separately by species to account for possible differences in movement parameters. Models were fitted using the R package bsam v. 1.1.2 [45], with the code modified to accept our empirically calculated Argos error parameters when not using VMFR02 values. In all cases, locations were estimated at 6-h intervals for tracks lasting ≥ 3 days. Model performance was assessed by comparing locations estimated by the hSSM to corresponding Fastloc GPS locations meeting the same 5-min criterion for temporal proximity used to investigate Argos error. True error magnitudes for model estimates were then compared in a pairwise manner by error input parameters (VMRF02 and our derived values) and by input dataset (complete and edited), where the median difference in error between two models was tested for difference from zero using a signed-rank test in Statgraphics (one-variable analysis). For the two models being compared, significant positive median differences indicated better performance for one model and significant negative median differences indicated better performance for the other model.