Potential utility of geomagnetic data for geolocation of demersal fishes in the North Pacific Ocean

Archival tags that measure the Earth’s magnetic field could provide a new geolocation method for demersal fishes in the North Pacific Ocean. However, the presence of local magnetic field anomalies caused by geological formations such as volcanic rock and temporal fluctuations from solar storms could complicate its use in some high-latitude areas of the North Pacific Ocean. We assessed the potential value of adding geomagnetic data to a depth-based state-space model for geolocation of demersal fishes in Glacier Bay National Park, USA, a high-latitude magnetic anomaly area. We developed a high-resolution (100 m) magnetic field map of the study area and assessed in situ tag resolution by deploying 5 geomagnetic archival tags on a stationary mooring for 8 months. We compared performance of 4 theoretical geomagnetic tag measurement resolutions (low = ± 1000 nT, medium = ± 500 nT, high = ± 300 nT, and very high = ± 150 nT), 2 map resolutions (coarse- or fine-scale), and 5 methods of geomagnetic variance specification by estimating locations of simulated random walk trajectories under the different treatment scenarios using a hidden Markov model. Geomagnetic data improved model performance for both fine-scale and coarse-scale magnetic maps when tag resolutions were medium to very high and geomagnetic variance specification was based on error between measured and mapped values instead of study area attributes such as slope or roughness. Overall, the best model performance was observed for the highest tag resolution, the fine-scale map, and variance based on anomaly magnitudes. However, the coarse-scale map with a constant variance of 165 nT resulted in improvements over depth alone for all tag resolutions. In situ testing of mooring data suggests that the precision of the geomagnetic archival tags was comparable to the low and medium tag measurement resolutions tested in simulations, but variation in performance was high among tags. Our results suggest that inclusion of geomagnetic data could improve geolocation of demersal fishes in the North Pacific Ocean, but improvements to geomagnetic tags and additional information on magnetic field values measured at the seafloor compared to the sea surface are needed to ensure its utility.

large-scale migrations of up to thousands of kilometers [1][2][3]. Detailed information on large-scale movements, such as migration timing, pathways, and the proportion of populations that migrate is important for management of these species. However, this type of information is scarce because it is difficult to obtain from fisherydependent conventional mark-recapture tags that provide tag release and recovery locations alone [4].
Electronic archival tags that collect information such as depth, temperature, and light intensity while a tagged fish is at liberty can be used to provide daily position estimates of demersal fish species over large time scales. The values recorded by the tag each day are matched to maps of the measured variables in the study area to identify the most likely location of the fish on each day, a process known as geolocation. For example, ambient light intensity measurements vary spatially and seasonally and can provide daily estimates of latitude and longitude based on the time of local noon and day length recorded by the tag each day [5]. Light-based geolocation works well for reconstructing the movement paths of highly mobile pelagic fishes in lower latitudes, such as tunas and billfishes [6,7], but not high-latitude demersal fishes. Demersal fishes can occupy depths > 150 m, where light intensity is too limited for geolocation [8] or locations where light intensity is occluded by silt or phytoplankton. For demersal fishes, matching tidal amplitude and phase from depth records of stationary fish to predicted tidal amplitude and phase in the study area can be an effective geolocation method depending on the characteristics of the study area [9]. However, this method is not practical for demersal fishes in the North Pacific Ocean because hydrodynamic models are not accurate for most areas in the region, and tidal signals in depth records can be diminished for deep-water fishes. Therefore, geolocation based on maximum daily depth is a more robust method of geolocation for demersal fishes in the North Pacific Ocean that can be assumed to be in close proximity to the seafloor at least once per day [10]. However, in areas where depth gradients are weak or for fish species that cannot be assumed to visit the seafloor on a daily basis, such as sablefish [11], additional geolocation variables would be expected to greatly improve the accuracy and precision of geolocation estimates for these fishes in the North Pacific Ocean.
One option for an additional geolocation variable for demersal fishes in the North Pacific Ocean is provided by recently developed electronic tags that measure the Earth's magnetic field. Because the magnetic field varies over space and can be mapped, its use for fish geolocation has been proposed [12,13]. Earth's threedimensional magnetic field can be described in terms of individual X, Y, and Z dimensions, horizontal and vertical components, or as the total magnetic field (the vector sum of all three dimensions). The magnitude of the magnetic field can be predicted at any location on earth by three-dimensional models such as the International Geomagnetic Reference Field [IGRF,14]. These global models are based on satellite measurements, vessel surveys, and magnetic observatory data. Because the main field changes slowly over time, the models need to be updated approximately every 5 years.
Two approaches for fish geolocation using magnetic field data have been introduced so far [12,13]. Total magnetic field strength can serve as an indication of latitude because it increases from approximately 26,000 nanoTeslas (S.I. unit, nT) at the equator to 66,000 nT at the poles [12]. However, because magnetic field gradients and orientation vary geographically, and are not always parallel to latitude, the potential usefulness of using magnetic data to determine the latitude of tagged fish varies by geographic region. Another theoretical approach to geomagnetic geolocation features the use of separate horizontal and vertical components, rather than the total magnetic field, because horizontal and vertical component gradients can differ within the same geographic region and thus locations may be estimated by intersecting horizontal and vertical magnetic field gradients [13].
Both of these approaches, which rely on data from large-scale models of the Earth's main field, are complicated by sources of small-scale spatial and temporal variation. Small-scale spatial variation is caused by geological formations such as igneous or ferromagnetic rocks that possess their own magnetic fields. The large-scale pattern of the Earth's main field is disrupted by the magnetic field intensity of these crustal features, which can vary greatly over small spatial scales, and thus they are referred to as local magnetic anomalies. Global and regional maps of local magnetic anomalies have been produced based on satellite measurements in addition to aerial and marine vessel survey data [15]. However, magnetic anomaly map resolution may be coarse compared to the spatial scale at which some local magnetic anomalies occur. In addition, the magnitude of magnetic field anomalies may be underestimated when measured several kilometers above an anomalous feature (e.g., during aerial surveys), as magnetic field intensity declines as an inverse cubic function with distance from the source [16].
Small-scale temporal variation results from solar and atmospheric processes. In low latitudes, charged particles in the ionosphere move more when heated by the sun during the day, resulting in diel variation in the magnetic field that occurs at a scale of up to 100 nT [16]. In high latitudes, solar storms may increase or decrease the measured magnetic field at a specific location by 2000 nT or more and can last several days [16]. A global network of observatories provides precise information on temporal changes at different locations; these data are freely available at www.inter magne t.org.
Because gradient strength and orientation of the main field, incidence of local magnetic anomalies, and magnitude of temporal fluctuations vary by geographic region, the potential utility of geomagnetic geolocation should be assessed regionally. In the North Pacific Ocean near Alaska, gradients in the total magnetic field run NW to SE (Fig. 1a). For a fish moving from SW to NE, perpendicularly to the magnetic gradient orientation, gradient strength will decrease and total magnetic field values will increase. Therefore, the method would be most effective at detecting fish movement that is oriented roughly parallel to the shoreline and occurs between the Aleutian Islands and Prince William Sound, or on the eastern Bering Sea continental shelf between the eastern Aleutian Islands and the Bering Strait. However, many local magnetic anomalies exist in Alaska, given the region's volcanic nature (Fig. 1b). In addition, high latitudes are exposed to high temporal variability associated with the effects of solar storms (Fig. 2). Therefore, the process of geolocation with geomagnetic data may be more difficult in Alaska (USA) compared to low-latitude regions without these phenomena.
We explored the potential value of geomagnetic geolocation for demersal fishes in the North Pacific Ocean by incorporating geomagnetic data into a discrete statespace model that can account for local magnetic anomalies. This model is based on a hidden Markov model (HMM) developed for the geolocation of Atlantic cod in the North Sea [17] and adapted for geolocation of demersal fishes in the North Pacific Ocean based primarily on depth data [10]. In this study, we expanded the HMM by developing a data likelihood model that combines depth and geomagnetic data and we hypothesized that it would perform better than a data likelihood model based on depth alone. We tested model performance under different conditions in a high-latitude magnetic anomaly area that, while small in spatial scale, allowed a mechanistic understanding of map accuracy, tag resolution, and geolocation model parameterization.
To assess the potential utility of geomagnetic geolocation with the HMM, we quantified magnetic field map accuracy, in situ tag resolution, and performance of different data likelihood models. We produced a fine-scale map of the magnetic field in the study area for comparison with a large-scale magnetic field map available for the broader region. We simulated movement trajectories of demersal fishes in the study area with four different values of magnetic tag measurement resolution and compared location estimation results from a model based on depth alone to results from six models that combined depth and geomagnetic data. We collected information Fig. 1 Magnetic field values in Alaska, USA. a The main field at sea surface elevation modeled by International Geographic Reference Field (IGRF) increases from the southwest to the northeast (contour lines of 500 nT are shown). Four magnetic observatories in Alaska (see Fig. 2) are indicated by yellow crosses. b Magnetic field anomalies (red represents large positive anomalies, blue represents large negative anomalies, and green represents non-anomaly areas) occur throughout the region. Information on anomaly magnitude is available from a map with 1-km resolution by the North American Magnetic Anomaly Group (NAMAG). Numbered areas identify characteristics of the anomaly map referred to in the discussion on in situ tag resolution and temporal fluctuations from archival tags deployed on a stationary mooring in the study area and inferred potential utility of these tags for geolocation based on performance of simulated data with different measurement resolutions. We discuss the potential value of including geomagnetic data into geolocation models for demersal fishes in magnetic anomaly areas such as the North Pacific Ocean as well as general procedures and best practices for working with geomagnetic data.

Study area
Our study was conducted in Glacier Bay National Park, a glacial fjord located in the northern portion of southeastern Alaska (Fig. 1a). The heterogeneous glacial topography in the study area is composed of shallow (approximately 50 m) sills and deep (to 450 m) trenches (Fig. 3a). Glacier Bay also has a heterogeneous geological composition formed of four distinct geological terranes produced by collision of the North American and Pacific plates [18]. Magnetic anomalies in Glacier Bay are largely produced by granitic rocks from the Cretaceous and Tertiary ages [18]. The magnitude of geomagnetic anomalies in Glacier Bay (range 2000 nT) produces changes in magnetic field values over distances of a few hundred meters that are equivalent to the change in the main magnetic field over 1000 km in the vicinity of Kodiak Island, Alaska (Fig. 1a).

Fine-scale map
Because magnetic field values can vary greatly over very small spatial scales in anomaly areas, a fine-scale map of the magnetic field in the study area ( Fig. 3b) was developed for the purposes of conducting simulations based on higher resolution magnetic field data than would be provided by larger scale maps. To obtain high-resolution magnetic field data in the study area, a GEM (Markham, Ontario, Canada) GSM-17 Overhauser magnetometer/gradiometer was attached to the bow of an aluminum vessel and data were recorded at a frequency of 1 Hz (Additional file 1: Figure S1 Figure S1-3). SMO measurements were then used to account for temporal fluctuations in the magnetic field during vessel surveys. A linear temporal trend in the main field (− 85 nT/year) was removed from individual observations so that the map represented magnetic field values on July 1, 2013. All data were divided into 100-m grid cells and a mixed effects model was used to obtain cell means that accounted for autocorrelation present for each transit through the grid cell and for each tracking trip (Additional file 1). Magnetic field values at locations in the study area not visited during the vessel survey were estimated by co-Kriging the vessel survey data with fine-scale aerial survey data that were available for the entire study area [18,19]. Detailed methods for the construction of this map, subsequently referred to as the "fine-scale map", are available in Additional file 1.

Coarse-scale map
A relatively coarse, large-scale map that provides information on location magnetic anomalies was obtained for the study area (Fig. 3c). This map consisted of the North American Magnetic Anomaly Group [NAMAG,20] anomaly grid (1-km resolution) added to main field predictions from the International Geomagnetic Reference Field [IGRF,14]. This map, referred to subsequently as the coarse-scale map, is available over a larger area in the North Pacific Ocean (Fig. 1b) and may thus be utilized for geolocation over broader areas within the North Pacific Ocean region.

Coarse-scale map accuracy
We quantified the coarse-scale map error by comparing it to the values measured by the magnetometer on board the vessel prior to co-Kriging (Additional file 1: Figure S1-4). This avoided inclusion of potentially inaccurate fine-scale map values in grid cells that were not measured directly by the vessel. We calculated the root mean square error for all vessel measurements (100 m grid resolution) within each coarse-scale map cell (1 km resolution). To determine whether map error was larger in grid cells with higher magnetic anomalies, we tested for a relationship between root mean square map error vs. anomaly magnitude (NAMAG grid) using a Generalized Additive Model.

Geomagnetic geolocation Geolocation model
The HMM [10,17] features a study area divided into discrete grid cells and ultimately estimates a probability that the tagged fish occupied a given grid cell at a given time step. Briefly, the model consists of a movement model (random walk) coupled with a data likelihood model that matches geolocation data recorded by the tag on the fish to maps of geolocation variables in the study area. At each time step, the movement model iteratively advances the probable location of the tagged fish and then the data likelihood model updates the probability that the tagged fish is located within each grid cell for that time step. Once the last geolocation record is reached, backward smoothing is conducted to re-estimate probabilities based on all geolocation records. Daily position estimates can be expressed either in terms of overall grid cell probabilities or as a single location corresponding to the mean or mode of the probability distribution across all grid cells. Uncertainty at each time step can be quantified by polygons that encompass a desired level of the probability distribution at each time step (e.g., 99%). The data likelihood model for demersal fishes in the North Pacific Ocean is based primarily on the maximum daily depth recorded by a fish at each time step [10]. Because demersal fishes are assumed to be in close contact with the seafloor at least once per time step, the maximum depth can be linked to bathymetric maps of the study area. The likelihood value for each grid cell and time step is determined by integrating the probability distribution of grid cell depth values between the limits of the maximum tag depth at each time step plus and minus tag resolution [21]. Likelihoods for additional geolocation variables, such a geomagnetic data, can be incorporated when available by cell-wise multiplication of the likelihoods for each variable at each time step. The likelihood value for geomagnetic data is calculated in the same manner as the depth likelihood (i.e., integrating the probability distribution of the total magnetic field values in each grid cell between the limits of the tag magnetic field measurement plus and minus tag measurement resolution at each time step).

Simulated trajectories
To assess the performance of different tag resolutions and HMM data likelihood models, we simulated 1000 random walks in the study area using the formula: where x t and y t are the X and Y coordinates (here, representing changes in longitude and latitude, respectively) at time t, x t-1 and y t-1 are the X and Y coordinates at the previous time step, and ε t is a normally distributed error with a mean of 0 and standard deviation of σ, drawn independently for the x and y components. Trajectories were simulated with 100 steps and σ = 1000 m. At each step, the depth value was extracted from the 20 m bathymetry grid and recorded; if the location fell on land, it was discarded and a new location was selected. Uniform error between + 1 and − 1 m was added to depth data to simulate depth measurement resolution. Magnetic field values were extracted from the 100 m fine-scale grid at each simulated location. Four magnitude levels of Gaussian error (s.d. 75, 150, 300, and 500 nT) were added to extracted values to simulate very high (± 150 nT), high (± 300 nT), medium (± 500 nT), and low (± 1000 nT) tag resolutions. To avoid effects of convoluted coastlines when using a diffusion kernel movement model [22], we altered the bathymetry for the purpose of these simulations and modeling exercises to remove inlets and convert islands to shallow areas (island and inlet cells were replaced with depth values of 5 m with random Gaussian error of s.d. = 1 m added).

Data likelihood treatments
The HMM was used to estimate locations of each of the 1000 synthesized archival data sets for 7 data likelihood model treatments. Treatments consisted of a depth-only data likelihood model and six data likelihood models that combined depth and magnetic data (Table 1). Data likelihood models were based on either the coarse-scale map, which has potential for large-scale application in the North Pacific Ocean, or the fine-scale map, which represents the best available magnetic field data for the study area. For the coarse-scale map, four methods of determining grid cell variance were tested (Fig. 4). First, the "roughness" method assigns cell variance based on the standard deviation of values in all adjacent cells and is commonly used to assign grid cell variance in other (1) HMM applications [21][22][23]. Second, the "slope" method assigns variance based on linking magnetic field gradients to expected variation in the grid cell from available finescale maps [10]. Third, the "anomaly" method assigns standard deviation values to model grid cells based on the root mean square difference between magnetic field values measured by the vessel and the coarse-scale map in each coarse-scale grid cell. Fourth, the "constant" method assigns the same value of standard deviation to all model grid cells; the value is derived from the 80% quantile value of rms map errors (the difference between vesselbased measurements and coarse-scale model prediction) in the large-scale grid cells. For the fine-scale map, two methods of assigning variance were tested. First, the "aggregated" method consisted of the standard deviation of the high-resolution (100 m) values aggregated to form the 1 km model grid [10]; this combination of map and variance specification method represents the best available magnetic field information. Second, the "anomaly" method, described above for the coarse-scale map, was applied to the fine-scale map as well.

Model estimation
The HMM was run entirely with the R program [24] using a combination of code provided by Martin Pedersen (DTU, Denmark) translated from Matlab to R [10] and the R package HMMoce [23]. Each of the 6 data likelihood treatments that consisted of both depth and geomagnetic data likelihood models were run with the four different tag resolution scenarios. For the movement model, we used the same value of σ that was used to create the trajectories (1000 m). The size of the diffusion (movement) kernel was 9 × 9 cells, which allowed a maximum movement of 5.66 km per time step. Pathways were reconstructed by the weighted mean method [10] consisting of the mean location of the smoothed probability distribution at each time step.

Performance assessment
Performance of each treatment on a simulated data set was assessed by calculating the mean absolute distance between each known (simulated) location and the location estimated by the model. This quantity is referred to as the mean absolute error (MAE). Box-and-whisker plots were constructed to visualize results from all data likelihood/tag resolution treatment combinations. The Wilcoxon rank sums test was used to determine whether the median value of MAE for each depth/magnetic likelihood treatment differed from depth alone.

Geomagnetic archival tag resolution
To understand in situ archival tag magnetic measurement resolution and assess temporal change in magnetic field values due to solar storms, we deployed five geomagnetic archival tags on a stationary mooring in the study area (Fig. 3a). Geomagnetic tags from two different manufacturers (Desert Star Systems SeaTag-MOD, n = 3, and Star Oddi DSTmagnetic, n = 2) were rigidly attached to a mooring line at depths ranging from 134 to 138 m from October 10, 2013 to July 1, 2014. Desert Star tags recorded measurements every 4 min and Star Oddi tags recorded every 20 min. Both types of tags recorded tri-axial magnetic field, tri-axial acceleration, depth, and temperature data. The mooring consisted of a concrete anchor and nylon mooring line and thimbles. An aluminum acoustic release Oceano 500 (iXblue, Saint-Germain en Laye, France) was mounted 2 m above the anchor and 2 m below the  5). Islands and convoluted shorelines have been removed to simplify the simulation exercise lower tag. Prior to deployment, a G-857 proton precession magnetometer (Geometrics, San Jose, CA, USA) was used to verify that the acoustic release did not influence the magnetic field at a distance of 2 m. Tags were spaced 1 m apart and attached to the line with plastic fasteners. Three plastic trawl floats (buoyancy 12.5 kg each) were used for flotation and were attached 1 m above the upper tag. Tags were deployed October 9, 2013 and were recovered July 1, 2014.
To assess tag resolution and accuracy, we first calculated an offset that linked the total magnetic field value measured by the tag to the measured value of the total magnetic field at the mooring location because neither tag type recorded absolute magnetic field values. For fish geolocation, the offset is obtained by finding the difference between the daily mean of the total magnetic field values on the first (or last) day at liberty and the known value of the magnetic field at the release location (or tag recovery location). This offset is then applied to every record in the data set. To be consistent with geolocation methods for fishes, we calculated the magnetic field value at the mooring location from the fine-scale map and subtracted the mean for the first day. The offset for each tag was then added to each recorded magnetic field value.
To assess temporal trends and determine whether solar storms could be detected by the tags, we superimposed observatory data from SMO over detailed data sets. Observatory data from SMO were found to be representative of temporal changes in the magnetic field in the study area based on short-term data that were obtained from stationary tags deployed during the mapping portion of this project (Additional file 1: Figure S1-3). Observatory data were obtained at a frequency of 1 min and adjusted to the magnitude of the magnetic field at the mooring location from the fine-scale (100 m resolution) map by applying an offset of 206 nT. Detailed records were visually examined at weekly timescales to determine whether obvious solar storm events were evident in the magnetic field measurements recorded by the stationary tags.
Tag measurement resolution was determined by calculating the standard deviation of all total magnetic field daily means for each tag. To visualize in situ tag resolution and accuracy in the context of potential HMM performance and allow comparisons with HMM estimation of simulated data sets, we generated histograms of the difference between tag daily means and the finescale map value at the mooring location. Histograms were plotted over polygons that represented the four levels of tag measurement resolutions used to calculate likelihoods (very high = ± 150 nT, high = ± 300 nT, medium = ± 500 nT, and low = ± 1000 nT). A Wilcoxon rank sums test was performed to determine whether a bias existed in the difference between measured and mapped values for each tag.
To assess and visualize the effects of stationary tag in situ bias and measurement resolution, we applied the HMM to the stationary tag data as if it were obtained from a stationary fish [22]. Only magnetic field data were included in the data likelihood model in order to assess the geolocation without the influence of the more-precise depth information. The best-performing data likelihood model from the simulation and the same fixed parameters (grid size, diffusion, etc.) were applied to data from all 5 stationary tags. A residence distribution that summarizes the estimated probability of location over the entire deployment period was calculated for each tag. Daily positions were calculated using the weighted mean probability of all grid cells in the study area on each day and mean absolute error (MAE) between the mooring location and estimated daily locations was calculated. Polygons representing 99% of the probability distribution on each day were created and the area of each polygon calculated.

Coarse-scale map accuracy
Coarse-scale map error (the difference between values measured by the vessel and values predicted by the coarse-scale map) varied by magnetic anomaly value. Map error was generally low for grid cell anomaly values between − 150 and 150 nT (Fig. 5a). There were some exceptions within this range, including several large negative values measured near the Bartlett Cove fuel dock. However, both positive and negative differences began to increase at anomaly values greater than 150, and at anomaly values greater than approximately 400 nT, differences were mostly positive (e.g., measured values tended to be greater than mapped values in grid cells with higher anomaly values).
The GAM for rms map difference vs. anomaly magnitude was significantly different from linear (smooth term: p < 2e−16, 4.13 estimated degrees of freedom; the basis dimension, k, was limited to 6 to prevent overfitting) and explained 34% of the deviance. The fitted model was approximately flat for anomaly values between − 150 nT and 150 nT, began to increase above 150 nT, and began to level off at anomaly magnitudes of approximately 300 nT (Fig. 5b). Based on these results, we developed three categories of variance for use in the anomaly method of variance specification by calculating the 80% quantile value of rms map differences in three categories: − 150-150 nT, 150-300 nT, and > 300 nT. These quantile values were 110, 191, and 347 nT, respectively. The variance for the constant method of variance specification, 165 nT, was obtained from the 80% quantile of all rms map distances regardless of anomaly magnitude. This quantile value was chosen to represent the majority of observed rms values without being unduly influenced by higher rms values that were sometimes observed.

Model performance
Including geomagnetic data increased model performance compared to the depth-only likelihood model for both fine-scale and coarse-scale magnetic maps, but only for certain tag resolutions and variance specification methods (Fig. 6, Table 2). Overall, the likelihood treatments that featured the fine-scale magnetic map had the greatest performance increases compared to depth-only when tag resolution was high or very high. The anomaly method of variance specification performed better than the aggregated method for likelihoods based on the fine-scale map. The likelihood treatments that featured the coarse-scale magnetic map performed better than depth alone as long as variance specification was based on map error (e.g., either the anomaly or constant method of variance specification) and tag resolution was medium or higher. The constant method of variance specification had better performance than the anomaly method, and performance of this treatment was similar to the fine-scale map treatments for medium and high tag resolutions. For the low-resolution tags, the coarse-scale map with the constant method of variance performed better than either of the two fine-scale map treatments. However, the coarse-scale map with variance specifications based on roughness and slope had much poorer performance compared to depth alone.
Within each likelihood treatment, tag resolution had a strong effect on tag performance. Performance increased as tag magnetic field resolution increased for both fine-scale map treatments and coarse-scale map treatments with the anomaly and constant methods of variance specification. However, a trend toward better performance with increasing tag resolution was not observed for the coarse-scale map with the roughness or slope methods of variance specification.

Geomagnetic archival tag resolution
All five archival tags on the stationary mooring exhibited temporal patterns in detailed (4-min interval) and daily average total magnetic field measurements that were similar to each other, but not related to fluctuation in the magnetic field from solar storms (Fig. 7). The total magnetic field measured by the SMO rarely varied by more than several 100 nT throughout the tag deployment period, though several solar storms produced variations of greater magnitudes. The range of 1-min values measured by the SMO during the deployment period (1191 nT) was smaller than ranges observed for detailed magnetic field measurements from the stationary tags (Table 3). Tag DS-2 had the smallest range of detailed measurements (1496 nT) and tag SO-2 the largest (9828 nT) over the deployment period. The lowest magnetic field values recorded by SMO during the deployment period occurred during a solar storm on February 20, 2014. This storm produced fluctuations in the magnetic field with a range of 1081 nT over a 24-h period. Changes in total magnetic field values for some tags during this time were similar to the pattern of the solar storm ( Fig. 8; note that tag SO-2 was not included due to its much larger range of variation), and the lowest measurement recorded in the course of the entire deployment period for one of the tags (DS-1) occurred in conjunction with this storm. However, in general the daily variations in tag measurements were larger than the range of the storm and similar patterns were observed in tag data when no storms were evident in the SMO data. After a visual inspection of weekly data for all tags, we concluded that solar storm patterns Fine-scale and coarse-scale indicate which map was used for the treatment, and variance specification method is indicated by "aggregated" (Fig. 4a), "anomaly" (Fig. 4d), "constant" (165 nT for each grid cell), "roughness" (Fig. 4b), and "slope" (Fig. 4c) could not be distinguished reliably in the stationary archival tag data. All five stationary tags recorded daily changes in the magnetic field that appeared to be much more strongly related to tidal action than to solar storms (Fig. 9a). Oscillations in the magnetic field measurements were clearly related to tidal oscillations in depth (Fig. 9b) and were greater during flood tides compared to neap. For example, total magnetic field values for tag DS-3 (Fig. 9b) tended to increase sharply when the tidal cycle switched from a slack high tide to an outgoing tide, then decreased steadily during the following incoming tide. Changes in tag orientation that could result in distortion of the magnetic field (Additional file 2) were also associated with tidal patterns (Fig. 9c). However, restricting magnetic field data to only those records collected when the tag was in the same orientation did not remove the oscillating magnetic field values associated with tidal variations, though it did dampen it considerably (Additional file 2: Figure S2-3). A more detailed description of factors that could produce artifacts in magnetic field values measured by the tags on the stationary mooring is available in Additional file 2.
Geomagnetic tag measurement resolution of daily means varied within and between manufacturing type (Figs. 7, 10). In general, Desert Star tags were more precise than Star Oddi. Standard deviation of daily means for Desert Star tags ranged from 159 to 303 nT compared to 543-2601 nT for Star Oddi tags (Table 3). In comparison to archival tag measurements, the standard deviation of daily means recorded by the SMO was much smaller (26.2 nT). Tags also varied in accuracy (difference from the known value at the mooring location; Table 3, Fig. 10), and all tags except SO-1 exhibited significant measurement bias. For comparison with the tag resolution levels used to generate likelihoods for the geolocation simulation studies, Desert Star tag resolution would be consistent with low (DS-1 and DS-3) to medium (DS-2) resolution (Fig. 10). Star Oddi tag resolution ranged from low (SO-1) to so low that the data would presumably be unsuitable for geolocation (SO-2).
Four of the five stationary archival tags provided HMM-estimated locations from stationary tag magnetic field data (Fig. 11). The data from tag SO-2 were too extreme to be found in the study area, thus the model could not function. The data would allow for estimation if a larger study area were used, however. Model reconstructed pathways for all tags appeared to wander around the study area, with the mooring location found at the edge of the estimated residency distributions of all four tags. Mean absolute error (MAE) of estimated locations, where lower values indicate higher accuracy, was similar for tags SO-1, DS-2, and DS-3 but much larger for tag DS-1 ( Table 3). The mean size of daily 99% probability polygons (where lower values indicate higher precision) ranged from approximately 200 to 350 km 2 .

Geomagnetic geolocation with the HMM
Our simulations suggest that, despite the presence of geomagnetic anomalies in the region, geomagnetic data could improve the geolocation of demersal fishes in the North Pacific Ocean when combined with depth data in a hidden Markov model framework. However, the degree of potential improvement depends on the resolution and accuracy of both geomagnetic archival tags and magnetic field maps available for specific study areas as well as the variance specification  poor-quality maps, or mis-specified data likelihood models are employed. On the other hand, if fine-scale maps are available for a given study area, as they are for the Glacier Bay study site, considerable improvement in geolocation over depth alone would be expected from tags with high or very high resolution. In regions where local magnetic anomalies exist, the use of the hidden Markov model framework that can explicitly account for magnetic anomalies may be preferable to existing methods that rely solely on main field gradients. For example, when using a method that intersects latitude derived from the Earth's main field with longitude derived from light intensity, large geolocation errors were observed in the Galapagos region where local magnetic anomalies are prominent [12]. This method is unsuitable for demersal fishes in the North Pacific Ocean even in the absence of the errors caused by local magnetic anomalies because it requires light-based longitude data that are either too sporadic or non-existent [11]. For the method of intersecting horizontal and vertical gradients of the main field [13], problems would be anticipated in anomaly areas because large-scale anomaly maps are only available for the total magnetic field, not for horizontal and vertical components, separately.
The HMM framework proposed here that explicitly incorporates magnetic anomalies has advantages beyond accounting for map error. For data likelihood models based on depth alone, increased bathymetric heterogeneity and small-scale depth gradients can improve geolocation as long as grid size is small enough to satisfy the assumption of a normal distribution in each grid cell [10]. Therefore, the increased study area heterogeneity (e.g., Figure 1b, areas 2 and 4) caused by magnetic anomalies could also improve geolocation performance compared to non-anomaly areas. In addition, magnetic anomalies can have large-scale patterns such as the alternating swaths of positive and negative anomalies associated with seafloor spreading [16] in the Gulf of Alaska (e.g., Figure 1b, areas 1 and 3). Therefore, magnetic field gradients at larger spatial scales may also be stronger in anomaly Table 3 In situ geomagnetic archival tag resolution Tag mooring depth, offset that links tag measured total magnetic field values to mapped value at mooring location, range of detailed (4-min resolution) total magnetic field measurements, standard deviation of total magnetic field daily means, median difference between daily means and mapped value, Wilcoxon rank sums p-value for detecting bias, mean absolute error (MAE) between mooring and estimated locations from hidden Markov model (HMM), and mean size of daily error estimates from HMM. No HMM results were available for tag SO-2 because values measured were outside of the range of values in the study area

Tag ID Depth (m) Magnetic offset (nT)
Range detailed (nT)  areas compared to non-anomaly areas, and geolocation performance would likely increase for fish that move perpendicular to those gradients (e.g., east-west movement in area 1 or north-south movement in area 3, Fig. 1b).

Geomagnetic anomaly maps
The similarities between the fine-scale and coarse-scale magnetic field maps in this study were encouraging for the use of the coarse-scale map (IGRF + NAMAG) over large areas. The key differences between map scales were (1) the magnitude of anomalies tended to be lower in the coarse-scale map, and (2) man-made structures, such as the fuel dock at Bartlett Cove, were not included in the coarse-scale map. The differences in the magnitude of anomaly values can be addressed by a data likelihood model that specifies a larger variance for grid cells with larger anomaly values (e.g., Fig. 4d). However, the presence of man-made structures is more difficult to address and may be an important source of map error. Man-made structures such as offshore petroleum platforms, wind 54500 55500 56500 57500 Total magnetic field (nT)  Fig. 9 Variation in total magnetic field measurements recorded by archival tag DS-3 during a period of solar storm activity (magnitude 1000 nT) in Glacier Bay National Park, Alaska, USA. a Detailed (4-min resolution) total magnetic field measurements (black line) from tags compared to 1-min resolution total magnetic field data from the Sitka Magnetic Observatory (pink line; offset by − 600 nT to allow visual comparison). b Detailed total magnetic field measurements (black line) from tags compared to depth (blue line) reflect a distinct tidal pattern in magnetic field data. c Detailed total magnetic field measurements (black line) compared to tag orientation (tri-axial acceleration) over the time period, where orientation along the X-axis is shown in red, Y-axis in blue, and Z (vertical) axis in green Fig. 10 Histograms of the difference between the daily magnetic field measurements recorded by the five moored archival tags over the course of the 8-month deployment and the value of the fine-scale magnetic field map at the mooring location in Glacier Bay National Park, Alaska, USA. Colored polygons indicate range of tag resolutions used to define likelihoods for four levels of tag resolution in fish movement trajectories simulated via hidden Markov modeling. For example, to calculate the likelihood using the lowest tag resolution (blue), grid cell magnetic field probability density is integrated by limits of the daily measurement ± 1000 nT. Medium resolution (green) integrates the cell probability by ± 500 nT, high resolution (orange) by ± 300 nT, and very high resolution (red) by ± 150 nT. DS-1 would be considered a low-resolution tag because the histogram falls within the blue polygon, whereas DS-2 would be equivalent to a medium-resolution tag in the simulations because the histogram falls within the green polygon farms, and shipwrecks can have strong magnetic field signatures due to steel structural components and electromagnetic emissions. For example, shipwrecks can produce magnetic anomalies of 10,000 nT or more [25]. Both demersal and pelagic fish species can have increased abundance in the vicinity of these structures [26,27] and behaviors such as site fidelity and homing at scales of less than 50 m to these structures have been observed [28][29][30][31]. Therefore, if a specific study area is known to contain major man-made structures that could attract tagged fish, efforts should be made to determine their typical magnitude and include that information on the magnetic field maps. Associations between tagged fish such as plaice, cod, and skates and man-made structures have been determined based on grid cells that represent the presence of structures such as shipwrecks and undersea cables [32], so perhaps such maps could be extended or augmented to represent potential differences between measured and mapped values that could occur in different parts of the study area. Knowledge of magnetic anomaly magnitudes associated with man-made structures such as bridges could greatly improve geolocation for species that occupy certain bays and estuaries or that migrate along river corridors [33]. One potential challenge that could accompany the use of the NAMAG anomaly map is the gaps in coverage  Fig. 1b). These gaps could be filled with information from the EMAG2v3 anomaly grid, which has world-wide coverage at a scale of 2 arc-minutes [15]. In addition, the Enhanced Magnetic Model (EMM) combines main field and anomaly data to spatial scales of approximately 50 km [34]. Future studies should test model performance with these additional sources of magnetic field maps that also explicitly include anomaly information.

Geomagnetic variance specification methods
The differences in performance for different methods of variance specification were striking, particularly for the coarse-scale map. The data likelihood treatments that were based on the roughness and slope methods performed much worse than those based on depth alone. This result may be due to the smaller values of variance produced by these methods compared to the methods that were based on map error. This is an important consideration for combining multiple types of geolocation data, and it may be advisable to set higher values of variance for geolocation variables that have low gradients and map accuracy.

Geomagnetic archival tag resolution and accuracy
Although in general the addition of geomagnetic data improved model performance, the simulation results suggest that tag resolution plays an important role in the magnitude of the improvement. The lowest resolution (± 1000 nT) did not result in performance improvement, and in one case the performance was worse than using depth alone. However, the range of anomaly values in the study area was 2000 nT, so it is possible that performance may be improved even for the lowest resolution tags in areas where gradient strength is more than twice the tag resolution. In addition, the heterogeneity of depth in the study area is greater than in other locations in the North Pacific Ocean (e.g., along shallowly sloping expanses of continental shelf habitat), so low-resolution tags could still improve geolocation over depth alone in areas where magnetic field gradients are stronger than depth gradients. The simulation results also suggest that very-highresolution tags may not improve geolocation if coarse maps are used. This is an important point from the standpoint of tag manufacturing and tag expense, as highresolution tags are more difficult to produce and would therefore cost more. In addition, a great deal of magnetic field noise (e.g., solar fluctuation) occurs below a level of approximately ± 100 nT [16], so producing a tag with resolution greater than this would not be expected to result in further improvement of geolocation unless the fluctuations can be taken into account.

In situ geomagnetic archival tag performance
Our results from the five geomagnetic archival tags deployed on a stationary mooring suggest several implications for geolocation performance. First, magnetic field measurement artifacts that could have been related to some aspect of attachment to the mooring line resulted in gradual increases or decreases in daily magnetic field values relative to the known value at the mooring location. Tags deployed on land that were rigidly fixed to one orientation did not exhibit temporal variation that is tidal in nature; instead, they recorded changes consistent with temporal fluctuations measured by observatories (Additional file 1: Figure S1-3) or changes in temperature (Additional file 2: Figure S2-4). However, given the observed changes in recorded magnetic field values resulting from changes in tag orientation (Additional file 2: Figure S2-1, and discussion below), it seems likely that slight changes in tag orientation on the mooring line either with tidal action or other physical action on the mooring line over time are responsible for the tag measurements that differ markedly from observatory (SMO) data. Because the patterns in the daily means varied at much larger time scales than tidal action (e.g., for weeks the measured value would be lower than the known value in the study area, then become higher for weeks), such temporal patterns could be mistaken for tag movement in a data set not known to be from a stationary tag. For example, extended periods of time when measured values were lower than mapped values resulted in apparent tag movement to a region of low values for tag DS-1 (Fig. 11). However, similar periods of time when recorded values were higher than the known value at the mooring location for tags SO-1 and DS-3 did not result in as much error because positive anomaly areas were much closer to the mooring location.
The sub-daily patterns in total magnetic field data recorded by all five of the archival tags on the stationary mooring are likely related to a change in orientation or aspect of the mooring line during tide changes. The magnetic field sensors in the tags are vulnerable to a host of magnetic field distortions that can cause a change in recorded magnetic field values when the tag is rotated (Additional file 2: Figure S2-1). These include hard and soft iron effects from other components in the tags, such as batteries, or errors in sensor alignment [35,36] for which neither type of tag was calibrated. Although magnetic field sensors are also sensitive to temperature change (Additional file 2: Figure S2-4), the change in temperature associated with changing tides was not great enough to explain the magnitude of daily variation in magnetic field values that was observed. The motion of ions in seawater is known to produce a magnetic field in coastal areas and could perhaps produce a tidal signal in the magnetic field data, but the magnitude of magnetic fields caused by tides is typically less than 100 nT [16].
Second, the requirement for calculating an offset based on linking tag measurements to mapped values on the day of release and/or recovery can lead to tag bias that can have adverse effects on geolocation performance. In our case, the first daily mean was not necessarily representative of the true offset that could be applied with perfect knowledge of the entire data set (e.g., the mean of all daily means). This is a concern for interpretation of magnetic data from tagged fishes, as only two points in time are presumed known (tag release and tag recovery locations). Thus, the bias results reported for the tags are somewhat arbitrary, but they point to a potential systemic source of error in tag measurement that could be eliminated if the tags were to record absolute rather than relative magnitudes. In anomaly areas, an additional bias may occur due to assigning the wrong mapped value, which would then propagate the error throughout the entire data set. To ensure that this does not happen, an accurate and precise measurement of the magnetic field could be obtained at the release location.
Third, we found substantial variation in precision and accuracy among tags. This is a concern because the geomagnetic data likelihood specification relies on quantification of in situ tag resolution values from tags known to be stationary. To ensure that the value used will represent all possible tag resolutions for tags deployed on fish, the lowest resolution observed for stationary tags should be used to specify the likelihood. In this case, tags that have higher resolution than other tags, such as DS-2, would be penalized. The poor performance of SO-2 is troubling in this context, as it was not suitable for geolocation in the study area (values were much lower than values in the study area). An alternative approach would be to test tag precision prior to deployment so that specific resolutions could be used to specify a resolution for each tag. However, the issue of changing magnetic field values with tag orientation should be solved before users can obtain accurate resolution values during pre-deployment testing.

Caveats
Although this research has provided a basis for considering the potential utility of geomagnetic geolocation in the North Pacific Ocean, several caveats should be mentioned. First, our study site was small in comparison to the scales of movement that would be expected for demersal fishes over long time periods. However, because large-scale movements are composed of a series of daily movements, geolocation models that perform well at scales of daily movement should also perform well over longer time scales. Our small study area allowed a mechanistic understanding of the characteristics of local magnetic anomaly areas and corresponding insights into ways to specify a geomagnetic data likelihood model that accounts for them, and to consider and contextualize relative tag performance within that framework, and we expect our results to be applicable over larger scales in space and time.
Second, the use of the same value of diffusion for the HMM movement model as was used to simulate the trajectories likely led to better performance than if the diffusion coefficient was estimated by the model or from the literature on fish behavior. In this study, we decided to hold the diffusion constant so that differences in performance could be attributed to data likelihood treatments or tag resolution. However, the sensitivity of the HMM to different values of diffusion for different applications should be investigated.
Third, our fine-scale map of the study area may contain errors and may not fully represent magnetic field values that would be measured by a tag attached to a demersal fish in the study area. The fine-scale magnetic field data used to create the map were collected as part of acoustic tracking trips for tagged fish, so the spatial and temporal distribution of survey effort to collect the data were not ideal for producing a high-resolution, comprehensive magnetic field map (Additional file 1). We feel the map is sufficiently accurate to represent the main features of the fine-scale anomalies for demonstrative purposes, but it is possible that the distributions of magnetic field values in the 1-km aggregated model grid cells are more skewed than suggested by our fine-scale map. Further, our magnetic data were collected at sea level, but in anomaly areas values could be much higher on the seafloor where demersal fish are located. Thus, important work remains to compare magnetic field values at the seafloor to values at sea level as part of the validation of geomagnetic geolocation for demersal species.
Fourth, the archival tags deployed on the stationary moorings were early versions of magnetic archival tags available from each manufacturer, and current versions may perform better than the results reported in our research. The Desert Star tags were not factorycalibrated with the batteries on them, so calibration of the tag with the battery section attached was performed manually prior to deployment and it is possible that errors in calibration were introduced in this step. Therefore, our findings of in situ tag resolution and accuracy should not be taken as representative of current tag models. Instead, the tag resolution and