The development and maintenance of these e-infrastructures require significant economic investment and a coordinated effort to address the many political, technical, and funding challenges. A key step in the growth and sustainability of these information systems, and their capacity to support the research community, is the ease by which data can be transferred from device to repository. The assortment of sensors now available, combined with the range of methods for the automated download of these data (ARGOS, Bluetooth, email, Iridium, Globalstar, GSM, SMS) and the sheer number of companies producing the devices, is resulting in many different attribute names, definitions, units, and file formats for reporting the same types of data. For example, different manufacturers currently report the time and date in numerous disparate formats (e.g. dd/MM/yyyy h:mm; h:mm:ss.s MM.dd.yyyy; yyyy.MM.dd hh:mm:ss; yyyy-MM-dd hh:mm:ss.s, etc.). In some cases, data files and documentation provide no reference to the time zone, even when allowing researchers to choose (and sometimes change on the fly) the time zone in which their data are provided. In addition, data generated by individual manufacturers can change over time, leading to outputs that misleadingly appear to be in the same format; for example, the development of a new device model or software version that changes the units for reporting instantaneous ground speed without any related change in data output files. Such changes may or may not be discovered during import to an e-infrastructure and data analysis. The quality of metadata describing data output varies widely—whilst some manufacturers provide thorough and up-to-date manuals, data for many devices lack complete written documentation defining variables, units, accuracy/precision, etc., and including a history of changes.
The resulting ambiguities in how data, and even the same variables, are reported make it almost impossible for data upload into the repositories to be a fully automated process and are creating serious challenges for those wishing to store and integrate data from multiple telemetry devices within a single database. All existing e-infrastructures are severely affected by this problem because the creation of dedicated import procedures, each tailored to the device, manufacturer, and model, is a time-consuming and error-prone task that also requires continual monitoring and updating as new models and manufacturers enter the research arena. The current approach is expensive and will become increasingly infeasible if manufacture proliferation and device diversification continue at the current rate without globally adopted data standards.
Here, we request that the device manufacturers and scientific community take steps to develop and instigate standards for the reporting and documentation of data collected by an increasing plethora of animal-borne telemetry devices. We understand that sensors differ in design and purpose; however, we believe that most of the scientifically relevant data collected can be described by a finite set of variables. When stored along with information about the sensor (manufacturer, model, etc.), shared data standards can allow data from a wide range of sensors to be properly archived and analysed together. In particular, we ask that manufacturers and the research community design and implement.
Standard variable names, definitions, data types, and units for commonly used data attributes.
A standard and documented file format, such as an Extensible Markup Language (XML) schema, as an option for receiving data. There are already widely accepted standards for reporting many of the variables most commonly used by biotelemetry devices (for example, providing timestamps in Coordinated Universal Time (UTC) and geographic coordinates in the WGS84 reference system). Several standards already exist to describe ecological and geospatial data that can be used as a preliminary reference, for example: Darwin Core and ABCD from TDWG for biological collections; Ecological Metadata Language (EML) from the Knowledge Network for Biocomplexity; the US Federal Geographic Data Committee’s (FGDC) standards for geographic information (with an extension for biological information); ISO 19115 from the International Organization for Standardization for geographic information; and GPX (GPS Exchange Format).
As a first step, we also request that individual device manufacturers, if they have not done so already, provide complete written documentation (metadata) for all variables and formats currently used by their devices, if needed including a history of past changes and description of how data differ between devices, user-specific preferences, data access methods, or file formats. Such documentation does not allow for interoperability or automated integration of data into repository, but would minimize archiving errors and enable future implementation of the standards above with historical data.
The definition and adoption of such standards by all manufacturers would dramatically simplify the data acquisition process, augmenting the willingness of biotelemetry researchers to archive their hard-won data-collections; reduce data management costs, allowing e-infrastructures to focus on developing shared analysis tools; and minimize the risk of errors derived from data handling, enhancing data reusability. These improvements in how the community share and reuse data would in turn benefit the device manufactures by placing greater value on the data their devices collect to the wider scientific and natural resource management communities.
The definition and adoption of standards in the reporting of data collected by animal-borne electronic devices are the first necessary step towards a more general and comprehensive set of data standards that would enhance interoperability amongst the different e-infrastructures. The recently formed International Bio-Logging Society has made one of its goals to “standardize data protocols to make the various marine and terrestrial databases interoperable”. For reasons outlined in this article, we urge for this to be instigated at the level of device manufacture and propose a dedicated workshop at the next International Bio-Logging Science Symposium, inviting developers from all the Animal Biotelemetry e-infrastructures, as well as the leading device manufacturers and research scientists.