Skip to main content

Application of a computer vision technique to animal-borne video data: extraction of head movement to understand sea turtles’ visual assessment of surroundings



An animal-borne video recording system has recently been developed to study the behavior of free-ranging animals. In contrast to other types of sensor data (i.e., acceleration), video images offer the advantage of directly acquiring information without analysis. However, most previous findings have only been obtained through visual observation of image data. Here, we demonstrate a new method of data analysis for animal-borne videos using a computer vision technique referred to as template matching. As a case study, we tracked the horizontal head movements of green turtles (Chelonia mydas) to investigate how they move their heads to look around the underwater environment.


Template matching allowed tracking of head movements with high accuracy (0.34 ± 0.12 % and 0.52 ± 0.29 % of the root-mean-square error on the x- and y-coordinates, respectively), high true (86.2 ± 8.1 %), and low false extraction rates (6.6 ± 8.4 %). However the program sometimes failed because the turtle’s head would move out of range of the video. During cruising swimming, green turtles did not significantly move their heads to one side, moving with a ratio of 50.5:49.5 (left: right). Green turtles moved their heads from side to side more widely and more slowly before (12.0 ± 4.6 point and 0.25 ± 0.03 Hz, respectively) and after taking a breath (27.5 ± 2.9 point and 0.27 ± 0.03 Hz) compared to during cruising swimming (8.4 ± 3.8 point and 0.32 ± 0.01 Hz). Before feeding, turtles moved their heads slowly (0.23 ± 0.03 Hz) and narrowly (9.3 ± 3.6 point). Our combined approach using video and gyro loggers revealed that when making a turn, turtles always turned their heads to the side 1.38 ± 0.77 s before turning their body.


Our method enables researchers to quantitatively extract information regarding vision cognition and behavioral responses in green turtles in the wild that could not otherwise be obtained from other sensors used in previous studies. This new method using a combination of computer vision and bio-logging (e.g., gyroscope) can serve as a powerful tool in animal behavior and ecological studies.


In field studies, it has been difficult to measure how free-ranging animals recognize environmental stimuli and respond to them due to the difficulty of measuring responses using devices in the wild. Therefore, most previous studies on cognition processes and behavior of animals have been conducted using well-trained animals within a restricted space, such as an arena or tank [1]. However, biotelemetry and bio-logging techniques have enabled researchers to understand behavioral responses to environmental stimuli such as temperature (e.g., [2, 3]) by investigating the relationship between movement and temperature data. However, it has been yet difficult to investigate directed behavioral responses to the surrounding environment, which were impossible to measure by conventional sensors (e.g., accelerometer). Therefore, only a few studies using novel sensors such as EEG [4] have attempted to determine how free-ranging animals respond to the environment in the wild.

In recent years, an animal-borne video and still-recording system has been developed to study the behavior of free-ranging animals [5]. In contrast to other types of sensor data (i.e., acceleration), video and still images have the advantage of allowing for the collection of a large amount of directional information without analysis. This advantage allows researchers to obtain detailed information regarding wild animals, such as social behavior [68], prey items [9, 10], tool use [11], gas exchange [12], and habitat environment [13, 14]). However, most of these previous findings were obtained solely through visual observations of image data, although a few studies quantitatively analyzed image data through image processing [15, 16]. If the behavioral response of animals to the objects/environment on the video footage can be determined using animal-borne video data, we may be able to more deeply understand the cognition and decisions driving behaviors of animals in the wild. Such an analysis could be used in tandem with computer programs designed for motion detection or object recognition/tracking on the video footage, then quantitatively track the objects concerning the animals or environment.

Various methods for motion detection or object tracking in video data (e.g., optical flow) have been developed in the field of computer vision [17, 18]. These techniques have been used to understand human cognition and behavior. One example is the use of these techniques in ‘life-log’ analyses; i.e., capturing personal behavior and experiences using a wearable video/camera system equipped with other sensors and computers on the human body [19, 20]. For example, camera and object motions in footage taken from a head-mounted video camera have been analyzed to investigate how humans pay attention to objects [19].

Here, we present a new method of data analysis for animal-borne video using template matching, one type of computer vision technique. As a case study, we tracked the horizontal head movements of green turtles (Chelonia mydas) to investigate how they move their heads to look around in the underwater environment. We assumed that turtles move their heads horizontally when looking to the right or left, particularly when they want to look at an object with binocular vision. This assumption is supported by the fact that green turtles have relatively clear horizontal vision among sea turtle species [21]. From the perspective of behavioral studies, vision is presumably of great importance for sea turtles when feeding, avoiding predators, and finding mates [8, 22, 23], although they also use hearing and chemoreception [24].


Animal-borne video data

The present study used animal-borne video data of six green sea turtles described as part of previously published studies [10, 12]. Video data were recorded in the waters of Iriomote Island, Okinawa, Japan (24°20′N, 123°50′E). The turtles were hand-captured by local fisherman with permission from the Fisheries Adjustment Commission of Okinawa Prefecture (Permission No. 23-2 and 24-4).

For the purposes of this study, animal-borne video data were recorded using video data loggers (GoPro HD®, image resolution: 1280 × 720 pixels, frame rate: 30 frames per second, Woodman Labs, CA, USA) with a custom-made waterproof case (Logical Product, Fukuoka, Japan). The video loggers were equipped with a Fast-loc GPS, depth tags (Mk10-F, Wildlife Computers, WA, USA), a VHF transmitter, and a time-scheduled release system. Two turtles, CM5 and CM6, were also equipped with a multi-sensor data logger incorporating depth, temperature, a three-axis accelerometer, a three-axis magnetometer, and three-axis gyroscope sensors (LP-KUBL1101, Logical Product, Fukuoka, Japan). Accelerometers measure the dynamic acceleration and tilt angle of the carapace. Magnetometer and gyroscope sensors measure the strength of geomagnetic field and angular velocity, respectively. These logger units were deployed on turtles’ second vertebral scute (Additional file 1). Turtles equipped with logger units were released at their capture point. The time-scheduled release mechanisms were programmed to activate 36–168 h after release, at which time an electric charge would incise the plastic cable. The logger units then detached from the turtles and floated to the sea surface. The video loggers were retrieved using radio telemetry. A detailed description of the experimental protocol is provided in Okuyama et al. [10]. In the present study, a total of 22.5 h (CM1-6: 5.0, 2.0, 5.0, 2.5, 4.0, and 4.0 h, respectively) of video data were used.

Template matching

The template matching technique was used to track and quantify the head movements of green turtles. Template matching is a digital image processing technique to identify small parts of an image that match a template image [17]. Here, the scale pattern on the top of a turtle’s head was chosen as the template, which is the most characteristic part of the head (Fig. 1). Fourteen to nineteen templates, including images of the turtle facing various directions and at various depths, were manually prepared for each individual to track the head under any situation (Fig. 1). The number of templates was empirically determined to maximize accuracy of template matching for each turtle. The templates were defocused using Gaussian blur (GIMP software, The GIMP Development Team) to expand its versatility. The image stacks were then converted to grayscale. Similarity or dissimilarity between a part of an image and a template was evaluated using the zero-mean normalized cross-correlation (ZNCC). The ZNCC was calculated using the formula below:

$$ {R_{ZNCC}} = \frac{{\mathop \sum \nolimits_{j = 0}^{N - 1} \mathop \sum \nolimits_{i = 0}^{M - 1} \left(\left({I\left({i,j} \right) - \mathop {\bar I}\limits} \right)\left({T\left({i,j} \right) - \mathop {\bar T}\limits} \right)\right)}}{{\sqrt {\mathop \sum \nolimits_{j = 0}^{N - 1} \mathop \sum \nolimits_{i = 0}^{M - 1} {{\left(I\left({i,j} \right) - \mathop {\bar I}\limits\right)}^2} \times \mathop \sum \nolimits_{j = 0}^{N - 1} \mathop \sum \nolimits_{i = 0}^{M - 1} {{\left(T\left({i,j} \right) - \mathop {\bar T}\limits\right)}^2}}}} $$
$$ \mathop {\bar T}\limits = \frac{{\mathop \sum \nolimits_{j = 0}^{N - 1} \mathop \sum \nolimits_{i = 0}^{M - 1} T(i,j)}}{MN} $$
$$ \mathop {\bar I}\limits = \frac{{\mathop \sum \nolimits_{j = 0}^{N - 1} \mathop \sum \nolimits_{i = 0}^{M - 1} I(i,j)}}{MN} $$

where i and j are the x- and y-coordinates of the image, respectively. T(i, j) and I(i, j) are the brightness values of the template and the image, respectively. M and N represent the pixel count of the width and height of the template, respectively. ZNCC values run from −1 to 1, with values close to 1 indicating that the image is very similar to the template. The ZNCC value of a given template to a given part of the image was calculated, and the calculation area was then adjusted pixel by pixel. These calculations were applied for all templates, and the template with the highest ZNCC value was chosen as a head position in the image. The intersection of the ‘Y-shaped’ scale pattern of the template was determined as a head position in the image.

Fig. 1
figure 1

The concept of template matching. The intersection of the ‘Y-shaped’ scale pattern on the turtle head was used as a template and was determined as a head position. Several templates were manually prepared for each individual to track the head in any situation

Coordinates of the head positions were calculated for all consecutive images, and head movement was determined by tracking head positions in consecutive images. In the present study, if the head was detected at a given frame, the search area for the next frame was confined to a 250 × 160-pixel area of the head position detected in the previous frame. This prevented any false detection, and shortened the computing time. However, if the ZNCC value was lower than 0.63 or if a >20-pixel difference in the distance of head movement along the x-coordinate in the previous frame occurred continuously for eight frames, the search area was extended to the entire frame. Then, if a matching area with a ZNCC value greater than 0.63 could not be detected, the head position was considered to be a missing value in that frame. The threshold value of 0.63 was empirically determined to give the best accuracy for template matching. This computer vision analysis procedure was implemented using the OpenCV 2.3 software written by authors in Microsoft Visual C++ 2010 Express.

Accuracy evaluation of template matching

The extraction accuracy of head position by template matching was evaluated using the root-mean-square error (RMSE). The true positions of the head position along x- and y-coordinates were determined via visual observations of image data. We evaluated the extraction accuracy using 10 min of video data for each turtle. Also, we calculated the true and false extraction rates of head positions in template matching,. The true extraction rate was defined as the ratio of the number of head positions that were extracted by both template matching and visual observation to the number of head positions that were detected by visual observation alone. The false extraction rate was defined as the ratio of the number of false extractions of head positions detected by template matching to the total number of head positions extracted by that method.

Horizontal head movement when exhibiting particular behaviors

Although we tracked both vertical and horizontal movements (on both x- and y-coordinates), in this study, we focused only on turtles’ horizontal head movement because downward head movements often went out of the field of view. In addition, the vertical field of view, i.e., the position of the head on y-coordinate, fluctuated while the turtle was swimming because the longitudinal tilt angle of the carapace on which the video was deployed changed with the sea turtles’ power stroke [25].

The size and angle of the field of view differed slightly among individuals due to differences in the size of turtles and the deployment position on their back. Therefore, we calibrated the head position along the x-coordinate for each turtle. Based on the assumption that turtles generally face forward with respect to the body axis when they are in the typical state, the mode value of the x-coordinate was defined as the head position when the turtles faced forward, which was converted to 0 point for the head position. In addition, to calibrate the difference in amplitude of head movements between the left and right sides, the maximum value along the x-coordinate, which indicates that the turtle turned their head to the right side, was converted to 100 point, while the minimum value was converted to −100 point. Thus, horizontal head movement was represented as values ranging from −100 point (left side) to 100 point (right side).

To determine horizontal head movement, the dominant frequencies and amplitude of horizontal head movement were used, which were calculated by continuous wavelet transformation with a minimum frequency of 0.1 Hz and a maximum frequency of 10 Hz using IGOR Pro ver. 6.2 (WaveMetrics, Inc., Lake Oswego, OR, USA) and Ethographer ver. 2.00 software [26]. In addition, the relationship between the carapace tilt angle on the sway (lateral) axis and the head position in the video field of view was investigated. For this analysis, we used only data at the local maxima and minima of time-series variation in the tilt angle, because the amount of data in the full profile was excessive. The tilt angle was calculated from static acceleration, which was extracted using a low-pass filter (see [10, 12]).

To investigate the head movement during a particular behavior, we divided the time duration into four categories: cruising swimming, breathing, feeding, and resting, by visual observation of image data. The breathing phase was further divided into two additional phases: pre- and post-breathing, which were defined as the 10-s duration before taking a first breath and after taking a last breath at the surface, respectively. The feeding phase was defined as the 10-s duration before feeding on, capturing, and handling prey. Although we also observed numerous feeding events on seagrass meadow (see [10]), these feeding events were excluded from the analysis because the turtles’ heads were out of the field of view. Thus, we investigated horizontal head movement during feeding only on floating prey. The resting phase was defined as the time during which the turtles rested on or under coral reefs. The resting phase was omitted from data analysis, as the turtles bent their heads down during most resting time, such that the head moved out of the field of view. The cruising swimming phase was defined as the remaining time that was not used for breathing or resting phases. During the cruising swimming phase, we investigated how turtles usually moved their heads. In addition, to determine whether turtles scan the environment toward which they are heading before making a turn, we extracted the time and turn velocity when turtles made a turn using the angular velocity of the turtles on the yaw (vertical) axis obtained from two turtles, CM5 and CM6, which had been deployed with a gyroscope sensor. We then examined the relationship between the maximum rotating velocity and the maximum amplitude of horizontal head movement before turning. To clearly extract the time at which the turtle made a turn, we only used data for when turtles made a turn with >25 ° of maximum angular velocity.


Evaluation of template matching

The RMSE values of the x- and y-coordinates were 4.32 ± 1.55 and 3.72 ± 2.12 pixels for all 6 turtles, respectively. Video was recorded at 1280 × 720 pixels; thus, RMSEs were only 0.34 ± 0.12 % and 0.52 ± 0.29 % for the x- and y-coordinates, respectively. The true extraction rate of the head positions was 86.2 ± 8.1 %, while the false extraction rate was 6.6 ± 8.4 %. These values indicate that the template matching technique is adequate for tracking the head position of a sea turtle. However, the technique could not be implemented for 14.8 ± 5.2 % of the time in the cruising swimming and breathing phases, because the turtles’ heads moved out of the range of the video.

There was a significant relationship between the carapace tilt angle on the sway (lateral) axis and horizontal head position, demonstrating that turtles tended to move their heads in the same direction toward which they were tilted (Additional file 2). This fact indicates that the turtles usually did not keep their head straight, because in that case the horizontal head position would be on the side opposite the direction of the tilt.

Head movement during cruising swimming

The turtles moved their head to the left for 50.5 ± 8.4 % of the time during cruising swimming and to the right for the remaining time. Thus, the time spent moving their head to the left and right did not differ significantly (Wilcoxon signed-rank test, N = 6, P = 0.92); however, the variance differed among individuals (Fig. 2, Additional file 3).

Fig. 2
figure 2

Histograms of the head position of six green turtles (CM1–CM6). Positive and negative values indicate that the turtle faced to the right or left side, respectively

Change in head movement before and after breathing

The time-series profiles of head and body movements are shown in Fig. 3. The dominant frequencies of horizontal head movement before and after breathing and during cruising swimming were 0.25 ± 0.03 Hz (N = 50), 0.27 ± 0.03 Hz (N = 53) and 0.32 ± 0.01 Hz (N = 57), respectively (Fig. 4a, Additional file 4); all pairwise comparisons were significantly different (ANOVA and post hoc Tukey test, P < 0.05). The mean amplitudes of horizontal head movement before/after breathing and during cruising swimming were 12.0 ± 4.6 point (N = 50), 27.5 ± 2.9 point (N = 53) and 8.4 ± 3.8 point (N = 57), respectively (Fig. 4b, Additional file 4); all pairwise comparisons were significantly different (ANOVA and post hoc Tukey test, P < 0.05).

Fig. 3
figure 3

Typical profiles of head and body movements: dive, carapace tilt angle on the surge (longitudinal) and sway (lateral) axes, angular velocity on the yaw (vertical) axis, and horizontal head movement in a green turtle. Dashed lines represent the 10-s period before taking a first breath/after taking a last breath, respectively

Fig. 4
figure 4

The dominant frequency (a) and amplitude of horizontal head movements (b) in green turtles before and after breathing, during cruising swimming and before feeding. BB, AB, SW and BF represent before breathing, after breathing, cruising swimming, and before feeding, respectively. Vertical bars and asterisks represent standard deviations and significant differences, respectively. Lowercase letters represent significant differences by Turkey post hoc test (P < 0.05)

Head movement before feeding

Twenty feeding events on Salpa sp., jellyfish, and floating seagrass were observed from all turtles (Additional file 5). During the 10 s before feeding on floating prey, the dominant frequency of horizontal head movement was 0.23 ± 0.03 Hz (N = 20, Fig. 4a), while the mean amplitude was 9.3 ± 3.6 point (N = 20, Fig. 4b). These facts indicate that the turtles moved their heads significantly slower while feeding than during cruising swimming and before/after breathing (ANOVA and post hoc Tukey test, P < 0.05). The amplitude was not significantly different from that during cruising swimming and before breathing (ANOVA and post hoc Tukey test, P > 0.05), but significantly smaller than after breathing (ANOVA and post hoc Tukey test, P < 0.05).

Head movement before turning

For the two turtles deployed with both video and gyro loggers, 38 turns were recorded over the recording period (Additional file 6). When making turns (to the right or the left), the turtles always turned their head toward the direction in which they were heading 1.38 ± 0.77 s before turning. Maximum angular velocity on the yaw axis was significantly related to the maximum amplitude of horizontal head movement before turning (ANOVA, F = 281.9, P < 0.001, Fig. 5).

Fig. 5
figure 5

The maximum angular velocity on the yaw axis (degrees per second) in two turtles (CM5 and CM6) plotted with respect to the maximum amplitude of horizontal head movement before turning. Positive and negative values represent the right and left sides, respectively. The black line represents the results of linear regression


Computer vision techniques have been rapidly developing in the field of computer image processing and recognition, primarily because performance cameras have become less expensive and computers now have sufficient processing capabilities [17, 18]. These techniques have been used in various fields from image sensing to artificial intelligence and control robotics [18]. In addition to these advances in hardware, the miniaturization of recording devices for video and still images enables biologists to apply these computer vision techniques to bio-logging studies. To our knowledge, however, only one paper has been published that has applied a computer vision technique to bio-logging using animal-borne video [16]. Our approach of combining computer vision with bio-logging demonstrated the potential for providing a novel perspective of data analysis to visual cognition and behavior studies in animal science.

Although accurate object tracking in the video image is possible visually, this approach requires enormous time and effort. Our template matching demands much less effort and results in quantitative, rapid tracking of green turtles’ head movements with high accuracy (almost the same as visual tracking) and high true and low false extraction rates. Our findings indicate that template matching offers a valuable method for object recognition and tracking on video footage in the field of bio-logging as well as for life-logs. In addition, this automated method will be even more useful with forthcoming advances in animal-borne video loggers that can achieve longer recording times and enhanced miniaturization of recording systems. In most cases, template matching generated appropriate head positions, but the technique was sometimes (13.8 % in this study: 100 % minus the true extraction rate) unable to track head position because the matching score did not exceed the extraction threshold. These incidents occurred intermittently, and were mainly observed when turtles turned their necks sharply while looking around or extended their neck upward while breathing (Fig. 3, Additional file 4). Thus, these non-extractions would not seem to be a problem for tracking of head movements. False extractions only occurred when the turtle’s head moved out of the field of view or when coral reefs appeared; some image patterns of the coral reefs were incorrectly identified as a head position. Not surprisingly, template matching cannot be applied when the turtle’s head moves out of the field of view. Thus, the success rate of template matching varied in relation to the attachment angle of the video logger and the angle of the view (Additional file 1). Therefore, researchers should pay attention to these aspects for the successful tracking of an object.

Our results highlighted anew that vision serves as an important cue in green turtles in their assessment of the surrounding environment [8, 22, 23]. However, head movements might also function as a behavior related to other stimuli, such as odor. For example, leatherback turtles (Dermochelys coriacea) exhibit a rhythmic mouth opening behavior during specific phases of dives, suggesting that they might rely on gustatory cues to sense the immediate environment [27]. If so, head movement might appear prior to feeding. Here, we observed several feeding behaviors on jellyfish and salpae while turtles were swimming, but turtles did not move their heads frequently or widely (Fig. 4, Additional file 5), indicating that they may maintain their head toward a particular direction to locate and capture prey. These findings indicate that head movement in green turtles is performed to assess the environment via visual cues. However, sea turtles are not only distributed in environments with high underwater visibility such as in our study area, but also in habitats with poor visibility, such as inshore estuaries with mangrove forests and deep sea. In such environments, the role of vision would be much less important, and we expect that turtles turn their heads less frequently or in a different manner than observed in the present study.

We quantitatively described fundamental information related to how turtles move their heads while performing normal activities (Fig. 2). The carapace tilt angle on the sway axis does not seem to have much effect on horizontal head position (Additional file 2). Green turtles did not exhibit a significant tendency to turn their head to one side or the other during cruising swimming (i.e., the time spent turning to the right or left side was roughly equal), although turtles sometimes continuously turned their head to one side for a short period of time. The difference in the variances of head position among individuals indicates that turtles may change the frequency of head movements in relation to the surrounding environment.

Sea turtles are thought to assess subsurface risks, such as boat strikes or predator attacks, during ascent, and to scan for prospective prey, predators, and resting places before starting a dive or during descent [10, 28, 29]. However, little quantitative evidence exists to support these supposed behaviors. Our study demonstrated that green turtles moved their heads from side to side more slowly and more widely before and after taking a breath than during swimming (Figs. 3, 4), which may represent the first factual evidence that turtles vigilantly visually assess the underwater/subsurface environment before and after taking a breath at the surface.

The turtles always turned their head toward the direction in which they were heading before making a turn (Fig. 5). The eyes of sea turtles are located on the side of the head and are therefore well placed for taking in information from a wide field of view [30]. The binocular field of vision of green turtles is assumed to differ 48°–67° from those of other turtle species [31]. Thus, the observed significant relationship between maximum angular velocity on the yaw axis and the maximum amplitude of horizontal head movement before turning may indicate that turtles always scan the environment toward which they are heading before a turn to achieve a stereoimage with both eyes and to subsequently make a decision as to whether to proceed. An alternative interpretation is that turtles may move their head to the side toward which they were heading to reduce drag by decreasing the cross-sectional area of the whole body in the direction of turning.


In this study, we presented an object tracking method using a computer vision technique for animal-borne video data. Our method enables researchers to automatically extract quantitative behavioral information in the wild that could not otherwise be obtained from other sensors used in previous studies. Here we used a simple basic template, a grayscale image for template matching. However, templates could be matched by other image conversion methods such as edge detection, which highlights the border between an object and the background [17]. Such techniques may enable researchers to extract objects that do not have strong contrasts and characteristic patterns like a green turtle’s head. Thus movements of other body parts and surrounding objects such as the mouth, nose, wing or prey could be tracked, leading to broad potential for application of our method to studies in other situations and on other species.

Our combined approach using this new method coupled with other sensors (e.g., gyroscope) will be valuable for achieving a more profound understanding of the relationship between animal behavior and other individual(s)/objects in the surrounding environment. Visibility is critical for computer vision; thus, our method cannot be applied in environments with poor visibility, such as turbid or deep waters, at night, or in stormy weather. Nevertheless, our new method of combining computer vision and bio-logging will serve as a powerful tool for animal behavior and ecological studies.


  1. Dukas R. Cognitive ecology: the evolutionary ecology of information processing and decision making. Chicago: The University of Chicago Press; 1988.

    Google Scholar 

  2. Kitagawa T, Nakata H, Kimura S, Itoh T, Tsuji S, Nitta A. Effect of ambient temperature on the vertical distribution and movement of Pacific Bluefin tuna Thunnus thynnus orientalis. Mar Ecol Prog Ser. 2000;206:251–60.

    Article  Google Scholar 

  3. Hawkes LA, Witt MJ, Broderick AC, Coker JW, Coyne MS, Dodd M, Frick MG, Godfrey MH, Griffin DB, Murphy SR, Murphy TM, Williams KL, Godley BJ. Home on the range: spatial ecology of loggerhead turtles in Atlantic waters of the USA. Diversity Distrib. 2011;17:624–40.

    Article  Google Scholar 

  4. Vyssotski AL, Dell’Omo G, Dell’Ariccia G, Abramchuk AN, Serkov AN, Latanov AV, Loizzo A, Wolfer DP, Lipp HP. EEG responses to visual landmarks in flying pigeons. Curr Biol. 2009;19:1159–66.

    Article  CAS  PubMed  Google Scholar 

  5. Rutz C, Troscianko J. Programmable, miniature video-loggers for deployment on wild birds and other wildlife. Methods Ecol Evol. 2013;4:114–22.

    Article  Google Scholar 

  6. Sato K, Mitani Y, Kusagaya H, Naito Y. Synchronous shallow dives by Weddell seal mother-pup pairs during lactation. Mar Mam Sci. 2003;19:136–47.

    Article  Google Scholar 

  7. Yoda K, Murakoshi M, Tsutsui K, Kohno H. Social interactions of juvenile brown boobies at sea as observed with animal-borne video cameras. PLoS One. 2011;6(5):e19602.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  8. Okuyama J, Kagawa S, Arai N. Random mate searching: male sea turtle targets juvenile for mating behavior. Chel Cons Biol. 2014;13:278–81.

    Article  Google Scholar 

  9. Seminoff JA, Jones TT, Marshall GJ. Underwater behaviour of green turtles monitored with video-time-depth recorders: what’s missing from dive profiles? Mar Ecol Prog Ser. 2006;322:269–80.

    Article  Google Scholar 

  10. Okuyama J, Nakajima K, Noda T, Kimura S, Kamihata H, Kobayashi M, Arai N, Kagawa S, Kawabata Y, Yamada H. Ethogram of immature green turtles: behavioral strategies for somatic growth in large marine herbivores. PLoS One. 2013;8(6):e65783.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  11. Rutz C, Bluff LA, Weir AAS, Kacelnik A. Video camera on wild bird. Science. 2007;318:765.

    Article  CAS  PubMed  Google Scholar 

  12. Okuyama J, Tabata R, Nakajima K, Arai N, Kobayashi M, Kagawa S. Surfacers change their dive tactics depending on the aim of the dive: evidence from simultaneous measurements of breaths and energy expenditure. Proc Roy Soc B. 2014;281:20140040.

    Article  Google Scholar 

  13. Watanabe Y, Bornemann H, Liebsch N, Plötz J, Sato K, Naito Y, Miyazaki N. Seal-mounted cameras detect invertebrate fauna on the underside of an Antarctic ice shelf. Mar Ecol Prog Ser. 2006;309:297–300.

    Article  Google Scholar 

  14. Watanuki Y, Daunt F, Takahashi A, Newell M, Wanless S, Sato K, Miyazaki N. Microhabitat use and prey capture of a bottom-feeding top predator, the European shag, shown by camera loggers. Mar Ecol Prog Ser. 2008;356:283–93.

    Article  Google Scholar 

  15. Watanabe Y, Mitani Y, Sato K, Cameron MF, Naito Y. Dive depths of Weddell seals in relation to vertical prey distribution as estimated by image data. Mar Ecol Prog Ser. 2003;252:283–8.

    Article  Google Scholar 

  16. Kane SA, Zamani M. Falcons pursue prey using visual motion cues: new perspectives from animal-borne cameras. J Exp Biol. 2014;217:225–34.

    Article  PubMed Central  PubMed  Google Scholar 

  17. Russ JC. The image processing handbook. Boca Raton: CRC Press; 2006.

    Book  Google Scholar 

  18. Bradski GR, Kaehler A. Learning OpenCV: computer Vision with the OpenCV library. Farnham: O’Reilly; 2008.

    Google Scholar 

  19. Nakamura Y, Ohde JY, Ohta Y: Structuring personal activity records based on attention-analyzing videos from head mounted camera. In: Sanfeliu A, Villanueva JJ, Vanrell M, Alquezar R, Crowley J, Shirai Y, editors. Proceedings of 15th International Conference on Pattern Recognition vol. 4: 3–7 September; Barcelona. IEEE Computer Society; 2000. p. 222–5.

  20. Gemmell J, Bell G, Lueder R. MyLifeBits: a personal database for everything. Commun ACM. 2006;49:88–95.

    Article  Google Scholar 

  21. Oliver LJ, Salmon M, Wyneken J, Hueter R, Cronin TW. Retinal anatomy of hatchling sea turtles: anatomical specializations and behavioral correlates. Mar Freshw Behav Physiol. 2000;33:233–48.

    Article  Google Scholar 

  22. Fritsches KA, Warrant EJ. Vision. In: Wyneken J, Lohmann KJ, Musick JA, editors. The biology of sea turtles, vol. III. Boca Raton: CRC Press; 2013. p. 31–58.

    Chapter  Google Scholar 

  23. Narazaki T, Sato K, Abernathy KJ, Marshall GJ, Miyazaki N. Loggerhead turtles (Caretta caretta) use vision to forage on gelatinous prey in mid-water. PLoS One. 2013;8(6):e66043.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  24. Bartol SM, Musick JA. Sensory biology of sea turtles. In: Lutz PL, Musick JA, Wyneken J, editors. The biology of sea turtles, vol. II. Boca Raton: CRC Press; 2003. p. 79–102.

    Google Scholar 

  25. Noda T, Okuyama J, Koizumi T, Arai N, Kobayashi M. Monitoring attitude and dynamic acceleration of free-moving aquatic animals using a gyroscope. Aqua Biol. 2012;16:265–76.

    Article  Google Scholar 

  26. Sakamoto KQ, Sato K, Ishizuka M, Watanuki Y, Takahashi A, Daunt F, Wanless S. Can ethograms be automatically generated using body acceleration data from free-ranging birds? PLoS One. 2009;4:e5379.

    Article  PubMed Central  PubMed  Google Scholar 

  27. Myers AE, Hays GC. Do leatherback turtles Dermochelys coriacea forage during the breeding season? A combination of novel and traditional data-logging devices provide next insights. Mar Ecol Prog Ser. 2006;322:259–67.

    Article  Google Scholar 

  28. Hochscheid S, Godley BJ, Broderick AC, Wilson RP. Reptilian diving: highly variable dive patterns in the green turtle Chelonia mydas. Mar Ecol Prog Ser. 1999;185:101–12.

    Article  Google Scholar 

  29. Heithaus MR, Frid A. Optimal diving under the risk of predation. J Theor Biol. 2003;223:79–92.

    Article  PubMed  Google Scholar 

  30. Mrosovsky N. The water-finding ability of sea turtles. Brain Behav Evol. 1972;5:202–25.

    Article  CAS  PubMed  Google Scholar 

  31. Hergueta S, Ward R, Lemire M, Rio JP, Reperant J, Weidner C. Overlapping visual fields and ipsilateral retinal projections in turtles. Brain Res Bull. 1992;29:427–33.

    Article  CAS  PubMed  Google Scholar 

Download references

Authors’ contributions

JO conceived and designed the field experiment, analyzed the data, and prepared the manuscript. KN assisted with field experiments and data analysis. KM, YN, KK and TK developed the analysis program using template matching. NA coordinated the research project. All authors participated in editing the manuscript. All authors read and approved the final manuscript.


We thank A. Wada, T. Noda, S. Kimura, M. Kobayashi and other staff of the Ishigaki Tropical Station, Seikai National Fisheries Research Institute for assistance. This study was conducted with the permission of the Okinawa Prefecture (Permission No. 23-2, 24-4) for conducting the experiment, and the experimental protocol was approved by Animal Research Committee of Kyoto University (No. 24-4). This study was partly supported by Grant-in-Aid for JSPS Research Activity Young Scientists B (J.O. No. 22710236) and the CREST, JST.

Compliance with ethical guidelines

Competing interests The authors declare that they have no competing interests.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Junichi Okuyama.

Additional files


Additional file 1. Attachment angle and deployment position of the logger unit on the green turtle’s second vertebral scute.


Additional file 2. The relationship between the carapace tilt angle on the sway (lateral) axis and horizontal head position. A significant relationship (y = 0.39x + 0.91, R2 = 0.13, N = 16915, F = 292, P < 0.01) indicates that turtles tended to move their heads to the side in which they were tilted.


Additional file 3. Video clip showing horizontal head movement during cruising swimming.


Additional file 4. Video clip showing horizontal head movement before/after taking a breath.


Additional file 5. Video clip showing horizontal head movement before feeding.


Additional file 6. Video clip showing horizontal head movement when making a turn.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Okuyama, J., Nakajima, K., Matsui, K. et al. Application of a computer vision technique to animal-borne video data: extraction of head movement to understand sea turtles’ visual assessment of surroundings. Anim Biotelemetry 3, 35 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: