2.2.1 Sampling.
Given the small sample of papers that used AR to enhance the zoo visitor experience, our second review focused on broader technologies. We restricted our search to 1980 to 2023, using the search string “[Full Text: technology] AND [Abstract: zoo]”.
The initial dataset resulted in 1052 papers, which we imported to Covidence. We removed all duplicate instances of a paper and then screened the papers based on their title and abstract. A substantial number of papers (910 papers) were considered irrelevant for our review based on the title and abstract screening. The large number of exclusions can be explained by the use of the term ‘zoo’ for reasons other than referring to an establishment that house animals. For example, our search returned papers related to machine learning (“Radio Galaxy Zoo” [
2]), face recognition technology (“FaceX-Zoo” [
48]), and networking papers (“Networked Data Zoo” [
65]). Additionally, our search returned papers that clearly indicated in their title and abstract that the technology was not used to enhance the zoo visitor experience. Examples include: ‘Using technology to monitor and improve zoo animal welfare’ [
89] and ‘Assisted reproductive technologies for endangered species conservation’ [
28]. We considered these as out of scope for this review. We followed the title and abstract screening with a full review of the remaining papers. We only included papers that employ, explore, discuss or conceptualise technology used by visitors or non-experts to enhance the zoo experience.
Our final sample consisted of 50 papers. Figure
1 shows a summary of the scoping review process for our second review using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA flowchart) [
54].
2.2.3 Results.
Almost all (49) papers in our sample involved technology that presented visual content (text, images, video, 2D graphics and/or 3D visualizations). The one exception was a paper that involved a physical interface (button or lever) to trigger a mechanical enrichment device within an animal’s enclosure. As no digital technology was used in this paper (both the interaction interface and the response was physical), we excluded it from further analysis.
The open and axial coding revealed seven themes: perspective, visual focus, scope, interactivity, social, game elements and content type. We elaborate on them below.
Perspective. Refers to the design of technology that enables users to experience different points of view. Previous works have used technology to show users how animals see the world [
49,
50,
92], or to embody animals [
3,
11,
64]. For example, Kasuga et al. [
36] created virtual reality (VR) videos of various animals’ eyesight, colour vision and dynamic vision. Other works explored how different perspectives can support users in learning abstract concepts. For example, Allison et al. [
3] designed a VR system in which users embodied an adolescent gorilla and learned about the social hierarchies in gorilla tribes through implicit interactions with other gorillas. Other works leveraged technology to better communicate the work of zoo experts. For instance, Whitehouse et al. [
88] designed a game on an interactive public display through which users “became” primate researchers to better empathise with what the work entails. Visitors could navigate a stylized map on the display and engage with quiz games related to the researcher’s life in the wild.
Visual Focus. Refers to the design of technology based on the focal point of attention of the user. Technology in previous works has either focused the users’ view of the animal and the exhibit [
22,
35,
83] or on the technology [
3,
4,
49,
50,
88]. A key motivator for using AR technologies at zoos is not hindering visual connection with animals [
83]. For example, Fu et al. [
22]’s AR app overlaid text or 2D graphics on animals detected via a smartphone without obstructing the view of the animal. However, technology has also been used to present animal visualizations as an alternative when observing a real animal is impossible. For example, Tanaka et al. [
78] presented a web-based system that displayed penguins’ anatomy and behaviour as 2D graphics and animations.
Scope. Refers to the area where the technology will be used. Prior works deployed technology in a small area of the zoo [
32], a section of the zoo encompassing multiple exhibits [
34], or in the entirety of the zoo [
75]. For example, Jimenez Pazmino et al. [
32]’s system visualised the challenges faced by polar bears due to global warming. The system was used in place, so the user did not need to move to interact with the system. In contrast, other applications encouraged visitors to move around the zoo. Kapoun and Kapounová [
34]’s mobile app helped users understand abstract zoological concepts, such as ecological links and food chains. The application guided users to locations in the zoo based on zoological relationships (ecological links or food chains) and presented visualizations of zoological relationships using dynamic semantic networks. Additionally, Kim et al. [
39]’s system used RFID to locate users within the zoo and present relevant educational content.
Social. Refers to the design of technology to facilitate or encourage social interactions. Previous works have facilitated social interactions for teaching visitors [
64], sharing content with other visitors [
58,
59], and playing zoo-related games [
19]. Many of these aim to facilitate in-person social interactions. For instance, Perry et al. [
64]’s AR app was used to facilitate learning about zoo objectives while engaged in a location-based game. Different subgroups of the same learning group received information about the game and were encouraged to share that with other groups.
Other works encouraged remote social interactions. For example, Ren et al. [
69] designed a mobile application where users in the zoo could interact with remote users to share their experiences. We also found technology used to support interactions between the visitor and docents [
32].
Interactivity. Refers to the degree of interaction designed into the experience. We found three kinds of interaction: passive, active stationary, and active mobile interactions. We refer to experiences delivered through technology whose design does not involve user interactions as passive. For example, Perdue et al. [
63] explored the effectiveness of video presentations on visitor knowledge about the presented topic. Active stationary interactions involve gestures, touch or buttons but can be accessed from a single location. For instance, in Chang et al. [
13], users actively interacted with a digital zoo using gestures, but they had to stand in front of a Kinect sensor. Finally, active mobile interactions require the user to interact from different locations. For example, Pishtari et al. [
66]’s location-based mobile system required users to move to different areas of the zoo to interact with game content placed by a game creator.
Game Elements. Refers to the use of game elements in the system design. Examples of such game elements include quizzes [
78,
79,
80,
88], competitive team-based tasks [
19] and location-based game mechanics [
64]. These works include both collaborative and individual examples. The systems described by Perry et al. [
64] and [
19] describe location-based games designed for collaborative play. Both systems used mobile devices to present information but differed in that in Perry et al.’s [
64] users worked towards a common goal, while in Fahlquist et al.’s [
19] teams competed against each other. Single-player games included Long and Gooch’s [
50] educational simulation in which individual users switched between human and bee vision to find relevant game objects.
Content Type. Refers to the type of content delivered through the technology. The types of content explored in previous works can be broadly categorized as visual or auditory. For example, Long et al. [
49] used visual content to simulate the cat visual system and show four key differences (colour, luminance, blur and field of view) between the human and feline systems. In contrast, Pendse et al. [
62] created a novel experience for an accessible aquarium. They map between different characteristics of fish and unique notes to enable visitors to ‘hear’ the presence of different fish. For example, smaller fish were associated with higher pitches, while quick movements led to a faster tempo [
62].