Code
install.packages("pak")
pak::pak("vahdatjavad/ecotourism")Joining Occurrence, Tourism and Weather Data- Are Glowworm sightings driven by tourism?
In this tutorial you will investigate whether glowworm sightings in Tasmania are driven by tourist activity or local hobbyists.
By the end you should be able to:
ws_id and datatourism_regionInstall the package if you haven’t already:
install.packages("pak")
pak::pak("vahdatjavad/ecotourism")We will use four datasets from the ecotourism package:
glowworms: glowworm occurrence records (2014–2024)weather: daily weather by stationtourism_quarterly: quarterly domestic visitor counts by regiontourism_region: tourism region names linked to weather stationsWe focus on glowworm sightings in Tasmania across all years, investigating whether sightings and tourism activity move together.
Explore the table above to get familiar with the data. Notice the hour, date, and ws_id columns which we will use for joining later.
The map below shows all Tasmanian glowworm sightings from 2014–2024. Each gold dot represents a recorded sighting. Click any marker for date and time details.
Filter the glowworms dataset to Tasmania in December. Join with the weather dataset using ws_id and date.
temp) on sighting days?What does this tell you about glowworm spotting conditions?
# filter Tasmania December sightings
tas_dec <- glowworms |>
filter(obs_state == "Tasmania", month == 12)
# join with weather on both ws_id and date for exact area and day match
tas_dec_weather <- tas_dec |>
left_join(weather, by = c("ws_id", "date"))
# summarise key weather stats on sighting days
tas_dec_weather |>
summarise(
n_sightings = n(), # total sightings
avg_temp = mean(temp, na.rm = TRUE), # mean temperature
prop_rainy = mean(rainy, na.rm = TRUE) # proportion rainy days
)# A tibble: 1 × 3
n_sightings avg_temp prop_rainy
<int> <dbl> <dbl>
1 30 9.19 1
Most sightings occur in the afternoon (3pm) rather than at night which is rather surprising for a bioluminescent organism! This likely reflects observer activity patterns rather than glowworm behaviour. People visit caves and trails during daylight hours and record what they see.
Using all years of Tasmania glowworm data, join glowworms to tourism_region via ws_id, then to tourism_quarterly via region_id.
# 1st joining glowworms to region info
tas_region <- glowworms |>
filter(obs_state == "Tasmania") |>
left_join(tourism_region, by = "ws_id")
# 2nd join to tourism quarterly data
tas_tourism <- tas_region |>
left_join(tourism_quarterly, # bring in trip counts
by = c("region_id", "ws_id")) |>
filter(!is.na(region)) # drop unmatched rows
# summarise sightings and trips per year and quarter
yearly <- tas_tourism |>
group_by(year.x, quarter) |> # year.x = glowworm year
summarise(
n_sightings = n(), # count sightings
avg_trips = mean(trips, na.rm = TRUE), # average tourism trips
.groups = "drop"
)
yearly# A tibble: 32 × 4
year.x quarter n_sightings avg_trips
<dbl> <int> <int> <dbl>
1 2015 1 20 8.20
2 2015 2 21 5.85
3 2015 3 13 4.26
4 2015 4 18 5.67
5 2017 1 20 8.20
6 2017 2 21 5.85
7 2017 3 13 4.26
8 2017 4 18 5.67
9 2018 1 80 13.9
10 2018 2 63 11.9
# ℹ 22 more rows
If the two lines move together, i.e, sightings spike when tourism spikes, this suggests tourists are the ones recording glowworm sightings rather than local hobbyists. Look for years where both dip (hint: 2020!) as extra evidence.
From your joined dataset in Question 2, split the tourism data by purpose (Holiday vs Business).
# summarise by year, quarter AND purpose this time
purpose_yearly <- tas_tourism |>
group_by(year.x, quarter, purpose) |> # split by purpose
summarise(
n_sightings = n(), # sightings per group
avg_trips = mean(trips, na.rm = TRUE), # trips per group
.groups = "drop"
) |>
filter(!is.na(purpose)) # drop missing purpose rows
# calculate correlation for each purpose separately
purpose_yearly |>
group_by(purpose) |>
summarise(
correlation = cor( # cor() gives correlation
n_sightings,
avg_trips,
use = "complete.obs" # ignore NA pairs
)
)# A tibble: 2 × 2
purpose correlation
<chr> <dbl>
1 Business 0.317
2 Holiday 0.205
If Holiday trips correlate more strongly with sightings than Business trips, this is strong evidence that tourists on holiday are driving glowworm recordings and not so much locals or business travellers. This has implications for the ecotourism package itself that sighting data may be biased toward tourist-heavy seasons and regions.
In this tutorial you investigated whether glowworm sightings in Tasmania are driven by tourist activity. You practiced:
glowworms -> tourism_region ->tourism_quarterlyweather using ws_id and dategouldian_finch or manta_rays show the same tourism pattern?