Environmental epidemiologists always strive for accurate exposure assessment. When considering spatially modeled exposures, such as air pollution, green space, noise, or the built environment, researchers often use participants’ residential address… Click to show full abstract
Environmental epidemiologists always strive for accurate exposure assessment. When considering spatially modeled exposures, such as air pollution, green space, noise, or the built environment, researchers often use participants’ residential address to assign exposures. However, residence-based exposure estimates contain error because no one spends 100% of their time at home (COVID-19 lockdowns notwithstanding). Approaches using Global Positioning Systems (GPS) data enable us to estimate personalized exposure as participants move throughout the day, that is, their “activity space.” But studies measuring activity space with GPS devices often follow small cohorts over short time periods owing to participant burden and device cost. Smartphone applications with activated location services show tremendous potential in obtaining activity space data, but it is unclear whether it is feasible to gather these data over many years. In their article, Hystad et al. explored a unique approach to assessing exposure retrospectively based on minute-to-minute location using Google Location History (GLH) data derived from smartphone data. As a Google product, GLH data have been collected on location service-enabled smartphones since 2010. The data provide information on locations visited by the user over time. Hystad et al. invited 378 participants in the Washington State Twin Registry study who had participated in previous GPS studies to submit their GLH data for a new study. Ultimately, 61 participants (16% of those eligible) successfully provided their GLH history, while another 53 (14%) wanted to participate but did not have GLH data available. Smartphones provide a massive amount of location data: From 61 participants, there were 34 million data points of more than 66,000 d between 2010 and 2021, with a median of 752 d per participant. For participants with overlapping GLH and GPS data, GLH captured 91% of GPS locations away from home or work for a 200-m buffer. Finally, nitrogen dioxide exposure estimates derived from GLH data were different from those estimated from home addresses, suggesting that GLH-based estimates provide additional information on exposure comparedwith those based on the residence alone. Those hoping to use GLH for research should be aware of the limitations of these data. A low participation rate suggests those who provideGLHdatamay not be comparable to the general population. In addition, 23%of participants declined to participate citing privacy concerns. And those providing GLH data were on average 6 y younger than those who did not. Any study using GLH data must assess the generalizability of these data. We must also think through the ethics of using GLH data for research. Like any GPS data, GLH data are highly identifiable, given that they can be used to calculate where participants live and spend time. Moreover, GLH users may not be aware that these data are being gathered in the background on their phones and being sent to Google. Although commercial entities are not held to the same ethical standards as research institutions, maintaining a perspective on privacy and data confidentiality is fundamental when using these data for research. However, there is enormous potential for using GLH data in research. Rich time–activity data are relatively easy to obtain: It took participants only 5 min to download their data and upload them to a secure server. Participants can provide years of time– activity data all the way back to 2010, which can be used to estimate personalized exposures to any spatially defined feature. As we develop more complex spatiotemporal models, we may even be able to link GLH data to the exact level of an environmental factor at a given moment (e.g., hourly temperature data) to gain new perspectives on precise effects of the environment on health. We can also use GLH data to explore potential natural experiments, such as exploring whether COVID-19 policies influenced shifts in contextual environment exposures. Cohort studies with large samples of residential address data can use GLH data on subsets of participants to refine exposure and conduct measurement error correction. GLH data also have the advantage of alleviating worries over compliance with study protocols, as well as concerns about participants changing behavior when monitored, because data are collected passively and retrospectively. As Google (as well as Apple and many other technology companies) increasingly gathers location data on users for commercial purposes, there are tremendous opportunities to advance environmental epidemiology. While maintaining our ethical obligation to our research participants, epidemiologists should develop partnerships with these tech corporations to fully capitalize on new insights from smartphone location data and other tools like GLH. These approaches may enable us to recreate personalized exposures over more than a decade of data to unearth novel viewpoints on spatial exposures and health.
               
Click one of the above tabs to view related content.