Research Mentor(s)
Noguchi, Kimihiro
Description
We examine three common transformations (identity, fourth-root, and log) to determine the most suitable transformation for evaluating the importance of certain common features surrounding the Twin Cities Metropolitan Area (TCMA) city parks on park visitation. The distances between these features and city parks are approximately exponentially distributed by noting that their relative locations closely follow the spatial Poisson process. Because a fourth-root transformation improves the normality of exponential random variables, we verify that the fourth-root transformation is considered best by comparing correlation coefficients of the fourth-rooted data to the untransformed and log-transformed data via simulation. Using the TCMA city parks data, we also confirm that the fourth-root transformation improves the bivariate normality. Finally, we show that the fourth-root transformation of distance-type variables improves the probability of selecting the most important features affecting the park visitation using the least absolute shrinkage and selection operator (LASSO) regression.
Document Type
Event
Start Date
18-5-2020 12:00 AM
End Date
22-5-2020 12:00 AM
Department
Statistics, Mathematics
Genre/Form
student projects, posters
Subjects – Topical (LCSH)
Protected areas--Management; Protected areas--Planning; Mathematical statistics; Probabilities
Type
Image
Keywords
Exponential Distribution, Modeling Distances, Spatial Poisson Process, Kolomogorov-Smirnov Test, Fourth-Root Transformation, LASSO Regression
Rights
Copying of this document in whole or in part is allowable only for scholarly purposes. It is understood, however, that any copying or publication of this document for commercial purposes, or for financial gain, shall not be allowed without the author’s written permission.
Language
English
Format
application/pdf
Included in
Modeling Park Visitation Using Transformations of Distance-Type Predictor Variables with LASSO
We examine three common transformations (identity, fourth-root, and log) to determine the most suitable transformation for evaluating the importance of certain common features surrounding the Twin Cities Metropolitan Area (TCMA) city parks on park visitation. The distances between these features and city parks are approximately exponentially distributed by noting that their relative locations closely follow the spatial Poisson process. Because a fourth-root transformation improves the normality of exponential random variables, we verify that the fourth-root transformation is considered best by comparing correlation coefficients of the fourth-rooted data to the untransformed and log-transformed data via simulation. Using the TCMA city parks data, we also confirm that the fourth-root transformation improves the bivariate normality. Finally, we show that the fourth-root transformation of distance-type variables improves the probability of selecting the most important features affecting the park visitation using the least absolute shrinkage and selection operator (LASSO) regression.