Research Mentor(s)

Noguchi, Kimihiro

Description

We examine three common transformations (identity, fourth-root, and log) to determine the most suitable transformation for evaluating the importance of certain common features surrounding the Twin Cities Metropolitan Area (TCMA) city parks on park visitation. The distances between these features and city parks are approximately exponentially distributed by noting that their relative locations closely follow the spatial Poisson process. Because a fourth-root transformation improves the normality of exponential random variables, we verify that the fourth-root transformation is considered best by comparing correlation coefficients of the fourth-rooted data to the untransformed and log-transformed data via simulation. Using the TCMA city parks data, we also confirm that the fourth-root transformation improves the bivariate normality. Finally, we show that the fourth-root transformation of distance-type variables improves the probability of selecting the most important features affecting the park visitation using the least absolute shrinkage and selection operator (LASSO) regression.

Document Type

Event

Start Date

18-5-2020 12:00 AM

End Date

22-5-2020 12:00 AM

Department

Statistics, Mathematics

Genre/Form

student projects, posters

Type

Image

Keywords

Exponential Distribution, Modeling Distances, Spatial Poisson Process, Kolomogorov-Smirnov Test, Fourth-Root Transformation, LASSO Regression

Rights

Copying of this document in whole or in part is allowable only for scholarly purposes. It is understood, however, that any copying or publication of this document for commercial purposes, or for financial gain, shall not be allowed without the author’s written permission.

Language

English

Format

application/pdf

Share

COinS
 
May 18th, 12:00 AM May 22nd, 12:00 AM

Modeling Park Visitation Using Transformations of Distance-Type Predictor Variables with LASSO

We examine three common transformations (identity, fourth-root, and log) to determine the most suitable transformation for evaluating the importance of certain common features surrounding the Twin Cities Metropolitan Area (TCMA) city parks on park visitation. The distances between these features and city parks are approximately exponentially distributed by noting that their relative locations closely follow the spatial Poisson process. Because a fourth-root transformation improves the normality of exponential random variables, we verify that the fourth-root transformation is considered best by comparing correlation coefficients of the fourth-rooted data to the untransformed and log-transformed data via simulation. Using the TCMA city parks data, we also confirm that the fourth-root transformation improves the bivariate normality. Finally, we show that the fourth-root transformation of distance-type variables improves the probability of selecting the most important features affecting the park visitation using the least absolute shrinkage and selection operator (LASSO) regression.

 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.