A machine learning approach to big data regression analysis of real estate prices for inferential and predictive purposes

Jorge Iván Pérez-Rave*, Juan Carlos Correa-Morales, Favián González-Echavarría

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

58 Citations (Scopus)

Abstract

The hedonic price regressions have mainly been used for inference. In contrast, machine learning employed on big data has a great potential for prediction. To contribute to the integration of these two strategies, this article proposes a machine learning approach to the regression analysis of big data, viz. real estate prices, for both inferential and predictive purposes. The methodology incorporates a new procedure of selecting variables, called ‘incremental sample with resampling’ (MINREM). The methodology is tested on two cases. The first is data from web advertisements selling used homes in Colombia (61,826 observations). The second considers the data (58,888 observations) from a sample of the Metropolitan American Housing Survey 2011 obtained and prepared by a reference study. The methodology consists of two stages. The first chooses the important variables under MINREM; the second focuses on the traditional training and validation procedure for machine learning, adding three activities. In both test cases, the methodology shows its value for obtaining highly parsimonious and stable models for different sample sizes, as well as taking advantage of the inferential and predictive use of the obtained regression functions. This paper contributes to an original methodology for big data regression analysis.

Original languageEnglish
Pages (from-to)59-96
Number of pages38
JournalJournal of Property Research
Volume36
Issue number1
DOIs
Publication statusPublished - 2 Jan 2019
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2019, © 2019 Informa UK Limited, trading as Taylor & Francis Group.

All Science Journal Classification (ASJC) codes

  • Geography, Planning and Development
  • Urban Studies

Fingerprint

Dive into the research topics of 'A machine learning approach to big data regression analysis of real estate prices for inferential and predictive purposes'. Together they form a unique fingerprint.

Cite this