On Representativeness of Internet Data Sources for Real Estate Market in Poland


  • Maciej Eryk Beręsewicz Poznan University of Economics




Shifting paradigms in Official Statistics lead to a widespread use of administrative records to support or to create an alternative for census and surveys. At the same time demand for diversified detailed information is increasing. Official Statistics in order to meet this demand need to seek for new data sources. Internet data sources or more general -- Big Data -- could be one of them. Potential usefulness of these new sources of statistical information should not be neglected.

The aim of the paper is to assess representativeness of Internet data sources (IDS) for real estate market in Poland. These sources could be used for describing demand and supply on secondary real estate market in more detailed way that is done with existing methodology. In order to assess representativeness, information from official surveys and other data sources will be used. Due to lack of sufficient literature on this issue, own research will be conducted to enhance information from official statistics. For the purpose of the paper Internet data sources will be defined. Register TERYT containing information on street names was used to correct information taken from Internet data sources. Special program for automated data collection (web spider) was developed. All the calculation was done with R statistical software and additional packages (XML, RCurl, httr).

