Data Editing and Imputation in Business Surveys Using “R”

Journal Title: Revista Romana de Statistica - Year 2014, Vol 62, Issue 2

Abstract

Purpose – Missing data are a recurring problem that can cause bias or lead to inefficient analyses. The objective of this paper is a direct comparison between the two statistical software features R and SPSS, in order to take full advantage of the existing automated methods for data editing process and imputation in business surveys (with a proper design of consistency rules) as a partial alternative to the manual editing of data. Approach – The comparison of different methods on editing surveys data, in R with the ‘editrules’ and ‘survey’ packages because inside those, exist commonly used transformations in official statistics, as visualization of missing values pattern using ‘Amelia’ and ‘VIM’ packages, imputation approaches for longitudinal data using ‘VIMGUI’ and a comparison of another statistical software performance on the same features, such as SPSS. Findings – Data on business statistics received by NIS’s (National Institute of Statistics) are not ready to be used for direct analysis due to in-record inconsistencies, errors and missing values from the collected data sets. The appropriate automatic methods from R packages, offers the ability to set the erroneous fields in edit-violating records, to verify the results after the imputation of missing values providing for users a flexible, less time consuming approach and easy to perform automation in R than in SPSS Macros syntax situations, when macros are very handy.

Authors and Affiliations

Elena Romascanu

Keywords

Related Articles

Services in the Agricultural Production System. Comparative Structural Levels

This paper presents the characteristic of services in agriculture and offers a description of the dynamics recorded by this sector in the Romanian economy. It is taken into account that services in agriculture have a set...

EVOLUŢIA CERERII ŞI OFERTEI DE FORŢĂ DE MUNCĂ DIN AGRICULTURA ROMÂNIEI 

Pe baza seriilor de date disponibile în Anuarul Băncii Mondiale (2008), s-a realizat o analiză statistică în vederea evidenţierii unor caracteristici ale agriculturii României. Sunt prezentate o serie de modele (Grabowsk...

Assessment and Recognition of Intellectual Capital - Concrete Implications of the Accounting in the Management of Sustainable Development

Intangible assets are the most important sources of competitive advantage. According to the new perspective supported by endogenous growth theory, the traditional factors of production have diminished the importance. Sim...

Unele consideraţii despre managementul resurselor materiale şi logistică

In time, we all became acquainted with notions with close significance, which suggests aspects of securing production, securing workplaces and ensuring the company with materials in order to realize its production functi...

General Aspects Regarding the Methodology for Prediction Risk

In order to measure the total risk to which an investor or a financial institution is exposed when they invest in a financial asset, there needs to be a tool to capture this risk. The most widely used tool in measuring t...

Download PDF file
  • EP ID EP152745
  • DOI -
  • Views 232
  • Downloads 0

How To Cite

Elena Romascanu (2014). Data Editing and Imputation in Business Surveys Using “R”. Revista Romana de Statistica, 62(2), 129-146. https://europub.co.uk/articles/-A-152745