Data Editing and Imputation in Business Surveys Using “R”
Journal Title: Revista Romana de Statistica - Year 2014, Vol 62, Issue 2
Abstract
Purpose – Missing data are a recurring problem that can cause bias or lead to inefficient analyses. The objective of this paper is a direct comparison between the two statistical software features R and SPSS, in order to take full advantage of the existing automated methods for data editing process and imputation in business surveys (with a proper design of consistency rules) as a partial alternative to the manual editing of data. Approach – The comparison of different methods on editing surveys data, in R with the ‘editrules’ and ‘survey’ packages because inside those, exist commonly used transformations in official statistics, as visualization of missing values pattern using ‘Amelia’ and ‘VIM’ packages, imputation approaches for longitudinal data using ‘VIMGUI’ and a comparison of another statistical software performance on the same features, such as SPSS. Findings – Data on business statistics received by NIS’s (National Institute of Statistics) are not ready to be used for direct analysis due to in-record inconsistencies, errors and missing values from the collected data sets. The appropriate automatic methods from R packages, offers the ability to set the erroneous fields in edit-violating records, to verify the results after the imputation of missing values providing for users a flexible, less time consuming approach and easy to perform automation in R than in SPSS Macros syntax situations, when macros are very handy.
Authors and Affiliations
Elena Romascanu
Relaţia între parametrii dreptelor reciproce
The authors analyze the relationships existing between the parameters of the reciprocal lines, defining, determining and interpreting the notions of estimators, regression line slope quotient, free terms of the regressi...
Some Aspects regarding the Residual Variable
The study of the residual variable is a significant aspect in econometrics practice, both in model use and constructions, as it provides key information on the correlation between two varaibles, and also the impact of th...
Statistical-Econometric Models used in Economic Analysis
Regression and correlation method indicates how the characteristic result of „Y” changes in conditions where the characteristics of values „X” changes. The goal of regression is to identify the mathematical relationship...
Migraţia internaţională şi impactul asupra pieţei muncii
În articol sunt prezentate date şi informaţii cu privire la rezultatele cercetării întreprinsă în scopul de a identifi ca şi analiza impactul emigraţiei forţei de muncă asupra pieţei muncii la nivelul Uniunii Europene. A...
What is the value of official statistics and how do we communicate that value?
Einstein’s aphorism mentioned above wasn’t meant to colour the text, but we particularly believe it corresponds, in a way, to the topic we plan to introduce during the seminar. Consequently, paraphrasing it, the aphorism...