Data Editing and Imputation in Business Surveys Using “R”
Journal Title: Revista Romana de Statistica - Year 2014, Vol 62, Issue 2
Abstract
Purpose – Missing data are a recurring problem that can cause bias or lead to inefficient analyses. The objective of this paper is a direct comparison between the two statistical software features R and SPSS, in order to take full advantage of the existing automated methods for data editing process and imputation in business surveys (with a proper design of consistency rules) as a partial alternative to the manual editing of data. Approach – The comparison of different methods on editing surveys data, in R with the ‘editrules’ and ‘survey’ packages because inside those, exist commonly used transformations in official statistics, as visualization of missing values pattern using ‘Amelia’ and ‘VIM’ packages, imputation approaches for longitudinal data using ‘VIMGUI’ and a comparison of another statistical software performance on the same features, such as SPSS. Findings – Data on business statistics received by NIS’s (National Institute of Statistics) are not ready to be used for direct analysis due to in-record inconsistencies, errors and missing values from the collected data sets. The appropriate automatic methods from R packages, offers the ability to set the erroneous fields in edit-violating records, to verify the results after the imputation of missing values providing for users a flexible, less time consuming approach and easy to perform automation in R than in SPSS Macros syntax situations, when macros are very handy.
Authors and Affiliations
Elena Romascanu
Managementul carierei - pregătire şi promovare profesională
Specialiştii în domeniul pilotajului carierei concep şi elaborează programe de pregătire şi promovare profesională, orientându-se în funcţie de o serie de factori care urmează să contribuie la succesul programului. Succe...
The Gross Domestic Product Evolution by the End of June 2013
In this paper, the research focuses on the evolution of the GDP in recent years. The data are collected from official sources, mainly the publications of the National Institute of Statistics. The research also emphasizes...
Identification Of Financial Instruments – Important Step in Building Portfolios
Construction of any portfolio is initially identifying financial instruments to be traded, and the timing for entering the capital market (ie the optimal timing of trading) . This is the stage in which the market analysi...
THEORETICAL AND PRACTICAL STATISTICS*: TERRITORIAL SERIES/OF SPACE – SYSTEM OF INDICATORS AND INDICES, SUGGESTIVE GRAPHICS REPRESENTATIONS
Territorial statistical series were and are used to analyze phenomena depending on space or place where they are produced, having a special importance at macroeconomic level. In our opinion, to understand how the local s...
Estimation and Variance Decomposition in a Small-size DSGE Model
The purpose of this study is to make a grounded estimation-based analysis of the Romanian economy, considering several central economic variables, aggregated at macroeconomic level. By resorting to a basic dynamic stocha...