R-Tip: Missing weights, weighted.mean

There are instances where sampling weights are not only unknown, but in fact, cannot be known (unless one makes certain unsavory assumptions). Under those circumstances, weights for certain respondents can be ‘missing’. Typically there the strategy is to code those weights as 0. However if you retain those as NA, weighted.mean etc. is wont to give you NA as an… Read more →

R – Recoding variables reliably and systematically

Survey datasets typically require a fair bit of repetitive recoding of variables. Reducing errors in recoding can be done by writing functions carefully (see some tips here) and automating and systematizing naming, and application of the recode function (which can be custom) – fromlist Read more →

Working with modestly large datasets in R

Even modestly large (< 1 GB) datasets can quickly overwhelm modern personal computers. Working with such datasets in R can be still more frustrating because of how R uses memory. Here are a few tips on how to work with modestly large datasets in R. Setting Memory Limits On Windows, right click R and in the Target field set maximum… Read more →

Reducing Errors in Survey Analysis

Analysis of survey data is hard to automate because of the immense variability across survey instruments – different variables, differently coded, and named in ways that often defy even the most fecund imagination. What often replaces complete automation is ad-hoc automation – quickly coded functions to recode a variable to lie within a particular range, etc. applied by intelligent people… Read more →

help(R): matrix indexing

In R, some want to treat matrix like a data frame. It is a bad idea. And no – one cannot use the dollar sign to pick out a column of a matrix. The underlying data structure for a matrix in R is a vector, whereas data.frame object in R is of type list (see using typeof(object)) and it carries… Read more →