Sigh-tations

1 May

In 2010, Google estimated that approximately 130M books had been published.

As a species, we still know very little about the world. But what we know already far exceeds what any of us can learn in a lifetime.

Scientists are acutely aware of the point. They must specialize, as chances of learning all the key facts about anything but the narrowest of the domains are slim. They must also resort to shorthand to communicate what is known and what is new. The shorthand that they use is—citations. However, this vital building block of science is often rife with problems. The three key problems with how scientists cite are:

1. Cite in an imprecise manner. This broad claim is supported by X. Or, our results are consistent with XYZ. (Our results are consistent with is consistent with directional thinking than thinking in terms of effect size. That means all sorts of effects are consistent, even those 10x as large.) For an example of how I think work should be cited, see Table 1 of this paper.

2. Do not carefully read what they cite. This includes misstating key claims and citing retracted articles approvingly (see here). The corollary is that scientists do not closely scrutinize papers they cite, with the extent of scrutiny explained by how much they agree with the results (see the next point). For a provocative example, see here.)

3. Cite in a motivated manner. Scientists ‘up’ the thesis of articles they agree with, for instance, misstating correlation as causation. And they blow up minor methodological points with articles whose results their paper’s result is ‘inconsistent’ with. (A brief note on motivated citations: here).

How Do We Know?

17 Aug

How can fallible creatures like us know something? The scientific method is about answering that question well. To answer the question well, we have made at least three big innovations:

1. Empiricism. But no privileged observer. What you observe should be reproducible by all others.

2. Open to criticism: If you are not convinced about the method of observation, the claims being made, criticize. Offer reason or proof.

3. Mathematical Foundations: Reliance on math or formal logic to deduce what claims can be made if certain conditions are met.

These innovations along with two more innovations have allowed us to ‘scale.’ Foremost among the innovations that allow us to scale is our ability to work together. And our ability to preserve information on stone, paper, electrons, allows us to collaborate with and build on the work done by people who are now dead. The same principle that allows us to build as gargantuan a structure as the Hoover Dam and entire cities allows us to learn about complex phenomenon. And that takes us to the final principle of science.

Peer to Peer

20 Mar

Peers are equals, except as reviewers, when they are more like capricious dictators. (Or when they are members of a peerage.)

We review our peers’ work because we know that we are all fallible. And because we know that the single best way we can overcome our own limitations is by relying on well-motivated, informed, others. We review to catch what our peers may have missed, to flag important methodological issues, to provide suggestions for clarifying and improving the presentation of results, among other such things. But given a disappointingly long history of capricious reviews, authors need assurance. So consider including in the next review a version of the following note:

Reviewers are fallible too. So this review doesn’t come with the implied contract to follow all ill-advised things or suffer. If you disagree with something, I would appreciate a small note. But rejecting a bad proposal is as important as accepting a good one.

Fear no capriciousness. And I wish you well.

Motivated Citations

13 Jan

The best kind of insight is the ‘duh’ insight—catching something that is exceedingly common, almost routine, but something that no one talks about. I believe this is one such insight.

The standards for citing congenial research (that supports the hypothesis of choice) are considerably lower than the standards for citing uncongenial research. It is an important kind of academic corruption. And it means that the prospects of teleological progress toward truth in science, as currently practiced, are bleak. An alternate ecosystem that provides objective ratings for each piece of research is likely to be more successful. (As opposed to the ‘echo-system’—here are the people who find stuff that ‘agrees’ with what I find—in place today.)

An empirical implication of the point is that the average ranking of journals in which congenial research that is cited is published is likely to be lower than in which uncongenial research is published. Though, for many of the ‘conflicts’ in science, all sides of the conflict will have top-tier publications—-which is to say that the measure is somewhat crude.

The deeper point is that readers generally do not judge the quality of the work cited for support of specific arguments, taking many of the arguments at face value. This, in turn, means that the role of journal rankings is somewhat limited. Or more provocatively, to improve science, we need to make sure that even research published in low ranked journals is of sufficient quality.

The Case for Ending Closed Academic Publishing

21 Mar

A few commercial publishers publish a large chunk of top flight of academic research. And earn a pretty penny doing so. The standard operating model of the publishers is as follows: pay the editorial board no more than $70-$100k, pay for typesetting and publishing, and in turn get copyrights to academic papers. And then go on and charge already locked in institutional customers—university and government libraries—and ordinary scholars extortionary rates. The model is gratuitously dysfunctional.

Assuming there are no long term contracts with the publishers, the system ought to be rapidly dismantled. But if dismantling is easy, creating something better may not be. It just happens to be. A majority of the cost of publishing is in printing on paper. Twenty first century has made printing large organized bundles on paper largely obsolete; those who need it can print on paper at home. Beyond that, open source software for administering a journal already exists. And the model of a single editor with veto powers seems anachronistic. Editing duties can be spread around much like peer review. As unpaid peer review can survive as it always has, though better mechanisms can be thought about. If some money is still needed for administration, it could be gotten easily by charging a nominal submission tax, waived where the author self identifies as being unable to pay.

Bad Remedies for Bad Science

3 Sep

Lack of reproducibility is a symptom of science in crisis. An eye-catching symptom to be sure, but hardly the only one vying for attention. Recent analyses suggest that nearly two-thirds of the (relevant set of) articles published in prominent political science journals condition on post-treatment variables (see here.) Another analysis suggests that half of the relevant set of articles published in prominent neuroscience journals interpret the difference between a significant and non-significant result as evidence that the difference between the two is significant (see here). What is behind this? My guess: poor understanding of statistics, poor editorial processes, and poor strategic incentives.

  1. Poor understanding of statistics among authors. It likely stems from:
  2. Poor understanding of statistics among editors, reviewers, etc. This creates two problems:
    • Cannot catch inevitable mistakes: Whatever the failings of authors, they aren’t being caught during the review process. (It would be good to know how often reviewers are the source of bad recommendations.)
    • Creates Bad Incentives: If editors are misinformed, say to look for significant results, authors will be motivated to deliver to that.
      • If you know what is the right thing to do but know that there is a premium for doing the wrong thing (see the second point of the second point), you may use a lack of transparency as a way to cater to bad incentives.
  3.  Psychological biases:
    • Motivated Biases: Scientists are likely biased toward their own theories. They wish them to be true. This may lead to motivated skepticism and scrutiny. The same principle likely applies to reviewers who catch on to the storytelling and give a wider pass to stories that jive with them.
  4. Production Pressures: Given production pressures, there is likely sloppiness in what is produced. For instance, it is troubling how often retracted articles are cited after the publication of the retraction notice.
  5. Weak Penalties for Being Sloppy: Without easy ways for others to find mistakes, it is easier to be sloppy.

Given these problems, the big solution I can think of is improving training. Another would be programs that highlight some of the psychological biases and drive clarity on the purpose of science. The troubling part is that the most commonly proposed solution is transparency. As Gelman points out, transparency is neither necessary nor sufficient to prevent the “statistical and scientific problems” that underlie “the scientific crisis” because:

  1. Emphasis on transparency would merely mean transparent production of noise (last column on page 38).
  2. Transparency makes it a tad easier to spot errors but doesn’t provide incentives to learn from errors. And a thumb rule is to fix upstream issues than downstream issues.

Gelman also points out the negative externalities of transparency as a be-all fix. When you focus on transparency, secrecy is conflated with dishonesty.

Stemming the Propagation of Error

2 Sep

About half of the (relevant set of) articles published in neuroscience mistake difference between a significant result and an insignificant result as evidence for the two being significantly different (see here). It would be good to see if the articles that make this mistake, for instance, received fewer citations post publication of the article revealing the problem. If not, we probably have more work to do. We probably need to improve ways by which scholars are alerted about the problems in articles they are reading (and interested in citing). And that may include building different interfaces for the various ‘portals’ (Google Scholar, JSTOR etc., and journal publishers) that scholars heavily use. For instance, creating UIs that thread reproduction attempts, retractions, articles finding serious errors within the original article, etc.

Reviewing the Peer Review

24 Jul

Update: Current version is posted here.

Science is a process. And for a good deal of time, peer review has been an essential part of the process. Looked independently by people with no experience with it, it makes a fair bit of sense. For there is only one well-known way of increasing the quality of an academic paper — additional independent thinking. And who better than engaged, trained colleagues.

But this seemingly sound part of the process is creaking. Today, you can’t bring two academics together without them venting their frustration about the broken review system. The plaint is that the current system is a lose-lose-lose. All the parties — the authors, the editors, and the reviewers — lose lots and lots of time. And the change in quality as a result of suggested changes is variable, generally small, and sometimes negative. Given how critical the peer review is in the scientific production, it deserves closer attention, preferably with good data.

But data on peer review aren’t available to be analyzed. Thus, some anecdotal data. Of the 80 or so reviews that I have filed and for which editors have been kind enough to share comments by other reviewers, two things have jumped at me: a) hefty variation in quality of reviews, b) and equally hefty variation in recommendations for the final disposition. It would be good to quantify the two. The latter is easy enough to quantify.

Reliability of the review process has implications for how many recommenders we need to reliably accept or reject the same article. Counter-intuitively, increasing the number of reviewers per manuscript may not increase the overall burden of reviewing. Partly because everyone knows that the review process is so noisy, there is an incentive to submit articles that people know aren’t good enough. Some submitters likely reason that there is a reasonable chance of a `low quality’ article being accepted at top places. Thus, low reliability peer review systems may actually increase the number of submissions. Greater number of submissions, in turn, increases editors’ and reviewer’s load and hence reduces the quality of reviews, and lowers the reliability of recommendations still further. It is a vicious cycle. And the answer may be as simple as making the peer review process more reliable. At any rate, these data ought to be publicly released. Side-by-side, editors should consider experimenting with number of reviewers to collect more data on the point.

Quantifying the quality of reviews is a much harder problem. What do we mean by a good review? A review that points to important problems in the manuscript and, where possible, suggests solutions? Likely so. But this is much trickier to code. But perhaps there isn’t as much a point to quantifying this. What is needed perhaps is guidance. Much like child-rearing, there is no manual for reviewing. There really should be. What should reviewers attend to? What are they missing? And most critically, how do we incentivize this process?

When thinking about incentives, there are three parties whose incentives we need to restructure — the author, the editor, and the reviewer. Authors’ incentives can be restructured by making the process less noisy, as we discuss above. And by making submissions costly. All editors know this: electronic submissions have greatly increased the number of submissions. (It would be useful to study what the consequence of move to electronic submission has been on quality of articles.) As for the editors — if the editors are not blinded to the author (and the author knows this), they are likely to factor in the author’s status in choosing the reviewers, in whether or not to defer to the reviewers’ recommendations, and in making the final call. Thus we need triple blinded pipelines.

Whether or not the reviewer’s identity is known to the editor when s/he is reading the reviewer’s comments also likely affects reviewer’s contributions — in both good and bad ways. For instance, there is every chance that junior scholars in trying to impress editors file more negative reviews than they would if they would if they knew that the editor had no way of tying the identity of the reviewer with the review. Beyond altering anonymity, one way to incentivize reviewers would be to publish the reviews publicly, perhaps as part of the paper. Just like online appendices, we can have a set of reviews published online with each article.

With that, some concrete suggestions beyond the ones already discussed. Expectedly — given they come from a quantitative social scientist — they fall into two broad brackets: releasing and learning from the data already available, and collecting more data.

Existing Data

A fair bit of data can be potentially released without violating anonymity. For instance,

  • Whether manuscript was desk rejected or not
  • How many reviewers were invited
  • Time taken by each reviewer to accept (NA for those from whom you never heard)
  • Total time in review for each article (till R and R or reject) (And separate set of column for each revision)
  • Time taken by each reviewer
  • Recommendation by each reviewer
  • Length of each review
  • How many reviewers did the author(s) suggest?
  • How often were suggested reviewers followed-up on?

In fact, much of the data submitted in multiple-choice question format can probably be released easily. If editors are hesitant, a group of scholars can come together and we can crowdsource collection of review data. People can deposit their reviews and the associated manuscript in a specific format to a server. And to maintain confidentiality, we can sandbox these data allowing scholars to run a variety of pre-screened scripts on it. Or else journals can institute similar mechanisms.

Collecting More Data

  • In economics, people have tried to institute shorter deadlines for reviewers to effect of reducing review times. We can try that out.
  • In terms of incentives, it may be a good idea to try out cash but also perhaps experimenting with a system where reviewers are told that their comments will be public. I, for one, think it would lead to more responsible reviewing. It would be also good to experiment with triple-blind reviewing.

If you have additional thoughts on the issue, please propose them at: https://gist.github.com/soodoku/b20e6d31d21e83ed5e39

Here’s to making advances in the production of science and our pursuit for truth.

Reducing Errors in Survey Analysis

22 May

Analysis of survey data is hard to automate because of the immense variability across survey instruments—differentvariables, differently coded, and named in ways that often defy even the most fecund imagination. What often replaces complete automation is ad-hoc automation—quickly coded functions, e.g. recoding a variable to lie within a particular range, applied by intelligent people frustrated by the lack of complete automation and bored by the repetitiveness of the task. Ad-hoc automation attracts mistakes, as functions are often coded without rigor, and useful alerts and warnings are usually missing.

One way to reduce mistakes is to prevent them from happening. Carefully coded functions with robust error checking and handling, alerts, and passive verbose outputs that are cognizant of our own biases, and bounded attention, can reduce mistakes. Functions that are used most frequently typically need the most attention.

Let’s use the example of recoding a variable to lie between 0 and 1 in R to illustrate how to code a function. Some things to consider:

  1. Data type: Is the variable numeric, ordinal, or categorical? Let’s say we want to constrain our function to handle only numeric variables. Some numeric variables may be coded as ‘character.’ We may want to seamlessly deal with these issues, and possibly issue warnings (or passive outputs) when improper data types are used.
  2. Range: The range that the variable takes in the data may not span the entire domain. We want to account for that, but perhaps seamlessly by printing out the range that the variable takes and by also allowing the user to input the true range.
  3. Missing Values: A variety of functions we may rely on when recoding our variable may take fail (quietly) when fronted with missing values, for example, range(x). We may want to alert the user to the issue but still handle missing values seamlessly.
  4. A user may not see the actual data so we may want to show the user some of the data by default. Efficient summaries of the data (fivenum, mean, median, etc.) or displaying a few initial items may be useful.

A function that addresses some of the issues:


zero1 <- function(x, minx=NA, maxx=NA) {
# Test the type of x and see if it is a double, or can be transformed into a double
stopifnot(identical(typeof(as.numeric(x)), 'double'))
if(typeof(x)=='character') x <- as.numeric(x)
print(head(x)) #displays first few items
print(paste("Range:", paste(range(x, na.rm=T), collapse=" "))) #shows the range the variable takes in the data
res <- rep(NA, length(x))
if(!is.na(minx)) res <- (x - minx)/(maxx - minx)
if(is.na(minx))  res <- (x - min(x,na.rm=T))/(max(x,na.rm=T) - min(x,na.rm=T))
res
}

These tips also apply to canned functions available in R (and those writing them) and functions in other statistical packages that do not normally display alerts or other secondary information that may reduce mistakes. One can always build on canned functions. For instance, the recode (car package) function can be coded to passively display the correlation between the recoded variable and the original variable by default.

In addition to writing better functions, one may also want to check post hoc. But a caveat about post hoc checks: Post hocchecks are only good at detecting aberrations among the variables you test, and they are costly.

  1. Using prior knowledge:

    1. Identify beforehand how some variables relate to each other. For example, education is typically correlated with political knowledge, race with partisan preferences, etc. Test these hypotheses. In some cases, these can also be diagnostic of sampling biases.
    2. Over an experiment, you may have hypotheses about how variables change across time. For example, constraint typically increases across attitude indices over the course of a treatment designed to produce learning. Test these priors.
  2. Characteristics of the coded variable: If using multiple datasets, check to see if the number of levels of a categorical variable are the same across each dataset. If not, investigate. Cross-tabulations across merged data are a quick way to diagnose problems, which can range from varying codes for missing data to missing levels.

It is ‘possible’ that people may warm up to the science of warming

25 Jan

In science we believe

Belief in science is likely partly based on scientists’ ability to predict.
As M.S. notes, climate scientists accurately predicted that temperatures were going to rise in the future in late 1980s. Hence, for people who are aware of that (like himself), belief in climate science is greater.

Similarly, unpredictability in weather (as opposed to climate), e.g., snowstorms, which are typically widely covered in media, etc., may lower people’s belief in climate science.

Possibility of showers in the afternoon
Over conversations with lay and not to say lay people, I have observed that sometimes people conflate probability and possibility. In particular, they typically over-weight the probability of a possible event and then use that inflated weight to form the judgment. When I ask them to assign a probability to the event they identify as a possibility, they almost always assign very low probabilities, and their opinion comes to better reflect this realization.

Think of it like this: a possibility for people, once raised (by them or others) is very real. Only consciously thinking about the probability of that possibility allows them to get out of funky thinking.

Something else to note. Politicians use ‘possibility’ as a persuasion tool, e.g., ‘there is a possibility of a terror attack’ etc. This is something I have dealt with before but I leave the task of where to people motivated to pursue the topic.

Causality and Generalization in Qualitative and Quantitative Methods

19 Nov

Science deals with the fundamental epistemological question of how we can claim to know something. The quality of the system, forever open to challenge, determines any claims of epistemic superiority that the scientific method may make over other competing claims of gleaning knowledge from data.

The extent to which claims are solely arbitrated on scientific merit is limited by a variety of factors, as outlined by Lakatos, Kuhn, and Feyerabend, resulting in at best an inefficient process and at worst, something far more pernicious. I ignore such issues and focus narrowly on methodological questions around causality and generalizability in qualitative methods.

In science, the inquiry into generalizable causal processes is greatly privileged. There is a good reason for that. Causality and generalizability can provide the basis for intervention. However, not all kinds of data make themselves readily accessible to imputing causality or even making generalizable descriptive statements. For example, causal inference in most historical research remains out of bounds. Keeping this in mind, I analyze how qualitative methods within Social Sciences (can) interrogate causality and generalizability.

Causality

Hume thought that there was no place for causality within empiricism. He argued that the most we can find is that “the one [event] does actually, in fact, follow the other.” Causality is nothing but an illusion occasioned when events follow each other with regularity. That formulation, however, didn’t prevent Hume from believing in scientific theories. He felt that regularly occurring constant conjunctions were a sufficient basis for scientific laws. Theoretical advances in the 200 or so years since Hume have been able to provide a deeper understanding of causality, including a process-based understanding and an experimental understanding.

Donald Rubin defines causal effect as follows: “Intuitively, the causal effect of one treatment, E, over another, C, for a particular unit and an interval of time from t1 to t2 is the difference between what would have happened at time t2 if the unit had been exposed to E initiated at t1 and what would have happened at t2 if the unit had been exposed to C initiated at t1: ‘If an hour ago I had taken two aspirins instead of just a glass of water, my headache would now be gone,’ or because an hour ago I took two aspirins instead of just a glass of water, my headache is now gone.’ Our definition of the causal effect of the E versus C treatment will reflect this intuitive meaning.”

Note that the Rubin Causal Model (RCM), as presented above, depicts an elementary causal connection between two Boolean variables: one explanatory variable (two aspirins) with a single effect (eliminates headaches). Often, the variables take multiple values. And to estimate the effect of each change, we need a separate experiment. To estimate the effect of a treatment to a particular degree of precision in different subgroups, for example, the effect of aspirin on women and men, the sample size for each group needs to be increased.

RCM formulation can be expanded to include a probabilistic understanding of causation. A probabilistic understanding of causality means accepting that certain parts of the explanation are still missing. Hence, a necessary and sufficient condition is absent. Though attempts have been made to include necessary and sufficient clauses in probabilistic statements. David Papineau (Probabilities and Causes, 1985, Journal of Philosophy) writes, “Factor A is a cause of some B just in case it is one of a set of conditions that are jointly and minimally sufficient for B. In such a case we can write A&X ->B. In general, there will also be other sets of conditions minimally sufficient for B. Suppose we write their disjunction as Y. If now we suppose further that B is always determined when it occurs, that it never occurs unless one of these sufficient sets (let’s call them B’s full causes) occurs first, then we have, A and X condition conjugated with Y is equivalent with B. Given this equivalence, it is not difficult to see why A’s causing B should be related to A’s being correlated with B. If A is indeed a cause of B, then there is a natural inference to Prob(B/A) > Prob(B/-A): for, given A, one will have B if either X or Y occurs, whereas without A one will get B only with Y, and conversely it seems that if we do find that Prob(B/A) > Prob(B/-A), then we can conclude that A is a cause of B: for if A didn’t appear in the disjunction of full causes which are necessary and sufficient for B, then it wouldn’t affect the chance of B occurring.”

Papineau’s definition is a bit archaic and doesn’t entirely cover the set of cases we define as probabilistically causal. John Gerring (Social Science Methodology: A Criterial Framework, 2001: 127,138; emphasis in original), provides a definition of probabilistic causality: “[c]auses are factors that raise the (prior) probabilities of an event occurring. [Hence] a sensible and minimal definition: X may be considered a cause of Y if (and only if) it raises the probability of Y occurring.”

A still more sensible yet minimal definition of causality can be found in Gary King et al. (Designing Social Inquiry: Scientific Inference in Qualitative Research, 1994: 81-82), “the causal effect is the difference between the systematic component of observations made when the explanatory variable takes one value and the systematic component of comparable observations when the explanatory variable takes on another value.”

Causal Inference in Qualitative and Quantitative Methods

While the above formulations of causality—the Rubin Causal Model, Gerring, and King—seem more quantitative, they can be applied to qualitative methods. We discuss how below.

A parallel understanding of causality, one that is used much more frequently in qualitative social science, is a process-based understanding of causality wherein you trace the causal process to construct a theory. Simplistically, in quantitative methods in the Social Sciences, one often deduces the causal process, while in qualitative methods the understanding of the causal process is learned from deep and close interaction with data.

Both deduction and induction, however, are rife with problems. Deduction privileges formal rules (statistics) that straightjacket the systematic deductive process so that the deductions are systematic and conditional on the veracity of assumptions like normal distribution of data, the linearity of the effect, lack of measurement error, etc. The formal deductive process bestows a host of appealing qualities like generalizability, when an adequate random sample of the population is taken, or even a systematic handle on causal inference. In quantitative methods, the methodological assumptions for deduction are cleanly separated from the data. The same separation between the formal deductive process with a rather arbitrarily chosen statistical model and data, however, makes the discovery process less than optimal and sometimes deeply problematic. Recent research by Ho and King (Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference, Political Analysis, 2007), and methods like Bayesian Model Averaging (Volinsky), have gone some way in providing ways to mitigate problems with model selection.

No such sharp delineation between method and data exists in qualitative research, where data is collected iteratively—[in studies using iterative abstraction (Sayer 1981, 1992; Lawson 1989, 1995) or grounded theory (Glaser 1978; Strauss 1987; Strauss and Corbin 1990)]—till it explains the phenomenon singled out for explanation. Grounded data-driven qualitative methods often run the risk of modeling particularistic aspects of data, which reduces the reliability with which they can come up with a generalizable causal model. This is indeed only one kind of qualitative research. There are others who do qualitative analysis in the vein of experiments, for example, with a 2×2 model, and yet others who will test apriori assumptions by analytically controlling for variables in a verbal regression equation to get at the systematic effect of the explanatory variable on the explanandum. Perhaps more than the grounded theory method, the pseudo-quantitative style qualitative analysis runs the risk of coming to deeply problematic conclusions based on the cases used.

King et al. (1994: 75, note 1): “[a]t its core, real explanation is always based on causal inferences.”

Limiting Discussion to Positivist Qualitative Methods

Qualitative methods can be roughly divided into positivist methods, e.g., case studies, and interpretive methods. I will limit my comments to positivist qualitative methods.

The differences between positivist qualitative and quantitative methods “are only stylistic and are methodologically and substantively unimportant” (King et al., 1994:4). Both methods share “an epistemological logic of inference: they all agree on the importance of testing theories empirically, generating an inclusive list of alternative explanations and their observable implications, and specifying what evidence might infirm or affirm a theory” (King et al. 1994: 3).

Empirical Causal Inference

To impute causality, we either need evidence about the process or an experiment that obviates the need to know the process, though researchers are often encouraged to have a story to explain the process and test variables implicated in the story.

Experimentation provides one of the best ways to reliably impute causality. However, for experiments to have value outside the lab, the treatment must be ecological—it should reflect the typical values that the variables take in the world. For instance, the effect of televised news is best measured with real-life news clips shown in a realistic setting where the participant has control of the remote. Ideally, we also want to elicit our measures in a realistic way, in the form of their votes, or campaign contributions, or expressions online. The problem is that most problems in social science cannot be studied experimentally. Brady et al. (2001:8) write, “A central reason why both qualitative and quantitative research are hard to do well is that any study based on observational (i.e., non-experimental) data faces the fundamental inferential challenge of eliminating rival explanations.” I would phrase this differently. It doesn’t make social science hard to do. It just means that we have to be ok with the fact that we cannot know certain things. Science is an exercise in humility, not denial.

Learning from Quantitative Methods

  1. Making Assumptions Clear: Quantitative methods often make a variety of assumptions including to make inferences. For instance, empiricists often use ceteris paribus—all other things being equal, which may mean assigning away everything ‘else’ to randomization—to make inferences. Others use the assumption that the error term is uncorrelated with other independent variables to infer the correlation between an explanatory variable x and dependent variable y can only be explained as x’s effect on y. There are a variety of assumptions in regression models and the penalty for violation of each of these assumptions. For example, we can analytically think through how education will affect (or not affect) racist attitudes. Analytical claims are based on deductive logic and a priori assumptions or knowledge. Hence the success of analytical claims is contingent upon the accuracy of the knowledge and the correctness of the logic.
  2. Controlling for things: Quantitative methods often ‘control’ for stuff. It is a way to eliminate an explanation. If gender is a ‘confounder,’ check for variation within men and women. In Qualitative Methods, one can either analytically (or where possible empirically) control for variables or trace the process.
  3. Sampling: Traditional probability sampling theories are built on the highly conservative assumption that we know nothing about the world. And the only systematic way to go about knowing it is through random sampling, a process that delivers ‘representative’ data on average. Newer sampling theories, however, acknowledge that we know some things about the world and use that knowledge by selectively over-sampling things (or people) we are truly clueless about and under-sampling where we have a good idea. For example, polling organizations under-sample self-described partisans and over-sample non-partisans. This provides a window for positivist qualitative methods to make generalizable claims. Qualitative methods can overcome their limitations and make legitimate generalizable claims if their sampling reflects the extent of prior knowledge about the world.
  4. Moderators: Getting a handle on the variables that ‘moderate’ the effect of a particular variable that we may be interested in studying.
  5. Sample: One of the problems in qualitative research that has been pointed out is the habit of selecting on the dependent variable. Selection on the dependent variable deviously leaves out cases where, for example, the dependent variable doesn’t take extreme values. Selection bias can not only lead to misleading conclusions about causal effects and also about causal processes. It is essential hence not to use a truncated dependent variable to do one’s analysis. One of the ways one can systematically drill down to causal processes in qualitative research is by starting off with the broadest palette, either in prior research or elsewhere, to grasp the macro-processes and other variables that may affect the case. Then cognizant of the particularistic aspects of a particular case, analyze the microfoundations or microprocesses present in the system.
  6. Reproducible: One central point of the empirical method is that there is no privileged observed. What you observe, another should be able to reproduce. So, whatever data you collect, however you draw your inferences, all need to be clearly stated and explained.

How Are Academic Disciplines Divided?

18 Jul

The social sciences are split into disciplines like Psychology, Political Science, Sociology, Anthropology, Economics, etc. There is a certain anarchy to the way they are split. For example, while Psychology is devoted to understanding how the individual mind works, and sociology to the study of groups, Political science is devoted merely to an aspect of groups—group decision making.

One of the primary reasons the social sciences are divided so is because of the history of how social sciences developed. As major figures postulated important variables that constrain the social world, fields took shape around them. The other pertinent variables that explain some of the new disciplines in social sciences are changes in technology, and more broadly changing social problems. For example, the discipline of Communication took shape around the time mass media became popular.

The way the social sciences are currently divided has left them with a host of inefficiencies which leave them largely inefficacious in a variety of scenarios where they can offer substantive help. Firstly, The containerized way of understanding the social world provide inadequate ways of understanding complex social systems that are imposed upon by a variety of variables that range from the individual to the institutional. And secondly, the largely discipline-specific theoretical motivations lead academic to concoct elaborate theories that often misstate their applicability in complex ecosystems. We all know how economics never met common sense till of recently. It isn’t that disciplines haven’t tried to bridge the inter-disciplinary divide, they certainly have by creating sub-disciplines ranging from social-psychology (in psychology) to political psychology (in Political Science), and in fact that is exactly where some of the most exciting research is taking place right now, the problem is that we have been slow to question the larger restructuring of the social sciences. The question then arises as to what should we put at the center of our focus of our disciplines? The answer is by no means clear to me though I think it would be useful to develop competencies around primary organizing social structures/institutions.

Role of Social Science

Let me assume away the fact that most social science knowledge will end up in the society either through Capitalism or selective uptake by policymakers. Next, we need to evaluate how social science can meaningfully contribute to society. One intuitive way would be to create social engineering departments that are focused on specific social problems. The advice is by no means radical— certainly Education as a discipline has been around for some time, and relatively recently departments (or schools) devoted to Public Health, Environmental Policy have opened up across college campuses. Secondly, social science should create social engineering departments that help offer solutions for real-life problems, much the same way engineering departments affiliated with natural sciences do and try experimenting with how for example different institutional structures would affect decision making. Lastly, social scientists have a lot more to offer to third world countries which have yet to be overrun by brute Capitalism. What social science departments need to do is lead more data collection efforts in third world countries and offer solutions.

Qualitative Vs. Quantitative Methods

9 Jun

Epistemology of Causality

How do we know that something is the ’cause’ of something and how do we impute ‘causality’ through data?

To impute causality in quantitative models, we rely on the argument that it is unlikely that the change in Y could be explained by anything else other than X since we have ‘statistically controlled for other variables’. We ‘control’ for variables via experiments or we can do it via regression equations. This allows us to isolate the effect of say variable x on y. There are of course some caveats and some assumptions that go along with using these methods but robust experimental designs still allow us to impute causality in a fairly robust way. Generally, the causal claim is buffeted with a description of a plausible causal pathway. All of the analysis and the resulting benefits of reliably imputing causality are predicated on our ability to ‘correctly’ assign numbers to ‘constructs’ (the real variables of interest).

Let’s analyze now how qualitative methods can impute causality. While it seems reasonable to assume that ‘systematic’ ‘qualitative’ analysis of a problem can provide us with a variety of causal explanations and under most circumstances provide us with a reasonably good idea of how much each of the explanatory variables affects the dependent variable, there are crucial problems and limitations that may induce bias in the analyses. Additionally, we must define what constitutes as ‘systematic’ analysis.

Another thing to keep in mind is that ethics and rigor are not enough to impute causality. What one needs are the right epistemic tools.

A lot of qualitative research is marred by the fact that it ‘selects on the dependent variable’. In other words, it sees a dependent variable and then goes sleuthing for the possible causal mechanisms. It is hard in that case to impute wider causality between variables because the relationship hasn’t been tested for varying levels of X and Y. It is useful to keep in mind that sometimes it is all that we can hope to achieve. Additional problems can emerge from things like “selection bias” and logical fallacies like “Post hoc ergo propter hoc”. Partly the way qualitative research is written can also impose its own demands and biases including demands for narrative consistency.

It is unclear to me whether a system exists to impute causality reliably using qualitative methods. There are however some techniques that qualitative methods can borrow from quantitative methods to improve any causal claims that they may be inclined to make – one is to use a representative set of variables, the other is to look for ‘natural experiments’, and pay attention to larger sociological issues and iterate through why alternative explanations don’t apply as well here – a sort of a verbal regression equation.

There are of course instances where deeper more in-depth analysis of few cases allows one to get a deeper understanding of the issue but that shouldn’t be mistaken as coming up with causes.

Epistemology of generalization in empirical methods

There is very little space that we get edgeways when we think about a systematic theory of generalization for empirical theories unless. To generalize we must either ‘know’ fundamental causal mechanisms and how they work under a variety of contextual factors or use probability sampling. Probability sampling theories are built on the belief that we know nothing about the world. Hence we need to take care to collect data (which ideally transposes to the constructs) in a way that makes it generalizable to the entire population of interest.

Causal arguments in Qualitative research

For making ‘well grounded’ causal arguments in qualitative research – say with a small n – the case must be made for generalizability of the selected cases, use deduction to articulate possible causal pathways, and then bring them together in a ‘verbal regression equation’ and analyze which of the causal pathways are important – as in likely or have a large effect size- and which are not.

Epistemic standards in interpretation and methodology

Quantitative methods share a broad repertoire of skills that is shared across the disciplines while comparatively no such common epistemic standards exist across a variety of qualitative sub-streams that differ radically in terms of what data to look at and how to interpret the data. Common epistemic standards allow for research to be challenged in a variety of ways. From Gay and Lesbian studies to Feminist Scholarship to others – there is little in common in terms of epistemic standards and how best to interpret things. What we then have is merely incommensurability. Partly, of course, different questions are being asked but even when same questions are being asked – there appears to be little consensus as to what explanation is preferred over the other. While each new way to “interpret” facts in some ways does expand our understanding of the social phenomena, given the incommensurability in epistemic standards –we cannot bring all of them to a qualitative ‘verbal regression equation’ (my term) through which we can reliably infer the size of the effect of each.

Caveat Lector
The above article deals with the debate between qualitative methods and quantitative methods on a small select sample of issues – generalizability and causality – that are explicitly more tractable through quantitative models. It would be unwise to construe larger points about the relevance of qualitative methods from the article.

Social Science and the Theory of All

22 Apr

Social phenomenon, unlike natural phenomenon, is bound and morphed not only by nature (evolution, etc.) but also history, institutions (religious, governance, etc.), and technology, among others. Before I go any further, I would like to issue a caveat: the categories that I mention above are not orthogonal and in fact, do trespass into each other regularly. We can study particular social phenomena in aggregate through disciplines like political science, which study everything from study of psychology to institutions to history, or study them by focusing on one particular aspect – psychology or genetics – and investigating how each effect multiple social phenomena like politics, communication, etc.

Given the disparate range of fields that try to understand the social phenomenon, often the field is straddled with multiple competing paradigms and multiple theories within or across those paradigms with little or no objective criteria on which the theories can be judged. This is not to say that theories are always mutually irreconcilable for often they are not (though they may be seen as such – which is an artifact of how they are sold), or that favoring one theory automatically implies rejecting others. The success of a theory, hence, often depend on how well it is sold and the historical proclivities of the age.

Proclivities of an age; theories of an age

Popular paradigms emerge over time and then are discarded for entirely new ones. It is not that the old don’t hold but just that the new ones hold the imagination of the age. Take for example variables that people have chosen to describe culture over the ages – Weber argued religion was culture, Marx argued that political economy was culture, Freud proposed a psycho-analytical take on culture (puritan, liberated, etc.), Carey proposed communication as culture, political theorists have argued institutions as culture, bio-evolutionists argue that cognition and bio-rootedness are primary determinants of culture, Tech. evangelists have argued technology is culture, while others have argued that infrastructure dictates culture.

It is useful to acknowledge that the popularity of the paradigms that were used to define culture had something to do with the most important forces shaping culture at that particular time. For example, it is quite reasonable to imagine that Marx’s paradigm was a useful one for explaining the industrial society (in fact it continues to be useful), while Carey’s paradigm was useful to explain the results of rapid multiplication (and accessibility) of communication (mass-) media. I would like to reissue this caveat that adopting new paradigms doesn’t automatically imply rejecting the prior ones. In fact intersection of old and new paradigms provide fecund breeding grounds for interesting arguments and theories – for example political economy of mass media and its impact. Let me illuminate the point with another example from Political Science which a decade or so ago saw a resurgence of cultural theory at the back of Huntington’s theory of ‘Clash of civilizations’. Huntington’s theory didn’t mean an end to traditional paradigms like economic competition; it just postulated that there was another significant variable that needed to be factored in the discourse.

The structure of scientific revolutions

Drawing extensively from historical evidence from the natural sciences, Thomas Kuhn, a Harvard physicist, argued in his seminal book, The Structure of scientific revolutions, that science progressed through “paradigm shifts.” While natural sciences paid scant attention to the book, the book provoked an existentialist crisis within the social sciences. To arrive at that crisis point, social scientists made a number of significant leaps (not empirically based) from what Kuhn said – they argued that growth of social science was anarchic, its judgments historically situated and never objective, and hence the social sciences were pointless – or more correctly had a point but were misguided. This self-flagellation is typical in social sciences that have always been more introspective about their role and value in society as compared to the natural sciences, which have always proceeded with the implicit assumption that ‘progress’ cannot be checked and eventually what they produce are merely tools in service of humanity. Of course, that is quite bunk and has been exposed as such without making even the slightest dent in the research in science and technology. Criticizing natural sciences, especially the majority of it that is in service of ‘value-free’ economics, doesn’t take away from the questions that Kuhn posed for the social sciences. Social scientists, in my estimation, put disproportionate emphasis on Kuhn’s work. Social science is admittedly much behind in terms of coming up with generalizable theories, but they have been quite successful in identifying macro-variables and phenomena.

The most intractable problem that social scientists need to deal with is answering what is the purpose of their discipline. Is it to describe reality or to critique it or engineer alternative realities? If indeed it is all of above, and I believe it is, then social science must think about melding its often disparate traditions – theory and practice.

Rorty and the structure of philosophical revolutions

Richard Rorty in his book, Philosophy and the Mirror of Nature, launches a devastating attack on philosophy – especially its claims to any foundational insights. Rorty traces the history of philosophy and finds that the discipline is embedded, much more deeply than social science, in the milieu of paradigm shifts – philosophers from different ages not only offer different “foundational” insights but often deal with different problems altogether.

Battling at the margins

Those who argue that the singular purpose of social science should be to normatively critique it and offer alternative paradigms are delusional. Understanding how a society works (or how institutions work, people work) is important to craft interventions – be it drug policy or engineering new governance systems. Normative debates often times are nothing but frivolous debates at the margins. The broad overarching problems of today don’t need normative theorists devoted to analysis – though I don’t dispute their contribution – they are evident and abundantly clear. When we take out the vast middle of what needs to be decided, normative theory becomes a battle at the margins.

Post-positivist theorizing; and the sociology of research

The most significant challenges for social science as discipline lie within the realm of how the discipline aggregates research and moves forward and how that process is muzzled by a variety of factors.
Imre Lakatos sees “history of science in terms of a continuous competition between alternative research programs rather than of successive conjectures and refutations on the one hand, or of total paradigm-switches on the other.” Lakatos argues that any research program possess a kernel of theoretical principles which are taken as fixed and hence create a ‘negative heuristic’ that forbids release of anomalous results, and instead scientists are directed to create a “protective belt” of auxiliary assumptions intended to secure correctness of theoretical principles at the core. Finally, ‘positive heuristic’ is at work to “Defend and extend!” (Little, 1981)

Post-positivist scientific philosophy, like the ones forwarded by Kuhn and Lakatos, raise larger questions about the nature (and viability) of the scientific enterprise. While we may have a firmer grasp of what we mean by a good scientific theory, we are still floundering when it comes to creating an ecosystem that foments good social science and creates a rational and progressive research agenda. (Little, 1981) We must analyze the sociology, and political economy of journal publication as the whole venture is increasingly institutionalized and as careerism, etc. become more pronounced.