The Human and the Machine: Semi-automated approaches to ML

12 Apr

For a class of problems, a combination of algorithms and human input makes for the most optimal solution. For instance, three years ago software to recreate shredded documents that won the DARPA award used “human[s] [to] verify what the computer was recommending.” The insight is used in character recognition tasks. I have used it to create software for matching dirty data — the software was used to merge shape files with electoral returns at precinct level.

The class of problems for which human input proves useful has one essential attribute — humans produce unbiased, if error-prone, estimates for these problems. So for instance, it would be unwise to use humans for making the ‘last mile’ of lending decisions (see also this NYT article). (And that is something you may want to verify with training data.)

Big Data Algorithms: Too Complicated to Communicate?

11 Apr

“A decision is made about you, and you have no idea why it was done,” said Rajeev Date, an investor in data-science lenders and a former deputy director of Consumer Financial Protection Bureau

From NYT: If Algorithms Know All, How Much Should Humans Help?

The assertion that there is no intuition behind decisions made by algorithms strikes me as silly. So does the related assertion that such intuition cannot be communicated effectively. We can back out the logic for most algorithms. Heuristic accounts of the logic — e.g. which variables were important — can be given yet more easily. For instance, for inference from seemingly complicated-to-interpret methods such as ensemble methods, intuition for what variables are important can be gotten in the same way as it is gotten for methods like bagging. However, even when specific points are hard to convey, the meta-logic of the system can be explained to the end user.

What is true, however, is that it isn’t being done. For instance, WSJ covering Orion routing system at UPS reports:

“For example, some drivers don’t understand why it makes sense to deliver a package in one neighborhood in the morning, and come back to the same area later in the day for another delivery. …One driver, who declined to speak for attribution, said he has been on Orion since mid-2014 and dislikes it, because it strikes him as illogical.”

WSJ: At UPS, the Algorithm Is the Driver

Communication architecture is an essential part of all human focused systems. And what to communicate when are important questions that deserve careful thought. The default cannot be no communication.

The lack of systems that communicate intuition behind algorithms strikes me as a great opportunity. HCI people — make some money.

Estimating Hillary’s Missing Emails

11 Apr


55000/(365*4) ~ 37.7. That seems a touch low for Sec. of state.

1. Clinton may have used more than one private server
2. Clinton may have sent emails from other servers to unofficial accounts of other state department employees

Lower bound for missing emails from Clinton:

  1. Take a small weighted random sample (weighting seniority more) of top state department employees.
  2. Go through their email accounts on the state dep. server and count # of emails from Clinton to their state dep. addresses.
  3. Compare it to # of emails to these employees from the Clinton cache.

To propose amendments, go to the Github gist

Some Hard Feelings: Feelings Towards Some Racial and Ethnic Groups in 4 Countries

8 Aug

According to YouGov surveys in Switzerland, Netherlands and Canada, and the 2008 ANES in the US, Whites, on average, in each of the four countries feel fairly coldly — giving an average thermometer rating of less than 50 on a 0 to 100 scale — toward Muslims, and people from Muslim-majority regions (Feelings towards different ethnic, racial, and religious groups). However, in Europe, Whites’ feelings toward Romanians, Poles, and Serbs and Kosovars are scarcely any warmer, and sometimes cooler. Meanwhile, Whites feel relatively warmly towards East Asians.

Liberal politicians are referred to more often in news

8 Jul

The median Democrat referred to in television news is to the left of the House Democratic Median, and the median Republican politician referred to is to the left of the House Republican Median.

Click here for the aggregate distribution.

And here’s a plot of top 50 politicians cited in news. The plot shows a strong right skewed distribution with a bias towards executives.

News data: UCLA Television News Archive, which includes closed-caption transcripts of all national, cable and local (Los Angeles) news from 2006 to early 2013. In all, there are 155,814 transcripts of news shows.

Politician data: Database on Ideology, Money in Politics, and Elections (see Bonica 2012).

Taking out data from local news channels or removing Obama does little to change the pattern in the aggregate distribution.

(No) Value Added Models

6 Jul

This note is in response to some of the points raised in the Agnoff Lecture by Ed Haertel.

The lecture makes two big points:
1) Teacher effectiveness ratings based on current Value Added Models are ‘unreliable.’ They are actually much worse than just unreliable; see below.
2) Simulated counterfactuals of gains that can be got from ‘firing bad teachers’ are upwardly biased.

Three simple tricks (one discussed; two not) that may solve some of the issues:
1) Estimating teaching effectiveness: Where possible, random assignment of children to classes. I would only do within school comparisons. Inference will still not be clean (SUTVA violations, though they can be dealt with). Simply cleaner.

2) Experiment with teachers. Teach some teachers some skills. Estimate the impact. Rather than teacher level VAM, do a skill level VAM. Teachers = sum of skills + idiosyncratic variation.

3) For current VAMs: To create better student level counterfactuals, use modern ML techniques (SVM, Neural Networks..), lots of data (past student outcomes, past classmate outcomes etc.), cross-validate to tune. Have a good idea about how good the prediction is. The strategy may be applicable to other venues.

Other points:
1) Haertel says, “Obviously, teachers matter enormously. A classroom full of students with no teacher would probably not learn much — at least not much of the prescribed curriculum.” A better comparison perhaps would be to self-guided technology. My sense is that as technology evolves, teachers will come up short in a comparison between teachers and advanced learning tools. In most of the third world, I think it is already true.

2) It appears no model for calculating teacher effectiveness scores yields identified estimates. And it appears we have no clear understanding of the nature of bias. Pooling biased estimates over multiple years doesn’t recommend itself to me as a natural fix to this situation. And I don’t think calling this situation as ‘unreliability’ of scores is right. These scores aren’t valid. The fact that pooling across years ‘works’ may suggest issues are smaller. But then again, bad things may be happening to some kinds of teachers, especially if people are doing cross-school comparisons.

3) Fade-out concern is important given the earlier 5*5 =25 analysis. My suspicion would be that attenuation of effects varies depending on when the timing of the shock. My hunch would be that shocks at an earlier age matter more – they decay slower.

(No) Missing daughters of Indian Politicians

29 Jun

Indian politicians get a bad rap. They are thought to be corrupt, inept, and sexist. Here we check whether there is prima facie evidence for sex-selective abortion.

According to data on the Indian Government ‘Archive’, 15th Lok Sabha members (csv) had, in all, 696 sons and 666 daughters for a sex ratio of 957 females to 1000 males. Progeny of members from states with the most skewed sex ratios (Punjab, Haryana, Jammu and Kashmir, and Haryana) had a surprisingly healthy sex ratio of 1245 females to 1000 males. Sex ratios of children of BJP and INC members were 930/1000 and 965/1000 respectively. Rajya Sabha members (csv) had 271 sons and 272 daughters for a sex ratio of 1003 females to 1000 males. Not only was there little evidence of sex-selective abortion, data also suggest that fertility rates were modest. Lok Sabha members had on average 2.5 kids while members of Rajya Sabha had on average 2.2 kids.