Clarifai(ng) the Future of Clarifai; Some Thoughts

31 Dec

Clarifai is a promising AI start-up. In a short(ish) time, it has made major progress on an important problem. And it is rapidly rolling out products with lots of business potential. But there are still some things that it could do.

As I understand it, the base version of Clarifai API is trying to do two things at once: a) learn various recognizable patterns in images b) rank the patterns based on ‘appropriateness’ and probab_true. I think Clarifai would have to split these two things over time and allow people to input what abstract dimensions are ‘appropriate’ for them. As the idiom goes, an image is a thousand words. In an image, there can be information about social class, race, and country, but also shapes, patterns, colors, perspective, depth, time of the day etc. And Clarifai should allow people to pick dimensions appropriate for the task. Though, defining dimensions would be hard. But that shouldn’t stymie the efforts. And ad hoc advances may be useful. For instance, one dimension could be abstract shapes and colors. Another could be the more ‘human’ dimension etc.

Extending the logic, Clarifai should support the building of abstract data science applications that solve a particular problem. For instance, say a user is only interested in learning about whether the photo features a man or a woman. And the user wants to build a Clarifai based classifier. (That person is me. Task is inferring gender of first names. See here.) Clarifai could in principle allow the user to train a classifier that uses all other information in the images, including jewelry, color, perspective, etc. and provide an out of sample error for that particular task. The crucial point is allowing users fuller access to what Clarifai can do and then letting users manage it to their ends. To that end again, input about user objectives needs to be built into the API. Basic hooks could be developed for classification and clustering inputs.

More generally, Clarifai should eventually support more user inputs and a greater variety of outputs. Limiting the product to tagging is a mistake.

There are three other general directions for Clarifai to go into. A product that automatically sections an image into multiple images and tags each section would be useful. This would allow, for instance, to count the number of women in a photo. Another direction to go would be to provide the ‘best’ set of tags that collectively describe a set of images. (It may seem like violating the spirit of what I note above but it needn’t — a user could want just this.) By the same token, Clarifai could build general purpose discrimination engines — a list of tags that distinguishes image(s) the best.

Beyond this, the obvious. Clarifai can also provide synonyms of tags to make tags easier to use. And it could allow users to specify if they want, say tags in ‘UK English’ etc.

Congenial Invention and the Economy of Everyday (Political) Conversation

31 Dec

Communication comes from the Latin word communicare, which means `to make common.’ We communicate not only to transfer information, but also to establish and reaffirm identities, mores, and meanings. (From my earlier note on a somewhat different aspect of the economy of everyday conversation.) Hence, there is often a large incentive for loyalty. More generally, there are three salient aspects to most private interpersonal communication about politics — shared ideological (or partisan) loyalties, little knowledge of, and prior thinking about political issues, and a premium for cynicism. The second of these points — ignorance — cuts both ways. It allows for the possibility of getting away with saying something that doesn’t make sense (or isn’t true). And it also means that people need to invent stuff if they want to sound smart etc. (Looking stuff up is often still too hard. I am often puzzled by that.)

But don’t people know that they are making stuff up? And doesn’t that stop them? A defining feature of humans is overconfidence. And people often times aren’t aware of the depth of the wells of their own ignorance. And if it sounds right, well it is right, they reason. The act of speaking is many a time an act of invention (or discovery). And we aren’t sure and don’t actively control how we create. (Underlying mechanisms behind how we create — use of ‘gut’ are well-known.) Unless we are very deliberate in speech. Most people aren’t. (There generally aren’t incentives to be.) And find it hard to vet the veracity of the invention (or discovery) in the short time that passes between invention and vocalization.

I Recommend It: A Recommender System for Scholarship Discovery

1 Oct

The rate of production of scholarship has never been higher. And while our ability to discover relevant scholarship per unit of time has never kept pace with the production of knowledge, it has also risen sharply—most recently, due to Google scholar.

The efficiency of discovery of relevant scholarship, however, has plateaued over the last few years, even as the rate of production of scholarship has kept its steady upward climb. Part of the reason why the growth has plateaued is because current ways of doing thing cannot be made considerably more efficient very quickly. New growth in rate of discovery will need knowledge discovery systems to get more data on the user’s needs, and access to structured databases of academic production.

The easiest next step would perhaps be to build a collaborative recommendation system. In particular, think of a system that takes your .bib file, trawls a large database of citation lists, for example, JSTOR or PLoS, and produces recommendations for scholarship you may have missed. The logic of a collaborative recommender system is pretty simple: learn from articles which have cited similar scholarship as you. If we have meta data on scholarship, for instance, sub-field, actual text of an article or even the abstract, we could recommend based on the extent to which two articles cite the same kind of scholarly article. Direct elicitations (search terms are but one form) from the user can also be added to guide the recommendations. And meta characteristics, for instance, page rank of a piece of scholarship, could be used to order recommendations.

Companies with Benefits (or) Potential Welfare Losses of Benefits

27 Sep

Many companies offer employees ‘benefits.’ These include paying for healthcare, investment plans, company gym, luncheons etc. (Just ask a Silicon Valley tech. employee for the full list.)

But why ‘benefits’? And why not cash?

A company offering a young man zero down healthcare plan seems a bit like within-company insurance. Post Obamacare — it also seems a bit unnecessary. (My reading of Obamacare is that it just mandates companies pay for the healthcare but doesn’t mandate that how they pay for it. So cash payments ought to be ok?)

For investment, the reasoning strikes me as thinner still. Let people decide what they want to do with their money.

In many ways, benefits look a bit as ‘gifts’ — welfare reducing but widespread.

My recommendation: just give people cash. Or give them an option to have cash.

The technology giveth, the technology taketh

27 Sep

Riker and Ordershook formalized the voting calculus as:

pb + d > c

p = probability of vote ‘mattering’
b = size of the benefit
d = sense of duty
c = cost of voting

They argued that if pb + d exceeds c, people will vote. Otherwise not.

One can generalize this simple formalization for all political action.

A fair bit of technology has been invented to reduce c — it is easier than ever to follow the news, to contact your representative, etc. However, for a particular set of issues, if you reduce c for everyone, you are also reducing p. For as more people get involved, less does the voice of any single person matter. (There are still some conditionalities that I am eliding over — for instance, reduction in c may matter more for people who are poorer etc. and may have an asymmetric impact.)

Technologies invented to exploit synergy, however, do not suffer the same issues. Think Wikipedia, etc.

Beyond Anomaly Detection: Supervised Learning from `Bad’ Transactions

20 Sep

Nearly every time you connect to the Internet, multiple servers log a bunch of details about the request. For instance, details about the connecting IPs, the protocol being used, etc. (One popular software for collecting such data is Cisco’s Netflow.) Lots of companies analyze this data in an attempt to flag `anomalous’ transactions. Given the data are low quality—IPs do not uniquely map to people, information per transaction is vanishingly small—the chances of building a useful anomaly detection algorithm using conventional unsupervised methods are extremely low.

One way to solve the problem is to re-express it as a supervised problem. Google, various security firms, security researchers, etc. everyday flag a bunch of IPs for various nefarious activities, including hosting malware (passive DNS), scanning, or actively . Check to see if these IPs are in the database, and learn from the transactions that include the blacklisted IPs. Using the model, flag transactions that look most similar to the transactions with blacklisted IPs. And validate the worthiness of flagged transactions with the highest probability of being with a malicious IP by checking to see if the IPs are blacklisted at a future date or by using a subject matter expert.

Bad Science: A Partial Diagnosis And Some Remedies

3 Sep

Lack of reproducibility is a symptom of science in crisis. An eye-catching symptom to be sure, but hardly the only one vying for attention. Recent analyses suggest that nearly two-thirds of the (relevant set of) articles published in prominent political science journals condition on post-treatment variables (see here.) Another set of analysis suggests that half of the relevant set of articles published in prominent neuroscience journals treat difference in significant and non-significant result as the basis for the claim that difference between the two is significant (see here). What is behind this? My guess: poor understanding of statistics, poor editorial processes, and poor strategic incentives.

  1. Poor understanding of statistics: It is likely the primary reason. For it would be good harsh to impute bad faith on part of those who use post-treatment variables as control or treating difference between significant and non-significant result as significant. There is likely a fair bit of ignorance — be it on the part of authors or reviewers. If it is ignorance, then the challenge doesn’t seem as daunting. Let us devise good course materials, online lectures, and teach. And for more advanced scholars, some outreach. (And it may involve teaching scientists how to write-up their results.)

  2. Poor editorial processes: Whatever the failings of authors, they aren’t being caught during the review process. (It would be good to know how often reviewers are actually the source of bad recommendations.) More helpfully, it may be a good idea to create small questionnaires before submission that alert authors about common statistical issues.

  3. Poor strategic incentives: If authors think that journals are implicitly biased towards significant findings, we need to communicate effectively that it isn’t so.

Some Aspect of Online Learning Re-imagined

3 Sep

Trevor Hastie is a superb teacher. He excels at making complex things simple. Till recently, his lectures were only available to those fortunate enough to be at Stanford. But now he teaches a free introductory course on statistical learning via the Stanford online learning platform. (I browsed through the online lectures — they are superb, the quality of his teaching, undiminished.)

Online learning, pregnant with rich possibilities, however, has met its cynics and its first failures. The proportion of students who drop-off after signing up for these courses is astonishing (but then so are the sign-up rates). And some very preliminary data suggests that compared to some face-to-face teaching, students enrolled in online learning don’t perform as well. But these are early days and much can be done to refine the systems and to also promote the interests of thousands upon thousands of teachers who have much to contribute to the world. In that spirit, some suggestions:

Currently, Coursera, edX and similar such ventures mostly attempt to replicate an off-line course online. The model is reasonable but doesn’t do justice to the unique opportunities of teaching online:

The Online Learning Platform: For Coursera etc. to be true learning platforms, they need to provide rich APIs for allowing development of both learning and teaching tools (and easy ways for developers to monetize these applications). For instance, professors could access to visualization applications and students access to applications that puts them in touch with peers interested in peer-to-peer teaching. The possibilities are endless. Integration with other social networks and other communication tools at the discretion of students are obvious future steps.

Learning from Peers: The single teacher model is quaint. It locks out contributions from other teachers. And it locks out teachers from potential revenues. We could turn it into a win-win. Scout ideas and materials from teachers and pay them for their contributions, ideally as a share of the revenue — so that they have steady income.

The Online Experimentation Platform: So many researchers in education and psychology are curious and willing to contribute to this broad agenda of online learning. But no protocols and platforms have been established to exploit this willingness. The online learning platforms could release more data at one end. And at the other end, they could create a platform for researchers to option ideas that the community and company can vet. Winning ideas can form basis of research that is implemented on the platform.

Beyond these ideas, there are ideas to do with courses that enhance student’s ability to succeed in these courses. Providing students ‘life skills’ either face-to-face or online, teaching them ways to become more independent students, can allow them to better adapt to the opportunities and challenges of learning online.

Stemming the Propagation of Error

2 Sep

About half of the (relevant set of) articles published in neuroscience mistake difference between a significant result and an insignificant result as evidence for the two being significantly different (see here). It would be good to see if the articles that make this mistake, for instance, received fewer citations post publication of the article revealing the problem. If not, we probably have more work to do. We probably need to improve ways by which scholars are alerted about the problems in articles they are reading (and interested in citing). And that may include building different interfaces for the various ‘portals’ (Google scholar, JSTOR etc., and journal publishers) that scholars heavily use. For instance, creating UIs that thread reproduction attempts, retractions, articles finding serious errors within the original article, etc.

Unlisted False Negatives: Are 11% Americans Unlisted?

21 Aug

A recent study by Simon Jackman and Bradley Spahn claims that 11% of Americans are ‘unlisted.’ (The paper has since been picked up by liberal media outlets like the Think Progress.)

When I first came across the paper, I thought that the number was much too high for it to have any reasonable chance of being right. My suspicions were roused further by the fact that the paper provided no bounds on the number — no note about measurement error in matching people across imperfect lists. A galling omission when the finding hinges on the name matching procedure, details of which are left to another paper. What makes it to the paper is this incredibly vague line: “ANES collects …. bolstering our confidence in the matches of respondents to the lists.” I take that to mean that the matching procedure was done with the idea of reducing false positives. If so, the estimate is merely an upper bound on the percentage of Americans who could be unlisted. That isn’t a very useful number.

But reality is a bit worse. To my questions about false positive and negative rates, Bradley Spahn responded on Twitter, “I think all of the contentious cases were decided by me. What are my decision-theoretic properties? Hard to say.” That line covers one of the most essential details of the matching procedure, a detail they say the readers can find “in a companion paper.” The primary issue is subjectivity. But not taking adequate account of the relevance of ‘decision theoretic’ properties to the results in the paper grates.