The Misspelled Universe

18 Apr

How many times have you typed something in Google to be asked “Did you mean: ….” Next time you reach this page, stay a little longer and take a look at the pages that Google did find. This is your gateway to the parallel universe of misspelled words. Well let me correct myself — these “misspelled” words can belong to a different language altogether or they even might be rarely used genuine English words with close resemblance to the heavily used ones.

An entire gamut of information is being denied to us due to mere errors in spelling. To deride these spelling mistakes as “mere errors in spelling” is to ignore a small minority of people who deliberately misspell words so as to make their pages less publicly accessible. This works as an effective low-tech solution for every underground society has demands obscurity.

Then there are people who exploit misspellings to make their living e.g. People searching auction sites like eBay for misspelled (or mislabeled) items, and hence hopefully underbid items. (* eBay now offers a spell-check utility but surprisingly few people still refuse to use it.)

Excepting eBay entrepreneurs, one thing that is clear is that we are “losing”‘ this increasingly vast pool of information containing misspelled “keywords” (words we type in a search engine). There is an argument to be made that the quality of information source with misspelled words may itself be poor and hence we needn’t worry about the “lost” information. Arguably, the frequency of misspelled words in a peer-reviewed journal is much lower than say my blog. ;) The normative question is, Does that rightly consign my blog to obscurity?

Internet search is a classic case of finding a needle in a haystack, and search algorithms are built of dispense with as much “clutter” (hay) as fast as possible, leaving a very small minority of websites that are given genuine value. What we are seeing are two trends implicit in Google’s search algorithm — most of our search needs are about “popular”‘ items (given a higher rank by Google), and it is progressively harder to find “unpopular” sources. On the face of it the trend is innocuous and even sensible but the wider ramifications include information hegemony.

Let us turn the discussion around to sites that use “syntactically correct but meaningless verbiage including commons search terms” (a sentence like “Indeed, a blind crenelation blasphemously a player inside the stictomys. For example, a whopper behind a ferrocyanide indicates that the saccharinity behind a casino tropez another euphausiacea from another modem.”) People also “Google bomb” (mass posting on blogs/lists associating a search phrase with online address). Some sites have in fact automated this by writing programs that automatically go to different blogs/lists and post entries/comments like “poker chips poker – [web address].” This problem is much worse as it is making it progressively harder for us to find “genuine” (or most popular/reliable) information.

So will there be too much seemingly reliable unreliable information or will we miss a lot of seemingly unreliable reliable information? Chances are that both will happen.

Some Die Young

26 Dec

There are 34 countries in Africa where life expectancy at birth of both men and women is equal to or less than 51.
Data are from the 2003 UN estimates.
Note that life expectancy at birth is strongly impacted by infant mortality.

Name of country Av. Age of Men  Av. Age of Women

Angola      39      41
Benin       48      51
Botswana    39      40
Burkina Faso    45      46
Burundi     40      41
Cameroon    45      47
Central African 38      40
Chad        44      46
Rep of Congo    47      50
DR Congo    41      43
Djibouti    45      47
Equatorial  48      50
Ethiopia    45          46
Guinea      49      49
Guinea Bissau   44      47
Ivory Coast 41          41
Kenya       43          46
Lesotho     32          38
Liberia     41          42
Malawi      37          38
Mali        48          49
Mozambique  37      40
Namibia     43      46
Niger       46      46
Rwanda      39      40
Sierra Leone    33      35
Somalia     45      48
South Africa    45      51
Swaziland   33      35
Tanzania    42      44
Togo        48      51
Uganda      45      47
Zambia      33      32
Zimbabwe    34      33