Friday, January 11, 2013

All about lottery

Germany is scanning emails by the millions.

From our Know-What-They-Know department.

Let's cut through the chase: we claim that such activities are, from a social and behavioral point, nothing more than the lottery mentality in action. The same thinking that makes millions of us go out and buy a ticket every week despite the odds.

They gamble with our money, though.

The good news is, of around 37 million emails and data connections [sic] scanned by German government agencies in 2010, only 213 provided information that led to further investigation.

These numbers bounced around in the German media a few months ago when a report was released describing activities in 2010.

[TheEditor says: We suspended disbelief about the total number reported because only the suspension makes us buy an average of around 1 item scanned per adult under 60 of age per year.]

The good news is that German politicians have asked questions as to the sense of such an operation with results that require so many zeros after the decimal point that only math majors can understand the numbers.

More good news is that  the major email providers, Google, Microsoft, or Yahoo, or - just for size - Facebook, continue to know more about you than the German government does.

Even more good news of sorts is that mostly the careless of wording, the stressed of life, and the innocent get flagged by the scanners. This is a better description than the police chief who said something like "only the dumb" get caught like this.

Of course, it is really, really bad for those flagged incorrectly, but - for now - this still leaves a little bit of room for more level headed people to make themselves heard.

How they do it:
The media say they use keywords. We would add that there likely are also "key phrases" and that there must be some context processing with basic stemming and conjugation, too, for this to work. Hey, maybe they also use "mood detection" to figure out if the writer is happy, angry, very angry, or homicidal angry. The German media say they use about 2000 keywords related to terrorism, about 300 around human trafficking and about 13 000 [sic] around proliferation. The actual size and the content of the lists are obviously secret.

If you are a regular of this blog, you know how much we like experiments.

In this field, the best ones are the ones done without fanfare.

In the spirit of research, we at the K-landnews took the liberty to guess a few keywords, use these in our blog/on Twitter, and see where our audience was coming from.

Hidden somewhere near posts like "Pay to pee" or a Dr. Who quickie, we put some Easter Eggs for the processing algorithms.

We believe that we have seen a likely keyword response from a few countries, after eliminating known audience members and making an assumption, unscientifically gut feeling based, for chance hits as well as extremely bored males in their twenties googling at 2 AM their local time.

Which means, yes, someone seems to be watching sometimes but it is not (yet) a huge deal if you do not live in one of the high surveillance, high censorship countries. Or if your name is John Smith or Mike Miller, or if you avoid some scary words.

We also wonder if some nerds or geeks with decent programming skills and an idea of natural languages might, even as we write this, be working on compiling one or more lists of "words to avoid". Not because they want to do wrong but because they cling to a sense of privacy.

While technology is getting better all the time, simple ways to make life harder for the keyword hounds, off the top of our heads, include:
use typos;
use creative punctuation to kill the "end of sentence" detection;
write in a local dialect or any language not currently supported by Google translator and similar tools;
transliterate, for example Russian to Latin with a few typos sprinkled in;

Hardcore privacy fans use encryption anyway.

One more thing:
You will know whether typos and grammar errors really make the scanners barf when some politician or expert somewhere calls for criminalisation of typos.

No comments:

Post a Comment