Read more of this story at Slashdot.
Archive for December, 2010
Apple Forces Steve Jobs Action Figure Off eBay
Rape Victim Arrested by TSA for Refusing Groping
KVUE in Austin has the horrifying story:
Claire Hirschkind, 56, who says she is a rape victim and who has a pacemaker-type device implanted in her chest, says her constitutional rights were violated.  She says she never broke any laws. But the Transportation Security Administration disagrees.
Hirschkind was hoping to spend Christmas with friends in California, but she never made it past the security checkpoint.
“I can’t go through because I have the equivalent of a pacemaker in me,†she said.
Hirschkind said because of the device in her body, she was led to a female TSA employee and three Austin police officers. She says she was told she was going to be patted down.
“I turned to the police officer and said, ‘I have given no due cause to give up my constitutional rights. You can wand me,’†and they said, ‘No, you have to do this,’†she said.
Hirschkind agreed to the pat down, but on one condition.
“I told them, ‘No, I’m not going to have my breasts felt,’ and she said, ’Yes, you are,’†said Hirschkind.
When Hirschkind refused, she says that â€the police actually pushed me to the floor, (and) handcuffed me.  I was crying by then. They drug me 25 yards across the floor in front of the whole security.â€
While the TSA’s pornoscanners weren’t involved in this incident, it shows the newfound prevalence of TSA’s aggressive new groping strategy for anyone who doesn’t clear the first round of security checks.
Hirschkind isn’t alone: while sitting in an airport restaurant at LAX on Monday, I spoke with one middle-aged passenger – a conservative woman – spoke about how she also set off a metal detector and had to be groped. She described the experience as “humiliating,†and compared to it how she was sexually assaulted as a foster child. She cried immediately after the groping, saying it was “violating.â€
Know your rights this travel season - download our “Know Your Rights†flier, and visit our TSA coverage page for more on pornoscanners and groping.
White House Warns of Supercomputer Arms Race
Read more of this story at Slashdot.
Building blocks of a scalable web crawler
I recently had the pleasure of serving as a thesis advisor on a work by Marc Seeger, who was completing a portion of his requirements for a Master of Science in Computer Science and Media at Stuttgart Media University. Marc's thesis was titled "Building blocks of a scalable web crawler".
Marc undertook a project for Acquia that I had originally started in 2006; a Drupal site crawler to catalog, as best as possible, the current distribution of Drupal sites across the web. That is a task for which there is no easy answer as Drupal can be downloaded and used free (in all senses of the word). The best way to find out how many Drupal sites exist, is to develop a crawler that crawls the entire web and that counts all the Drupal sites one by one.
With Marc's help, I was able to resurrect my crawler project. Marc spent 6 months working with me; 3 months were spent in Germany where Marc lives, and 3 months were spent in Boston where Acquia is based.
During that time, Marc explored suitable architectures for building out, collecting and managing website data on the order of many millions of domains. He examined different backend storage systems (Riak, Cassandra, MongoDB, Redis, CouchDB, Tokyo Cabinet, MySQL, Postgres, ...), contemplated the methods of collecting the data while simultaneously allowing search and access. As part of his work, Marc explored a variety of different database technologies, database schemas and configurations, and experimented with various configurations of Amazon's Elastic Cloud hardware (EC2). Issues common to any large deployment were investigated and analyzed in detail, including HTTP persistent connections, data locking and concurrency control, caching, and performant solutions for large-scale searches. HTTP redirects, DNS issues -- his thesis covers it all, at least in terms of how each of these items impacted the search for an acceptable algorithm.
The crawler has been up and running for a number of months now, and investigated about 100 million domain names. Now we crawled about 100 million domain names, I plan to start publishing the results.
Marc's work is available in PDF from his blog post, and it's a good read, even if I'm slightly biased. Thanks for the great work, Marc! Time to look for a couple new thesis projects, and thesis students that want to work with me for a few months. Ideas welcome!