Fascinating and frightening cover story in today’s Wall Street Journal, Scrapers Dig Deep for Data on Web. Why it’s fascinating and frightening is for one reason alone — people share personal information online at the risk of privacy and loss of identity.
Scraping is the “business of tracking people’s activities online and selling details about their behavior and personal interests,” says the story. The Web site PatientsLikeMe was scraped by Nielsen, and the former sent a cease and desist letter to the latter on May 18, 2010. The latter agreed to stop, but what damage to the site’s consumer members had already been done? Revealing use of medications, medical history and commiserating about daily life online is personal choice.
Data brokers salivate at these new social networking sites and online forums where everyone attacks a topic with relish while including high-level personal information. Legally, “scrapers operate in a gray area,” according the Wall Street Journal piece. That’s carte blanche to dive full speed ahead into the gazillion bytes of online data deemed fair game because CONSUMERS PUT IT THERE TO BEGIN WITH.
Where I fault my friends, family and colleagues is not enrolling on Facebook or Twitter or LinkedIn (although there are boundaries here, too); I question the judgment of those willing to trust any online site with such information as personal use of pharmaceuticals, medical conditions and states of mind on a daily basis i.e. depression, suicide attempts or self-abuse. What this screams for is the need for services people can experience and trust while seeking support from peers and counselors; and that’s not happening online.
People not in tune to the risk of online engagement fall prey to scraping, scammers and hackers. Not everyone has the background or understanding to ask the questions and make the right choice before opening the personl data chasm. In fact, the scams are so sophisticated now that even folks with solid technological knowledge about Web site back ends can become a victim.
You can get the companies and sites yourself from the story; however, I’d like to flag them here, too:
- Monster and Facebook continually use technology to block scraping, but who knows how successful they truly are long-term?
- Sentor Anti Scraping System in Sweden is hired to block scrapers on behalf of its Website clients.
- InfoCheckUSA, LLC in Florida began as a background-check firm for screening applicants; it now offers more social information pulled from social networking sites and beyond
- 80Legs.com in Texas scrapes 1 million Web pages for $101
- Screenscraper.com in Provo, Utah and two other firms operate in “Happy Valley”
According to the story in a Sentor quote, the Stockholm company used to block some 2,000 scrapes monthly for a customer; however, now that figure has risen tenfold on a monthly basis.
What does that say to us? Caveat emptor — buyer beware; and, if it’s free? Run in the other direction!