ENN - Electric News.net
Free e-mail alerts & newsletter - Sign up here
Free e-mail alerts & newsletter - Sign up here
Edit your alerts
Let us know how to make ENN better!
Take our reader's survey.
Adworld UTV_AD

Using modified blogging software
More and more companies are modifying low-cost blogging software to set up rudimentary content management systems. Some Irish Web developers have pointed out shortcomings with this approach.
More here



IBM researches Web privacy techniques
Tuesday, June 04 2002
by Matthew Clark

Send story to a friend
Print this story
To protect privacy, IBM researchers are looking into ways that let users lie on Internet questionnaires without affecting the accuracy of the overall survey.

On-line merchants and retailers routinely ask users personal questions, such as age, income or marital status, in order to better gauge who their customers are for marketing purposes. The trouble with these surveys, according to IBM, is that users routinely lie due to concerns over privacy.

To overcome this, Dr. Rakesh Agrawal and Dr. Ramakrishnan Srikant, researchers at IBM's Almaden Research Center, have developed a system called Privacy-Preserving Data Mining, which relies on the notion that one's personal data can be protected by being scrambled or randomised prior to being communicated to Web sites. "Our research institutionalises the notion of fibbing on the Internet and does so to preserve the overall reality behind the data," said Dr. Agrawal.

IBM claims that by applying this technique, a retailer can generate highly accurate data models without ever seeing personal information.

For example, a survey could ask users to input their income between a range of EUR50,000 and EUR150,000 per year. But before that information is transmitted to the Web merchant, IBM's software would add or subtract a randomisation parameter of -EUR30,000 to +EUR30,000. The merchant sets this randomisation parameter.

Subsequently one user who earns EUR100,000 could transmit a figure of EUR85,000, while another could report the amount as EUR105,000, even though they both earn the same amount in reality. No record is kept of either user's true salary. On a per-user basis, the survey results are useless because the data is often inaccurate. But when enough users are surveyed, IBM's software can apply algorithms to compensate for the data scrambling.

"The beauty of this research is that retailers and other Web businesses are able to extract the valuable demographic information they need without necessarily knowing the underlying personal consumer data," said Harriet P. Pearson, IBM's chief privacy officer.

And the new technology comes as retailers are facing increasing pressure to stop collecting information on users. According a March 2002 survey from the Progress and Freedom Foundation think tank, commercial Internet sites are collecting less information on visitors.

That survey said that among the 100 most popular domains in the US, the proportion collecting personal information fell from 96 percent in May 2000 to 84 percent in December 2001. The proportion of domains using third-party cookies has also declined from 78 percent in May 2000 to 48 percent by the end of last year.

According to Dr. Agrawal, the Privacy-Preserving Data Mining research has a wide range of potential applications, from medical research and building disease prediction models using randomised individual medical histories to e-commerce and accurate promotions using randomised demographics of individual users.

:: Discuss this story - Click here



ENN Corporate Services Ad Red Moon Media Ad ENN Message Boards House Ad
Powered by The CIA
Designed by Redmoon media


© Copyright ElectricNews.Net Ltd 1999-2002.