ElectricNews.net:News:IBM researches Web privacy techniques

ADS & MARKETING

APPOINTMENTS

BUSINESS

E-COMMERCE

HOME & GADGETS

INTERNET & TELECOMS

INVESTMENT

MARKETS

OPINION

ROUNDUPS

SECURITY

WIRELESS

CORRECTIONS

Let us know how to make ENN better!
Take our reader's survey.

Using modified blogging software
More and more companies are modifying low-cost blogging software to set up rudimentary content management systems. Some Irish Web developers have pointed out shortcomings with this approach.

::E-COMMERCE IBM researches Web privacy techniques Tuesday, June 04 2002 by Matthew Clark
Send story to a friend Print this story
To protect privacy, IBM researchers are looking into ways that let users lie on Internet questionnaires without affecting the accuracy of the overall survey.

On-line merchants and retailers routinely ask users personal questions, such as age, income or marital status, in order to better gauge who their customers are for marketing purposes. The trouble with these surveys, according to IBM, is that users routinely lie due to concerns over privacy.

To overcome this, Dr. Rakesh Agrawal and Dr. Ramakrishnan Srikant, researchers at IBM's Almaden Research Center, have developed a system called Privacy-Preserving Data Mining, which relies on the notion that one's personal data can be protected by being scrambled or randomised prior to being communicated to Web sites. "Our research institutionalises the notion of fibbing on the Internet and does so to preserve the overall reality behind the data," said Dr. Agrawal.

IBM claims that by applying this technique, a retailer can generate highly accurate data models without ever seeing personal information.

For example, a survey could ask users to input their income between a range of EUR50,000 and EUR150,000 per year. But before that information is transmitted to the Web merchant, IBM's software would add or subtract a randomisation parameter of -EUR30,000 to +EUR30,000. The merchant sets this randomisation parameter.

Subsequently one user who earns EUR100,000 could transmit a figure of EUR85,000, while another could report the amount as EUR105,000, even though they both earn the same amount in reality. No record is kept of either user's true salary. On a per-user basis, the survey results are useless because the data is often inaccurate. But when enough users are surveyed, IBM's software can apply algorithms to compensate for the data scrambling.

"The beauty of this research is that retailers and other Web businesses are able to extract the valuable demographic information they need without necessarily knowing the underlying personal consumer data," said Harriet P. Pearson, IBM's chief privacy officer.

And the new technology comes as retailers are facing increasing pressure to stop collecting information on users. According a March 2002 survey from the Progress and Freedom Foundation think tank, commercial Internet sites are collecting less information on visitors.

That survey said that among the 100 most popular domains in the US, the proportion collecting personal information fell from 96 percent in May 2000 to 84 percent in December 2001. The proportion of domains using third-party cookies has also declined from 78 percent in May 2000 to 48 percent by the end of last year.

According to Dr. Agrawal, the Privacy-Preserving Data Mining research has a wide range of potential applications, from medical research and building disease prediction models using randomised individual medical histories to e-commerce and accurate promotions using randomised demographics of individual users.

:: Discuss this story - Click here

Related Stories

::Privacy groups slam new EU directive 31-05-2002

::New EU proposals could threaten privacy 13-05-2002

::Privacy groups introduce spam-fighter 01-02-2002

::ISPs to tackle many issues in 2002 15-01-2002

:: MORE NEWS from E-COMMERCE

Search Jobs on ENN
Post your CV
Hiring? Use 'Post a Job'