For enthusiastic technologists, Big Data promises new opportunities, interesting work, and the potential to make a lot of money. But we techies are consumers and citizens too. Like everyone else, our personal data is constantly being captured and exchanged through millions of electronic transactions in a data collection market now worth many billions of dollars worldwide.
Each day the enormously powerful Internet technology companies (Google, Facebook, Twitter, etc.), credit reporting agencies, retailers of every size (including electronic retailers like Alibaba and Amazon), and a myriad of smaller online data tracking groups, use evermore powerful and ingenious monitoring technologies to track and record our daily activities – our preferences, likes and dislikes, what we buy, what we search for, who we know, where we’ve been. In fact, an entire industry – now a vital part of the digital economy and therefore important to continued economic growth – has emerged that is dependent upon the collection and sale of personal data in ways that would have been uncountenanced in the past. We should know; we’re the architects of the technology that makes that personal data market possible.
But at the same time, no one should know better than we do that data hackers and the universality of the Internet means that personal data is never really secure – it can be openly bought and sold, made available around the globe, irretrievable and un-erasable. In our enthusiasm for all things “Big Data,” are we abetting the data collectors in something that might be bad for society’s (and our own) best interest?
Download an excerpt from Digital Exhaust: What Everyone Should Know About Big Data, Digitization and Digitally Driven Innovation by Dale Neef.
Dale Neef is a businessman, consultant, speaker, and author specializing in “Big Data” management issues and electronic monitoring and reporting technologies. He has been a technical consultant for the Asian Development Bank, has worked for IBM and Computer Sciences Corporation, and was a fellow at Ernst & Young’s Center for Business Innovation. A frequent contributor to journals, and a regular speaker at technology conferences, he earned his doctorate from Cambridge University, was a research fellow at Harvard, and has written or edited eight books on the economics of knowledge and data management and the use of information technology to mitigate risk. http://www.daleneef.com/
Home Truths About Data Privacy and Big Data
Click through for a closer look at Big Data collection, privacy, and our role in building and sustaining the industry, as identified by Dale Neef, a businessman, consultant, speaker, and author specializing in Big Data management issues.
Big Data Is Usually Personal Data
When people use the term Big Data, they can mean anything from analyzing large data sets for epidemiology or gene research to crunching financial services and market data on a real-time basis.
But let’s be honest. Despite advocation for its unique value in more specialized fields of science, engineering and economics, when people talk about Big Data these days, we can be pretty certain that they’re talking about focusing money and technology on one thing: the collection and analysis of personal data from consumers and customers.
Money and Advertising
The motive is money; the means is advertising on our mobile phones and tablets.
The reason that companies want to collect and analyze personal data mostly, but not exclusively, is to support digital advertising. It is not where the Big Data phenomenon started (science and engineering), and it may not be the area where Big Data can do most to help humanity (medicine and biotechnologies). But collecting personal data in order to provide customized, electronic advertising is where the technological and economic dynamism of Big Data is currently focused.
So who is collecting our personal data and how is it being used? To understand the extent to which personal data is being captured, we need to look at four groups.
Internet Tech Companies
Group 1: The Internet Tech Companies
Large Internet technology platform providers such as Google, Amazon, Facebook, Yahoo! and Twitter use tracking technologies such as cookies and social plug-ins to monitor our search terms, to track our “traffic” patterns on websites, to filter that data through recommendation engines, and to monitor our reaction to ads that they post.
And as we have moved to the cloud, required registration with these groups means that they now can analyze both incoming and outgoing emails to our PCs, tablets, and phones, as well as any emails stored on their servers, for keywords and sentiment analysis. This process allows these groups access not only to the content of emails (allowing them to apply automated sentiment scanning to derive affiliations, interests, and so on) but also to the contact information of both email senders and anyone else copied on those emails. In this way, these tech giants build up profiles not only of their own registered users but on millions of other non-affiliated email contacts as well.
Credit Reporting Agencies
Group 2: Credit Reporting Agencies
Credit reporting agencies (CRAs) have been collecting data on almost everyone in America – our identity (age, sex, race), employment, income and spending, major purchases, creditworthiness, criminal activity, marriages, divorce settlements, and courthouse appearances – and now have extensive dossiers on millions of individuals worldwide: on our jobs, what we earn, what we spend, what we buy, where we live, what cars we drive, and our political party affiliation. And although the confidentiality of credit scores and medical records is still protected by law, CRAs have expanded their services to include selling of individual profiles to almost anyone who will pay for them.
Group 3: Bricks-and-Mortar Retailers
Point-of-sale (POS) systems today routinely match up our personal information (often captured from loyalty or discount cards) with the SKUs of our purchases, allowing retailers to understand our likes and dislikes, who pays in cash or with a credit card, the types of food we eat, and in what quantities – and whether we buy whole milk or skim, fruit or candy bars. They even know our dress size and belt girth. And as grocery and general retailers moved into pharmacy sales in the 2000s, they soon learned what prescriptions individuals bought and what doctors wrote those prescriptions – an oncologist, an ophthalmologist, or an obstetrician. Large retailers are adopting in-store tracking of customers using technologies that include Wi-Fi and closed-circuit-TV-(CCTV)-based monitoring, customer purchasing histories, predictive analytics, and even facial recognition.
The Invisible Data Trackers
Group 4: The Invisible Data Trackers
Augmenting all of this are a myriad of tracking technologies poised surreptitiously on each of the top websites, waiting to pounce on site visitors in a way that most people never see and seldom appreciate – with as many as several hundred groups launching a deluge of tracking code on our computers in a fraction of a second. If they can, they will latch onto us electronically and continue to follow our web activity on other sites as we move through the day. And using the same type of crawling technology developed for search engines, these programs search for key words or sentiment data using software “robots” that simulate human search, looking for contact details, resumes, email addresses, or a user’s comments on discussion boards, blogs, or online chatrooms.
Aiding and Abetting
Many of us work for companies that fall into these four groups and are actively engaged in some aspect of the collection of our customers’ personal data. And whether as technologists, data scientists, company managers, or sales and marketing executives – we all play some role in setting our organization’s Big Data strategy.
No one knows better than we do that Big (Consumer) Data can be profitable and helpful to our organizations and our customers, but it can also be personally intrusive and enormously costly (if we lose or abuse private customer data). We also must appreciate that, largely through our own efforts, we may be careening off toward a Wild West of data management that marks the end of privacy as we know it.
Leading or Following?
As technologists and businessmen, do we have any special responsibility to curb or self-regulate our collection and use of customer data, or should we leave that to government regulators? Equally important, are we allowing our enthusiasm for success and money to cloud our judgment on our Big Data personal data strategies – leading us all down a path that the world and the economy would be better off not following?
In short, shouldn’t those of us who understand the threats as well as the promise of Big Data be doing more to set a responsible course for the broader economy?