Big Data is the new buzzword for the enormous increase in stored information across the world. But how will this information stream influence working life and working life research?
There are many different definitions of Big Data. Some try to describe it as the volume of data measured in strange quantities. Most of us will be familiar with kilobytes, megabytes and gigabytes. Perhaps the latest external storage you bought at Clas Ohlson was one terabyte, which means at €75 you can store the same volume of information which you would need to chop down 50,000 trees for if you were to print it on paper.
Petabyte is a thousand times more than that — a number 1 followed by 15 noughts — and if we are talking exabytes, a 1 followed by 18 noughts, you have a measure of how much new information has been produced in the year 1999. Since then the information stream has grown exponentially, meaning by 2011 we were producing 2.5 exabytes every day.
According to the International Data Corporation, IDC, which attempts to map how much information is being produced, the volume of data increases by 40 percent every year. In 2020 we will be producing and copying 44 zettabytes in one year. A zettabyte is one thousand exabytes.
Rather than talking about unfathomable volumes of data there is a simpler way of defining Big Data: information which is too big to be stored in ordinary computers. Instead you need to send in analysis programs like probes to where the information is stored, and special programs have been developed which can divide up tasks and delegate them to many different computers.
How many computer servers are being used by Google is a company secret, but they number somewhere between one and ten million.
This has provided space for new companies and consultancy services. The information technology research company Gartner estimates that in 2013 companies spent $296bn on information systems which gather information on customers, suppliers and competitors.
The Swedish Institute of Computer Science, SICS, reckons 35,000 jobs will be created in ten years in Sweden alone in a number of Big Data clusters, and a further 9,000 in what is called Mega data centres.
We already see many examples of how Big Data is being used in everyday life. If you show interest in a book on Amazon, you immediately receive suggestions for other books which readers of the book you were interested in have read. With 152 million customers and some 1.5 billion products in stock, Amazon is at the forefront for using Big Data and also shows the need for being able to rapidly analyse enormous amounts of data.
The data information which Amazon uses is mainly structural — the kind that you input and which can be presented in a spreadsheet and stored in databases. But there is also unstructured information, which is information created in social media like Twitter and Facebook, blogs and emails. This is where you find new and unexpected insight into customers’ needs.
The reason why Big Data has become such hot business is the combination of the ease with which you can now store data and the new analysis systems which make it possible to handle the information in realtime.
One striking example is Flightradar, which shows the world’s entire air traffic in realtime. Not only does the map tell you about the volume of traffic — it also provides a clear image of where in the world economic growth is concentrated:
At www.flightradar24.com you can zoom in and follow air traffic anywhere in the world. If you click on aircraft symbols you get the aircraft registration, the route and a range of other information. News are sent to a Twitter account with 260,000 followers
But what does this have to do with working life research? Well, the pilots have an occupation which is among the world’s most heavily monitored. Everything they say and do in relation to the aircraft is registered in what is popularly known as the black box, so that the information can be analysed in the case of an accident.
So far black boxes have had a limited capacity, storing only the latest few hours of communication. But after several high profile accidents where air crash investigators have failed to locate the black box, questions have been raised as to why information cannot be stored outside of the aircraft too. Satellites and radio telescopes already make it possible to have internet on board a plane. It should also be possible to send the black box information from the plane to data servers on the ground.
More and more occupations are monitored in the same way as pilots are, and as private individuals we leave digital traces all the time. How this impacts on working life is of course an important field for research, but Big Data will also have an impact on the way in which researchers work.
For most of the 20th century information gathering was both time and labour consuming. Even a small group of people interact in complex ways, like the American anthropologist Wayne W Zachary demonstrated in 1977 when his groundbreaking work demonstrated how the 50 to 100 members of a karate club were linked:
A diagram of 34 karate club members and their interactions outside of it. Each line is one contact outside of the club. There are 595 different relations only between these members
Some years ago researchers like Johan Ugander at Stanford University in California and Lars Backstroke at Facebook published an analysis of how the then 720 million Facebook users were linked to each other.
They discovered that the established theory of six degrees of separation (person A has a friend B who knows C and so on) did not apply to Facebook users. They are linked to each other in only four steps:
“Today, it is possible to examine human activities at scales undreamt of a generation ago, and these digital footprints have the potential to help social scientists better understand the complexities of human behaviour — for example, how individuals form and maintain social ties and the dynamics of influence and power,” writes Jimmy Lin from the University of Maryland in the American science journal The Annals, which has a special edition on Big Data in research.
According to Lin people now talk about Big Data as “the fourth paradigm” within research, which complements theory, experiments and simulations.
“Historically, social scientists would plan an experiment, decide what data to collect, and analyse the data. Now people collect everything and then search for significant patterns in the data,” points out Will Shih at the Harvard Business School in an interview with Harvard Magazine.
If you look for thesis or papers on working life in Nordic universities and university colleges there is not very much to be found.
“One area where Big Data will soon start to be used is health care and so-called personalised health,” says senior researcher Rajendra Akerkar at the Western Norway Research Institute.
Personalised Health aims to support the choice of treatments for a cancer patient based on characteristics found in the individual and the tumour, for instance the gene profile.
“This is an area where there is a great need for finding solutions which can handle large amounts of data. There is a trend towards tailored solutions for individual needs and this requires in-depth knowledge of the biological origins of illnesses and there is a need for high data-processing capacity.”
Meanwhile there are important questions about how this information should be protected. It is often not possible to anonymise the information, since it deals with a human being’s genetic code — which is unique.