Why is data journalism an important tool for journalists? And how can graphics be used to communicate a story in ways that words alone cannot?
In recent years, online outlets like the Guardian’s Datablog and other independent websites such as ‘Information is Beautiful’ have played a part in developing what we now call data journalism, a digitised form of investigative journalism that includes gathering data, analysing it and presenting it to the reader in form of graphic story telling, so-called visualisations.
But data journalism is not new; it has been around for as long as there has been data. Florence Nightingale was a data journalist, and her 1850s visualisations reporting on mortality in the army are still famous to this day. It is the term data journalism that is new to the field. Two years ago, as Datablog Editor Simon Rogers says, nobody knew what it was, and whether it was real journalism was questioned.
Today however, post Wikileaks and the MPs expenses scandal, it has become widely known in the journalistic field.
The data journalism we see today has developed as a product of a changing media landscape in which, as argued by Paul Bradshaw in the Data Journalism Handbook, “almost everything is and can be described in numbers”.
The rise in digitisation of data and increasingly freely available information on the web as well as the ongoing development in countries with restricted press freedom are other contributors. This new media environment, in which print publications are seeing a decline in circulation and online media platforms are on the rise, makes it increasingly important for journalists to keep up with the changes. The tools of data journalism are important skills for journalists in this new landscape to be able to inform the public and pursue their primary function in a democratic society: to act as watchdogs holding power to account.
Data journalism plays an increasingly important part in todays media. Citizen journalists, bloggers and user-generated content (UGC) makes it easy to fill the online and print publications with reactive material reporting on events and following up on stories from social media. With new tools and skills, journalists can find a balance in their role, so that they both respond to events and actively seek stories – as argued by Bradshaw.
Whistle-blowing platforms such as Wikileaks encourage people to submit data and make it available to all. This is a step towards a more open society. However, as argued by Dr Benedetta Brevini, it means nothing unless you have somebody who can make sense of vast amounts of data. This is where data journalists come in. With online journalism on the rise we see more user-friendly tools available to gather and visualise data such as Google Fusion Tables, Tableau and Google Maps.
There has been a decline in the public’s trust in journalists in recent years. This is due to events such as the phone hacking scandal and the Jimmy Saville case at the BBC. Together with the decline in newspaper circulation, and a rise in citizen journalism, there is now room, and perhaps need, for changes in the field. Data journalism can be one way of doing this.
Data journalists scrape. Scraping is a tool which allows the journalist to collate and analyse information on the web. It’s not only faster than ordinary search tools but it reaches information that may not have been collected before, such as notices, mentions and smaller documents. It is, as Bradshaw argues, not only faster than Freedom of Information (FOI) requests but provides more granular results than most advanced searches and can grab data that “organisations would rather you didn’t have”.
Scraping data and finding the story is of course the foundation (there wouldn’t be any stories without it), but visualisation is another key part of data journalism, for the aim is, of course, to communicate the stories to the public in a way that is easily understandable. Then comes making sure the story is interesting and attention grabbing. It is perhaps not the journalist’s role to entertain, but much like when writing a story, the data journalist too needs to be able to grab, and keep, the reader’s attention, and convey the message clearly.
Although we live in an open society where data is freely available in theory, be it sourcing online from public bodies such as the Office of National Statistics and the Open Data Institute, YouGov, or via FOI requests, it is possible for these bodies to ‘hide’ data they don’t want to show. Pdf files for instance, have been one of the ways of doing so, as they are more difficult to scrape. Now, however, data journalists have tools for scraping such documents too.
Journalist Heather Brook talks about a new Information Enlightenment where free knowledge flows beyond national boundaries and where technology breaks down social boundaries too, such as status, class and power, even geography. She argues that only those with the right tools can make it and that data illiteracy is the big problem. Brooke argues that only those, with the right tools can make sense of the information. “A journalist who cannot use these tools is in the near future not going to be much use either for the publication or society.”
One of the Guardian’s biggest data journalism projects to date was a collaboration with the London School of Economics: Reading the Riots, a project investigating the causes of the August 2011 riots.
Prime Minister David Cameron said in a statement shortly after the riots that they were not about poverty. However the investigation suggested otherwise. Another significant finding was that, again contrary to the government’s beliefs, the riots were not organised through social media. There was talk about shutting down social media traffic on platforms such as Twitter and Facebook during the riots, as it was believed to be a tool for looters to mobilise and communicate. Having been proven a useful tool in spreading communication about aid and cleaning-up activities in times of crisis elsewhere, for example during hurricane Sandy last year, shutting down these platforms can instead prove detrimental. This investigation showed how data journalism can be crucial in holding the government to account through questioning such statements and proving them wrong.
This is part of a series of visualisations by Reading the Riots analysing social media usage. They mapped the Twitter traffic during the riots using a database of over 2.5 million tweets related to the riots. The visualisation shows that the majority of surging social media traffic occurred after the first verified reports of the event it was related to rather than before, or even during. This would not have been as effective if told by the written word only, as looking at the graph really helps to visualise the timeline. It is a good example of a visualisation telling a story, which is that the government was making arguments with no facts to back them.
The story was published on 24 August, which was very close to the actual events and this sort of work would have taken much longer without the tools available to the data journalists.
The interactive visualisation itself is not as clear as it could be, however. The y-axis doesn’t specify that it shows the number of tweets sent. It is also slightly tricky to navigate. Instead of ‘event’ it could have shown the name of the location to see which location each ‘event’ indicates. This is available as an interactive function, but could have been avoided. Also there is a cluster of events on the night of 10 August which could have been more clear if stretched out.
Like this one, many data journalism stories use visualisations only, without quotes or deeper analysis, as the figures presented are supposed to ‘speak for themselves’. In this case it is part of a series and should the reader want to know more, there are links to more in-depth articles. Another example of a website where stories are told purely with visualisations is David McCandless’s Information is Beautiful. As argued by Bradshaw: it has on “many occasions shown the importance of clear design”.
In this type of data journalism the focus is on the aesthetics of the presentation and the figures stand alone. All data is then subject to the viewer’s own interpretation, which could make for a more objective view on the matter.
Journalists have to fight for the reader’s attention, just like any other type of media. This is where so-called ‘visualisation porn’ sometimes comes into the picture, meaning that what could have easily been communicated with a simple pie chart, for example, is presented as an interactive multimedia ‘show’.
Here is an example of such a story. It was published on the Guardian’s series Show and Tell which presents data stories from around the world. It is made by NYC-based Sam Slover and the Guardian’s Mona Shalabi.
The story is very interesting, important, but difficult to grasp when only viewing numbers and figures. If told in the written word only, possibly in the form of a longer broadsheet feature, it could be hard to show what overfishing really looks like and a visualisation helps. However, this almost feels like it was meant more for children than the average Guardian reader. One of the benefits of data visualisation is brief and concise story telling. This story could have been told with a couple of pie charts, perhaps in the shape of a round blowfish or something similar showing the percentage of large versus smaller fish. A time line chart would be sufficient if focusing on the development over time. It is also time-consuming and could be shortened, as the changes are rather small during the 1950s and 60s. It could also clarify which fish are swimming in and out of the frame and which are being added and removed, perhaps blinking rather than fading.
But the power of visualisation need not be overstated. An example of when it is at its best is this story about gay rights in the US. “Gay Rights By State” by the Guardian US Interactive Team shows different laws and rights for gay, lesbian and transgender people and looks at issues such as marriage, adoption, and even school bullying. The interactive visualisation shows how the handling of gay rights issues varies by state and region.
It is hard to find anything negative about this story. It is a big topic and therefore a big story and it takes a little effort, albeit not much, to understand how to read it and therefore it is not clear at first glance. The only thing one might have wanted was a map, however it would have been difficult to fit all the different colours in each state and it would perhaps have lost something rather than gained. It is a brilliant example of how the visualisation does the job for the reader, as you can easily navigate to the different states, read on about the figures in more detail and skip, instead of having to skim through, information that is unwanted and time-consuming. This article also has a good amount of in-depth explanation further down about the different rights, told with text as well as even more visualisations, a very useful feature. With the information overload of the web today, this is an excellent example of efficient and effective storytelling. In print, a large-scale story like this, would have taken up at least several pages and although feasible it would have lost much of the effect of putting the states next to each other like this, as it highlights geographical differences and shows all different parameters. Not surprisingly, it won the Data-Driven storytelling Big Media Award at The Data Journalism Awards this year held by Global Editors Network.
In an ever changing media landscape it is crucial for the journalist to keep up with new technologies. Data journalism is a tool for the more traditional investigative journalist to do so. With new interactive multimedia online it is getting increasingly difficult to grab and keep the attention of the reader and data visualisations play an important part here for journalists. Not only does it often tell a story better than text alone, helping the reader visualise and analyse figures. It can also be more unbiased than many print-only stories, keeping to the facts, and not interpretations of them. It is a way for today’s journalists to win back some lost public trust to keep its place in society, inform and act as watchdog, ensuring a democratic society even in the future.
Journalism as a profession is under threat from the recent rise in citizen journalism, bloggers, social media and UGC. To keep professional journalism alive, some adapting of journalistic practice is needed and new skills such as data journalism must be learnt, not only to gather but to investigate, question and finally present the data telling a good story.
This essay was part of my Data Journalism module at City University, London.
Heather Brooke, The revolution will be digitised, London, 2011, Random House
Paul Bradshaw, Scraping for Journalists,UK, 2013, Leanpub
Guardian Datablog, URL http://www.guardian.co.uk/news/datablog
Global Editors Network, website URL: http://www.globaleditorsnetwork.org/dja/ (consulted on 06.07.13)
Data Journalism Handbook, URL http://datajournalismhandbook.org/1.0/en/introduction_0.html
Information is Beautiful, website, URL http://www.informationisbeautiful.net/
Benedetta Brevini, Power Without Responsibility module at City University, Lecture, November 2012