Data journalism can be much more than an impressive, interactive visualization or an inmmersive longform piece. There’s also the option of letting the data and the visualizations lead the storytelling, allowing for a much deeper comprehension of the subject at hand, as it’s the case with this work from the Tampa Bay Times. There, visualizations lead the story to show us the case they are investigating and explain to us why it’s important, taking us through each step.
Galicia is holding its Autonomy elections in October 21st, and the National Statistics Institute has released a small set of data from the electoral census. Working with this data we first saw how councils in Galicia have a varying percentage of their constituency living abroad, with some councils having more than 50% of their voters living outside Spain. We’ll continue to explore that data in the coming days, but today I want to take a look at how age also defines the voters profile in Galicia.
Migration is one of the key factors influencing the distribution of the galician electoral census. The other one is age, which is also a consequence of the first one. In the first graph we see the total census distribution by province and age. We can see clearly that the western provinces (A Coruña and Pontevedra) have much more weight in the census than the eastern ones (Lugo and Ourense). We can also see that the differences between these two groups are much more evident in the younger half of the census, but we can’t see clearly how important age is in the census distribution in Galicia.
In this second graph we can see the percentage of voters in each province by age, so we can more clearly see the weight of age in each of the four provinces. Lugo and Ourense in the east are the more aged provinces, both because of demographic trends and because of young people leaving for the richer provinces of Pontevedra and Coruña, other parts of Spain and abroad.
Relative data, as percentages, allow us to compare provinces and see the weight of each age group more clearly than when working with absolute numbers like the total number of voters.
Galicia, an Autonomy inside Spain, is having elections on October 21st. The spanish National Statistics Institute has released some data about the electoral census of the Autonomy, specifically regarding its age and population living in foreign countries. Galicia has a strong migration history and almost 15% of the electoral census lives abroad, although only 3,68% will vote, due to electoral laws restricting and difficulting the voting rights of a collective that registered participation levels above 30% in the past.
In the map above we can see the percentage of electoral census living abroad by council. The data ranges from 0.88% in Burela, to almost 55% in Avión, Bande and Gomesende.
I’m currently writing an article on digital currencies for a future, small-run magazine edited by Crazy Little Things. To complement the article, and to try to learn some new data-journalism skills I decided to do also a timeline of the most relevant digital currencies for the past 20+ years. This is the result:
How it’s done
I used Inkscape to draw the SVG file: timeline, bars and text, and create the layout. Then I added basic interactivity by hand using a text-editor. The content is based on my own research for the article. I plan to include in further releases a csv with the source data used in the timeline so it’s easier for others to replicate it using other tools.
Regarding interactivity, right now you can uncover some contextual information hovering your mouse over certain years, and click on the names of the digital currencies to go to their website or get more information. I plan to add more contextual information on the currencies, explaining the type of currency and the reason it dissapeared, if needed.
The project (just an SVG file) is hosted on Github. You can download it, fork it, open a new issue, send ideas or suggestions. The project is under a NC-BY-SA Creative Commons licence. This is my first time using Git and Github for a project like this, and I’ll share my experience in a separate post. I can tell you now that I’ll definitely keep using it.
I’m also open to criticism on the timeline content: Did I miss a critical digital currency project? Should I remove something from the timeline? Is any of the data wrong? I’m all ears.
CartoDB is a powerful and open source geospatial data management and visualization tool. It does everything Google Fusion Tables does, and more. If you’re comfortable with SQL queries and CSS (CartoDB uses Carto, a stylesheet language from Mapbox similar to CSS), you can get amazing results, including hexagonal density grids, or editable and interactive maps. They have more case examples in the gallery, and you can find many more examples online.
In the workshop we learned how to do the basic stuff: upload different kinds of data, visualize them, merge them and tinker a bit with the SQL queries and Carto stylings. We did two maps during the workshop: Spanish unemployment by provinces and life-expectancy rates by country:
I had a CartoDB account way before the workshop but I never got around to try it. I don’t know why, but I thought it was harder to use than Google Fusion Tables, so I was pleasantly surprised to discover how easy it was to work with it. Now I’m looking forward to see how can I use CartoDB in my projects.
Spanish Congress and Senate members released four days ago statements about their owned property and assets, as well as any other employment or line of work besides public service. But there was a catch. They released it in single PDF files, one per each Senator and Congressman. Quickly, developer David Cabo, member of the ProBonoPublico collective and a key member of the data visualization community in Spain, created a hashtag (#adoptaunsenador) and opened a Google Docs spreadsheet to crowdsource the data extraction process from the individual PDF files into a single structured file.
The spreadsheet was too slow to be used with more than 50 people editing simultaneously.
Anonymous editing was allowed from the beginning, but it was difficult to cope with erased / lost data. Because of simultaneous editing, recovering an earlier instance of the spreadsheet meant losing data as well from more recent edits.
At some point, spam finally came and anonymous editing was closed. The amount of people working on the spreadsheet dropped to 12-15.
Anonymous edits were also the laziest: unfinished sentences and skipped sections.
Data columns for currencies or dates need to be formatted before the spreadsheet is made public to avoid confusion and different formattings by the contributors.
An editor is needed to overview the process, secure that there is a standard in the transcription process and ensure that there’s no missing data.
There’s another crowdsourcing process going on with the objective to put together all the properties and assets data from members of the Congress: spreadsheet and hashtag.
Bet you didn’t know that half of the population of Barcelona wasn’t born there. Yeah, me neither. I’m one of the 30k something Galicians we now call Barcelona home, almost 2% of the total population. This modest visualization was done with Impure, using data from the Open Data initiative by the Barcelona city council from the 2009 census. Hover your mouse over the colored areas to see the the place of birth that corresponds with each figure.
Esta es la versión beta de la presentación que utilicé ayer en el taller de periodismo de datos que la gente de Media140 me ofreció presentar. A última hora hice algunos cambios en la estructura y el orden de la presentación. Intentaré actualizarlo el viernes con la estructura que usé en el taller, notas y la referencia a Florence Nightingale que se me pasó incluir.