How to get rid of a newspaper editor

No han echado a Pedro J. porque los ministros no fuesen a sus entregas de premios. Eso era solo simbólico. Las presiones han sido mucho más sencillas: han consistido en cortar el grifo de la publicidad institucional. Según cálculos internos de Unidad Editorial, la guerra desatada por el escándalo Bárcenas –y especialmente por los SMS del presidente del Gobierno al extesorero del PP– le ha costado al grupo unos 14 millones de euros en publicidad institucional.

Todas las administraciones gobernadas por el PP, desde el Ministerio de Empleo hasta el Ayuntamiento de Sevilla, pasando por Castilla-La Mancha o la Comunidad de Madrid, han cortado el grifo de las subvenciones a El Mundo. Todo ese dinero público, que el PP reparte arbitrariamente y utiliza para domesticar a los medios de comunicación, ha pasado de El Mundo al ABC. Y de la misma manera que hace unos años Esperanza Aguirre se cargó a José Antonio Zarzalejos, hoy Mariano Rajoy ha desbancado a Pedro José.

via La extinción del PedroJotasaurius Rex.

Data visualization of electoral census in Galicia, Spain (II)

Galicia is holding its Autonomy elections in October 21st, and the National Statistics Institute has released a small set of data from the electoral census. Working with this data we first saw how councils in Galicia have a varying percentage of their constituency living abroad, with some councils having more than 50% of their voters living outside Spain. We’ll continue to explore that data in the coming days, but today I want to take a look at how age also defines the voters profile in Galicia.

Migration is one of the key factors influencing the distribution of the galician electoral census. The other one is age, which is also a consequence of the first one. In the first graph we see the total census distribution by province and age. We can see clearly that the western provinces (A Coruña and Pontevedra) have much more weight in the census than the eastern ones (Lugo and Ourense). We can also see that the differences between these two groups are much more evident in the younger half of the census, but we can’t see clearly how important age is in the census distribution in Galicia.

In this second graph we can see the percentage of voters in each province by age, so we can more clearly see the weight of age in each of the four provinces. Lugo and Ourense in the east are the more aged provinces, both because of demographic trends and because of young people leaving for the richer provinces of Pontevedra and Coruña, other parts of Spain and abroad.

Relative data, as percentages, allow us to compare provinces and see the weight of each age group more clearly than when working with absolute numbers like the total number of voters.

Data visualization of electoral census in Galicia, Spain (I)

Galicia, an Autonomy inside Spain, is having elections on October 21st. The spanish National Statistics Institute has released some data about the electoral census of the Autonomy, specifically regarding its age and population living in foreign countries. Galicia has a strong migration history and almost 15% of the electoral census lives abroad, although only 3,68% will vote, due to electoral laws restricting and difficulting the voting rights of a collective that registered participation levels above 30% in the past.

In the map above we can see the percentage of electoral census living abroad by council. The data ranges from 0.88% in Burela, to almost 55% in Avión, Bande and Gomesende.

Digital Currency Timeline

I’m currently writing an article on digital currencies for a future, small-run magazine edited by Crazy Little Things. To complement the article, and to try to learn some new data-journalism skills I decided to do also a timeline of the most relevant digital currencies for the past 20+ years. This is the result:

How it’s done

I used Inkscape to draw the SVG file: timeline, bars and text, and create the layout. Then I added basic interactivity by hand using a text-editor. The content is based on my own research for the article. I plan to include in further releases a csv with the source data used in the timeline so it’s easier for others to replicate it using other tools.

Regarding interactivity, right now you can uncover some contextual information hovering your mouse over certain years, and click on the names of the digital currencies to go to their website or get more information. I plan to add more contextual information on the currencies, explaining the type of currency and the reason it dissapeared, if needed.

Improve it

The project (just an SVG file) is hosted on Github. You can download it, fork it, open a new issue, send ideas or suggestions. The project is under a NC-BY-SA Creative Commons licence. This is my first time using Git and Github for a project like this, and I’ll share my experience in a separate post. I can tell you now that I’ll definitely keep using it.

I’m also open to criticism on the timeline content: Did I miss a critical digital currency project? Should I remove something from the timeline? Is any of the data wrong? I’m all ears.

Top 5 essential skills for a data journalist

New York Times’ Aron Pilhofer answer to that question on the NICAR-L mailing list:

My top five (in order of importance):

  1. Know that the most important part of data journalism is… journalism. Reporting. In other words, you know how to report a story, you understand how to treat data as a source. You know how to pick up a phone, and not just assume that everything you get in data form (especially government data) is complete and accurate.
  2. You have at least basic data skills — meaning, you know your way around a spreadsheet. You can figure out for yourself how to import data, and do something with it. You also understand the basics of data analysis: rates, ratios, sums, averages, medians, and how to use them.
  3. You have command of more advanced data analysis skills, such as GIS, basic statistics, advanced SQL, etc. You also may know some basic programming techniques (using the language of your choice… Python, Perl, Ruby. ILENE.. shoot, even .NET) to scrape the web, get and clean data.
  4. You can apply your basic programming techniques to the creation of data-driven news applications using off-the-shelf tools like Google maps, MapBox, Fusion Tables, etc. At this point, you are not running servers, or serving database-driven apps. But you are creatively using what is available to you to add to your reporting online. This is probably where you need to get on the Javascript train.
  5. You have some skills with a web framework (Django, Rails, Grails) in order to enhance your reporting online through data-driven applications that you create from scratch and host.

CartoDB workshop in Barcelona

Last Monday I attended a workshop about CartoDB, organized by Media140 (with whom I’ve also collaborated: 1 and 2) and presented by Sergio Álvarez.

CartoDB is a powerful and open source geospatial data management and visualization tool. It does everything Google Fusion Tables does, and more. If you’re comfortable with SQL queries and CSS (CartoDB uses Carto, a stylesheet language from Mapbox similar to CSS), you can get amazing results, including hexagonal density grids, or editable and interactive maps. They have more case examples in the gallery, and you can find many more examples online.

In the workshop we learned how to do the basic stuff: upload different kinds of data, visualize them, merge them and tinker a bit with the SQL queries and Carto stylings. We did two maps during the workshop: Spanish unemployment by provinces and life-expectancy rates by country:

I had a CartoDB account way before the workshop but I never got around to try it. I don’t know why, but I thought it was harder to use than Google Fusion Tables, so I was pleasantly surprised to discover how easy it was to work with it. Now I’m looking forward to see how can I use CartoDB in my projects.

Common sense: from apps to responsive design

When the iPad was unveiled a year and a half ago, it was received with enthusiasm by media companies, especially by their directive boards, as it provided two essential things for them:

  1. A closed, confortable and standardized environment to receive content. Like good old magazines, but with video and rich-media ads.
  2. An opportunity to charge for content again, by creating scarcity, taking advantage of the walled garden of the iTunes store and selling apps like they sold magazines in the past.

But this approach overlooked several important flaws:

  1. Apple, while the biggest player in the market (at least for now), is not the only one. Any effort would have to be repeated, and then maintained, to gain more potential audience for any other platform (Android, Blackberry…). It’s not escalable.
  2. There is a bottleneck at the distribution stage, and you’re at the mercy of Apple’ internal app approval policies.
  3. The company has to give Apple a 30% cut of their subscription sales through the app, and probably will not have access to their subscribers’ data.
  4. In september, almost 40 million iPads were sold worldwide since the tablet was introduced one year and a half before. Why would you limit to a potential audience of 40 million when there are hundreds of millions of other devices capable of internet access?
  5. What will happen when the iPad and iOS are surpassed by newer, better technologies? Change is unavoidable, and in a world of planned obsolescence, it doesn’t make much sense to tie yourself to a product that will be obsolete in a few years.

Lately, there has been a trend that seems to take a more thoughtful, long-term. sustainable approach called responsive design. The first to jump the boat was the Financial Times, with an HTML5 app that avoids the iTunes app store and lets users access the app directly through the browser. That was a good start, but it was still rooted on the idea of developing an specific product for just one platform, in this case, iOS.

Instead, Propublica made some changes in their site to allowed it to adapt to the screen size of the visitor’s device, whether smartphone or tablet of any size, and independently of the device’s operating system.

The redesign of the Boston Globe was an even more ambitious project. It’s probably the first news website fully redesigned under the responsive web design paradigm, which means it’ll adapt its layout to the characteristics of the device used to acces the site.

If journalism is not a product, it’s a process, a platform-agnostic approach that will deliver, with quality, consistence and coherence, the same news, reports, analysis and commentary, no matter what you use to read them, makes much more sense. It also allows the company to retain control and independence over their most important assets: their audience and how they access their content.

Over 2012 we’ll see more and more media companies sailing away from the siren chants of the iPad and getting the control of their own future back with HTML5 responsive design websites. Those who don’t will see their efforts scattered ineffectively accross a handful of platforms, draining precious resources away from meaningful innovation.

This article was previously published in the blog of the ESCACC Foundation (Espai Català de Cultura i Comunicació, in catalan) and in ElEConomista / CanalPDA (spanish).

Barcelona: income by neighborhood

Red hues: Below average
Green: Around average
Blue hues: Above average

I see a trend around the Diagonal. Urbanism influenced wealth distribution around the city, or the other way around? What has been the impact of gentrification (Olympic Games, Fórum, 22@) in Vila Olímpica, Diagonal Mar, and Parc i Llacuna del Poblenou?

This map is going to be one of the exercises I’ll teach at a data visualization workshop organized by Media140, on November 7th, at Vilaweb, in Barcelona.

Víctimas del terrorismo de ETA

Como todos sabéis a estas alturas, el grupo terrorista ETA ha declarado su abandono definitivo de las armas. En los 53 años que han pasado desde su fundación, ETA ha asesinado a 829 personas. No, a 858. No, no, 952. Espera un momento, ¿cuántas son las víctimas de ETA?

Ministerio del Interior 829
Fundación Víctimas del Terrorismo 828
Asociación Víctimas del Terrorismo 858
Colectivo Víctimas del Terrorismo en el País Vasco 952
El País 829
El Mundo 864
El Correo 858
Diario Vasco 829
ABC 857
La Voz de Galicia 829
El Periódico de Catalunya 829
La Vanguardia (1 / 2) 829 / 858
Wikipedia (ES) 839
Wikipedia (EN) 829

¿Cómo se explican estas diferencias? En el caso del Colectivo de Víctimas del Terrorismo del País Vasco la inclusión de las víctimas de los Comandos Autónomos Anticapitalistas como víctimas de ETA infla el número hasta llegar a 952. La esquizofrenia de La Vanguardia se debe a que una de las noticias es de la agencia EFE, que da por bueno el número de 858, mientras que la información propia se queda con 829. Este número es el reconocido oficialmente por el Ministerio de Interior y el Gobierno Vasco, mientras que la Asociación Víctimas del Terrorismo defiende el número de 858, probablemente incluyendo víctimas de incidentes o acciones terroristas no reconocidas por ETA. Los números de la Fundación Víctimas del Terrorismo y de ABC probablemente estén simplemente desactualizados respecto a sus fuentes, mientras que el origen del número de El Mundo es un misterio, puesto que no se acerca a ninguna de las otras fuentes.

Mención aparte merece La Información en este conjunto de gráficos, en el que cada uno tiene una cifra total de víctimas diferente.

Primera víctima mortal: ¿1960 o 1968?

La fecha del primer atentado mortal de ETA también es controvertida. Algunos medios dan por buena la atribución a ETA de la muerte del bebé de 2 años Begoña Urroz Ibarrola en un atentado con bombas incendiarias en la estación de tren de Amara en 1960, a pesar de que ETA nunca ha reivindicado este atentado y de que la mayoría de fuentes apuntan al DRIL (Directorio Revolucionario Ibérico de Liberación). Oficialmente, la primera víctima mortal de ETA es José Pardines Arcay, en 1968.

Innovation means collaboration for media companies

Media companies were never really innovative. They used to be quick to take advantage of technological developments in their content-distribution channels to enhance their content offering, like when printing presses allowed to reproduce pictures, and later, color. But these where not developed by the media companies. They could have sparked this innovation, their needs may have pushed for these developments to happen, but they were not theirs.

They didn’t innovate much in content production or presentation as well. Newspaper sections have remained virtually untouched for years, the same classification and categorization of information today as the newspapers that served society 100 years ago. When internet emerged in the 90’s, they used the same information architecture in the new channel. Ultimately, newsrooms are meant to produce, following a set of rules and processes, not to do research and development, and those kind of departments are rare in most but the biggest media companies.

With revenue streams getting thinner and management struggling to maintain media companies profitable (or reduce losses), even cutting newsroom resources, R&D is not a priority, if it ever was. Some of the most disruptive innovations in advertising, information architecture and content in the last years have not come from media companies:

  • Think of how Craigslist established a new standard for online classifieds, historically a business dominated by newspapers, and one of their main revenue streams.
  • How Google first, and Groupon later developed ways to put in touch local businesses with consumers online. Both of which are natural newspaper customers, although in different ends of the product chain.
  • How newspapers didn’t get the need for CRM and analytics and a better understanding and insight on their audiences until it was too late, and social networks appeared to provide advertisers with a profiled audience to develop targetted advertising programs.
  • How newspapers not only skipped the chance to bypass intermediaries, like distribution chains, when trying to sell digital subscriptions to consumers, but jumped in the wagon when Apple demanded a 30% cut of their subscription revenue to iPad apps.
  • And how that demonstrates that newspaper and media companies seem fixated on the idea of “channel” instead of focusing on creating platform-agnostic content.

Innovation happens in the fringes

But not everything is that bad. We have had our share of innovation in media in the last 10 years. It just hasn’t come from media companies, but from the fringes of the media ecosystem, or even outside of it.

  • Storify, a tool to organize and create a narrative around curated content from social networks, is co-founded by a journalist.
  • Google Living Stories was developed by Google in partnership with the New York Times and The Washington Post. It’s a way to organize news content around an ongoing issue in a way that makes it easy to understand the timeline of events, as well as the access to all the related content.
  • One of the first mashups,, which had a clear impact in the current interest and attention on data visualization, including its adoption in newspapers and media, was created single-handedly by one journalist, Adrian Holovaty. His later company, Everyblock, which geolocates content from several sources in several of the major US cities, was acquired by MSNBC.

Collaboration and partnerships

If we can learn something from these stories, is that for media companies innovation can, and will happen thanks to collaboration and partnership. In this process, institutions like the Knight Foundation have been critical, as it has provided funding to projects that otherwise may have not received it, and pushed an spirit of sharing and collaboration, demanding, for example, that software developed using its grant is released under the GPL license, and all the other material under Creative Commons licenses. A perfect example would be one of their funded projects, DocumentCloud, also a partnership between ProPublica and the New York Times, is an open source tool to share, analyze and annotate source documents, and is currently used by more than 200 newsrooms in the US.

Hacks / Hackers, an informal and loose network of meetups of journalists and programmers, which recently has seen the birth of a new chapter in Madrid,  is another example of collaboration, very focused on software development for newsrooms. One of the most interesting projects lately is PDFSpy, which allows to monitor changes to a large set of hosted PDF files, like the ones released by Spanish congressmen and senators. That way, a journalist would receive an alert if something is added, or removed from any of the 614 pdf documents.

In Spain we have meetups (Café y Periodismo, the upcoming Hacks / Hackers), research groups (1001 medios) and hybrids of both (BCNMediaLab, which I co-founded). These are outlets for journalists to meet, debate and share ideas and projects, but I feel we’re going to need to take these initiatives (or new ones) a step further in order to generate the collaboration and contribute to the innovation our industry needs.

This post was first published in catalan in the blog of the ESCACC Foundation (Espai Català de Cultura i Comunicació).