Blog survey findings: introduction and methodology

Over the past few weeks, dozens of aid and development bloggers have collaborated to promote a joint survey of our readership. This project was essentially market research. We wanted to know who actually reads these blogs we write, what those people are interested in, how they interact with the blogs they read, and what they get out of it. The survey was designed and conceptualized by me and the other bloggers involved with the Smart Aid Initiative. We knew that surveying one blog’s readership would have been inadequate, so we got a dozen other bloggers to contribute comments and help spread the word.

And oh how it spread. I lost track of all the blogs that posted the survey. Let me just say thank you to everyone who blogged, tweeted, googleplused, facebooked and otherwise shared the link! The result was an incredible 1,751 responses!

This week I’ll share the topline results in a series of posts. (There might be some charts in the Excel 2003 default template. Don’t judge me.) This initial post will deal with some methodology and housekeeping issues. The later posts will get more interesting, I promise. Also, I’ve created a page to collect all the results in one place.

Crowdsourcing the analysis

There are many insights you can draw from 1,751 responses to 18 questions. I won’t have time to do all of the analysis (and if I did, my employer might start wondering what they’re paying me for) so I’m crowdsourcing it! Which is just a fancy way to ask for help. I think the results could help answer a lot of interesting questions, especially if you do some subgroup analysis (are the grad students who read blogs interested in the same issues as the professors? do the NGO workers read for the same reasons as staff at multilateral agencies? etc). The full dataset will be available under a Creative Commons license: you can use it for anything noncommercial, as long as you attribute the data to the Smart Aid Initiative.

To receive the full dataset in excel format, please contact me at findwhatworks (at) gmail (dot) com. For those who have already requested it, I’ll email it later this week. And if you write anything based on the results, please send me a link. I’d like to include it on the findings page.

Methodology and its limitations

I’m not trying to turn this into an academic paper — this is a blog after all — but I know that many people will be interested in what we did. Here’s the overview of our methodology. Criticisms are welcome.

Survey design

In the weeks prior to the survey launch, I exchanged emails with an informal virtual focus group of about 20 other bloggers. They provided some great feedback on the research, the wording of survey questions, and the design/layout of the survey.

Technical side

We used Google Docs to administer the survey because it’s free and simple. The tool has a few constraints, but on the whole it served the purpose well. The only glitch was that it didn’t seem to work in the Safari browser on Macs. I blame Steve Jobs.

If we repeat the survey and have a bit of a budget, maybe we can get a subscription for Survey Monkey or something else that offers broader compatibility as well as a finer control on question types.

Population and sampling

This is where any internet survey gets tricky. Our population of interest was people who read international aid/development blogs. Our sampling frame was the readership of the blogs that posted the link. Given the sheer number of posts, and the support of some heavyweight development bloggers like Chris Blattman and Duncan Green, I feel fairly confident in saying that our sampling frame comes close to matching our population. From that sampling frame, our sample was only the people who read one of those blogs in the two week period of the survey (September 5-18), plus those who saw a link through another site, like Twitter or Facebook. Our findings tell us that 95% of the audience reads blogs at least once per week. So the sample also closely matches the sampling frame.

The big drop occurs from the sample to the respondents — i.e. the response rate. Survey nonresponse is okay, as long as it’s random. Since you never really know if nonresponse is random, as a proxy we usually look for a high response rate. Unfortunately, the response rate is the Achilles’ heel of an internet survey: we have no idea how large the population, sampling frame and sample are, so we have no idea what our response rate was. This makes it hard to claim that the response rate was high, even if the number of responses is high.

methodology chart

The punchline is that it’s hard to claim that nonresponses were random. Perhaps executives have less time, and so they were less likely to take the survey even if they still spend time reading blogs. Or perhaps junior employees have less time. Maybe those living in developing countries with unreliable or costly internet connections are less likely to waste precious online time taking a survey. We don’t know. And what we don’t know might bias our results.

I don’t mean to belittle the results before even sharing them. It’s good practice to be upfront about the limitations. Given how little we knew about the population before the survey, I think we’ve been able to learn a lot. I hope these findings provide a base for future research.

One outstanding question is to estimate the size of the aid/development blogosphere audience. This also brings conceptual challenges, such as defining which blogs are part of that blogosphere and which readers count as members of the audience. Maybe the question of size has to be tackled in conjunction with audience segmentation based on reading habits, issue interests, etc.

But I’m getting ahead of myself. Stay tuned for the findings, starting with the questions on demographics and professional status…