Posts tagged data
In the latter half of 2011, I offered to help Nicholas Felton spec out an iOS app for collecting survey and ambient data throughout 2012 for his next iteration of his Annual Report. Halfway through our requirements discussion – about the time I realized the app a) wouldn’t be public and would be b) would be a really fun challenge with a wholly unique use case – I offered to build the app myself. I’m sure we’ll share more details later, but for now Felton’s final designs speak for themselves.
4,739 reports later, the results are stunning.
We use MapBox and the tools they create often at PlaceIQ. No one else seems to better understand the difficulty in presenting and analyzing spatial data better than MapBox.
Adding access to satellite imagery is great on its own, but the ability to customize the images themselves is almost essential. Toning down the colors and subtleties of sat images lets overlay data pop while still having the real world reference layer. Only people who’ve agonized over displaying spatial data would recognize the need for these adjustments.
The sat filter presets they launched today are great starting points. Like Instagram’s filters, they allow us to quickly tweak the imagery to help us more clearly communicate the story in the data.
Give Data for Charity
Location-based analytics company Placed has launched a new app:
Placed said the app—which is free—automatically lets users earn points towards monetary donations to charities, in exchange to provide their location information to the company. Placed is backed by Madrona Venture Group, and develops analytics software used to let app developers know where their users are using their apps. Placed said users can earn points to donate to the Make-A-Wish Foundation, American Cancer Society, American Red Cross, Habitat for Humanity, Action Against Hunger, Sierra Club and the Humane Society, just for using the app; the company said it is using that data to help provide analytics to third parties in an aggregated manner.
The future, everybody.
Every human in the United States. All 308,450,225 of them.
Made by Brandon Martin Anderson with US Census data. Click through for a zoomable map. The details are stunning.
According to GNIP, Oreo’s Gay Pride work spurred huge interactions on Tumblr, and yet when you look at Twitter’s numbers there wasn’t a blip.
The conclusion I’d draw from this is that Twitter, at a high, numerical level, renders as a static drone. It’s so big and messaging is limited in form that spikes are limited, especially when the story is complicated or involved imagery. Has it always been this way or is this a product of its growth?
An Archive of Quotes
A few months ago, prior to joining my current gig, I was cobbling together a new way to look at news. Essentially, I built an RSS reader that didn’t display headlines or body copy. The server would crawl each new article and find the quotes cited by the journalist. It would polish these up and score them, based on their significance, and serve them to a client iOS app on request.
Quietly this server ran for the last 3 months or so, with a half baked raking method and an imperfect quote puller. But the quotes it served to the demo client were usually interesting.
This weekend I spent sometime wrapping up or winding down some side projects I’m unable to pursue, this being one of them. I’ll post the source code later for the app, but for now I’m posting the archived data in both TSV and JSON formats for any data wonks interested. There are over 15k quotes from over 19k articles from 19 publishers. Ping me with any questions or projects you build.
I’m not posting this chart to make a comment about the iPad’s dominance, but rather to applaud Chitika’s visualization tactic. Rather than include iPad figures as a giant bar, dwarfing all the others into a indistinguishable scale, Chitika chose to absorb the iPad’s performance into the Y axis.
I think this allows the chart to pull more weight: we can both understand how other tablets stack up against each other with some nuance and get that oh damn the iPad is dominating moment when we realize it’s factored into the scale itself. Including a dominating bar would only give you the oh damn, while omitting the bar entirely would only give you the also-ran nuance.
In summation: if you have a chart with an overwhelmingly strong signal, make the dominant datapoint the scale against which all others are measured. (Via GigaOm)
Mining FDA Data Reveals a Multitude of Side Effects from Drug Interactions
With Big Data techniques, the world is your lab:
The work, published today in Science Translational Medicine1, provides a way to sort through the hundreds of thousands of ‘adverse events’ reported to the US Food and Drug Administration (FDA) each year. “It’s a step in the direction of a complete catalogue of drug–drug interactions,” says the study’s lead author, Russ Altman, a bioengineer at Stanford University in California.
Although clinical trials are often designed to assess the safety of a drug in addition to how well it works, the size of the trials needed to detect the full range of drug interactions would surpass even the large, late-stage clinical trials sometimes required for drug approval. Furthermore, clinical trials are often done in controlled settings, using carefully defined criteria to determine which patients are eligible for enrolment — including other conditions they might have and which medicines they can take alongside the trial drug.
For practical studies, being able to take lab-calibre measurements in the real world trumps a lab any day. (Via Nature)