Data Mining with R: learning by case studies
http://www.liaad.up.pt/~ltorgo/DataMiningWithR/
R is a really excellent tool ... i use it to analyse performance data from tuning sessions ....Worldwide Inauguration via Twitter
amazing viz
"At noon EST, Barack H. Obama became the 44th president of the United States of America. Watch as the (Twitter) world watched. Below are the tweets worldwide that included inauguration with a "positive attitude." " - Flowing DataHow to Track Twitter Clicks and Get Conversion Data
Track Twitter Clicks and Get Conversion Data
good tagsASCII by Jason Scott / FUCK THE CLOUD
"So please, take my advice, as I go into other concentrated endeavors. Fuck the Cloud. Fuck it right in the ear. Trust it like you would trust a guy pulling up in a van offering a sweet deal on electronics. Maybe you’ll make out, maybe you won’t. But he ain’t necessarily going to be there tomorrow."
Centralization is not the future.
yeahhhhh
And even
Trust it like you would trust a guy pulling up in a van offering a sweet deal on electronics.
"By the cloud, of course, I mean this idea that you have a local machine, a box running some OS, and a vital, distinct part of what you do and what you’re about or what you consider important to you is on other machines that you don’t run, don’t control, don’t buy, don’t administrate, and don’t really understand."Gridplane
vie guthrie
Listable, create and share lists with JSON, SQL, and plaintext output
via guthrie
listable, list making, serving site
having so much fun with this already (via sccottt)The Economy According To Mint
Recession spending by sector (US)
The spending data is cool, but it's largely meaningless to use for any decision-making about the state of the economy.
"Mint.com is in a unique position to answer this question – quantitatively. Since the crisis first hit in September, our user registration rate has more than quadrupled, giving us 900,000 sample points on the economy. That’s close to 1% of US households."
The average customer: "are spending $400 less each month than they were a year ago, have burned through half of their savings, and on average have taken on an additional $5k in debt."
Aaron PatzerHOW TO: Take Your Data Back From Google's Claws
Good list of ways to back up all Google services
Easy instructions on how to back up google data
Some easy solutions for extracting and backing up your data on popular Google apps and services.Map of Popular Super Bowl Words Used on Twitter - Interactive Graphic - NYTimes.com
Yay, Rochester is tracked.well-formed.eigenfactor.org : Visualizing information flow in science
Interactive visualizations based on the Eigenfactor™ Metrics and hierarchical clustering to explore emerging patterns in citation networks. A cooperation between the Eigenfactor Project (data analysis) and Moritz Stefaner (visualization).
Interactive visualizations based on the Eigenfactor™ Metrics and hierarchical clustering to explore emerging patterns in citation networks.dotbot | DotNetDotCom.org
i find it cool that 7% of the web is not there.
We are just a few Seattle based guys trying to figure out how to make internet data as open as possible. You should be able to find everything you are looking for below. If not feel free to contact us. Happy Surfing!
Whole Internet in one file :)Times Developer Network - Welcome
amidst the flames, NYT opens up a phoenix API.
The NY Times Developer Network. Looks like they have some really interesting API's to play with.Announcing the Article Search API - Open Blog - NYTimes.com
This API to search over 2.8 million NT Times articles back to 1981 looks incredible. Plus I really commend the Times for releasing this API. A valuable and useful resource indeed.Google Mobile - Sync
OMG! Is it for real?
後で試す。Energy Information
Google energy information - popular site in Delicious
Monitor your houseapplicances' energy consumption via Google!
Google PowerMeter, now in prototype, will receive information from utility smart meters and energy management devices and provide anyone who signs up access to her home electricity consumption right on her iGoogle homepage. The graph below shows how someone could use this information to figure out how much energy is used by different household activites.Security: Properly Erase Your Physical Media
For class
erase files forever, nuke HD hard driveHow Google and Facebook are using R : Data Evolution
This looks like a fun language to play with.
google tools
Last night, I moderated our Bay Area R Users Group kick-off event with a panel discussion entitled “The R and Science of Predictive Analytics”, co-located with the Predictive Analytics World conference here in SF. The panel comprised of four recognized R users from industry: * Bo Cowgill, Google * Itamar Rosenn, Facebook * David Smith, Revolution Computing * Jim Porzak, The Generations Network (and Co-Chair of our R Users Group)
looks like a promising readR Graph Gallery :: thumbnails gallery
wow. very smooth.
リアルタイムグラフ
meteorological data, Germany
Experience weather in a new wayDiskDigger | DmitryBrant.com
On Friday night a technology blog called Techcrunch posted a vicious and completely false rumour about us: that Last.fm handed data to the RIAA so they could track who’s been listening to the “leaked” U2 album. I denied it vehemently on the Techcrunch article, as did several other Last.fm staffers. We denied it in the Last.fm forums, on twitter, via email – basically we denied it to anyone that would listen, and now we’re denying it on our blog. According to Ars Technica, even the RIAA don’t know where the rumour came from. The Ars Technica article is worth a read by the way, as it explains how the album was leaked in the first place by U2’s record label. All the data and technical side of Last.fm is hosted in London and run by the team here. We keep a close eye on what data mining jobs we run, not because we’re paranoid the RIAA is trying to infiltrate us, but because time on our Hadoop Cluster (where the data lives) is so precious and we have lots of important jobs that run every
Wow! Last.fm pwns TC
The hacks at Tech Crunch post a rumor whose source neither Last.fm nor the RIAA know about. Congrats, chumps, you're sending the digital journalism community back several years with every post.
Best rebuttal ever.
I quit reading TC at least 18 months ago. As Rogers Cadenhead said, they're obsessed with first rather than right.
Last.fmはRIAAにデータを横流ししているとTechcrunchが言い掛かりをつけてきたのでキレるの図。Google Data on Rails - Google Data APIs - Google Code
This article is intended for developers interested in accessing the Google Data APIs using Ruby, specifically Ruby on Rails. It assumes the reader has some familiarity with the Ruby programming language and the Rails web-development framework. I focus on the Documents List API for most of the samples, but the same concepts can be applied to any of the Data APIs.Gridplane
website with good graphs/charts/grids
Cool stuff, more data based than xplaneAmazon Exposes 1 Terabyte of Public Data to Developers - ReadWriteWeb
That leaked U2 album is causing all sorts of trouble. The unreleased album, which is due out on March 3, found its way ...
Annotated link http://www.diigo.com/bookmark/http%3A%2F%2Fwww.techcrunch.com%2F2009%2F02%2F20%2Fdid-lastfm-just-hand-over-user-listening-data-to-the-riaaThe Twitalyzer for Tracking Influence and Measuring Success in Twitter
Nerdiness.OGDL, Ordered graph data language
OGDL is a structured textual format that represents information in the form of graphs, where the nodes are strings and the arcs or edges are spaces or indentation.bush-map.gif (GIF Image, 1500x818 pixels)
design imgIt's Data Privacy Day: Do You Know Where Your Data Is?
We've covered oodles of privacy apps and topics over the years at Lifehacker, but here are some of our personal favorites:
Encriptación, navegación anónima, gestión de passwords, borrado seguro de ficheros, encriptación de comunicaciones...
Today is Data Privacy Day, during which we're encouraged to reflect on the state of our data and bolster security where we can—so let's take a closer look at our favorite data privacy tips.
Data Privacy Day, during which we're encouraged to reflect on the state of our data and bolster security where we can—so let's take a closer look at our favorite data privacy tips
** Posted using Viigo: Mobile RSS, Sports, Current Events and more **Twitter Technology Blog: We Got Data
Online Data
The data collection effort about investor attitudes that I have been conducting since 1989 has now resulted in a group of Stock Market Confidence Indexes produced by the Yale School of Management. These data are collected in collaboration with Fumiko Kon-Ya and Yoshiro Tsutsui of Japan. Some of our earlier results are also noteworthy.
Robert Shiller's databaseJDMP » Java Data Mining Package » About
The Java Data Mining Package (JDMP) is an open source Java library for data analysis and machine learning. It facilitates the access to data sources and machine learning algorithms (e.g. clustering, regression, classification, graphical models, optimization) and provides visualization modules. It includes a matrix library for storing and processing any kind of data, with the ability to handle very large matrices even when they do not fit into memory. Import and export interfaces are provided for JDBC data bases, TXT, CSV, Excel, Matlab, Latex, MTX, HTML, WAV, BMP and other file formats. JDMP provides a number of algorithms and tools, but also interfaces to other machine learning and data mining packages (Weka, LibSVM, Mallet, Lucene, Octave).
Data mining and visualisation tool that connects to a number of data sources (including Matlab and Weka)
datamining
LGPL 3Amazon Web Services Blog: New AWS Public Data Sets - Economics, DBpedia, Freebase, and Wikipedia
We have just released four additional AWS public data sets, and have updated another one. In the Economics category, we have added a set of transportation databases from the US Bureau of Transportation Statistics. Data and statistics are provided for aviation, maritime, highway, transit, rail, pipeline, bike & pedestrian, and other modes of transportation, all in CSV format. I was able to locate employment data for our hometown airline and found out that they employed 9,322 full-time and 1,122 part-time employees as of the end of 2007. In the Encyclopedic category, we have added access to the DBpedia Knowledge Base, the Freebase Data Dump, and the Wikipedia Extraction, or WEX.
amazonRails Forms microformat « Trek
Trek Rails Forms microformat This article has been updated to reflect the latest patterns in Rails 2.3 edge (based mostly on this commit) If you’ve been relying on Rails form helpers to generate forms, then you may have missed the interesting little microformat used to pass application data to and fro. In case you didn’t know, form data is passed as part of the request body as a set of key/values pairs in plain text (if you’re using get as a method for a form, it’s that url section like this: ?name=widget12&price=22). The name attribute of the form inputs are the keys (here name and price), and the value is whatever the user entered or selected (widget12 and 22). Most languages/frameworks for the web will reconstitute these pairs as objects accessible to the programmer. For example <input name='widget_name' /> is accessed with $_POST["widget_name"] in php, self.request.get("widget_name") on App Engine, and params[:widget_name] in Rails. This format can only pass a single value f
One of the most comprehensive articles on rails forms. Includes the newly added nested attributes stuff.Media Cloud
Media Cloud is a system that lets you see the flow of the media. The Internet is fundamentally altering the way that news is produced and distributed, but there are few comprehensive approaches to understanding the nature of these changes. Media Cloud automatically builds an archive of news stories and blog posts from the web, applies language processing, and gives you ways to analyze and visualize the data. The system is still in early development, but we invite you to explore our current data and suggest research ideas. This is an open-source project, and we will be releasing all of the code soon. You can read more background on the project or just get started below.
Harvard's Berkman Center announces Media Could: "Media Cloud is a system that lets you see the flow of the media. The Internet is fundamentally altering the way that news is produced and distributed, but there are few comprehensive approaches to understanding the nature of these changes. Media Cloud automatically builds an archive of news stories and blog posts from the web, applies language processing, and gives you ways to analyze and visualize the data. The system is still in early development, but we invite you to explore our current data and suggest research ideas. This is an open-source project, and we will be releasing all of the code soon. You can read more background on the project or just get started below."
Media Cloud is a system that lets you see the flow of the media. The Internet is fundamentally altering the way that news is produced and distributed, but there are few comprehensive approaches to understanding the nature of these changes. Media Cloud automatically builds an archive of news stories and blog posts from the web, applies language processing, and gives you ways to analyze and visualize the data.The Guardian Open Platform | guardian.co.uk
You can use the Open Platform to develop tools exploiting the depth and quality of the Guardian's content
app building
Open Platform
The Guardian's Open PlatformData Store: Facts you can use | Data Store | guardian.co.uk
The data store for the guardian newspaper
datasetsGuardian launches Open Platform service to make online content available free | Media | guardian.co.uk
media
Innovative method of 'regionalising' news stories
The future
Websites can now use The Guardian's huge store of articles and information to make new websitesTim Berners-Lee on the next Web | Video on TED.com
10 Mar 09 / NYT interactive map of immigration & foreign born in US, 1880 to 2000, by time, county, group.
fine info-graphic
Interactive Map Showing Immigration Data Since 1880 - Interactive Graphic - NYTimes.comGoodbye Google | stopdesign
Timetric's here to help you make sense of data. It focuses on time series analysis: graphing, tracking and comparing the movements of data over time. It creates tools to make it easy to build models on top of time series — updated whenever the data they're based on is updated — as it is to use a spreadsheet. You can use data in the program or upload your own.CS171
Information visualization courseData Visualization Is Reinventing Online Storytelling - Advertising Age - DigitalNext
Data Visualization
Today's consumer seems to have an insatiable appetite for information, but until recently making sense of all of that raw data was too daunting for most. Enter the new "visual scientists" who are turning bits and bytes of data -- once purely the domain of mathematicians and coders -- into stories for our digital age.
A nice round up of teh ideas and some great examples
Advertising Age - DigitalNextMicro Persuasion: Social Networking Demographics: Boomers Jump In, Gen Y Plateaus
According to the study, baby boomers... * Increased reading blogs and listening to podcasts by 67 percent year over year; nearly 80 times faster than Gen Y (1 percent) * Posted a 59 percent increase in using social networking sites—more than 30 times faster than Gen Y (2 percent) * Increased watching/posting videos on the Internet by 35 percent—while Gen Y usage decreased slightly (-2 percent) * Accelerated playing video games on the go via mobile devices by 52 percent— 20 times faster than Gen Y (2 percent) * Increased listening to music on an iPod or other portable music player by 49 percent—more than four times faster than Gen Y (12 percent)Twitter’s Tweet Smell Of Success | Nielsen Wire
about grow twitter
about twitter
Twitter’s Tweet Smell Of SuccessEconomic Recovery Dashboard - Helping Advisors
Economic Recovery Dashboard - Helping Advisors
To help you talk to your clients, we've identified a few key economic and market indicators to help assess the current economic health and trend.
Chart of leading and lagging economic indicators giving current numbers and historical information.Follow the Mobile User
Focus on the mobile user, and all else will follow Simpler data, better browsers, and a smoother experience
This guest post is written by Vic Gundotra, Vice President of Engineering for Google's mobile and developer products. (Prior to Google, he spent ...
in 2009, for the first time 50% of all new internet connections will come from a PHONE!Data Visualization Is Reinventing Online Storytelling - Advertising Age - DigitalNext
Today's consumer seems to have an insatiable appetite for information, but until recently making sense of all of that raw data was too daunting for most. Enter the new "visual scientists" who are turning bits and bytes of data -- once purely the domain of mathematicians and coders -- into stories for our digital age.
Short blog post offers several different examples, not all from news organizations.SIMILE Widgets
SIMILE Widgets Free, Open-Source Data Visualization Web Widgets, and More This is an open-source “spin-off” from the Simile project at MIT. Here we offer free, open-source web widgets, mostly for data visualizations. They are maintained and improved over time by a community of open-source developers.
This is an open-source “spin-off” from the Simile project at MIT. Here we offer free, open-source web widgets, mostly for data visualizations. They are maintained and improved over time by a community of open-source developers.
Free, Open-Source Data Visualization Web Widgets, and MoreFlickr Photo Download: Web Trend Map 4 Final Beta
I have no idea how to read it, but it looks interesting
AWESOME visualization of web trends
Web Trend map maps influential web domains and people onto a Tokyo Metro map
The Internet in the form of public transit10 Useful Flash Components for Graphing Data
In this article, you will find ten excellent Flash components that will help you in building stunningly attractive, complex, and interactive data visuals. These components will help you create an assortment of graphs and charts to aid in presenting otherwise boring and stale numerical data.Lifehacker - Exhibit Transforms Your Spreadsheet into an Interactive Web Page - Exhibit
Érdekes mashup. Interaktív, szűrhető, csoportosítható táblázatot készít Excel -ből, amibe képeket, térkép pontokat, stb. szerelhetünk
Turn a boring old spreadsheet into an interactive web-based map, timeline, or table with some simple HTML using the free, open source Exhibit project.
Very cool!
Turn a boring old spreadsheet into an interactive web-based map, timeline, or table with some simple HTML using the free, open source Exhibit project. Exhibit takes data sets up to about 500 rows, plots locations on a Google Map, dates on an interactive timeline, and displays images and links in a tabular or thumbnail view. The viewer can sort, search, and filter data in any Exhibit view without reloading the page. You can make Exhibit do all this with a single HTML file and a spreadsheet–no hardcore programming required#FollowFriday: The Anatomy of a Twitter Trend
@JitterbugBoy #followfriday = recommend other people to follow, on a Friday. For origins, see http://tinyurl.com/b3crrz [from http://twitter.com/stevegreer/statuses/1322087659]
March 6th, 2009 | by Micah Baldwin
What if you didn’t know who to follow on Twitter? Would you randomly start following people? Would you follow people you see mentioned by those you already follow? Most likely you would ask your friends for recommendations since you can trust that your friends will suggest people who are worth following. Which is exactly how FollowFriday began.駅データ 無料ダウンロード 『駅データ.jp』
takes time to load.... but versatile views
click and roll over for pop stats of any region in the world
International comparisons of economies and societies tend to be undertaken at the country level; statistics refer to gross national product, for example, while health and education levels tend similarly to be measured and debated in national terms. However, economic performance and social indicators can vary within countries every bit as much as they do between countries. Understanding the differences and similarities in regional economic structures is essential for designing effective strategies which improve regional competitiveness and in turn increase sustainable national growth. Regions in OECD countries are classified on two territorial levels to facilitate greater comparability of regions at the same territorial level. The higher level (TL2) consists of 335 large regions. All the regions are defined within national borders and in most of the cases correspond to administrative regions.Data Visualizations: 5 Beautiful Social Media Videos
Data visualizations are a wonderful way to display the interactions between large groups of people within a network. Virtual places like Twitter, Facebook, or Flickr can be easier understood when you see a visual representation of their inner workings. We’ve chosen five fresh videos that visualize various social media ecosystems.
Visualizing social media in very interesting waysHerman Miller
a crowd sourcing site, in beta, which aims to answer tricky and interesting questions, so getting a feel for the view of the crowd. useful thing to try and do, very odd site design, unfriendly and unclear but arty!! not tried it but watch this space....
navigationing!
Herman Millers ThoughtPile application, bundling user generated ideas on new topics every day.26051202.jpg (JPEG Image, 2282x1397 pixels)
Info-graphic: consumption of key resources, estimated number of years before exhaustion. Many between 5 and 40 years.Some Notes on Distributed Key Stores « random($foo)
Distributed Key Stores
(Anti RDBMS) Key-value storesGoogle Analytics Blog: Web Analytics Tips & Tricks: Attention Developers: Google Analytics API Launched!
these forums so let us know what you think about the API there, and share your ideas and your applications with us. We look forward to seeing your creativity!
Attention Developers: Google Analytics API Launched!Visual Representation of Tabular Information - How to Fix the Uncommunicative Table | FlowingData
table data visualisatingDemo: Stunning data visualization | Video on TED.com
About this talk JoAnn Kuchera-Morin demos the AlloSphere, a new way to see, hear and interpret scientific data. Dive into the brain, feel electron spin, hear the music of the elements ... and detect previously unseen patterns that could lead to new discoveries. About JoAnn Kuchera-Morin Composer JoAnn Kuchera-Morin is the director of the Center for Research in Electronic Art Technology (CREATE) at UC Santa Barbara. Full bio and more links
JoAnn Kuchera-Morin demos the AlloSphere, a new way to see, hear and interpret scientific data. Dive into the brain, feel electron spin, hear the music of the elements ... and detect previously unseen patterns that could lead to new discoveries.
JoAnn Kucnera-Morin: Tour the AlloSphere, a stunning way to see scientific dataWolfram|Alpha: Our First Impressions - ReadWriteWeb
Another query with a very sophisticated result was "uncle's uncle's brother's son." Now if you type that into Google, the result will be a useless list of sites that don't even answer this specific question, but Alpha actually returns an interactive genealogic tree with additional information, including data about the 'blood relationship fraction,' for example (3.125% in this case).
The hype around Wolfram|Alpha, the next "Google killer" from the makers of Mathematica, has been building over the last few weeks. Today, we were lucky enough to attend a one-hour web demo with Stephen Wolfram, and from what we've seen, it definitely looks like it can live up to the hype - though, because it is so different from traditional search engines, it will definitely not be a "Google killer."
More impressions on Wolfram Alpha question answering engineOfficial Google Blog: Adding search power to public data
All the data we've used in this first launch are produced and published by the U.S. Bureau of Labor Statistics and the U.S. Census Bureau's Population Division. They did the hard work! We just made the data a bit easier to find and use. Since Google's acquisition of Trendalyzer two years ago, we have been working on creating a new service that make lots of data instantly available for intuitive, visual exploration.
Google launched a new search feature that makes it easy to find and compare public data. So for example, when comparing Santa Clara county data to the national unemployment rate, it becomes clear not only that Santa Clara's peak during 2002-2003 was really dramatic, but also that the recent increase is a bit more drastic than the national rate. If you go to Google.com and type in [unemployment rate] or [population] followed by a U.S. state or county, you will see the most recent estimates. Once you click the link, you'll go to an interactive chart that lets you add and remove data for different geographical areas.
http://www.google.com/publicdata?ds=usunemployment&met=unemployment_rate&idim=county:CN060850#met=unemployment_rate&idim=county:PS060900
Adding search power to public data 4/28/2009 12:17:00 PM Earthquakes are not the only thing that can shake Silicon Valley. After the dot-com bubble burst back in 2000 the unemployment rate of Santa Clara county went up to 9.1%. During the last couple of months, it has gone up again:
Google has launched a cool, if somewhat limited, new feature that makes it easier to search for and visualize statistics gleaned from public data. You can search for "unemployment rate" or "population" for any area in the United States and Google will provide you with information from the US Bureau of Labor Statistics and the Census Bureau.
"We just launched a new search feature that makes it easy to find and compare public data... If you go to Google.com and type in [unemployment rate] or [population] followed by a U.S. state or county, you will see the most recent estimates... Once you click the link, you'll go to an interactive chart that lets you add and remove data for different geographical areas."Descry - Lab - MIX Online
mix: ways to utilize visualizationscomScore: Mobile Internet Becoming A Daily Activity For Many
comScore, Inc. reports that the number of people using their mobile device to access news and information on the Internet more than doubled from January 2008 to January 2009. Among the audience of 63.2 million people who accessed news and information on their mobile devices in January 2009, 22.4 million (35 percent) did so daily; more than double the size of the audience last year.Lifehacker - Five Best Free Data Recovery Tools - Data Recovery
Dados sobre o mundo, estatísticas interessantes.
About a year ago the United Nations announced UNdata, a way to disseminate data stretched out across 22 United Nations databases through one central application. While UNdata houses 66 million records, it's tough to get a sense of what's going on without a visual representation. Progress is an effort to make this world data visible. More than anything though, it was a chance for me to mess around with some data. TAKE A LOOK --- A Project by FlowingData
"About a year ago the United Nations announced UNdata, a way to disseminate data stretched out across 22 United Nations databases through one central application. While UNdata houses 66 million records, it's tough to get a sense of what's going on without a visual representation. Progress is an effort to make this world data visible. More than anything though, it was a chance for me to mess around with some data."Many Eyes: Visualization Options
Need this for class.
FANTÁSTICO
Finding the right way view your data is as much an art as a science. The visualizations provided on Many Eyes range from the ordinary to the experimental. We're deliberately providing a wide array of possibilities since this is an experimental site...
Finding the right way view your data is as much an art as a science. The visualizations provided on Many Eyes range from the ordinary to the experimental. We're deliberately providing a wide array of possibilities since this is an experimental site—and expect to see more soon!
cool collection of visualizationsSchneier on Security: Privacy in the Age of Persistence
"Cardinal Richelieu famously said: 'If one would give me six lines written by the hand of the most honest man, I would find something in them to have him hanged.' When all your words and actions can be saved for later examination, different rules have to apply."
Schneier says privacy is quickly disappearing and we're ignoring it. It's like pollution at the beginning of the century: we're ignoring it now because it's small but soon we'll realize it was a big problem that should have been nipped in the bud. Also, if every conversation is recorded we have to change our standards accordingly; eg: how information is considered in a court.
"Society works precisely because conversation is ephemeral; because people forget, and because people don't have to justify every word they utter. ... Privacy isn't just about having something to hide; it's a basic right that has enormous value to democracy, liberty, and our humanity. ... Just as we look back at the beginning of the previous century and shake our heads at how people could ignore the pollution they caused, future generations will look back at us – living in the early decades of the information age – and judge our solutions to the proliferation of data. We must, all of us together, start discussing this major societal change and what it means. And we must work out a way to create a future that our grandchildren will be proud of."
Beautiful essay by Bruce Schneier on the challenges of our time due to data collection, the "pollution" of the information age. Tweeted by Thomas Kriese.
"Cardinal Richelieu famously said: "If one would give me six lines written by the hand of the most honest man, I would find something in them to have him hanged." When all your words and actions can be saved for later examination, different rules have to apply." This is especially important for those who say that they have nothing to hide. That misses the point.
Welcome to the future, where everything about you is saved. A future where your actions are recorded, your movements are tracked, and your conversations are no longer ephemeral. A future brought to you not by some 1984-like dystopia, but by the natural tendencies of computers to produce data. Data is the pollution of the information age. It's a natural byproduct of every computer-mediated interaction. It stays around forever, unless it's disposed of. It is valuable when reused, but it must be done carefully. Otherwise, its after effects are toxic. And just as 100 years ago people ignored pollution in our rush to build the Industrial Age, today we're ignoring data in our rush to build the Information Age. Increasingly, you leave a trail of digital footprints throughout your day.37 Data-ish Blogs You Should Know About | FlowingData
You might not know it, but there are actually a ton of data and visualization blogs out there. I'm a bit of a feed addict subscribing to just about anything with a chart or a mention of statistics on it (and naturally have to do some feed-cleaning every now and then). In a follow up to my short list last year, here are the data-ish blogs, some old and some new, that continue to post interesting stuff.What is the Open Platform? | The Guardian Open Platform | guardian.co.uk
"The Open Platform is the suite of services that make it possible for our partners to build applications with the Guardian. We've opened up our platform so that everyone can benefit from our journalism, our brand, and the technologies that power guardian.co.uk."
The Open Platform is the suite of services that make it possible for our partners to build applications with the Guardian. We've opened up our platform so that everyone can benefit from our journalism, our brand, and the technologies that power guardian.co.uk. The Open Platform currently includes two products, the Content API and the Data Store:
"The Open Platform is the suite of services that make it possible for our partners to build applications with the Guardian. We've opened up our platform so that everyone can benefit from our journalism, our brand, and the technologies that power guardian.co.uk. The Open Platform currently includes two products, the Content API and the Data Store. "
The Open Platform is the suite of services that make it possible for our partners to build applications with the Guardian. We've opened up our platform so that everyone can benefit from our journalism, our brand, and the technologies that power guardian.co.uk.Just Landed: Processing, Twitter, MetaCarta & Hidden Data | blprnt.blg
Guy parses the Twitter stream with MetaCarta data services to create a Processing visualization of people's flights around the world.
drool :) "The idea is simple: Find tweets that contain this phrase, parse out the location they’d just landed in, along with the home location they list on their Twitter profile, and use this to map out travel in the Twittersphere"
Looking for ‘Just landed in…’ in public twitter streams, »BOOM!« Arcs on map!DataTables example
SIIIICK Data TableLive Piracy Map
This map shows all the piracy and armed robbery incidents reported to the IMB Piracy Reporting Centre during 2008. If exact coordinates are not provided, estimated positions are shown based on information provided. Zoom-in and click on the pointers to view more information of an individual attack. Pointers may be superimposed on each other.
Un mapa de todos los robos a mano armada y piratería incidentes (ambos con éxito y tentativa) informó el año pasado
Mappa con i punti di attacco delle navi pirata verso imbarcazioni commerciali.
YarrrAxiis : Data Visualization Framework
open source data visualization Axiis is an open source data visualization framework designed for beginner and expert developers alike. Whether you are building elegant charts for executive briefings or exploring the boundaries of advanced data visualization research, Axiis has something for you. Axiis provides both pre-built visualization components as well as abstract layout patterns and rendering classes that allow you to create your own unique visualizations. Axiis is built upon the Degrafa graphics framework and Adobe Flex 3.Data.gov
The purpose of Data.gov is to increase public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government. Although the initial launch of Data.gov provides a limited portion of the rich variety of Federal datasets presently available, we invite you to actively participate in shaping the future of Data.gov by suggesting additional datasets and site enhancements to provide seamless access and use of your Federal data. Visit today with us, but come back often. With your help, Data.gov will continue to grow and change in the weeks, months, and years ahead.
WOW "The purpose of Data.gov is to increase public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government."
The purpose of Data.gov is to increase public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government.
The new U.S. federal open data site is live! "Data.gov will open up the workings of government by making economic, healthcare, environmental, and other government information available on a single website, allowing the public to access raw data and transform it in innovative ways."The New York Times > Business > Image > The Road to 200 Million
Facebook's rise to 200 million users, showing network diagrams and site expansion and use
infographic on growth of Facebook
Nice diagram from NY Times showing map of the world and changing age distribution of Facebook members over time.
Great data visualization for how the age and worldwide distribution of Facebook members has cahnged since launch in 2004B木 - naoyaのはてなダイアリー
ハードディスクのような遅い記憶装置にはB-treeが、SSDのような速い記憶装置にはSuffix Arrayが適しているという論。
ので次は18章です。18章のGeek Chart - Show Where You Share
some web aplication to get a geekiness graphic.
Twitter/ブログ/YouTube、ポスト状況の可視化、円グラフ/光沢、←マウスオーバーしないとラベルが読めない/同時に考えられない
オンラインでの活動状況を円グラフにすることができるサービス
A Geek Chart is a badge to put on your website that shows where you share stuff online. Each slice of the Geek Chart is a link to your profile on sites like Flickr, Twitter, Youtube, Digg and more.Journalism.org- The State of the News Media 2009
The State of the News Media 2009, An Annual Report on American Journalism - Presented by Journalism.org
The State of the News Media 2009 is the sixth edition of our annual report on the health and status of American journalism.
Le dernier rapport sur la presse américaine est disponible. A quand de tels rapports disponibles pour la presse française ?Axiis : Data Visualization Framework
우아한 차트를 만들어주는 프레임웍이란다Secret of Googlenomics: Data-Fueled Recipe Brews Profitability
Kuinka Googlen AdWords oikeasti toimii. Steven Levyn erinomainen juttu Wiredissa.
The economics behind the ads you see, and what they cost.
Article by Steven Levy (not: Steven Levitt ;-) ) on Hal Varian, Google chief economist and the application of auctions to all kinds of logistical, organisational or economic problems.With YQL Execute, the Internet becomes your database (Yahoo! Developer Network Blog)
The Yahoo! Query Language lets you query, filter, and join data across any web data source or service on the web. Using our YQL web service, apps run faster with fewer lines of code and a smaller network footprint. YQL uses a SQL-like language because it is a familiar and intuitive method for developers to access data. YQL treats the entire web as a source of table data, enabling developers to select * from Internet.
YQL + Linked Data = possibilities
Execute elements run server-side JavaScript with E4X (naDev Explorer - Reading and Writing to Excel Spreadsheets in Python
ReadingandWritingtoExcelSpreadsheets
An introduction to using the xlwt and xlrd modules for python to interact with Microsoft Excel spreadsheets.The Three Sexy Skills of Data Geeks : Dataspora Blog
istograms, where labels and colors are minimally set by default. Their goal is to help develop a hypothesis about the data, and their audience typically numbers one or a small team. A second kind of data visualization are those intended to communicate to a wider audience, whose goal is to visually advocate for a hypothesis. While most d
The sexy job in the next ten years will be statisticians… The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill.
parsing, and proofing one’s data before it’s suitable for analysis. Real world data is messy. At best it’s inconsistently delimited or packed into an unnecessarily complex XML sUSGovXML.com: Home
Government Data in XML (web services etc)
More datasets courtesy of uncle Sam.
Listado de direcciones del gobierno de EEUU con WebServices que permiten acceder a informacion publica50 Great Examples of Data Visualization | Webdesigner Depot
Kevin: As the post says: "50 of the best data visualizations and tools for creating your own visualizations out there, covering everything from Digg activity to network connectivity to what’s currently happening on Twitter."
Haven't gone through these yet. Found link via DIGG InfoVis
Data visualizationGOOD Transparencies Archive - a set on Flickr
really amazing and beautiful charts and data visuality from GOOD magazine.
GOOD Magaine infographics inspirationNew Twitter Research: Men Follow Men and Nobody Tweets - Conversation Starter - HarvardBusiness.org
new Twitter research
Great stuff on Twitter stats. Raises a lot of questions and possible lines of research.
very different than other social networks... the top 10% of prolific Twitter users accounted for over 90% of tweets. an average man is almost twice more likely to follow another man than a woman.Lifehacker - Separate Your Data from Windows on a Standalone Partition - Windows
With Windows 7's release just around the corner, now's a great time to get your PC ready for the new operating system. First step: separate your data onto a dedicated partition.
LifehackerSocrata | Making Data Social
"Opening government to new audiences and constituencies is the 21st century battle cry in societies everywhere. At the heart of this movement is open government data, readily accessible over the internet, in a form that maximizes comprehension, interactivity, participation, and sharing, delivered at a fraction of the cost of today's data download sites."
This used to be the site called blist.
AWESOME source of data sets, .csvGoogle Squared
Google Squared takes a category and creates a starter 'square' of information, automatically fetching and organizing facts from across the web.RoamBi - Your Data, iPhone-Style
sweet iphone/touch app!
upload your data and turn them into interactive visualizations for the iPhoneWhy CouchDB?
man, I really really wish I understood this stuff.
shows
“Django may be built for the Web, but CouchDB is built of the Web. I’ve never seen software that so completely embraces the philosophies behind HTTP. CouchDB makes Django look old-school in the same way that Django makes ASP look outdated.”
ebook on why you would choose couchdbJavaScript InfoVis Toolkit - Interactive Data Visualizations for the Web
Java scrippting
GG
kind of flare for js: impressive!
JavaScript Information Visualization Toolkit, Meaningful VisualizationsSecret of Googlenomics: Data-Fueled Recipe Brews Profitability
"I'm going to talk about online auctions," says Hal Varian, the session's first speaker. Varian is a lanky 62-year-old professor at UC Berkeley's Haas School of Business and School of Information, but these days he's best known as Google's chief economist. This morning's crowd hasn't come for predictions about the credit market; they want to hear about Google's secret sauce
Google getta le basi per una nuova forma di organizzazione economica.
Why does Google even need a chief economist? The simplest reason is that the company is an economy unto itself. The ad auction, marinated in that special sauce, is a seething laboratory of fiduciary forensics, with customers ranging from giant multinationals to dorm-room entrepreneurs, all billed by the world's largest micropayment system. Google depends on economic principles to hone what has become the search engine of choice for more than 60 percent of all Internet surfers, and the company uses auction theory to grease the skids of its own operations. All these calculations require an army of math geeks, algorithms of Ramanujanian complexity, and a sales force more comfortable with whiteboard markers than fairway irons.kmlfactbook.org
kmlfactbook.org can use either Google Maps or the Google Earth browser plugin to preview the KML files that you create. To switch between the two modes press the 2D Map and 3D Map buttons to the right in the screen.
kmlfactbook.org allows you to create Google Earth KML visualizations from your own global data-sets.
Demographic 3D mapping
Great facts for many subjects
A site that combines statistical data with a Google Map.Taking a New Look at Health : GE
java visualizer, cool information graphics
really nice website that makes numbers and stats clear to seePatrick Collison » blog » Hacking for fun and profit with Mathematica and the Google Analytics API
http://collison.ie/blog/2009/04/hacking-for-fun-and-profit-with-mathematica-and-the-google-analytics-api HackingforfunandprofitGuesses vs. Data as Basis for Design Recommendations (Jakob Nielsen's Alertbox)
unread
Wherein we are told—or reminded—that the smallest amount of empirical data from real users quadruples the probability of being right.
Jakob Nielsen's AlertboxCan Computer Nerds Save Journalism? - TIME
A cadre of newly minted media whiz kids, who mix high-tech savvy with hard-nosed reporting skills, are taking a closer look at ways in which 21st century code-crunching and old-fashioned reporting can not only coexist but also thrive.
Journalism schools aren\'t just incorporating computer skills into their curriculums -- they\'re recruiting techies with full-ride scholarships
Journaliste, changez de pratique, sinon direction Pôle emploi
"A cadre of newly minted media whiz kids, who mix high-tech savvy with hard-nosed reporting skills, are taking a closer look at ways in which 21st century code-crunching and old-fashioned reporting can not only coexist but also thrive." - To answer the question in the headline - "No." No one group of journalists/computer geeks are going to "save" journalism.
A cadre of newly minted media whiz kids, who mix high-tech savvy with hard-nosed reporting skills, are taking a closer look at ways in which 21st century code-crunching and old-fashioned reporting can not only coexist but also thrive. And the first batch of them has just emerged from Northwestern University's Medill School of Journalism.Using jQuery To Manipulate and Filter Data - Nettuts+
Using jQuery To Manipulate and Filter DataData Evolution
ittle to do with their lascivious leanings (ahem, BedPost), and more with the scarcity of their skills. I believe that the folks to whom Hal Varian is referring are not statisticians in the narrow sense, but rather people who possess skills in three key, yet independent areas: statistics, data munging, and data visualization. (In parentheses next to each, I’ve put the salient character trait needed to acquire it).
sdfsgasdgThe Web of Data: Creating Machine-Accessible Information - ReadWriteWeb
The Web of Data: Creating Machine-Accessible Information In the coming years, we will see a revolution in the ability of machines to access, process, and apply information. This revolution will emerge from three distinct areas of activity connected to the Semantic Web: the Web of Data, the Web of Services, and the Web of Identity providers. These webs aim to make semantic knowledge of data accessible, semantic services available and connectable, and semantic knowledge of individuals processable, respectively. In this post, we will look at the first of these Webs (of Data) and see how making information accessible to machines will transform how we find information. The amount of information and services available is growing exponentially. Every day, it is getting harder to find the information we are actually looking for. Still, we have to learn how to tell machines what we want. Why can't a machine understand which website, recent tweet, Flickr photo, Facebook message, or restaurantJQuery HowTo: 5 easy tips on how to improve code performance with huge data sets in jQuery
i am guilty of many of theseGoogle Fusion Tables (Pre-Alpha)
Fusion Tables é uma plataforma online com nova tecnologia que uniformiza diversos tipos de dados e promete economia às empresas.Mobile phones: Sensors and sensitivity | The Economist
buéno
2009-06-04 IF YOUR mobile phone could talk, it could reveal a great deal. Obviously it would know many of your innermost secrets, being privy to your calls and text messages, and possibly your e-mail and diary, too. It also knows where you have been, how you get to work, where you like to go for lunch, what time you got home, and where you like to go at the weekend. Now imagine being able to aggregate this sort of information from large numbers of phones. It would be possible to determine and analyse how people move around cities, how social groups interact, how quickly traffic is moving and even how diseases might spread. The world’s 4 billion mobile phones could be turned into sensors on a global data-collection network.
"Data collection: Mobile phones provide new ways to gather information, both manually and automatically, over wide areas."
"As a first step, Sense plans to collect positional information from a control group of infected patients being treated at Helen Joseph Hospital in Johannesburg who would have to volunteer to participate in the scheme. Dr Pentland and his colleagues will then be able to determine which neighbourhoods these patients frequent, and their commuting patterns between them. They hope this will then enable them to work out the characteristics of typical TB patients, so that they can then spot potentially infected people in the wider population. How public-health officials will use this information has yet to be decided: people who are thought to be infected could be contacted by text message and asked to visit a doctor, for example."
"Sense plans to collect positional information from a control group of infected patients. [They] will then be able to determine which neighbourhoods these patients frequent, and their commuting patterns between them. They hope this will then enable them to work out the characteristics of typical TB patients, so that they can then spot potentially infected people in the wider population."
Data collection: Mobile phones provide new ways to gather information, both manually and automatically, over wide areas
F YOUR mobile phone could talk, it could reveal a great deal. Obviously it would know many of your innermost secrets, being privy to your calls and text messages, and possibly your e-mail and diary, too. It also knows where you have been, how you get to work, where you like to go for lunch, what time you got home, and where you like to go at the weekend. Now imagine being able to aggregate this sort of information from large numbers of phones. It would be possible to determine and analyse how people move around cities, how social groups interact, how quickly traffic is moving and even how diseases might spread. The world’s 4 billion mobile phones could be turned into sensors on a global data-collection network.Official Google Research Blog: Google Fusion Tables
Database systems are notorious for being hard to use. It is even more difficult to integrate data from multiple sources and collaborate on large data sets with people outside your organization. Without an easy way to offer all the collaborators access to the same server, data sets get copied, emailed and ftp'd--resulting in multiple versions that get out of sync very quickly. Today we're introducing Google Fusion Tables on Labs, an experimental system for data management in the cloud. It draws on the expertise of folks within Google Research who have been studying collaboration, data integration, and user requirements from a variety of domains. Fusion Tables is not a traditional database system focusing on complicated SQL queries and transaction processing. Instead, the focus is on fusing data management and collaboration: merging multiple data sources, discussion of the data, querying, visualization, and Web publishing. We plan to iteratively add new features to the systems as weInfoGraphic Designs: Overview, Examples and Best Practices | Showcases | instantShift
Information graphics or infographics are visual representations of information, data or knowledge. These graphics are used where complex information needs to be , Daily Resource for Web Designers and Developers.
Best practices for designing infographics followed by some examples which might help you learn a thing or two.
Andres Ross from InstantShift.com has written an article about information graphics entitled: InfoGraphic Designs: Overview, Examples and Best Practices. He starts with answering the questions “What is InfoGraphics” and “Why using InfoGraphics”. Then goes on with to telling a bit about its history and the different types that can be distinguished.The Beauty of Infographics and Data Visualization | Abduzeedo | Graphic Design Inspiration and Photoshop Tutorials
man.. yeah
Here in Brazil there's a magazine called "Super Interessante" (which had an Abduzeedo cover a little time ago), that always features some really cool infographics. This is a field that if you make things right, you got yourself inside a great
รวมไอเดียกราฟฟิกพรีเซนเทชั่นInvestigate your MP's expenses
Join us in digging through the documents of MPs' expenses to identify individual claims, or documents that you think merit further investigation. You can work through your own MP's expenses, or just hit the button below to start reviewing.
Investigate your MP's expenses Join us in digging through the documents of MPs' expenses to identify individual claims, or documents that you think merit further investigation. You can work through your own MP's expenses, or just hit the button below to start reviewing. (Update, Fri pm: we now have a virtually complete set of expenses documents so you should be able to find your MP's)
We hope that many hands can make light work of the thousands of documents released by Parliament in relation to MPs’ expenses. We, and others - perhaps you? - are using these tools to review each document, decide whether it contains interesting information, and extract the key facts.
Nice crowdsourcing
Guardian crowd sourcing investigative journalism
Join us in digging through the 700,000 documents of MPs' expenses to identify individual claims, or documents that you think merit further investigation. You can work through your own MP's expenses, or just hit the button below to start reviewing. (Update, Thurs evening: More added now and more coming all the time. Check back if you haven't found your MP yet) Already created an account? Log in here.
Brilliant! Help the Guardian find suspicious MP expenses claims that can be flagged to the authorities! Aaaah, the wonders of the internet!New York City Homicides Map - The New York Times
Infográfico com informações sobre assasinatos em Nova York,SitePen Blog » JavaScriptDB: Persevere’s New High-Performance Storage Engine
JavaScriptDB: Persevere’s New High-Performance Storage Engine April 20th, 2009 at 8:47 pm by Kris Zyp The latest beta of Persevere features a new native object storage engine called JavaScriptDB that provides high-end scalability and performance. Persevere now outperforms the common PHP and MySQL combination for accessing data via HTTP by about 40% and outperforms CouchDB by 249%. The new storage engine is designed and optimized specifically for persisting JavaScript and JSON data with dynamic object structures. It is also built for extreme scalability, with support for up to 9,000 petabytes of JSON/JS data in addition to any binary data.IP address geolocation SQL database :: IPInfoDB
Complete (City)
The SQL database behind ipinfodb.com is offered for free. We offer the database in different formats (SQL, CSV), city or country precision, 3 or 4 IP digits precision and data in single or multiple tables. Available information in the database : ISO country code, country name, FIPS region code, region name, city, zipcode, latitude, longitude and GMT/DST timezone. The database is updated during the first week of each month.Tim Berners-Lee on the next Web | Video on TED.com
Tim Berners-Lee invented the World Wide Web. He leads the World Wide Web Consortium, overseeing the Web's standards and development
Web 3.020 Visualizations to Understand Crime | FlowingData
Lovely.
There's a lot of crime data. For almost every reported crime, there's a paper or digital record of it somewhere, which means hundreds of thousands of data points - number of thefts, break-ins, assaults, and homicides as well as where and when the incidents occurred. With all this data it's no surprise that the NYPD (and more recently, the LAPD) took a liking to COMPSTAT, an accountability management system driven by data. While a lot of this crime data is kept confidential to respect people's privacy, there's still plenty of publicly available records. Here we take a look at twenty visualization examples that explore this data.
關於犯罪事件的視覺化呈現Video: Designing for Big Data, by Jeffrey Veen
This is a 20-minute talk I gave at the Web2.0 Expo in San Francisco a couple weeks ago. In it, I describe two trends: how we're shifting as a culture from consumers to participants, and how technology has enabled massive amounts of data to be recorded, stored, and analyzed. Putting those things together has resulted in some fascinating innovations that echo data visualization work that's been happening for centuries.
highlighting the shifts in design techniques for streams rather than a drop of data.
Jeff Veen: "This is a 20-minute talk I gave at the Web2.0 Expo in San Francisco a couple weeks ago. In it, I describe two trends: how we're shifting as a culture from consumers to participants, and how technology has enabled massive amounts of data to be recorded, stored, and analyzed. Putting those things together has resulted in some fascinating innovations that echo data visualization work that's been happening for centuries."
I describe two trends: how we're shifting as a culture from consumers to participants, and how technology has enabled massive amounts of data to be recorded, stored, and analyzed. Putting those things together has resulted in some fascinating innovations that echo data visualization work that's been happening for centuries.
Why 1974 was the seminal year for Web 2.0Web Squared: Web 2.0 Five Years On: Web 2.0 Summit 2009 - Co-produced by TechWeb & O'Reilly Conferences, October 20 - 22, 2009, San Francisco, CA
Join us for a webcast about Web Squared on Thursday, June 25 at 10:00 a.m. Pacific time with John Battelle and Tim O'Reilly.
"we’ll get to the Internet of Things via a hodgepodge of sensor data contributing, bottom-up, to machine-learning applications that gradually make more and more sense of the data that is handed to them. ... As the information shadows become thicker, more substantial, the need for explicit metadata diminishes. Our cameras, our microphones, are becoming the eyes and ears of the Web, our motion sensors, proximity sensors its proprioception, GPS its sense of location. Indeed, the baby is growing up. We are meeting the Internet, and it is us"; "evidence shows that formal systems for adding a priori meaning to digital data are actually less powerful than informal systems that extract that meaning by feature recognition"; "There are many who worry about the dehumanizing effect of technology. We share that worry, but also see the counter-trend, that communication binds us together, gives us shared context, and ultimately shared identity"Lifehacker - Five Best Free System Restore Tools - Disk image
Free software to clone/image your hard driveFoller.me - before you follow...
Gibt einen Überblick zu einem beliebigen Twitter-Account6 Gorgeous Twitter Visualizations
6 Gorgeous Twitter VisualizationsWorldwide Real-Time Firefox Downloads
Watch Firefox 3.5 takeup in real timeFederal IT Dashboard
157 Investments Evaluated by Agency CIOs Arrow View detailed chart by agency Note: All descriptions, dates, and costs are as reported by agencies. Major investments (Investments Evaluated) represent only a portion of the agency's entire IT portfolio reported in Exhibit 53.
drupal site20+ CSS Data Visualization Techniques | tripwire magazine
Get inspired.
graph with cssOECD Factbook eXplorer for analysing country statistics
Interesting site which allows manipulation and animation of set data. Bears further investigation.Putting Government Data online - Design Issues
Notes from Tim Berners-LeeTales from the encrypt: the secrets of data protection | Technology | guardian.co.uk
Tales from the encrypt: If you care about the integrity of your data, it's time to investigate solutions for accessing and securing it – and not just for the here and now
"But what if I were killed or incapacitated before I managed to hand the passphrase over to an executor or solicitor who could use them to unlock all this stuff that will be critical to winding down my affairs – or keeping them going, in the event that I'm incapacitated? I don't want to simply hand the passphrase over to my wife, or my lawyer. Partly that's because the secrecy of a passphrase known only to one person and never written down is vastly superior to the secrecy of a passphrase that has been written down and stored in more than one place. Further, many countries's laws make it difficult or impossible for a court to order you to turn over your keys; once the passphrase is known by a third party, its security from legal attack is greatly undermined, as the law generally protects your knowledge of someone else's keys to a lesser extent than it protects your own."The Nike Experiment: How the Shoe Giant Unleashed the Power of Personal Metrics
"Call it Living by Numbers—the ability to gather and analyze data about yourself, setting up a feedback loop that we can use to upgrade our lives, from better health to better habits to better performance."
The Nike Experiment: How the Shoe Giant Unleashed the Power of Personal Metrics
article writing up the success & simplicity of nike+. Part of a bigger piece on improving our lives through tracking and managing personal data. Features my friend Michael Tchao who leads Nike TechLab.
Veronica Noone attached a small sensor to her running shoes and headed out the door. She pressed start on her iPod and began keeping track of every step she took. It wasn't a long run—just 1.67 miles in 18 minutes and 36 seconds, but it was the start of something very big for her.Influential Marketing Blog: 10 Stunning (And Useful) Stats About Twitter
Interesting Stats about Twitter!Rise of the Data Scientist | FlowingData
Interesting!How The Average U.S. Consumer Spends Their Paycheck - Visual Economics
Source: Consumer expenditures, U.S. department of Labor, U.S. Bureau of Labor Statistics, April 2009
Durchschnittsverbrauch eines US-AmerikanersSQL Databases Are An Overapplied Solution (And What To Use Instead)
SQL Databases Are An Overapplied Solution (And What To Use Instead)Google Maps Data API - Google Code
BackgroundReadWriteWeb Interview With Tim Berners-Lee, Part 1: Linked Data
מספר על המצאת חייו WWW- ממציא ה
First of a 2 Part interview between McManus of RWW and Berns-Lee director of W3C and father of the InternetOfficial Google Research Blog: Large-scale graph computing at Google
I want one of these! "We have created scalable infrastructure, named Pregel, to mine a wide range of graphs. In Pregel, programs are expressed as a sequence of iterations. In each iteration, a vertex can, independently of other vertices, receive messages sent to it in the previous iteration, send messages to other vertices, modify its own and its outgoing edges' states, and mutate the graph's topology (experts in parallel processing will recognize that the Bulk Synchronous Parallel Model inspired Pregel). Currently, Pregel scales to billions of vertices and edges, but this limit will keep expanding. Pregel's applicability is harder to quantify, but so far we haven't come across a type of graph or a practical graph computing problem which is not solvable with Pregel. It computes over large graphs much faster than alternatives, and the application programming interface is easy to use. Implementing PageRank, for example, takes only about 15 lines of code. "
Kernel
So many things to learn and apply in business deals.
http://spinn3r.com/rankYou should follow me on Twitter | Dustin Curtis
Hey RIA and web vendors: how can your tools let people do this kind of experimenting?
You should follow me on Twitter
Experiment on increasing CTR to your twitter page
Experimenting with different phrases.your.flowingdata / Capture your life in data.
I don't quite understand this one yet and they don't have any samples or examples on the site, but eventually, I will get to try it out.
Use your.flowingdata to collect data about yourself and your surroundings with Twitter. Record what you eat, when you go to sleep, how much television you watch, or whatever else you want. What you track is completely up to you.The Human's Development :: we ain't plastic
Alternative content Get Adobe Flash player
human's developmentVanish: Enhancing the Privacy of the Web with Self-Destructing Data
Program that makes email self destruct
ehind Vanish in detail. Briefly, as mentioned above, the user never knows the encryption key. This means that there is no risk of the user exposing that key at some point in the future, perhaps through coercion, court order, or compromise. So what do we do with the key? We could escrow it with a third party, but that raises serious trust issues (e.g., the case with Hushmail).
copies of Vanish encrypted data — even archived or cached copies — will become permanently unreadable at a specific time, without any action on the part of the user or any third party or centralized service.
Storing the decryption key across many p2p nodes means you can "lose" the key at a specified time. As long as one of the p2p nodes you have used destroys the key, we can no longer decrypt the message. The theory is certainly sound, lets hope the implementation is.
Vanish is a research system designed to give users control over the lifetime of personal data stored on the web or in the cloud. Specifically, all copies of Vanish encrypted data — even archived or cached copies — will become permanently unreadable at a specific time, without any action on the part of the user or any third party or centralized service.The Social Data Revolution(s) - Now, New, Next - HarvardBusiness.org
Successful interactions have become genuine communication with near-instantaneous feedback. For example, PayScale allows users to retrieve real-time salary reports based on their job title, location, education, and experience-but only after they have contributed their own data. As the expectations of users change, firms must spend more time developing incentive systems that will entice more users to participate. Consequently, the online world is beginning to be ruled by the expectations of the users. No longer is it sufficient for a search engine to cough up some hotels across the world when a weary traveler is looking for a good deal in Bangkok! As these consumer expectations shift, companies that want to stay relevant have no choice but to accept the ideas of the consumer revolution as swiftly as possible. For users, switching costs are cheap
Successful interactions have become genuine communication with near-instantaneous feedback. For example, PayScale allows users to retrieve real-time salary reports based on their job title, location, education, and experience-but only after they have contributed their own data. As the expectations of users change, firms must spend more time developing incentive systems that will entice more users to participate. Consequently, the online world is beginning to be ruled by the expectations of the users. No longer is it sufficient for a search engine to cough up some hotels across the world when a weary traveler is looking for a good deal in Bangkok! As these consumer expectations shift, companies that want to stay relevant have no choice but to accept the ideas of the consumer revolution as swiftly as possible. For users, switching costs are cheap
In 2009, more data will be generated by individuals than in the entire history of mankind through 2008. Information overload is more serious than ever. What are the implications for marketing?
People trust people not advertising in product choices
In 2009, more data will be generated by individuals than in the entire history of mankind through 2008.
Supposedly humans will create more data in 2009 than in all time up to 2008. Not sure where that data comes fromStatPlot.com: Visualize Sports Stats - Powered by StatSheet.com
Ese último link tendría que haber sido http://statplot.com/ ; el link que envié apuntaba hacia el artículo que lo describe el sitio. [from http://twitter.com/dariuus/statuses/1939829918]
Create interactive stat charts based on sports
Build graphical charts using NFL, NBA, College Basketball, College Football, and NASCAR stats. Each chart comes in three formats: interactive Flash, image, and thumbnail.NYC Subway Ridership, 1905-2006
Really nicely done. Via Shaunrx.
Cool
Gorgeous map of NYC subway ridership over timeAnalysis from the Bottom Up | The Data Model That Nearly Killed Me
On February 17, 2009, President Barack Obama signed into law the economic stimulus package that appropriated about $20 billion for heath information technology ("Technology Gets a Piece of Stimulus", New York Times, January 25, 2009. The American Rec
Medical personnel at urgent care and the hospital who interacted with me all used a version of the same electronic health information system (the “system”). It became clear that everyone was fighting that system. Indeed, they wasted between 40% and 60% of their time making the system do something useful for them. The system kept everyone from fulfilling their duties – the health information system did not help medical professionals perform their duties.
scary story, not sure this can be blamed on the data model though; there are probably a dozen layers between people and the data model (including traiining/management acceptance) which have a much greater impact; points are well taken though
On February 17, 2009, President Barack Obama signed into law the economic stimulus package that appropriated about $20 billion for heath information technology (”Technology Gets a Piece of Stimulus“, New York Times, January 25, 2009. The American Recovery and Reinvestment Act of 2009, Subtitle A—Promotion of Health Information Technology, details the epically massive government program to digitize and network health information.) The law makes a job for yet another bureaucrat to oversee the vast program - is this change we can believe in? It defines rules for health information standards by designating a new standards board - everyone desires more data standards and standards groups. The law also explains how to test systems built with federal money but it does not explain how to measure semantic validity of information - garbage in garbage out! Good luck with all of that Mr. President.How Different Groups Spend Their Day - Interactive Graphic - NYTimes.com
どのようなことをして1日を過しているかの調査
Interactive Report which allows to filter by Age Group, Employment Status.How Different Groups Spend Their Day - Interactive Graphic - NYTimes.com
interactive
A really cool graph.
The American Time Use Survey asks thousands of American residents to recall every minute of a day. Here is how people over age 15 spent their time in 2008.GetDataBack - Data Recovery Software
This tutorial will show you how to use Amazon EC2 and Cloudera's Distribution for Hadoop to run batch jobs for a data intensive web application. During the tutorial, we will perform the following data processing steps:
* Configure and launch a Hadoop cluster on Amazon EC2 using the Cloudera tools * Load Wikipedia log data into Hadoop from Amazon Elastic Block Store (EBS) snapshots and Amazon S3 * Run simple Pig and Hive commands on the log data * Write a MapReduce job to clean the raw data and aggregate it to a daily level (page_title, date, count) * Write a Hive query that finds trending Wikipedia articles by calling a custom mapper script * Join the trend data in Hive with a table of Wikipedia page IDs * Export the trend query results to S3 as a tab delimited text file for use in our web application's MySQL database
This tutorial will show how to use Amazon EC2 and Cloudera's Distribution for Hadoop to run batch jobs for a data intensive web application.SocialSafe: Get Your Facebook Data Out of Facebook
SocialSafe is an effective, fun new application enabling Facebook users to manage their Facebook data offline on their home computer.
How to get your Facebook data out of Facebook.Information Is Beautiful | Ideas, issues, concepts, subjects - visualized!
As data visualization often needs to reach a broad audience the browser is becoming the number one tool to publish and share visualizations. A lot of visualizations require user-interaction to unleash their full potential, thus interactive applets that run directly in the browser are a a great way to analyze the data at hand. Beside the usual suspects like Flash, Silverlight and Processing, JavaScript is quickly gaining ground in the field of interactive visualization embedded in websites. We’ve collected 13 JavaScript visualization libraries that help you get started faster, keep it flexible and develop with higher reliability.Stats Confirm It: Teens Don’t Tweet
Teens may not be tweeting, but there is some evidence here of other generations taking to it
One of the hardest dems to reachHow Different Groups Spend Their Day - Interactive Graphic - NYTimes.com
e New York Ti
How Different Groups Spend Their Day. very nicely made infographic.私はこうやってマーケティングデータを集めています。 - livedoor ディレクターブログ
Very cool stuff...
Data Visualization is a method of presenting information in a graphical form. Good data visualization should appear as if it is a work of art. This intrigues the viewer and draws them in so that they can further investigate the data and info that the graphic represents. In this post there are 15 stunning examples of Data Visualization that are true works of art.http://avant.interactionconsortium.com/australian_internet/#
australian web projects visualizedStatistics Show Social Media Is Bigger Than You Think « Socialnomics – Social Media Blog
see numbers below the videoPersonas | Metropath(ologies) | An installation by Aaron Zinman
Who are you when you aren't there?
Personas is an art installation by Aaron Zinman that is a component of Metropath(ologies), an interactive exhibit by the Sociable Media Group, MIT Media Lab. Metropath(ologies) is by Alex Dragulescu, Yannick Assogba, Aaron Zinman under the direction of Prof. Judith Donath.
Personas is a component of the Metropath(ologies) exhibit, currently on display at the MIT Museum by the Sociable Media Group from the MIT Media Lab. It uses sophisticated natural language processing and the Internet to create a data portrait of one's aggregated online identity. In short, Personas shows you how the Internet sees you.Personas | Metropath(ologies) | An installation by Aaron Zinman
Enter your name, and Personas scours the web for information and attempts to characterize the person - to fit them to a predetermined set of categories that an algorithmic process created from a massive corpus of data. The computational process is visualized with each stage of the analysis, finally resulting in the presentation of a seemingly authoritative personal profile.DataSF - DataSF - Liberating City Data
Why can't every city have this?
City of SF opens site containing datasets
"DataSF is a clearinghouse of datasets available from the City & County of San Francisco. While there is plenty of room for improvement, our goal in releasing this site is: 1) improve access to data, 2) help our community create innovative apps, 3) understand what datasets you'd like to see, 4) get feedback on the quality of our datasets."
"DataSF is a clearinghouse of datasets available from the City & County of San Francisco. While there is plenty of room for improvement, our goal in releasing this site is: (1) improve access to data (2) help our community create innovative apps (3) understand what datasets you'd like to see (4) get feedback on the quality of our datasets."Best Science Visualization Videos of 2009 | Wired Science | Wired.com
best science visualization video os 2009Personas | Metropath(ologies) | An installation by Aaron Zinman
Personas is an art installation by Aaron Zinman that is a component of Metropath(ologies), an interactive exhibit by the Sociable Media Group, MIT Media Lab. Metropath(ologies) is by Alex Dragulescu, Yannick Assogba, Aaron Zinman under the direction of Prof. Judith Donath.
O que você anda fazendo na web? O MIT sabe!6 Gorgeous Facebook Visualizations
create beautiful Facebook visualizations of your own with very little effort. Enjoy!
This Facebook add-on lets you easily create 3D graphical representations of the connections in your Facebook network. You can also view graphs for other users, fine tune the settings to create various graphs, zoom and pan your graph, and choose between a light and a dark theme. The results can be stunning, especially for users with a lot of Facebook friends.
Great Social Site with connections to all other social sitesZombies.pdf (application/pdf Object)
Potential models for zombie outbreaks.
Zombies are a popular figure in pop culture/entertainment and they are usually portrayed as being brought about through an outbreak or epidemic. Consequently, we model a zombie attack, using biological assumptions based on popular zombie movies. We introduce a basic model for zombie infection, determine equilibria and their stability, and illustrate the outcome with numerical solutions. We then refine the model to introduce a latent period of zombification, whereby humans are infected, but not infectious, before becoming undead. We then modify the model to include the effects of possible quarantine or a cure. Finally, we examine the impact of regular, impulsive reductions in the number of zombies and derive conditions under which eradication can occur. We show that only quick, aggressive attacks can stave off the doomsday scenario: the collapse of society as zombies overtake us all.Obama | One People
getting me excited about data
Graphs of the impact of Obama on people/demographics in less than 1 year since inauguration.Developers: Never Mind the APIs, Here's YQL Execute - ReadWriteWeb
Read: Developers: Never Mind the APIs, Here's YQL Execute [feedly] http://tr.im/koyE [from http://twitter.com/krisnelson/statuses/1693267224]
RWW's @jolieodell dares to tackle the powerful beast that is the new YQL Execute http://bit.ly/J1gxO and so far has lived to tell the tale [from http://twitter.com/marshallk/statuses/1680054262]
...includes explanation of what YQL is, starting with: a sophisticated solution that is agnostic across all Internet platforms and that lowers both the burden of labor and the barriers to entry for social and other web application developersTrending Topics: Hot Wikipedia Topics - Powered by Hadoop & EC2
A search engine for trending topics. Built by Data Wrangling with Cloudera Hadoop shows some massive data processing habilities.
awesome website that mines wikipedia traffic levelsHow XML Threatens Big Data : Dataspora Blog
Back in 2000, I went to France to build a genomics platform. A biotech hired me to combine their in-house genome data with that of public repositories like Genbank. The problem was the repositories, all with millions of records, each had their own format. It sounded like a massive, nightmarish data interoperability project. And an ideal fit for a hot new technology : XML
Three Rules for XML Rebels 1. Stop Inventing New Formats 2. Obey the Fifteen Minute Rule 3. Embrace Lazy Data Modeling
Un point de vue intéressant sur le xml, à rebours des conceptions en sciences de l'info (en tout cas les miennes)
Excellent thoughtful article on data bureaucracy and the limitations of XML.Chart.ly
from twitter, chart sharing
Service to share charts on TwitterHow Big Is the Apple iPhone App Economy? The Answer Might Surprise You
If I were to tell you that Apple’s app economy was worth more than $2.5 $2.4 billion a year, you would laugh hysterically, shake your head and walk out of the room, yes? Surf on over to some other web site? But here I am telling you exactly that!
According to mobile advertising startup AdMob, there are some $200 million worth of applications sold in Apple’s iPhone store every month, or about $2.4 billion a year.
If I were to tell you that Apple’s app economy was worth more than $2.5 $2.4 billion a year, you would laugh hysterically, shake your head and walk out of the room, yes? Surf on over to some other web site? But here I am telling you exactly that! According to mobile advertising startup AdMob, there are some $200 million worth of applications sold in Apple’s iPhone store every month, or about $2.4 billion a year. Just to put that in context, Apple says about 1.5 billion apps have been downloaded from the App Store. In comparison, the Android marketplace brings in about $5 million a month or on a run rate to do $60 million in a year, AdMob says. I bet that number rises up sharply once more handsets come to market. As you know, Motorola is announcing its new Android handsets at our Mobilize 09 conference on September 10.Tweeting By Numbers: 7 Ways to Become a Twitter Analyst
There are plenty of Twitter tools out there designed to help you understand TwitterTwitter metrics. These tools come in handy for measuring change in tweet
There are plenty of Twitter tools out there designed to help you understand Twitter metrics. These tools come in handy for measuring change in tweet fluctuations, charting follower count numbers, finding out hashtag frequency, and quantifying Twitter activity.
Could a journalist use this info?More Truth About Twitter | Information Is Beautiful
Twitter statistics in graphical format, incl. "If the Twitter community were 100 people", "The Average 100 Tweets", and "Peak Days/Hours".
peaks in twitter acivity
Sweet charts that highlight Twitter users and uysage.Food Nutrition Comparisons | twofoods
Food Nutrition Comparisons | twofoods
two foods
Nutritional comparison of two food types. Strangely neat. EFL comparisonsDataMasher
Infográficos de dados públicos
1. Pick a data set - /> orange circle Poverty Rate 2. Choose an operator - /> choose: - × ÷ 3. Pick another data set - /> blue circle Unemployment Your Mashup! - /> venn diagram Poverty Rate Unemployment
To empower people to discover and discuss government data through manipulation and mapping.
DataMasher is a tool that takes these vast quantities of information and allows you to whittle it down into simpler terms, offering an easy way to get hard data on certain topics without any intrusive media spin.VC blog » Blog Archive » Information Visualization Manifesto
facilitate understanding and aid cognition
Le but de la visualisation est de donner un aperçu, pas des images, disait déjà Ben Shneiderman en 1999. Manuel Lima propose plusieurs critères à son manifeste qui découlent de ce constat : La forme suit la fonction, l'intéractivité comme clef, la puissance de la naration...
Infoviz is becoming more and more popular and, just as anything growing popular, also controversial. Here's a list with some good points on good information visualization.What Visualization Tool/Software Should You Use? – Getting Started | FlowingData
We're gluttons for infographics, and a team at Kansas State just served up a feast: maps of sin created by plotting per-capita stats on things like theft (envy) and STDs (lust). Christian clergy, likely noting the Bible Belt's status as Wrath Central, question the "science." Valid point—or maybe it's just the pride talking.Gov 2.0: It’s All About The Platform
In this regard, there’s a CNN story from last April that I like to tell: a road into a state park in Kauai was washed out, and the state government said it didn’t have the money to fix it. The park would be closed. Understanding the impact on the local economy, a group of businesses chipped in, organized a group of volunteers, and fixed the road themselves. I called this DIY on a civic scale. Scott Heiferman corrected me: “It’s DIO: Not ‘Do it Yourself’ but ‘Do it Ourselves.’” Imagine if the state government were to reimagine itself not as a vending machine but an organizing engine for civic action. Might DIO help us tackle other problems that bedevil us? Can we imagine a new compact between government and the public, in which government puts in place mechanisms for services that are delivered not by government, but by private citizens? In other words, can government become a platform? We have an enormous opportunity right now to make a difference. There’s a receptivity to new ideas tThe Most Interesting New Tech Startup of 2009 - Anil Dash
The USA as the most interesting tech startup of '09, Anil Dash style.
was at 16 min in video... really interesting interview from US gov CIO/Wired
Ireland needs to reward govt APIs.
I think the most promising new startup of 2009 is one of the least likely: The executive branch of the federal government of the United States.
Each site has remarkably consistent branding elements, leading to a predictable and trustworthy sense of place when you visit the sites. There is clear attention to design, both from the cosmetic elements of these pages, and from the thoughtfulness of the information architecture on each site. (The clear, focused promotional areas on each homepage feel just like the "Sign up now!" links on the site of most Web 2.0 companies.) And increasingly, these services are being accompanied by new APIs and data sources that can be used by others to build interesting applications.Google Domestic Trends - Google Finance
Google Domestic Trends track Google search traffic across specific sectors of the economy. Changes in the search volume of a given sector on google.com may provide unique economic insight. You can access individual trend indexes by clicking on the left-hand navigation.
Google Domestic Trends track Google search traffic across specific sectors of the economy.Top 5 Web Trends of 2009: Structured Data
OpenCalais는 첨보는데, 벤치마킹해봐야징~
which"Anonymized" data really isn't—and here's why not - Ars Technica
birthdateIntroducing News Dots - By Chris Wilson - Slate Magazine
"An interactive map of how every story in the news is related, updated daily"
Like a human social network, the news tends to cluster around popular topics. One clump of dots might relate to a flavor-of-the-week tabloid story (the Jaycee Dugard kidnapping) while another might center on Afghanistan, Iraq, and the military. Most stories are more closely related that you think. The Dugard kidnapping, for example, connects to California Gov. Arnold Schwarzenegger, who connects to the White House, which connects to Afghanistan. To use this interactive tool, just click on a circle to see which stories mention that topic and which other topics it connects to in the network. You can use the magnifying glass icons to zoom in and out. You can also drag the dots around if they overlap. A more detailed description of how News Dots works is available below the graphic.
greGoogle - Internet Stats
This Google resource brings together the latest industry facts and insights together in one place. These have been collected from a number of third party vendors covering a range of topics from macroscopic economic and media trends to how consumer behaviour and technology are changing over time.Data Visualization and Infographics Resources | Developer's Toolbox | Smashing Magazine
People don't neglect backing up their computers because it's hard—it isn't, at all. No, people file into the inevitable death march of data loss for one reason: Backing up usually costs money. But it doesn't have to.
How To: Back Up All Your Stuff, For Free - Backup - GizmodoIf You’re Not Seeing Data, You’re Not Seeing | Gadget Lab | Wired.com
As you shove your way through the crowd in a baseball stadium, the lenses of your digital glasses display the names, hometowns and favorite hobbies of the
Annotated link http://www.diigo.com/bookmark/http%3A%2F%2Fwww.wired.com%2Fgadgetlab%2F2009%2F08%2Faugmented-reality
“Augmented reality,” where data from the network overlays your view of the real world. Your phone’s screen shows the real world overlaid with additional information such as the location of subway entrances, the price of houses, or Twitter messages that have been posted nearby. And publishers, moviemakers and toymakers have embraced a version of the technology to enhance their products and advertising campaigns.
As you shove your way through the crowd in a baseball stadium, the lenses of your digital glasses display the names, hometowns and favorite hobbies of the strangers surrounding you. Then you claim a seat and fix your attention on the batter, and his player statistics pop up in a transparent box in the corner of your field of vision.WTF is a SuperColumn? An Intro to the Cassandra Data Model — Arin Sarkissian
Nice detailed examples on NoSQL data modeling in Cassandra.The Data Liberation Front (the Data Liberation Front)
We intend for this site to be a central location for information on how to move your data in and out of Google products. Welcome.
"We intend for this site to be a central location for information on how to move your data in and out of Google products. Welcome." :-D
We intend for this site to be a central location for information on how to move your data in and out of Google products. Welcome. The Data Liberation Front The Data Liberation Front is an engineering team at Google whose singular goal is to make it easier for users to move their data in and out of Google products. We do this because we believe that any data that you create in or import into a product is your own. We help and consult other engineering teams within Google on how to "liberate" their products. This is our mission statementThe Data Liberation Front (the Data Liberation Front)
and out of
We intend for this site to be a central location for information on how to move your data in and out of Google products. Welcome.
saveReadWriteWeb's Top 5 Web Trends of 2009
Last week we ran a series of posts outlining the 5 biggest Internet trends of this year: Structured Data, Real-Time Web, Personalization, Mobile Web / Augmented Reality, Internet of Things. Effectively this was ReadWriteWeb's State of the Web 2009. We've now compiled the main points into a single presentation.
The 5 biggest Internet trends of this year: Structured Data, Real-Time Web, Personalization, Mobile Web / Augmented Reality, Internet of Things. Effectively this was ReadWriteWeb's State of the Web 2009.
ReadWriteWeb's State of the Web in 2009. <a href="http://www.readwriteweb.com/archives/top_5_web_trends_of_2009_structured_data.php">Structured Data</a>, <a href="http://www.readwriteweb.com/archives/top_5_web_trends_of_2009_the_real-time_web.php">Real-Time Web</a>, <a href="http://www.readwriteweb.com/archives/top_5_web_trends_of_2009_personalization.php">Personalization</a>, <a href="http://www.readwriteweb.com/archives/top_5_web_trends_of_2009_mobile_web_augmented_reality.php">Mobile Web / Augmented Reality</a>, <a href="http://www.readwriteweb.com/archives/top_5_web_trends_of_2009_internet_of_things.php">Internet of Things</a>
Effectively this was ReadWriteWeb's State of the Web 2009.Evolution of a Revolution: Visualizing Millions of Iran Tweets
Visualizing Millions of Iran Tweets - computational history of news using twitter
At its peak, a search for "Iran" on Twitter generated over 100,000 tweets per day and over 8,000 tweets per hour. The plot just below shows the growth in volume of information in the number of tweets per hour. How does an Internet junkie, news organization, or political operative monitor rapidly evolving real-time events, from the crucial details to the bigger picture? More importantly, how can a data stream be turned into real-time action, reaching the people who need it, when they need it, and in a form they can easily digest?
Article describes effort aimed at more sophistcicated analysis of twitter trends. Author is co-founder of Infoharmoni - startup building knowledge interfaces for real-time data sets.
How to algorithmically discover and deploy novel social structures is perhaps the billion, or trillion, dollar question. With Twitter, the data and API are in place. And if the history of computation is any guide, once programming a system becomes possible, progressing from a hack to an application to a platform is only a matter of time.
'...how can a data stream be turned into real-time action, reaching the people who need it, when they need it, and in a form they can easily digest? At the most abstract level, history and computation are the same thing: the evolution of systems over time. Twitter has several remarkable properties that allow us to finally leverage this correspondence in tangible ways. The simplicity of its data, the openness of its system, and its extreme time resolution make it possible for us to detect atoms of history, those moments when something is triggered and society is reconfigured ever so slightly. Simply tracking the volume of various phrases gives us a sense of what is happening on the street, literally and figuratively. But that signal is but a shadow of a far more complex and intricate reality, an interwoven web of individuals and actions. -- Disruptive events lead to information elites.'Nihilogic : Canvas Visualizations of Sorting Algorithms
Some great graphics to use when demonstrating the different topicsAggData | AggData
The goal of AggData is to play a small part in making this sought-out data more accessible, portable and reliable.
great source for aggregated data
AggData is short for aggregate data, which means a set of data that is collected together in one place. On this site, the AggData will come in the form of a list of records, where each record has details about a specific object in the group.
data aggregated by web scraping
another free data library.Dictionary of Algorithms and Data Structures
Definitions of algorithms, data structures, and classical Computer Science problems. Some entries have links to implementations and more information.Anscombe's quartet - Wikipedia, the free encyclopedia
e linear relations6 Incredible Twitter Powered Art Projects
witter has brought us many things. It lets us communicate in real-time about breaking news events, it lets us share content like photos, music, and videos, and it lets us do business in new ways. But Twitter is also being used to power some very intriguing and beautiful virtual art projects. Tweets are being visualized and mashed up with other content in ways that create stunning online art. In this post we’ll highlight six incredible experimental art projects that are using Twitter (Twitter) as a basis for their awesome creations. These visualizations go beyond just displaying data in more interesting ways — they are also truly fascinating pieces of online art.
this is kinda neat...kindaWhere The Buffalo Roamed « Weather Sealed
全米のマクドナルド分布図。
To gauge the creep of cookie-cutter commercialism, there’s no better barometer than McDonald’s – ubiquitous fast food chain and inaugural megacorporate colonizer of small towns nationwide. So, I set out to determine the farthest point from a Micky Dee’s – in the lower 48 states, at least. This endeavor required information, and the nice folks at AggData were kind enough to provide it to me: a complete list of all 13,000-or-so U.S. restaurants, in CSV format, geolocated for maximum convenience. From there, a bit of software engineering gymnastics, and… Behold, a visualization of the contiguous United States, colored by distance to the nearest domestic McDonald’s!
"Which begs the question: just how far away can you get from our world of generic convenience? And how would you figure that out? [...] To gauge the creep of cookie-cutter commercialism, there’s no better barometer than McDonald’s"
As expected, McDonald’s cluster at the population centers and hug the highway grid. East of the Mississippi, there’s wall-to-wall coverage, except for a handful of meager gaps centered on the Adirondacks, inland Maine, the Everglades, and outlying West Virginia. For maximum McSparseness, we look westward, towards the deepest, darkest holes in our map: the barren deserts of central Nevada, the arid hills of southeastern Oregon, the rugged wilderness of Idaho’s Salmon River Mountains, and the conspicuous well of blackness on the high plains of northwestern South Dakota.
A visualization of the contiguous United States, colored by distance to the nearest domestic McDonald’s
"As expected, McDonald’s cluster at the population centers and hug the highway grid. East of the Mississippi, there’s wall-to-wall coverage, except for a handful of meager gaps centered on the Adirondacks, inland Maine, the Everglades, and outlying West Virginia. For maximum McSparseness, we look westward, towards the deepest, darkest holes in our map: the barren deserts of central Nevada, the arid hills of southeastern Oregon, the rugged wilderness of Idaho’s Salmon River Mountains, and the conspicuous well of blackness on the high plains of northwestern South Dakota. There, in a patch of rolling grassland, loosely hemmed in by Bismarck, Dickinson, Pierre, and the greater Rapid City-Spearfish-Sturgis metropolitan area, we find our answer."dy/dan » Blog Archive » What I Would Do With This: Groceries
"The express lane isn't faster. The manager backed me up on this one. You attract more people holding fewer total items, but as the data shows above, when you add one person to the line, you're adding 48 extra seconds to the line length (that's "tender time" added to "other time") without even considering the items in her cart. Meanwhile, an extra item only costs you an extra 2.8 seconds. Therefore, you'd rather add 17 more items to the line than one extra person! I can't believe I'm dropping exclamation points in an essay on grocery shopping but that's how this stuff makes me feel."
less helpful
All other things being equal, which lane is the fastest?30 Resources to Find the Data You Need | FlowingData
Let's say you have this idea for a visualization or application, or you're just curious about some trend. But you have a problem. You can't find the data, and without the data, you can't even start. This is a guide and a list of sources for where you can find that data you're looking for. There's a lot out there. Universities Being a graduate student, I always look to the library for books and resources. Many libraries are amping up their technology and have some expansive data archives. Many statistics departments also tend to keep a list of data somewhere.New York City Homicides Map - The New York Times
New York City Homicides Map - The New York Times
a macabre map... murders in new york
Each day, the New York Police Department announces major crimes, including most homicides, in the five boroughs. This data is compiled from those reports, in addition to news accounts, court records and additional reporting. The map will be updated as new information becomes available.Forrester Predicts Huge Growth for Social Media Marketing
Forrester Research is holding its own conference down in Orlando and has just revealed its predictions for the growth of online advertising. The bottom line is that social media and mobile will be the hottest, but just about everything will see an upward trend.
Future forecast is pretty positive for social media
Graphs on marketing spending projections for social media
Forrester Research is holding its own conference down in Orlando and has just revealed its predictions for the growth of ...Twitter Data Analysis: An Investor’s Perspective
RT @rgbroitman: Some good Twitter stats here: http://ow.ly/sXKz [from http://twitter.com/cyberdad/statuses/4657457633]
From TechCrunch: Robert Moore - CEO and Co-Founder of RJmetrics provides highlights on Twitter usage.50 Great Examples of Data Visualization
tools
data sets, graphically displayed, overly complex, IMOInfoQ: Persistent Data Structures and Managed References
Persistent Data Structures and Managed References
Rich Hickey
good overview of concurrent data structures, refs and STM in Clojure
Rich Hickey is my new favourite presenter. The bloke is awesome (well except for the mullet you can see in this video)! In this video he discusses time as it relates to variable state, briefly applying it to clojure. What makes it great is his practical focus and his use of a really simple example to get across the point.20 Essential Infographics & Data Visualization Blogs | Inspired Magazine
Nice collection of blogs specialising in good (and bad) data visualisation.
lista de blogs sobre graficos e infograficosElements of Statistical Learning: data mining, inference, and prediction. 2nd Edition.
Hastie, Tibshirani and Friedman (2008). Springer-Verlag. Full-text PDF is free.
free online book
@dataspora: "The Elements of Statistical Learning, the authoritative text on the subject, now free at authors' site http://bit.ly/2J8WNK (ht @johndcook)" (from http://twitter.com/dataspora/status/4847621837)Factual
Service for basically creating shared databases: sounds quite interesting!
Factual is a platform where anyone can share and mash open data on any subject. For example, you might find a directory of California restaurants, a database of endocrinologists, or a list of American Idol finalists. We provide smart tools to help the community build and maintain a trusted source of structured data. And this data can be used through widgets and APIs to help application developers and content publishers be more innovative and productive.
open data edit
On line data collections
Data!Book of Odds - The Odds of Everyday Life
This is an interesting site focusing on odds and statistics.Ernest Marples' Postcode Latitude/Longitude Lookup API
Free, open postcode to location API
Post codes are really useful, but the powers that be keep them closed unless you have loads of money to pay for them. Which makes it hard to build useful websites. So we are setting them free. We're doing the same as everyone's being doing for years, but just being open about it.13 Interesting Infographics for Web Workers | Web Design Ledger
webdesignledger.com: infographics for web workers
Infographics are a great way to get people to actually look at data. The use of visual design elements can simplify complex information and make it easier to digestKnow Privacy
A comparison of users' expectations of privacy online and the data collection practices of website operators.
Approach: A comparison of users' expectations of privacy online and the data collection practices of website operators. Goal: To identify specific practices that may be harmful or deceptive and attract the attention of government regulators. Result: Recommendations for policymakers to protect consumers and for website operators to avoid stricter regulation.
research site for ghostery
The Current State of Web Privacy, Data Collection, and Information Sharing
evil!
Know Privacy: research by Joshua Gomez, Travis Pinnick, and Ashkan Soltani, UC Berkeley School of Information, class of 2009Know Thyself: Tracking Every Facet of Life, from Sleep to Mood to Pain, 24/7/365
quantifying and optimizing all facets of life
quantifiedself.com.27+ Beautiful Examples of Infographics | Dzine Blog
As designers, we’re constantly searching for ways to improve and style our designs, this is exactly what the following 30 infographics and sites display below; the breaking of rules.
As you search the web you'll come across a wide range of interactive and graphical maps. Deciding when, where and how to integrate or display a map on yourGetting It Wrong: Surprising Tips on How to Learn: Scientific American
"People remember things better, longer, if they are given very challenging tests on the material, tests at which they are bound to fail. In a series of experiments, they showed that if students make an unsuccessful attempt to retrieve information before receiving an answer, they remember the information better than in a control condition in which they simply study the information."
People remember things better, longer, if they are given very challenging tests on the material, tests at which they are bound to fail. In a series of experiments, they showed that if students make an unsuccessful attempt to retrieve information before receiving an answer, they remember the information better than in a control condition in which they simply study the information. Trying and failing to retrieve the answer is actually helpful to learning.
"People remember things better, longer, if they are given very challenging tests on the material, tests at which they are bound to fail. In a series of experiments, they showed that if students make an unsuccessful attempt to retrieve information before receiving an answer, they remember the information better than in a control condition in which they simply study the information. Trying and failing to retrieve the answer is actually helpful to learning. It’s an idea that has obvious applications for education, but could be useful for anyone who is trying to learn new material of any kind."
Reminded me that asking questions BEFORE reading the chapter is a better way to prepare students for learning.Grafitter // Visualizing Your Life on Twitter, IM, Delicious, and Blogger.
Nice idea: 1. Know yourself, 2. Collect data, 3 Visualise
Grafitter is a personal informatics tool for collecting and exploring information about your habits and patterns. Use the Grafitter format on Twitter, IM, Delicious, and Blogger to collect data about yourself easily and quickly.
Grafitter is a way collecting information about your self over time while sending updates with Twitter, using IM, saving bookmarks on Delicious, and writing a post on Blogger. Grafitter visualizes the information you record in graphs.Facebook | Engineering @ Facebook's Notes
Portal que intenta hacer pública información sobre los secretos de Washington DC
RT @cshirky: RT THIS is a big deal. OpenSecrets.org releases 200 million [gov't] data records. Today. http://bit.ly/fdXS [from http://twitter.com/danielgillval/statuses/1512025784]
Measuring Link-Bait of Articles I have flagged in the past.
OpenSecrets.org opens up its data -- feel free to mashup information on campaign finacnce, lobbying, personal finances and much moredata.australia.gov.au – beta
data.australia.gov.au is the home of Australian government public information datasets. We encourage you to make government information even more useful by mashing-up the data to create something new and exciting! Make sure you pay attention to the licence attached to the datasets you are interested in using.
data.australia.gov.au is the home of Australian government public information datasets. Like Data.gov, it has a wide variety of downloadable government data on topics such as crime, weather, and public lands--as well as some very Australian topics, such as the location and attributes of barbecues on public lands.
the home of Australian government public information datasets. We encourage you to make government information even more useful by mashing-up the data to create something new and exciting! Make sure you pay attention to the licence attached to the datasets you are interested in using. Each licence should make clear what you can and can’t do with the data. If you’re unsure, please contact the contributing agency.
data.australia.gov.au is the home of Australian government public information datasets. We encourage you to make government information even more useful by mashing-up the data to create something new and exciting! Make sure you pay attention to the licence attached to the datasets you are interested in using. Each licence should make clear what you can and can’t do with the data. If you’re unsure, please contact the contributing agency.Live Interactive Ships Traffic Worldwide Map
おもしろすぎる
Mapa de trafico de barcos - tiempo realGoogle Gives You A Privacy Dashboard To Show Just How Much It Knows About you
the answer is??
www.google.com/dashboardGoogle Dashboard: Now You Know What Google Knows About You
How much does Google know about you? http://ow.ly/zUKz [from http://twitter.com/LauraleeGuthrie/statuses/5484012833]
There’s no two ways about it: if you use a lot of Google services, then Google knows a lot about you. Google has received a solid amount of criticism because of this, and they’ve decided to alleviate the issue by launching Privacy Dashboard; a one-stop-shop with all the information that Google knows about you and your online habits collected in one place. Dashboard covers more than 20 products and services, including GmailGmailGmail, Calendar, Docs, Web History, OrkutOrkutOrkut, YouTubeYouTubeYouTube, PicasaPicasaPicasa, Talk, Reader, Alerts, Latitude and others. It’s quite a scary list; personally, I’m using all of these, and I was quite interested to see what exactly I’ve told GoogleGoogleGoogle about myself without even knowing.
There's no two ways about it: if you use a lot of Google services, then Google knows a lot about you. Google has received a solid amount of criticism because ofWhat Facebook Quizzes Know About You
A: Almost everything you've put on there.How Races and Religions Match in Online Dating « OkTrends
Since he’s a Pisces and I’m a Virgo, Chris and I of course think the Zodiac is total bullshit, and it was very gratifying to have the data bear this out. Here are the grouped match percentages for a random pool of 500,000 users. Astrological sign has no effect whatsoever on how compatible two people are.
data mining of how people describe themselves on a dating sightThe Quantified Self
"I track myself - 40 things about my body, mind, and activity - every day" -- Alexandra Carmichael
"I track myself - 40 things about my body, mind, and activity - every day. The fact that I do this tracking seems to interest people. Whether they are driven by curiosity about the phenomenon of personal data collection, or by the desire for a yardstick by which to measure and compare themselves, the fascination exists."New York Times - Linked Open Data
For the last 150 years, The New York Times has maintained one of the most authoritative news vocabularies ever developed. In 2009, we began to publish this vocabulary as linked open data. The Data The New York Times has published 5,000 people subject headings as linked open data under a CC BY license. We provide both RDF documents and a human-friendly HTML versions.
People subject headings for New York Times
data.nytimes.com For the last 150 years, The New York Times has maintained one of the most authoritative news vocabularies ever developed. In 2009, we began to publish this vocabulary as linked open data. The Data The New York Times has published 5,000 people subject headings as linked open data under a CC BY license. We provide both RDF documents and a human-friendly HTML versions.
The New York Times has published 5,000 people subject headings as linked open data under a CC BY license. We provide both RDF documents and a human-friendly HTML versions.
data.nytimes.com For the last 150 years, The New York Times has maintained one of the most authoritative news vocabularies ever developed. In 2009, we began to publish this vocabulary as linked open data. The Data The New York Times has published 5,000 people subject headings as linked open data under a CC BY license. We provide both RDF documents and a human-friendly HTML versions.SQL Databases Don't Scale
"Sharding kills most of the value of a relational database."
sql database db7 Visualization Groups On Flickr to Find Inspiration | FlowingData
Nice tutorial.
This worked very well; lots of ideas here.GeoAPI Home
All your location needs in one API.pskomoroch's dataset Bookmarks on Delicious
Resource list of public datasetsPython Package Index : pdfminer 20090330
PDFMiner is a suite of programs that aims to help extracting or analyzing text data from PDF documents. Unlike other PDF-related tools, it allows to obtain the exact location of texts in a page, as well as other layout information such as font size or font name, which could be useful for analyzing the document. It can be also used as a basis for a full-fledged PDF interpreter.Do music artists fare better in a world with illegal file-sharing? — Times Labs Blog
The most immediate revelation, of course, is that at some point next year revenues from gigs payable to artists will for the first time overtake revenues accrued by labels from sales of recorded music.
Experiments in web journalism * Home * About Times LabsTimes Labs Blog Do music artists fare better in a world with illegal file-sharing? This is the graph the record industry doesn’t want you to see. It shows the fate of the three main pillars of music industry revenue - recorded music, live music, and PRS revenues (royalties collected on behalf of artists when their music is played in public) over the last 5 years.
Do music artists fare better in a world with illegal file-sharingThe disturbing inaccuracy behind Google Analytics - iMediaConnection.com
iMediaConnection.com
The predicable problem with bounces vs. visits. NB! Check the comments.
It is critical to know how any web metrics package calculates its numbers, even Google. You cannot assume, no matter how big the company, that the numbers will be correct.
Article about a flaw in how Google Analytics is counting its visitsYou're Backing Up Your Data the Wrong Way - Windows - Lifehacker
Time and time again, people tell me that they've bought an external hard drive to back up their pictures, music, and documents. Great, right? Sadly, that's not always the case. There's one simple rule about backups that everybody needs to fully understand: Your files should exist in at least Two places, or it's no longer a backup—and your data is at risk. Too often people delete the files from their primary PC, assuming they are backed up. It's time to educate people on proper backup strategy, so we'll run through your options and talk about the pros and cons. These days, you've got plenty of choices on the Windows side of things, Mac users have Time Machine, and there's online backup for anybody.
Time and time again, people tell me that they've bought an external hard drive to back up their pictures, music, and documents. Great, right? Sadly, that's not always the case. There's one simple rule about backups that everybody needs to fully understand: Your files should exist in at least Two places, or it's no longer a backup—and your data is at risk. Too often people delete the files from their primary PC, assuming they are backed up. It's time to educate people on proper backup strategy, so we'll run through your options and talk about the pros and cons. These days, you've got plenty of choices on the Windows side of things, Mac users have Time Machine, and there's online backup for anybody. Backing Up to a Local Source When it comes to local backup applications, it's really a matter of preference, since most of them do the job adequately without a lot of fuss. The Backup and Restore application built into Windows 7 or Vista is a perfectly acceptable choice, and will handle moWeb Squared: When Web 2.0 Meets Internet of Things
Recently Tim O'Reilly and John Battelle released a white paper entitled Web Squared: Web 2.0 Five Years On. It focuses squarely, pardon the pun, on the intersection of ...cyoa
As a child of the 80s, the Choose Your Own Adventure books were a fixture of my rainy afternoons. My elementary school library kept a low, fairly unmaintained-looking shelf of them hidden in one of its back corners. Whether this non-marquee placement was an attempt by the librarians to deemphasize the books in favor of ‘serious’ (children’s) literature or was simply my good luck I still haven’t worked out. But it meant there was a place that I could retreat to and dive into unfamiliar worlds without distraction.
Amazing website dedicated to choose your own adventure books
In the story, your concord flight is interrupted when you are beamed aboard a nearby spacecraft trolling the universe for intelligent life. Once aboard you discover your new captors, the U-TY, are interested in keeping you around only to the extent that you can help them find Ultima, the ‘planet of paradise’. The planet’s location is cloaked in mystery and you are only told that it’s a place that cannot be reached ‘by making a choice or following directions’. However this is all foreshadowing for when the reader finally becomes frustrated in the apparently impossible quest and begins flipping through the book hunting for that ending. In fact not choosing is the only way to reach Ultima. The branch diagram for UFO 54-40 is unique in that it has one ending – the Ultima ending – which is completely disconnected from the rest of the story. It exists as an island, unreachable through choices but discoverable thanks to the random access nature of the book.The Anatomy Of An Infographic: 5 Steps To Create A Powerful Visual | Spyre Studios
Advice on how to make really useful infographics.
Information is very powerful but for the most bit it is bland and unimaginative. Infographics channel information in a visually pleasing, instantly understandable manner, making it not only powerful, but extremely beautiful. Once used predominantly to make maps more approachable, scientific charts less daunting and as key learning tools for children, inforgraphics have now permeated all aspects of the modern world.What is Pivot?
Here at Live Labs we’re all about experiments, and Pivot is our most ambitious to date. Pivot makes it easier to interact with massive amounts of data in ways that are powerful, informative, and fun. We tried to step back and design an interaction model that accommodates the complexity and scale of information rather than the traditional structure of the Web.multimediafinal
Timeline of unemployment rate, county by countr
Animated time-lapse map of county-by-county unemployment rates in the U.S. since January 2007. Jarring.
Creepy.
This depicts a graphic of the unemployment rate from 2007 to current date. Fascinating.
Wow.Physical Storage vs. Digital Storage | The Mozy Blog
Physical and digital storage9/11 Pager data
From 3AM on Wednesday November 25, 2009, until 3AM the following day (US east coast time), WikiLeaks is releasing over half a million US national text pager intercepts. The intercepts cover a 24 hour period surrounding the September 11, 2001 attacks in New York and Washington.
"WikiLeaks is releasing over half a million US national text pager intercepts. The intercepts cover a 24 hour period surrounding the September 11, 2001 attacks in New York and Washington. [...] Text pagers are usualy carried by persons operating in an official capacity. Messages in the archive range from Pentagon, FBI, FEMA and New York Police Department exchanges, to computers reporting faults at investment banks inside the World Trade Center"Visualizing empires decline on Vimeo
crazy
"This is mainly an experimentation with soft bodies using toxi's verlet springs. The data refers to the evolution of the top 4 maritime empires of the XIX and XX centuries by extent. The visual emphasis is on their decline." [Via: http://kottke.org/09/11/the-fall-of-empires "The fall of empires A visualization of the decline of the world's four maritime empires (British, Portuguese, French, Spanish) from 1800 to 2009."]
his is mainly an experimentation with soft bodies using toxi's verlet springs. The data refers to the evolution of the top 4 maritime empires of the XIX and XX centuries by extent. The visual emphasis is on their decline. More on that project mondeguinho.com/master/visual-experimentations/visualizing-empires9 Ways to Visualize Proportions – A Guide | FlowingData
Nice summary of some data visualization techniques
visualization data charts
"figure out what graph or chart suits your data best"
Nice range of options for viewing fractions... ratios...Featured Download: Darik's Boot and Nuke is the Nuclear Option of Secure Data Shredding
When it comes down to actually choosing targets, men choose the modelesque. Someone like roomtodance above gets nearly 5 times as many messages as a typical woman and 28 times as many messages as a woman at the low end of our curve. Site-wide, two-thirds of male messages go to the best-looking third of women. So basically, guys are fighting each other 2-for-1 for the absolute best-rated females, while plenty of potentially charming, even cute, girls go unwritten. ....the most salient of which is that the average-looking woman has convinced herself that the vast majority of males aren’t good enough for her, but she then goes right out and messages them anyway.
This week we will be confronting a fact that, by definition, haunts the average online dater: no matter how much time you spend polishing your profile, honing your IM banter, and perfecting your message introductions, it’s your picture that matters most.
Posted by angelaThe Boom of Social Sites | Other | Focus.com
The explosion of social networking sites over the past decade has facilitated a transformation in the way we communicate with each other. Here we look at some of these communities with over 1 million users, both active and defunct.
gráfico con número de usuarios de las redes sociales desde el comienzo
Infographic of various social sites, dates est and membersInteractive Map Showing Immigration Data Since 1890 - Interactive Graphic - NYTimes.com
See how foreign-born groups settled in your area and across the United States from 1880 to 2000.
This is really pretty cool, particularly how the trends so visibly change over time.
Select a foreign-born group to see how they settled across the United States.Semantics Incorporated: Tying Web 3.0, the Semantic Web and Linked Data Together --- Part 1/3: Web 3.0 Will Not Solve Information Overload
PART 1: Web 3/0: I've been following a fascinating 3-part series of posts this week by Greg Boutin, founder of Growthroute Ventures. The series aimed to tie together 3 big trends, all based around structured data: 1) the still nascent "Web 3.0" concept, 2) the relatively new kid on the structured Web block, Linked Data, and 3) the long-running saga that is the Semantic Web.YjWta.jpg (JPEG Image, 1024x767 pixels)
Pick the right chart based on what you are trying to convey.
Chart Suggestions - A Thought Starter
Chart Suggestions - A Thought-Starter. What would you like to showBBC News - Information goes out to play
The power of visual information
chart
Serious information used to be relayed in words, graphs and charts - pictures were just pretty window dressing. That's all changing, says David McCandless. E-mails. News. Facebook. Wikipedia. Do you ever feel there's just too much information? Do you struggle to keep up with important issues, subject and ideas? Are you drowning in data? In this age of information overload, a new solution is emerging that could help us cope with the oceans of data surrounding and swamping us. It's called information visualisation.
Serious information used to be relayed in words, graphs and charts - pictures were just pretty window dressing. That's all changing, says David McCandless.Natural Earth
Natural Earth is a public domain map dataset available at 1:10m, 1:50m, and 1:110m scales. Featuring tightly integrated vector and raster data, with Natural Earth you can make a variety of visually pleasing, well-crafted maps with cartography or GIS software.Eureqa | Cornell Computational Synthesis Laboratory
Eureqa is a software tool for detecting equations and hidden mathematical relationships in your data. Its primary goal is to identify the simplest mathematical formulas which could describe the underlying mechanisms that produced the data. Eureqa is free to download and use. Below you will find the program download, video tutorial, user forum, and other and reference materials.
"Eureqa is a software tool for detecting equations and hidden mathematical relationships in your data."
Eureqa is a software tool for detecting equations and hidden mathematical relationships in your data. Its primary goal is to identify the simplest mathematical formulas which could describe the underlying mechanisms that produced the data. Eureqa is free to download and use.
It's just what I've always wanted! Thank you!
Uses GA to discover the most likely equation behind your pile of data. Very pretty.OpenGeoscience | Free data | British Geological Survey (BGS)
source: http://news.bbc.co.uk/2/hi/science/nature/8398451.stm
Open GeoScience | British Geological Survey (BGS)
A free service where you can view maps, download photographs and other information. Use OpenGeoscience material free-of-charge for non-commercial private study, research and educational activities. Explore OpenGeoscience Explore the six OpenGeoscience sections: Data, Education, Maps, Pictures, Reports and Software. via @madgestar
Free data : British Geological Survey (BGS)
Open Geoscience is a free service from the British Geological Survey where you can view maps (up to 1:50,000), download photographs and other information. Use OpenGeoscience material free-of-charge for non-commercial private study, research and educational activities.
A free service where you can view maps, download photographs and other information.
"A free service where you can view maps, download photographs and other information. Use OpenGeoscience material free-of-charge for non-commercial private study, research and educational activities"digg labs / 365
Une interface en 360 ° plutôt réussi pour gérer de sliens Digger
les évènements importants de 2009 par diggEnvisioning Development: What is Affordable Housing?
Found on Giulio's Copenhagen climate conference blog
an excellent, important infographic
A very nice looking graphical summary of the claims and counter-claims of global warming skeptics and the scientific consensus response to all the denier's claims. Very nice bit of work.28 Rich Data Visualization Tools - InsideRIA
nice web based viz tools
Amazing collection of web graphing tools. Nice to see data-driven graphics, for a change!http://www.axiis.org/examples/BrowserMarketShare.html
a nice browser market share time-line visualization!Where Does My Money Go?
Excellent visualisation of UK government spending volume5 Best Data Visualization Projects of the Year – 2009 | FlowingData
Very interesting online data visualization example. Found the NYT unemployment chart by demographic very interesting and easy to use.
| FlowingData" Data has been declared sexy, and the rise of the data scientist is here. http://j.mp/6xbP4CFacebook Plans to Make Money by Selling Your Data - ReadWriteWeb
facebook make moneyGmail and Google Apps Account Got Hacked
recuperação gmail
t of all your Gmail / Google Accounts and initiate the password recovery processAverage Twitter user has 126 followers, and only 20% of users go via website | Technology | guardian.co.uk
RT @mikecane: Average Twitter user has 126 followers, and only 20% of users go via website http://tinyurl.com/m9osu5 [from http://twitter.com/jhelmus/statuses/2392524076]
RT @suryasnair: RT @JesseNewhart: Average Twitter user has 126 followers, and only 20% of users go via website http://bit.ly/19dsHV [from http://twitter.com/mikkokiviniemi/statuses/2389322147]
Sono decisamente sopra la media (165)10 Useful Flash Components for Graphing Data
Good review of flash components for graphingThe Fourth Paradigm: Data-Intensive Scientific Discovery - Microsoft Research
The Fourth Paradigm: Data-Intensive Scientific Discovery Presenting the first broad look at the rapidly emerging field of data-intensive science
Gray
In The Fourth Paradigm: Data-Intensive Scientific Discovery, the collection of essays expands on the vision of pioneering computer scientist Jim Gray for a new, fourth paradigm of discovery based on data-intensive science and offers insights into how it can be fully realized.
Free eBook of essays on "Data-Intensive Scientific Discovery" : "Increasingly, scientific breakthroughs will be powered by advanced computing capabilities that help researchers manipulate and explore massive datasets."agile approach | World Bank Open API 2.0 Launched
La revolución también impacta a los mayores organismos de banca multilateral en el planeta. El Banco Mundial ha lanzado un mecanismo para compartir la información económica que ha recolectado durante 50 años, con quien quiera utilizarla para crear aplicaciones en los nuevos entornos webMost religious groups in USA have lost ground, survey finds - USATODAY.com
When it comes to religion, the USA is now land of the freelancers. The percentage of people who call themselves in some way Christian has dropped more than 11% in a generation.
ht: @STRtweets -- interestingly, dig into the really small percentage of atheists.
Cathy Lynn Grossman 3/17/09
Most religious groups in USA have lost ground, survey findsGoogle’s Chiller-less Data Center « Data Center Knowledge
The climate in Belgium will support free cooling almost year-round, according to Google engineers, with temperatures rising above the acceptable range for free cooling about seven days per year on average. The average temperature in Brussels during summer reaches 66 to 71 degrees, while Google maintains its data centers at temperatures above 80 degrees.
Google’s Chiller-less Data Center
最新のデータセンターは冷却設備を持たない
"Google (GOOG) has begun operating a data center in Belgium that has no chillers to support its cooling systems, a strategy that will improve its energy efficiency while making local weather forecasting a larger factor in its data center management. [..]So what happens if the weather gets hot? On those days, Google says it will turn off equipment as needed in Belgium and shift computing load to other data centers. This approach is made possible by the scope of the company’s global network of data centers, which provide the ability to shift an entire data center’s workload to other facilities. [..]The ability to seamlessly shift workloads between data centers also creates intriguing long-term energy management possibilities, including a “follow the moon” strategy which takes advantage of lower costs for power and cooling during overnight hours. In this scenario, virtualized workloads are shifted across data centers in different time zones to capture savings from off-peak utility rates."
マトリックスみたいだ!Data Tables In Modern Web Design - Noupe
will send it to Chopy
"If you conduct social science research but are desperately clinging onto your SAS, SPSS or Matlab licenses; waiting for someone to convince you of R’s value, please allow me to be the first to try".Charts: Flowchart Decides Which Chart Style is Best for Your Data
That's something that's a bit troublesome - if better search technology for indexing the Deep Web comes into existence outside of Google, the world may not end up using it until such point Google either duplicates or acquires the invention.
Enabling a Google-like search from structured sources (databases)
Google and Yahoo approaching structued Web
Halevy, who heads the "Deep Web" search initiative at Google, described the "Shallow Web" as containing about 5 million web pages while the "Deep Web" is estimated to be 500 times the size. This hidden web is currently being indexed in part by Google's automated systems that submit queries to various databases, retrieving the content found for indexing. In addition to that aspect of the Deep Web - dubbed "vertical searching" - Halevy also referenced two other types of Deep Web Search: semantic search and product search.Data Sets | GroupLens Research
How Google Works
'...while both [Google] search queries and processing power have gone up by a factor of 1000, latency has gone down from around 1000ms to 200ms. Crawler updates now take minutes compared to months in 1999.'Science News / Florence Nightingale: The Passionate Statistician
How Florence Nightingale used statistics and good visualization to persuade the queen of England to improve the military medical service.
passion, persistence, for the least, but combined with competency and intelligenceVineet Gupta: NoSql Databases – Part 1 - Landscape
At Directi, we are taking a hard look at the way our applications need to store and retrieve data, and whether we really need to use a traditional RDBMS for all scenarios. This does not mean that we will eschew relational systems altogether. What it means is that we will use the best tool for the job – we will use non-relational options wherever needed and not throw everything at a relational database with a mindless one-size-fits-all approach. ... ... This post covers the current landscape of the NoSQL space. In a subsequent post, I intend to cover in more detail the various problem areas addressed by NoSQL systems and the specific algorithms used.
Really detailed description of a number of NoSQL solutions. Interesting reading on Cassandra and Voldemort.
This post covers the current landscape of the NoSQL space. In a subsequent post, I intend to cover in more detail the various problem areas addressed by NoSQL systems and the specific algorithms used.UX trick: display form data as tabular data | Css Globe
visualisation of rentals of films from netflix. be nice to correlate this to income levels and general demographic data
Examine maps of Netflix rental patterns, neighborhood by neighborhood, in a dozen cities across the nation.A Peek Into Netflix Queues - NYTimes.com
Examine maps of Netflix rental patterns, neighborhood by neighborhood, in a dozen cities across the nation.
Visual depiction of datamining Netflix queues by New York City districts.
#infografico Os filmes mais alugados na Netflix de acordo com o CEP em 12 cidades dos EUA http://bit.ly/6yQ7LD /by NYTimes.com
Netflix queues by location. Interesting, although only for certain areas (I find it cool cause I can look around Seattle).Nielsen: Twitter's growing really, really, really, really fast | The Social - CNET News
New numbers about 'community destination' sites in the U.S. reveal that Twitter grew well over a thousand percent between February 2008 and February 2009. Read this blog post by Caroline McCarthy on The Social.Digital Podge 2009 - Measurable Fun | 17th December 2009
Digital Podge 2009 - infographic-style website
The use simple datavis to show the website infos
Great charts & infographicsHINT.FM / Fernanda Viegas & Martin Wattenberg
This is the collaboration site of Fernanda Viégas and Martin Wattenberg. We invent new ways for people to think and talk about data. As technologists we ask, Can visualization help people think collectively? Can visualization move beyond numbers into the realm of words and images? As artists we seek the joy of revelation. Can visualization tell never-before-told stories? Can it uncover truths about color, memory, and sensuality?
"This is the collaboration site of Fernanda Viégas and Martin Wattenberg."indiemapper
indiemapper - Elegant Thematic Digital Cartography (coming soon) http://retwt.me/1LmXy #maps #gis [from http://twitter.com/geoparadigm/statuses/7290862695]BrowserCouch Documentation
BrowserCouch is an attempt at an in-browser MapReduce implementation.
BrowserCouch is an attempt at an in-browser MapReduce implementation. It's written entirely in JavaScript and intended to work on all browsers, gracefully upgrading when support for better efficiency or feature set is detected.Not coincidentally, this library is intended to mimic the functionality of CouchDB on the client-side, and may even support integration with CouchDB in the futur
"BrowserCouch is an attempt at an in-browser MapReduce implementation. It's written entirely in JavaScript and intended to work on all browsers, gracefully upgrading when support for better efficiency or feature set is detected. Not coincidentally, this library is intended to mimic the functionality of CouchDB on the client-side, and may even support integration with CouchDB in the future."20 Fresh JavaScript Data Visualization Libraries
Exploration of Beatles music through infographics
Graphic designer Michael Deal's infographics of Beatles statistics both obvious (songwriting credits) and unusual (metareferences).
Exploration of Beatles music through infographics (ongoing project) These visualizations are part of an extensive study of the music of the Beatles. Many of the diagrams and charts are based on secondary sources, including but not limited to sales statistics, biographies, recording sesion notes, sheet music, and raw audio readings. Join this project here.The 4 Big Myths of Profile Pictures « OkTrends
almost-scientific quantitative analysis of dating website
Important for meeting and engaging people online, may be important for professional profiles depending on career
How to make your profile picture to pop out.JavaScript grid editor: I want to be Excel « Eltit Golb
A short list of my favorite JavaScript grid components. How many times did you hear users asking you: “something simple, a grid like excel”?
Comparison of jQuery grids.Unlocking innovation | data.gov.uk
UK government stats online
UK government opens up its data - using Drupal!How to Make a Heatmap – a Quick and Easy Solution | FlowingData
R言語/統計解析言語、ヒートマップ、可視化
tutorial R
heatmapshttp://www.michaelvandaniker.com/labs/browserVisualization/
What happened with the Internet in 2009?
What happened with the Internet in 2009? How many websites were added? How many emails were sent? How many Internet users were there? This post will answer all of those questions and many more. Prepare for information overload, but in a good way. We have used a wide variety of sources from around the Web. A full list of source references is available at the bottom of the post for those interested. We here at Pingdom also did some additional calculations to get even more numbers to show you.
What happened with the Internet in 2009? How many websites were added? How many emails were sent? How many Internet users were there? This post will answer all of those questions and many more. Prepare for information overload, but in a good way. ;)
Fascinating statsWorld Government Data | guardian.co.uk
Tehgrauniad's search engine for government data sets.
more info : http://www.guardian.co.uk/news/datablog/2010/jan/07/government-data-world
The one-stop shop for World Government datasets from The Guardian.
Buscador de datos gubernamentales mundiales de The Guardian
Governments around the globe are opening up their data vaults – allowing you to check out the numbers for yourself. This is the Guardian’s gateway to that information. Search for government data here from the UK (including London), USA, Australia and New Zealand – and look out for new countries and places as we add them.どのグラフを使えばいいかを1枚の画像にまとめてみた…の図を日本語化してみた - 適宜覚書はてな異本
グラフの使い分け。
http://i.imgur.com/YjWta.jpgUnlocking innovation | data.gov.uk
"Advised by Sir Tim Berners-Lee and Professor Nigel Shadbolt and others, government are opening up data for reuse. This site seeks to give a way into the wealth of government data and is under constant development. We want to work with you to make it better. We’re very aware that there are more people like you outside of government who have the skills and abilities to make wonderful things out of public data. These are our first steps in building a collaborative relationship with you.[...]"
ça y est ! le site open data UK est public !
Advised by Sir Tim Berners-Lee and Professor Nigel Shadbolt and others, government are opening up data for reuse. This site seeks to give a way into the wealth of government data and is under constant development. We want to work with you to make it better. We’re very aware that there are more people like you outside of government who have the skills and abilities to make wonderful things out of public data. These are our first steps in building a collaborative relationship with you.Yahoo! GeoPlanet - YDN
Yahoo! GeoPlanet helps bridge the gap between the real and virtual worlds by providing an open, permanent, and intelligent infrastructure for geo-referencing data on the Internet. This page provides open access to the underlying data under a Creative Commons Attribution license so that you can incorporate WOEIDs and the GeoPlanet hierarchy into your own applications. The zip file below contains a license file, a readme file, and three data files in tab-delineated, Unicode (UTF-8) format: 1. geoplanet_places_[version].tsv: the WOEID, the placename, and the WOEID of its parent entity 2. geoplanet_aliases_[version].tsv: alternate names in multiple languages indexed against the WOEID 3. geoplanet_adjacencies_[version].tsv: the entities neighboring each WOEID 4. geoplanet_changes_[version].tsv: the list of removed WOEIDs and their replacement WOEID mappings How Do I Get Started? 1. Learn more about GeoPlanet and WOEIDs by reading the GeoPlanet documentation 2. Download
Yahoo! GeoPlanet™ Data
Yahoo! GeoPlanet provides a resource for managing all geo-permanent named places on Earth.
Open Sourced geo data base. Download and use it!Nicholas Felton | Feltron.com
A grapical poster representation of personal relations.
毎年カッチョイイビジュアライゼーション芸が楽しいNicholas Feltonさんの自分レポート2009年度版。
[img]http://feltron.com/images/uploads/ar09_01.jpg[/img]Stunning Infographics and Data Visualization - Noupe
StatisticsThe Roles of Facebook and Twitter in Social Media Marketing | Brian Solis
Social Media marketing is rapidly earning a role in the integrated marketing mix of small and enterprise businesses and as such, it’s transforming every
Social Media marketing is rapidly earning a role in the integrated marketing mix of small and enterprise businesses and as such, it’s transforming every division from the inside out. What starts with one champion in any given division, be it customer service, marketing, public relations, advertising, interactive, et al, eventually inspires an entire organization to socialize. What starts with one, a domino effect usually ensues toppling each department, gaining momentum, and triggering a sense of urgency through its path. And, it also marks the beginning of our journey through the ten stages of social media integration.
Social Media marketing is rapidly earning a role in the integrated marketing mix of small and enterprise businesses and as such, it’s transforming every division from the inside out. What starts with one champion in any given division, be it customer service, marketing, public relations, advertising, interactive, et al, eventually inspires an entire organization to socialize. What starts with one, a domino effect usually ensues toppling each department, gaining momentum, and triggering a sense of urgency through its path. And, it also marks the beginning of our journey through the ten stages of social media integration. ...PeteSearch: How to split up the US
Data visualization of Facebook profiles: "Looking at the network of US cities, it's been remarkable to see how groups of them form clusters, with strong connections locally but few contacts outside the cluster. For example Columbus, OH and Charleston WV are nearby as the crow flies, but share few connections, with Columbus clearly part of the North, and Charleston tied to the South. "Some of these clusters are intuitive, like the old south, but there's some surprises too, like Missouri, Louisiana and Arkansas having closer ties to Texas than Georgia. To make sense of the patterns I'm seeing, I've marked and labeled the clusters, and added some notes about the properties they have in common..."
Fun stuff, lots of entertaining demographic data.
According to FacebookThe Social Life of Health Information | Pew Internet & American Life Project
Americans' pursuit of health takes place within a widening network of both online and offline sources. Whereas someone may have in the past called a health professional, their Mom, or a good friend, they now are also reading blogs, listening to podcasts, updating their social network profile, and posting comments. This Pew Internet/California HealthCare Foundation survey finds that technology is not an end, but a means to accelerate the pace of discovery, widen social networks, and sharpen the questions someone might ask when they do get to talk to a health professional. Technology can help to enable the human connection in health care and the internet is turning up the information network’s volume.
Americans' pursuit of health takes place within a widening network of both online and offline sources.
This Pew Internet/California HealthCare Foundation survey finds that technology is not an end, but a means to accelerate the pace of discovery, widen social networks, and sharpen the questions someone might ask when they do get to talk to a health professional. Technology can help to enable the human connection in health care and the internet is turning up the information network’s volume.
study conducted nov-dec 2008, published june 2009The Man Who Looked Into Facebook's Soul
After forms, data tables are likely the next most ubiquitous interface element designers create when constructing Web applications. Users often need to add, edit, delete, search for, and browse through lists of people, places, or things within Web applications. As a result, the design of tables plays a crucial role in such an application’s overall usefulness and usability.
Great filter examples from Luke Wroblewski.
luke wCongratulations, Google staff: $210k in profit per head in 2008 | Royal Pingdom
Google factura 210k$ por personal. listado de ranking de facturación por empleado
Doing this study could totally open any company's mind
Google had $209,624 in profit per employee in 2008, which beats all the other large tech companies we looked at, including big hitters like Microsoft, Apple, Intel and IBM.
In that sense Microsoft is doing a very good job considering that they are close to matching Google in spite of having 4.5 times as many employees. And of course, looking at overall profit for the company, Microsoft is way ahead of every other company on this list.Data in, Brilliance Out | Tableau Public
JavaScript charts which also work in IE6
visualize and share your data in minutes, and embed in your website. Free (but not opensource)Track Mouse Activity On Your Computer | FlowingData
Interesting!
Excellent data, frank discussion of men's bias towards younger women (with graphs and pictures of cute non-young women!)
ause it has been a successful way to introduce previous posts, I wanted to put real faces on this demographic before I delve into a bunch of numbers. Pictured below are some single users in their mid-thirties or early forties, taken from the first couple pages of my own local match search. Nothing I'll talk about today pertains necessarily to any one of them, but I wanted to put forward some people to go with th
Data from OKCupid on sex match preferences and changes based on age as well as attitudes of men and women basically proving that men should date women older than they are despite the fact that typically they date younger women.C O D E O R G A N
The dawn chorus... the music of love... whale song... a babbling brook... but none compare to the beauty of the Codeorgan. It reads the content of any webpage and translates the content into music.
Generiert fluffige Elektromusik aus HTML-Code. Lustig.
code funk fun ♫♪ via Sarah
Turn any website into music...Google facts and figures (massive infographic) | Royal Pingdom
Google has perhaps more than any other company become The Internet Company. It's grown hand in hand with the internet and its entire business model has from the start been totally focused on the internet as a delivery platform. => A ton of facts and figures about Google.Snake Oil? The scientific evidence for health supplements | Information Is Beautiful
Get Adobe Flash player
Good and bad supplementsA Comparison of Approaches to Large-Scale Data Analysis - MapReduce vs. DBMS Benchmarks
"The following information is meant to provide documentation on how others can recreate the benchmark trials used in our SIGMOD 2009 paper."
A Comparison of Approaches to Large-Scale Data Analysis: MapReduce vs. DBMS BenchmarksPersonal Data Mining | Creativity Online
Nice piece on the growing trend of data design or data visualization.Shakespeare in XML
Shakespeare plays in XML formatDealing with Duplicate Person Data - Proud to Use Perl
I've recently been working on a fairly large project that that has contact information for almost 2 million people. These records contain details for both online and offline actions. Since the data can come from multiple sources there exist many duplicate records. Duplicate records mean more processing for our code, more storage space and more hassle for our clients who have to deal with these duplicates. All in all, bad things to leave lying around. In this article we'll look at some strategies that I used to identify and remove these duplicates. All code in this article are samples, and we'll leave the task of assembling them into a final working program up to the reader. CPAN is your Friend Like all good Perl projects, we will make heavy use of the CPAN. It makes our lives so much easier and every day I'm more in awe at the quality and bredth of solutions I find there. For this project we'll be using Text::LevenshteinXS, Lingua::EN::Nickname and Parallel::ForkManager. What is a Du
Funny to see people still using perl these days but great exampleAn Easy Way to Make a Treemap | FlowingData
library(portfolio)
I think this would be pretty easy to do with gpplot2, but Portfolio looks like its worth checking out too.
Here's a really easy way to make your own treemap in just a couple lines of code. We're looking to make something like this:Google - public data
The Google Public Data Explorer makes large datasets easy to explore, visualize and communicate. As the charts and maps animate over time, the changes in the world become easier to understand. You don't have to be a data expert to navigate between different views, make your own comparisons, and share your findings. Explore the data Students, journalists, policy makers and everyone else can play with the tool to create visualizations of public data, link to them, or embed them in their own webpages. Embedded charts and links can update automatically so you’re always sharing the latest available data.Think like a statistician – without the math | FlowingData
#beinghuman #maths #math #toread How to think like a statistician without the math http://to.ly/1lt0Google - public data
Limited data but nice and clear interface
Datasets and visualization
The Google Public Data Explorer makes large datasets easy to explore, visualize and communicate. As the charts and maps animate over time, the changes in the world become easier to understand. You don't have to be a data expert to navigate between different views, make your own comparisons, and share your findings.
Graficación animada desde diferentes fuentes de info públicaScience gleans 60TB of behavior data from Everquest 2 logs - Ars Technica
WEEK 8 -- 03/10/2010
In February 2009 Dmitri Williams
4 years, 400k players ~=60TB -- about 475k/s, slightly > 1k /user/sec.
Thanks to a partnership with Sony, a team of academic researchers have obtained the largest set of data on social interactions they've ever gotten their hands on: the complete server logs of Everquest 2, which track every action performed in the game.
m psychologists to epidemiologists have wondered for some time whether online, multiplayer games might provide some ways to test concepts that are otherwise difficult to track in the real world. A Saturday morning session at the meeting of the American Association for the Advancement of
Food for data minersTim Berners-Lee on the next Web | Video on TED.com
TED Talks 20 years ago, Tim Berners-Lee invented the World Wide Web. For his next project, he's building a web for open, linked data that could do for numbers what the Web did for words, pictures, video: unlock our data and reframe the way we use it together.
Over het nieuwe web waaronder linked en open data op het web.Data Compression Explained
Gary Flake demos Pivot, a new way to browse and arrange massive amounts of images and data online. Built on breakthrough Seadragon technology, it enables spectacular zooms in and out of web databases, and the discovery of patterns and links invisible in standard web browsing.
PivotFoursquare Introduces New Tools for Businesses - Bits Blog - NYTimes.com
Foursquare, a location-based facebook and myspace, intends to disperse a free of charge analytics application and dash panel inside the forthcoming days that will allow business people entry to an array of information and also figures regarding guests thus to their organizations.
Foursquare, a location-based myspace and facebook, intends to disperse a free stats instrument along with dash in the forthcoming weeks that could give companies usage of an array of info along with statistics about site visitors thus to their shops.
Foursquare, a location-based facebook and myspace, plans to deliver a free analytics tool and also dash panel in the arriving months that will give business owners access to an array of info as well as data regarding guests for their institutions.
Foursquare, a location-based social network, intends to deliver a free analytics device and also dash within the forthcoming days that could offer companies access to a range of information and statistics regarding site visitors recommended to their establishments.
Foursquare, a location-based myspace or facebook, plans to disperse a totally free statistics application and also instrument cluster in the forthcoming several weeks which will provide business people access to a variety of data as well as data about website visitors thus to their shops.
Foursquare, a location-based social network, intends to distribute a free stats tool and also dash inside the forthcoming several weeks that may offer business people entry to a range of information and data concerning site visitors thus to their institutions.
Foursquare, a location-based myspace or facebook, plans to send out a no cost stats device and dashboard inside arriving days which will offer business owners use of a variety of details along with statistics concerning site visitors to their establishments.BBC News - The top 100 sites on the internet
Which are the biggest sites on the internet? Explore this interactive graphic to find out.eBay’s two enormous data warehouses | DBMS2 -- DataBase Management System Services
trics on eBay’s main Teradata data warehouse include: * >2 petabytes of user data
Millions of queries per day
Statistieken over de databaseverwerking van ebayFacebook offers up users as marketing tool | Business | guardian.co.uk
He added the company has been experimenting with analysis of user sentiment, tracking the mood of its audience through what they are doing online. Such information is potentially very interesting to large brands, which are always seeking to measure what their customers think about their own or competitors' products. Facebook's advertising technology already allows advertisers to choose which sort of customer will see their display adverts when they log on to the site. Advertisers can choose from such categories as where the user is located and their age and gender, based upon what the user has uploaded on to Facebook – which is adding about 450,000 new users a day.
An article from February
RT @alexiskold: RT @zaibatsu Facebook aims to market its user database to businesses http://bit.ly/RSUU. [from http://twitter.com/brianking/statuses/1167955676]
aditya: @artagnon Here you go: http://tinyurl.com/aau3kr
כתבה העוסקת בכוונתו של פייסבוק להשתמש בבנק נתוני הגולשים שלו לצורך מסחריWolfram|Alpha
Computational Knowledge EngineThe MessagePack Project
MessagePack is a binary-based efficient object serialization library. It enables to exchange structured objects between many languages like JSON. But unlike JSON, it is very fast and small.
"MessagePack is a binary-based efficient object serialization library. It enables to exchange structured objects between many languages like JSON. But unlike JSON, it is very fast and small." -- apache licenseThe No-Stats All-Star - NYTimes.com
[[* Player 2.0, uses stats to guide play; problem of selfishness in basketball v baseball *]]
Fantastic story featuring an underdog bball star, advanced data analysis and great questions about whether our sports statistics really measure greatness.Open Data (OData)
The simplest OData service can be implemented as simply as a static file that follows the OData ATOM or JSON payload conventions. For scenarios beyond static content, frameworks are available to help in creating OData services. See the OData developer page for additional information.
New open data protocol being pushed by MicrosoftSnake oil? Scientific evidence for health supplements | Information Is Beautiful
play with the interactive version | find out more about this image | post a comment This image is a balloon race. The higher a bubble, the greater the
scientific evidence for popular health supplements [chart]
A graphical representation of the scientific evidence for or against particular health supplements. Vitamin C seems to be quite far down the list :).Official Google Data APIs Blog: Bringing OpenID and OAuth Together
Every OAuth provider should encapsulate OAuth authorization inside OpenID. Better UX, lesser redirects http://bit.ly/7qbfPB
OAuth-enabled APIs suSean Gourley on the mathematics of war | Video on TED.com
Argues that you should look at the insurgency according to its structure.. more fragmented groups, or fewer but stronger groups?
Long running conflicts depend on a stability of number of factions vs. strength of factions, if you have lots of weaker factions, they're not strong enough to commit as effective attacks, if you have fewer, but stronger factions, you can start leveraging negotiation.17 Killer Mashups for Taking Control of Your Government
Julkishallinnon mashupeja.Data Marketplace : Find, buy and sell data online
a place where one can buy and sell structured datasets online - e.g. the WAL MART Location in the US - weekly Oilprices since 1970. If a dataset is not available, you can request it and bid an amount with a set deadline for delivery
Find, buy and sell data onlineNumberQuotes - Get a quote, make a point
NumberQuotes Get a quote and make your point Ever need a good quote to add scale to a number? You know, you’re giving a presentation on sales and you want to give a number some scale.EveryBlock source code released / The EveryBlock Blog
oooh, I was waiting for this! would love to play around with this for oly.The Four Pillars of an Open Civic System - O'Reilly Radar
Oivaltava artikkeli siitä, kenen kaikkien välillä ja mihin suuntaan datan tulee liikkua (hallinto-kansalaiset) (kansalaiset-hallinto) (hallinto-hallinto) ja (kansalaiset-kansalaiset) What we really want (or what I really want anyway) is not simply government transparency, but an open civic system - a civic system that operates, and flourishes, as a fully open system, for whatever level we happen to be talking about - federal, state, city, neighborhood, whatever. And transparency is a big part of that open civic system, but it is still only one part. In fact there are four parts to a functioning open civic system. These are:
Citizen to Citizen (C2C). Okay so now we have both open G2C and C2G data flows going, and that's great - huge amplification of civic activity, great realization of efficiency with regards to interaction between government and people. But there are all sorts of ways to improve civic life that don't really need to involve the government at all - what about those things? That's where Citizen to Citizen, or C2C, data flows come in. C2C is the citizens' brigade of data flow - it's the people doing it for themselves, whatever "it" happens to be. Clever Commute, in New Jersey, is one example of a great C2C data flow.
By John Geraci
Comments on Open Government (eGov in the UK)BBC NEWS | Magazine | How to understand risk in 13 clicks
parallel sets diagramLifehacker - Hive Five Winner for Best Free Data Recovery Tool: Recuva - hive
TestDisk seems more interestingAWS Import/Export
is this supposed to be a joke - http://aws.amazon.com/importexport/
How the Amazon Web Services deal with importing and exporting large volumes of data to/from the cloud.
Import large data sets via snail mail into Amazon S3 storage.Finally, A Practical Use for Second Life - ReadWriteWeb
e benefits to working with data in this way don't really need to be touted too much - many businesses already perform data visualization, often using expensive software and powerful computers to do so. What makes what Green Phosphor does so interesting is not that they've come up with a way to visualize data - it's that they've come up with a way to leverage the platforms of virtual worlds to do so.
Second Life and real world data coming togetherMicroformats for business owners | Clagnut § Web standards · New media industry
Article explains the usefulness of microformats.10 Outstanding Social Media Infographics | @NowSourcing.Com
RT @Inma_Eiroa: RT @MarAbad: 10 infografías de redes sociales http://bit.ly/dAve0G
Social media infographicsBy the Numbers: Facebook vs The United States [INFOGRAPHIC]
Interesting break down of Facebook in the US vs US Population. Lots of nuggets for the water cooler. My favorite? More people claim DC in Facebook than live in DC IRL!WTF is a SuperColumn? An Intro to the Cassandra Data Model — Arin Sarkissian
Introductory blog post about the Cassandra data model.10 signs you don't understand web analytics - iMediaConnection.com
Useful info for MKT571 and MKTTM571 students.Information is beautiful: war games | News | guardian.co.uk
#armamentos
'Information is beautiful'Infovore » Learning to Think Like A Programmer
Should journalists learn to code?
What’s really important is to not understand how to do magical things with code, but to learn what magical things are possible, what the necessary inputs for that magic are, and who to ask to do it. Identify the repetitive tasks that computers are good at. Yes, they’re good at find-and-replace, but tools like regular expressions are even handier, and I’m amazed how few people understand that find-and-replace is the beginning, not the end, of text processing. (And yes, I’m aware that regex are a quick way to give yourself two problems.)
Useful advise to anyone looking to work with online tools. I can't write a computer program but my understanding of the fundamentals has helped me no end.
"It requires you to learn to translate intent into code, to know what’s possible, to know what’s easy and what’s hard, and to know what to do when third-party things you’re glueing together don’t work." "Computers are really good at processing regular data, and they are really, really good at repetitive tasks. Every time I watched someone in an office doing a repetitive, regular task I despaired, because that’s exactly the kind of thing we have computers for." "…nowadays, computers are a sort of primary source too. You’ve got to learn to interrogate them effectively - and quote them meaningfully - too." A sibling suggestion would be ‘Learn to explore inquisitively’. One of the reasons only 20% of an application’s functionality is used by the majority of users is that their major motivation when they start using an application is ‘How do I do [x] in [y]?’, as opposed ‘What [x]s can [y] do for me?’
What’s really important is to not understand how to do magical things with code, but to learn what magical things are possible, what the necessary inputs for that magic are, and who to ask to do it.
The more I talk to academics, the more I echo the following sentiment: "I remain convinced there’s an interesting book on “doing smart stuff with computers that isn’t quite programming but isn’t far off”, because let’s face it, most people deal with data all the time now, and have the ideal tool for working with it on their desks."Your Random Numbers – Getting Started with Processing and Data Visualization | blprnt.blg
"This post, then, is a first sketch of what a lesson plan for teaching Processing and data visualization might look like. I’m going to start from scratch, work through some examples, and (hopefully) make some interesting stuff. One of the nice things, I think, about this process, is that we’re going to start with fresh, new data – I’m not sure what kind of things we’re going to find once we start to get our hands dirty. This is what is really exciting about data visualizationjStorage - simple JavaScript plugin to store data locally
One of the most interesting parts of running a large website is watching the effects of unrelated events affecting user traffic in aggregate. Web traffic is something that companies typically keep very secret, and often the only time engineers can talk about it is late at night, at a bar, and very much off the record. There are many good reasons for keeping this kind of information confidential, particularly for publicly traded companies with complicated disclosure requirements. There are also downsides, the biggest being that is difficult for peers to learn from each other and compare notes. John Allspaw recently created a WebOps Visualizations group on Flickr for sharing these kinds of graphs with the confidential information removed. Here’s an example of a traffic drop seen both by Flickr & by Last.FM that coincided with President Obama’s inauguration.
all stats and graphs are fantastic. Whatever they are for.
オバマ宣誓中のネット活動可視化。
Comparison of traffic at Flickr, Google, Twitter, last.fm during the Obama inauguration. "One of the most interesting parts of running a large website is watching the effects of unrelated events affecting user traffic in aggregate."
dataporn webtraffic
zahlenfutter und die verbildlichung davon. rund um die amtseinführung von barack obamaHard Drive Data Recovery - Hard Drive Recovery Software - How to Fix a Hard Drive - Popular Mechanics
If the components in your drive are still functioning, you can recover the data yourself. If there's mechanical damage, send it to the pros. PM's complete guide to getting your files back.Many Eyes: Obama Inauguration Speech Word Tree
visualisiert Text in Baumdiagramm
obama word tree
interactive map to explore the speechData | The World Bank
Site regroupant un gros paquet de données de la banque mondiale.This is What a Tweet Looks Like
140 chars you say?... This is What a Tweet Looks Like http://j.mp/8ZkkDJ
This is What a Tweet Looks Like
twitter schema representation.
a breakdown of all the information in a tweet
Get a look at all the metadata behind the humble tweet: http://is.gd/bEYcL This is what the semantic web looks like. via @paoloman – Negar Mottahedeh (negaratduke) http://twitter.com/negaratduke/statuses/12711760820
Think a tweet is just 140 characters of text? Think again. To developers building tools on top of the Twitter platform, they know tweets contain far more information than just whatever brief, passing thought you felt the urge to share with your friends via the microblogging network. A tweet is filled with metadata - information about when it was sent, by who, using what Twitter application and so on. Now, thanks to Raffi Krikorian, a developer on Twitter's API/Platform team, you can see what a tweet looks like, in all its data-rich detail.良いデザインを決定するのはデータなのか、それとも… - Keep Crazy;shi3zの日記
「世界で最もユーザーインターフェースデザインの研究に費用を投じている会社を知ってますか?ノキアとマイクロソフトです。もちろんその結果は、ごらんの有様ですよ」
ときどき、「技術的に素人の方が技術者にない斬新な発想がでる」という発言を見ることがあるが、それはハッキリした間違いである。 むしろ素人の方が表面的な制約を「技術の制約」と捉えやすく、技術者の方が表面的な制約を「技術で簡単に乗り越えることが出来る」と看破しやすい。 技術的な素人が技術の発展に役立つ場面は、その素人の言った何気ない言葉を優れた技術者が咀嚼して技術的なジャンプをした場合である。
「世界で最もユーザーインターフェースデザインの研究に費用を投じている会社を知ってますか?ノキアとマイクロソフトです。もちろんその結果は、ごらんの有様ですよ」 それでも一般ユーザの反応というのは、重要だと思う。 ゲームの神と崇められる宮本茂さえもユーザビリティテストの結果に血相を変えて仕様を変更することは沢山あったらしい。Google Launches Maps Data API - O'Reilly Radar
The Google Maps Data API allows client applications to view, store and update map data in the form of Google Data API feeds using a data model of features (placemarks, lines and shapes) and maps (collections of features).
RT @cshirky: Four little words with big implications for reporting: Google Maps Data API. http://bit.ly/wjN4G [from http://twitter.com/axbom/statuses/1862185566]
maps API (read and investigate)Kyle Conroy's Personal Blog and Portfolio - Should I have bought that Apple Product?
"Currently, Apple's stock is at an all time high. A share today is worth over 40 times its value seven years ago. So, how much would you have today if you purchased stock instead of an Apple product?"
If I'd bought stock instead of a iBook 3G in 2003 I'd have $60k. http://www.kyleconroy.com/apple-stock.php Wowza.
RT @zeldman: Read it and shoot yourself. http://j.mp/8Xz1mc – David (dpan) http://twitter.com/dpan/statuses/12947252610
Apple product vs how much Apple stock would be worth instead. Silly but interesting.
List of apple products, the cost of those products at the time of release and what Apple stock would be worth today had you bought stock intstead of the product.
What if I had bought Apple stock instead?
RT @zeldman: Read it and shoot yourself. http://j.mp/8Xz1mcSocial Media Demographics: Who’s Using Which Sites? / Flowtown (@flowtown)
Social media demographics whos using which sites
[illustration] Social Media Demographics / segmentation drastique de qui utilisent quoi - http://bit.ly/cAAhpXRecover Data Like a Forensics Expert Using an Ubuntu Live CD - Hard Drives - Lifehacker
To the extent possible under law, Flickr has waived all copyright and related or neighboring rights to the “Flickr Shapefiles Public Dataset, Version 1.0”. This work is published from the United States. While you are under no obligation to do so, wherever possible it would be extra-super-duper-awesome if you would attribute flickr.com when using the dataset. Thanks!Enemy Lurks in Briefings on Afghan War - PowerPoint - NYTimes.com
“It’s dangerous because it can create the illusion of understanding and the illusion of control... Some problems in the world are not bullet-izable.” --Brig. Gen. H. R. McMaster
"Gen. Stanley A. McChrystal, the leader of American and NATO forces in Afghanistan, was shown a PowerPoint slide in Kabul last summer that was meant to portray the complexity of American military strategy, but looked more like a bowl of spaghetti.... 'When we understand that slide, we’ll have won the war,' General McChrystal dryly remarked, one of his advisers recalled, as the room erupted in laughter...."Bulk Data Downloads: A Breakthrough in Government Transparency - O'Reilly Radar
Wow this is potentially huge! Thoughts? RT @timoreilly:Bulk Data Downloads:A Breakthrough in Government Transparency http://bit.ly/EizO3 [from http://twitter.com/jhelmus/statuses/1283585077]
On getting greater access to government documents and data, with an amendment now in the HouseNew Data on Top Twitter Applications and Usage
Data on Twitter application usage to update Twitter.
Let's look at a random sample of a half a million tweets and see what people actually use to post updates to Twitter.
Feb 09 statistics on where and how people use TwittermySociety » Blog Archive » Say hello to Mapumental
Built with support from Channel 4’s 4IP programme, Mapumental is the culmination of an ambition mySociety has had for some time – to take the nation’s bus, train, tram, tube and boat timetables and turn them into a service that does vastly more than imagined by traditional journey planners. In its first iteration it’s specially tuned to help you work out where else you might live if you want an easy commute to work.
Map of everything with lots of data being included to make an uber-useful househunting guideMagnificent Maps: Power, Propaganda and Art - Stephen Walter's The Island
Killer hand-drawn map of London, slightly bird's-eye viewish.
RT @cunabula: Amazingly detailed map of london, completely handcrafted and beautifully illustrated. http://bit.ly/boKNmX via @martin_isa ...
The Island satirises the London-centric view of the English capital and its commuter towns as independent from the rest of the country. The artist, a Londoner with a love of his native city, offers up a huge range of local and personal information in words and symbols. Walter speaks in the dialect of today, focusing on what he deems interesting or mundane.Teens and Their Mobile Phones / Flowtown (@flowtown)
How Teens Use Cellphones [INFOGRAPHIC] http://www.flowtown.com/blog/teens-and-their-mobile-phones?display=wide
How are teens using their Cell Phones?
RT @trendplanner: Teens and Their Mobile Phones: An infographic - http://ow.ly/1HCqLThe Evolution of Privacy on Facebook
(d)Evolution of privacy on Facebook. Very cool. http://mattmckeon.com/facebook-privacy/ – Matt Suddain (suddain) http://twitter.com/suddain/statuses/13624852352
This really brings the point home.
Really interesting infographic showing how Facebook's default privacy settings have increasingly exposed one's profile to a broader and broader audience.
Visualizing Facebook's Privacy Evolution http://bit.ly/deZJyu #fb #privacy #temporal
The Evolution of Privacy on FaceBook http://bit.ly/alyy8g (I'd call it devolution.) #eventprofs #assnchat #engage365
RT @draenews: Del The Evolution of Privacy on Facebook: http://bit.ly/cEf3sP
In the beginning, it restricted the visibility of a user's personal information to just their friends and their "network" (college or school). Over the past couple of years, the default privacy settings for a Facebook user's personal information have become more and more permissive. They've also changed how your personal information is classified several times, sometimes in a manner that has been confusing for their users. This has largely been part of Facebook's effort to correlate, publish, and monetize their social graph: aThe Twitter Engineering Blog: Introducing Gizzard, a framework for creating distributed datastores
how many data centers does Google use, and where are theyBrooklyn Museum: API
"The Brooklyn Museum Collection API is a set of services that you can use to display Brooklyn Museum collection images and data in your own applications."
i
Publicly documented API for collections data. See the Application Gallery (http://www.brooklynmuseum.org/opencollection/api/docs/application_gallery). Also note the API key application and terms of use.
* The Brooklyn Museum Collection API is a set of services that you can use to display Brooklyn Museum collection images and data in your own applications.Open Data is Civic Capital: Best Practices for "Open Government Data"
16 open data principles. Josh Tauberer
This document is a best practices guide for governments embracing the notion of "open data". It discusses why open government data is beneficial to society, i.e. how it is civic capital, and what kinds of technological considerations must be made when making government data open. The document is intended to be read both by web managers, who may wish to skip the final Recommendations section, and by government web developers.
This document is a best practices guide for governments embracing the notion of "open data". It discusses why open government data is beneficial to society, i.e. how it is civic capital, and what kinds of technological considerations must be made when making government data open. The document is intended to be read both by web managers, who may wish to skip the final Recommendations section, and by government web developers. By Joshua TaubererThe 7 ½ Steps to Successful Infographics - Articles - MIX Online
RT @mixonline The 7 ½ Steps to Successful Infographics http://bit.ly/d9Mjms
The right mix of down-to-earth and reach-for-the-starts.100 Seriously Creative Infographics
This is the 2nd of 5 gallery-style posts in our week-long inspiration series in which we are collecting 100 awesome examples in a given category. Today we are covering infographics, you know those nifty illustrations that make complicated subjects easy to understand. The infographic (short for information graphic) has become a mainstream fixture thanks, in part, to publications such as Wired Magazine whose commissioned infographics are notoriously complicated and cool and widely respected within the design community. This collection contains great color combinations, infographics of varying sizes, creative typography and of course, extremely clever illustrations.
100 Seriously Creative InfographicsRecover Data Like a Forensics Expert Using an Ubuntu Live CD - How-To Geek
Recover Data Like a Forensics Expert Using an Ubuntu Live CD
ConclusioniPhone App Sales, Exposed
iPhone App Sales, Exposed http://om.ly/jnSG #tmdeler – Thomas Moen (thomasmoen) http://twitter.com/thomasmoen/statuses/14214376056
I'd say it's worth going here but it's scary how Apple is in control.
Survey of 96 different apps and units sold: "While industry wisdom states that application updates always boost downloads and sales, Apple has changed how updated apps are given exposure and this now doesn’t quite hold true. Some developers reported that updating the app gave only a small—and brief—spike in downloads. What did seem to have a larger impact on sales was a drop in price, although this also tended to taper off quickly. Being featured by Apple is the greatest contributor to spiking sales. The level of Apple."You’re Leaving a Digital Trail. What About Privacy? - NYTimes.com
An emerging field called collective intelligence could create an Orwellian future on a level Big Brother could only dream of.
The success of Google, along with the rapid spread of the wireless Internet and sensors — like location trackers in cellphones and GPS units in cars — has touched off a race to cash in on collective intelligence technologies.
collective intelligence
“The new information tools symbolized by the Internet are radically changing the possibility of how we can organize large-scale human efforts,” said Thomas W. Malone, director of the M.I.T. Center for Collective Intelligence. “For most of human history, people have lived in small tribes where everything they did was known by everyone they knew,” Dr. Malone said. “In some sense we’re becoming a global village. Privacy may turn out to have become an anomaly.”BBC NEWS | UK | Magazine | How to understand risk in 13 clicks
What are we to make of all those stories that warn of lifestyle dangers and slap a giant percentage sign in the headline? Michael Blastland introduces the Risk-o-meter to his regular column.
nice presentation, and even nicer correlation visualization if you scroll down further.
Ценнейший материал о том, как создавалась и тестировалась на читателях социальная реклама, представляющая сложные факты в виде наглядной инфографики.
What are we to make of all those stories that warn of lifestyle dangers and slap a giant "%" sign in the headline? Michael Blastland introduces the Risk-o-meter to his regular column.Google Prediction API - Google Code
Prediction API biedt mogelijkheden om bijv recommendations te doen op basis v historische data: http://bit.ly/c7z06p
Google Prediction APIA Tour through the Visualization Zoo - ACM Queue
Rich survey of advanced data visualization techniques http://is.gd/chAgi – Maria Popova (brainpicker) http://twitter.com/brainpicker/statuses/14368336820
A survey of powerful visualization techniques, from the obvious to the obscureStarting a business isn't as crazy and risky as they say - Blog - Small Business Start Up - Ideas - Resources
Starting a business isn't as crazy and risky as they say - Blog -
Startups, small business, marketing, and useful geekery from someone who's been there: Jason Cohen, founder of Smart Bear SoftwareInfographics news: i from infographics
Infographics news: i from infographics
i from infographicsindieprojector
Indieprojector is a free web service that re-projects digital map files and converts them to SVG for use in vector graphics editing software. Map projections are an essential part of map making but we found the existing tools to be too expensive, inflexible or complicated. Indieprojector is the smarter, easier, more elegant way to reproject and convert geographic data. It's a preview of our indiemapper technology that will bring map-making into the 21st century using web-services and a realtime visual approach to cartographic design.
Free geographic projection and data conversion tool with SHP / KML import, geographic projections and SVG export.
free geographic projection and data conversion tool: indieprojector
Site for converting various map formats into svgWhat turns women on - Times Online
Meredith Chivers is a 36-year-old psychology professor at Queen’s University in the small city of Kingston, Ontario.How Do You Feel About the Economy? - Interactive Feature - NYTimes.com
Interactive visualization from the NY Times
enter a word - track reader responses over time
The word train on the financial crisisStack Overflow Creative Commons Data Dump - Blog - Stack Overflow
Awesome, Stack Overflow release all of their public web data under a CC license.Google Flu Trends | Mexico
internet searches lead researchers to predict flu outbreaks early
We've created experimental estimates of flu activity in Mexico using aggregated search data. Unlike Google Flu Trends for the U.S., this data has not been validated against confirmed cases of flu.
RT @rzeiger: Google launches Experimental Flu Trends for Mexico http://tinyurl.com/cvdfnt #swineflu [from http://twitter.com/healthblawg/statuses/1653930907]
Experimental Flu Trends for Mexico by Google: http://tr.im/k6oa [from http://twitter.com/AdNerds/statuses/1659090902]
Flu trends analisys based on searches in Google | Tendencias de influeza basadas en búsquedas en Google
Seems that information overflow has turned into something positive with Google Flu TrendsCompeting Tax Plans: Two Perspectives - Freakonomics - Opinion - New York Times Blog
Nice chart comparing tax plans.
Obama v McCain tax plans
Wash. Post. / NYTimes review of tax plans useful diagramsPopular favorite images tagged with "infographics" on vi.sualize.us
Popular favorite images tagged with "infographics"
Popular favorite images tagged with 'infographics' on vi.sualize.us - a showcase of visual content for inspiration or simply delight of the spectator based on people recommendationsGoogle Begins to Make Public Data Searchable - ReadWriteWeb
Google Begins to Make Public Data Searchable http://bit.ly/11uBIO <-- event of historic importance [from http://twitter.com/marshallk/statuses/1642481003]
Google just announced its first foray into making public data searchable and viewable in graph form. The company is starting with population and unemployment data from around the US but promises to make far more data sets searchable in the future. The potential significance of making aggregate data about our world easy to visualize, cross reference and compare can't be overstated.Inside the ‘James Bond Villain’ Data Center « Data Center Knowledge
incredible use of people's searches as ambient trending data...
RT @rosshill: Can Google spot health epidemics early? http://tinyurl.com/czxtfh [from http://twitter.com/willdonovan/statuses/1667576296]
Last week, at the request of the Centers for Disease Control, Google took a retroactive look at its search data from Mexico. And there the team found a pre-media bump in telltale flu-related search terms (you know, “influenza + phlegm + coughing”) that was inconsistent with standard, seasonal flu trends.
Google’s search data may have been able to provide an early warning of the swine flu outbreak — if the company had been looking in the right place. Last week, at the request of the Centers for Disease Control, Google took a retroactive look at its search data from Mexico. And there the team found a pre-media bump in telltale flu-related search terms (you know, “influenza + phlegm + coughing”) that was inconsistent with standard, seasonal flu trends.
Googles search data may have been able to provide an early warning of the swine flu outbreak if the company had been looking in the right place.What is data science? - O'Reilly Radar
The future belongs to the companies who figure out how to collect and use data successfully. In this in-depth piece, O'Reilly editor Mike Loukides examines the unique skills and opportunities that flow from data science.
aspects Business Intelligence, Text Mining, and other statistical analysisLocals and Tourists - a set on Flickr
We've seen maps of photographic activity around the world, and maps of traffic activity in a city, which reveal how heavily roads are used. And now, photographer Eric Fischer has combined both ideas, creating maps of 50 different cities around the world, using only the geotags of photos uploaded to Flickr and Picasa. What emerges are basically maps of human interest--that is, all the places fascinating enough that someone decided to take a picture.
Some people interpreted the Geotaggers' World Atlas maps to be maps of tourism. This set is an attempt to figure out if that is really true. Some cities (for example Las Vegas and Venice) do seem to be photographed almost entirely by tourists. Others seem to have many pictures taken in piaces that tourists don't visit.
Overviews of places tourists visit versus places locals visit.
Blue points on the map are pictures taken by locals (people who have taken pictures in this city dated over a range of a month or more). Red points are pictures taken by tourists (people who seem to be a local of a different city and who took pictures in this city for less than a month).If San Francisco Crime were Elevation | Doug McCune
I’ve been playing with different ways of representing data (see my previous night lights example) and I decided to venture into 3D representations. I’ve used a full year of crime data for San Francisco from 2009 to create these maps. The full dataset can be download from the city’s DataSF website.
RT @brainpicker: San Francisco crime rates, visualized as topographic elevation http://bit.ly/bOlQAiWorld Cup 2010 Twitter replay | Football | guardian.co.uk
beim guardian kann man wm-spielverläufe anhand von twitter-schlagworten nachverfolgen. http://bit.ly/c43l7B via @jayzon277
Wirklich schöne Visualisierung vom Guardian: Die Twitter-Tags während aller #WM-Spiele im Replay. http://bit.ly/bPGMfb /via @spreeblick
i'm watching world cup
Watching: "World Cup 2010 #Twitter replay | Football | guardian.co.uk" ( http://bit.ly/d0tk8j )
twitterconversatie rond NED-DEN in beeld gebracht. Vooral de "hup" op het einde... Mooi!
Follow in high-speed replay of the World Cup and find out how Twitter reacted to every gameThe State of Mobile Apps | Nielsen Wire
Statistical data of app usageImagine A Pie Chart Stomping On An Infographic Forever - Smashing Magazine
A certain category of design gaffes can be boiled down to violations of audience expectations. Websites that don’t work in Internet Explorer are a heck of a nasty surprise for users who, bless their souls, want the same Internet experience as everyone else. Websites that prevent copying, whether through careless text-as-image conversions or those wretched copyright pop-ups from the turn of the century, cripple a feature that works nearly everywhere else on the Internet. Avoiding this category of blunders is crucial to good design, which is why I am upset that one particular pitfall has been overlooked with extreme frequency.
Some cool examples of info graphics.
A critique of some poor infographics
A importância de apresentar dados de forma clara e um showcase de infográficos ruinsJSonduit
"A service that can turn practically anything on the web into a JSON feed that any website or mobile app can consume. A JSON conduit, if you will."
RT @LaFermeDuWeb: Jsonduit - Transformez tout site web en un flux JSON Consommable: http://fdw.lu/aB7
Any data, anywhere. JSonduit is a service that can turn practically anything on the web into a JSON feed that any website or mobile app can consume. A JSON conduit, if you will. Feeds are created from one or more source URLs and a custom transform, written in JavaScript, that can manipulate the data before the feed is served. JSonduit also provides a hosting service for web widgets so that any site can easily display JSonduit feeds. In fact, the recent/popular lists you see below are widgets served by the JSonduit service; all done in a couple of lines of JavaScript (go ahead, view the page source!).
JSonduit is a service that can turn practically anything on the web into a JSON feed that any website or mobile app can consume. A JSON conduit, if you will.jQuery Templates and Data Linking (and Microsoft contributing to jQuery) - ScottGu's Blog
jQuery Templates and Data LinkingHuge Infographics Design Resources: Overview, Principles, Tips and Examples | Onextrapixel - Showcasing Web Treats Without A Hitch
Some good resources, including software links and other references.Ten of the greatest maps that changed the world | Mail Online
1) Dimitri Moor propaganda map of the USSR during the civil war, 2) circa 1490 world map used by Columbus to garner support for his expedition, 3) earliest known Chinese globe from 1623, 4) Waldseemuller world map, first naming the American continent, 5) Google Earth ("Google Earth presents a world in which the area of most concern to you can be at the centre, and which - with mapped content overlaid - can contain whatever you think is important. Almost for the first time, the ability to create an accurate map has been placed in the hands of everyone, and it has transformed the way we view the world."), 6) 1889 map of London poverty, 7) post-Revolutionary War map establishing USA-Canada border, 8) Harry Beck's London tube map, 9) the Peters equal-area projection world map, and 10) a late medieval world map that "marks the birth of English patriotism".
CHINESE GLOBEGoogle Maps
RT @draenews: Del Find exif data - Online exif/metadata photo viewer: http://www.findexif.com/
Extract exif data from any jpg online image, just paste the URL of the image, no need to upload photos to our server!Magazine Preview - The Data-Driven Life - NYTimes.com
Humans make errors. We make errors of fact and errors of judgment. We have blind spots in our field of vision and gaps in our stream of attention. Sometimes we can’t even answer the simplest questions. Where was I last week at this time? How long have I had this pain in my knee? How much money do I typically spend in a day? These weaknesses put us at a disadvantage. We make decisions with partial information. We are forced to steer by guesswork. We go with our gut.
Does anybody really believe that long hours at a desk are a vocational ideal?
Gary Wolf, of Wired and The Quantified Self, describes personal data collection and analysis in NYT magazine.Falsehoods Programmers Believe About Names: MicroISV on a Shoestring
This blog is about the business aspects of running Bingo Card Creator, a small software company. A brief summary of the last few years is available here. If you like what you see, I encourage you to sign up for the RSS feed. Thanks for visiting!COS 493, Spring 2002: Schedule and Readings
Algorithms for Massive Data SetsGoogle Storage for Developers - Google Code
Got the Google Store invite: http://code.google.com/apis/storage - Need to find time to play with it...
Google Storage for Developers is a RESTful service for storing and accessing your data on Google's infrastructure. The service combines the performance and scalability of Google's cloud with advanced security and sharing capabilities. Highlights include:2010 World Cup – The Ultimate Graphic and Data Resources Guide | Inspired Magazine
2010 World Cup – The Ultimate Graphic and Data Resources Guide | Inspired Magazine
The Ultimate Graphic and Data Resources Guide | Inspired Magazine
월드컵 관련 Infograhics모음
RT @inspiredmag Copa Mundial 2010 - el último gráfico y Guía de Recursos de Datos http://bit.ly/9eUQvr
2010 World Cup – The Ultimate Graphic and Data Resources Guide http://bit.ly/9eUQvr"Privacy and Publicity in the Context of Big Data"
presentation script by danah boyd
Big data, the currency that users pay Facebook and other social media companies for the right to use 'free' servicesContent Is No Longer King: Curation Is King
As someone said to me a few weeks back: "Andy Warhol was wrong. We're not going to be famous for 15 minutes. We're each going to be famous for 15 People." Indeed. Advertising: We're standing at the end of an era. "Mass Media", the ability to reach large segments of the population with a single message is essentially over. For advertisers, the need to find content in context, and to have that context be appropriate for their message and their brand is critical. So, Curation replaces Creation as the coin of the realm for advertiser-safe environments. No longer can advertisers simply default to big destination sites. The audience is too diffuse and the need to filter and organize quality crowd-created content is too critical.MetaOptimize Q+A - machine learning, natural language processing, artificial intelligence, text analysis, information retrieval, search, data mining, statistical modeling, and data visualization
"an open data cleansing tool"
Freebase Gridworks is a power tool that allows you to load data, understand it, clean it up, reconcile it internally, augment it with data coming from Freebase, and optionally contribute your data to Freebase for others to use. All in the comfort and privacy of your own computer.Parsing file uploads at 500 mb/s with node.js » Debuggable Ltd
A few weeks ago I set out to create a new multipart/form-data parser for node.js. We need this parser for the new version of transloadit that we have been working on since our setback last month. The result is a new library called formidable, which, on a high level, makes receiving file uploads with node.js as easy as:
Parsing file uploads at 500 mb/s with node.js » Debuggable LtdData URIs make CSS sprites obsolete | NCZOnline
* Skip to content * Home * Blog * Writing * Speaking * Downloads * About * ContactTop 10 MySQL GUI Tools — DatabaseJournal.com
muy buena pagina de como hacer infografias
If you’re not sure where to start to create your first infographic, remember that annotated maps, flow charts, graphs, many of the diagrams you may already be creating on the job can help your audience to see the meaning behind your data. Not sexy and artistic, you might think, but even a simple graphic is far more effective than asking your prospective supporters to study a spreadsheet – if done right! Here’s a little help for figuring out what information you want to communicate, which data points to select, and how to present the numbers in a way that will be both accurate and accessible to numerically challenged viewers:Wild Apricot Blog : Make Your Own Infographic
RT @wwwhatsnew: Consejos para crear infografías http://wwhts.com/c7RfsgThe Big Lies People Tell In Online Dating « OkTrends
lies and stats from OK Cupid http://bit.ly/aY1kX1 :)
OK Cupid crunches the numbers on the biggest lies in online dating: http://bit.ly/9zheTf
// Cool data
"People do everything they can in their OkCupid profiles to make themselves seem awesome, and surely many of our users genuinely are. But it's very hard for the casual browser to tell truth from fiction. With our behind-the-scenes perspective, we're able to shed some light on some typical claims and the likely realities behind them."
Another amazing data analysis post from OkCupid
I'm married, but I love love love when OKCupid goes all data on us.
Interesting analysis of information gathered from OK Cupid (dating site) vs. norm.adidas Football
real time meta tagging
An interesting interface for tracking and charting soccer matches.adidas Football
real time meta tagging
An interesting interface for tracking and charting soccer matches.adidas Football
real time meta tagging
An interesting interface for tracking and charting soccer matches.Social Media is the 3rd Era of the Web
Social Media is the 3rd Era of the Web http://bit.ly/9NCUuw #socialmedia
I've done this search dozens of times since December and have shared it in slides many times since. It's a search that compares the world wide search volume on
great comparisons in stats on "social media"
What’s also interesting is that the decline of Web 2.0 and the rise of social media are connected. Since Facebook has hit the scene, the original social media tools have peaked in usage: blogs, wikis, forums and RSS.
RT @BBHLabs: Excellent charts. Good long-term context - RT @steverubel: Social Media is 3rd Era of Web - http://j.mp/ciowqRSocial Media is the 3rd Era of the Web
An interesting series of graphs about the rise and rise of social media, especially Facebook
Social Media is the 3rd Era of the Web http://bit.ly/9NCUuw #socialmediaSocial Media is the 3rd Era of the Web
I’ve done this search dozens of times since December and have shared it in slides many times since. It’s a search that compares the world wide search volume on Google for new media, web 2.0, and social media. What the above graph shows is that we’re at an inflection point in the language we use to describe the macro trends of innovation on the web. I believe it’s the indicator that we’re in the 3rd Era of the Web and it’s The Era of Social Media.
An interesting series of graphs about the rise and rise of social media, especially Facebook
Social Media is the 3rd Era of the Web http://bit.ly/9NCUuw #socialmediaBBC - BBC Internet Blog: BBC World Cup 2010 dynamic semantic publishing
Ace post on how the BBC were using data to build their World Cup site.
BBC World Cup 2010 dynamic semantic publishing Post categories: World Cup, linked data, metadata, semantic, semantic web, web publishingA Protovis Primer, Part 1 | eagereyes
Turn your spreadsheet into a map with OpenHetaMap: http://www.openheatmap.com/ via @BetweenMyths #datavisualization #geolocation
Nice open way of putting heat maps on maps7 Basic Rules for Making Charts and Graphs
RT @flowingdata 7 Basic Rules for Making Charts and Graphs http://bit.ly/cGat5u
Primer or Reminder RT @JuiceAnalytics 7 basic rules for creating charts http://bit.ly/930RG7 #MRHow-To: Give Your Old iPhone New Life With Prepaid Data and Minutes
GoPhone for iPhone, 3G, 3GS
This is kool!!!Introduction
intro on data visualization
Introduction to, and history of, datavisualization
Data visualization is a pretty literal term that means, quite simply, the visual representation of quantitative data. In this course we’ll learn common techniques for visualizing data, as well as some strategies for managing information digitally. But first, a brief history.
A brief history of visualization http://bit.ly/a2YB4q #datavisualization #dataviz
Data Visualization
Although visualization hasn’t been widely recognized as a discipline in and of itself until fairly recently, today’s most popular forms date back nearly two centuries. Geographical exploration, mathematics, and popularized history spurred the creation of early maps, graphs, and timelines as far back as the 1600s; but William Playfair is widely credited as the inventor of the modern chart, having created the first widely distributed line and bar charts in his Commercial and Political Atlas of 1786, and what is generally considered to be the first pie chart in his Statistical Breviary, published in 1801.