Tuesday, June 10, 2008

So What Are They Talking About?

The recent growth in popularity of 'user generated content' and the almost universal provision of 'user tagging' of content - associating video or photo uploads, or bookmarked URLs, with informal keyword metadata (that is, tags) - has led to the widespread appearance on the web of weighted lists, more commonly referred to as tag clouds.

flickr tag cloud

The size of each word in the tag cloud is proportional to the frequency of the word in some particular corpus. This might be the number of items in a collection that have been tagged in a particular way, or it might be the result of a word count or content analysis over full text documents.

Where tag clouds are used to provide a navigational summary of user generated content, clicking on a tag in the cloud will typically lead to a 'tag results' page listing (like a search engine results page listing) with links to all the pages that use that tag. In the case of the tag cloud shown above, a screensahot of the tags page on the flickr photo sharing site, clicking on a tag will tag you to an 'image results' page showing thumbnails of the photographs tagged with that particular term.

The IBM Many Eyes visualisation application provides a suite of tools for analysing data sets and visualising in various ways. One of the tools is the Many Eyes tag cloud/text analyser, that will analyse textual data sets to produce a tag cloud visualisation.

For example, some time ago I uploaded a data set containing the titles, tags and descriptions of a set of OpenLearn open courseware units, and generated a tag cloud that allowed me to see a quick summary of the topics that were covered on OpenLearn at the time. You can see the tag cloud here: OpenLearn tag cloud (via Many Eyes).

As well as Many Eyes, there are many online tag cloud generators that can generate tag clouds from blocks of raw text, uploaded files, HTML pages (given their URL) or RSS feeds. TagCrowd is one such application, Zomclouds another. A web search search for something like tag cloud generator is likely to turn up many more.

Using one of the services referred to above, or one that you have found yourself, experiment with creating different tag clouds from different blocks of text. Does the visualisation provide a good 'summary' of the text? If you use delicious, flickr, or any other site that supports user tagging and tag clouds, look at your own tag cloud. Does it give a fair summary of the content you have tagged?


Tony Hirst said...
This comment has been removed by the author.
Tony Hirst said...

The Wordle tag cloud generator allows customisation of how a tag cloud (word cloud) is displayed, for example by allowing the use of different colours within the cloud and the ability to modify the orientation of words within the cloud (so they can be 'rotated' and displayed at an angle).