Experimenting with visualizing TV news (and comedy)

Matthew Ericson at the New York Times did a really cool visualization last week, “The Words They Used“, comparing the most frequently used words at the Democratic and Republican Conventions (from the article, “Republicans were more likely to talk about businesses and taxes, while Democrats were more likely to mention jobs or the economy.”)

This got me thinking about doing something similar for TV programs. So I did an experiment using the excellent word cloud generator Wordle on transcripts (generated with a single click from a SnapStream TV search appliance for The Daily Show with Jon Stewart and Fox’s The O’Reilly Factor with Bill O’Reilly last week (the week of the Republican Convention in Minneapolis). The results:

Monday, September 1, 2008

»The O’Reilly Factor with Bill O’Reilly

The Daily Show with Jon Stewart

(there wasn’t a new episode on Monday!)

Tuesday, September 2, 2008

»The O’Reilly Factor with Bill O’Reilly

»The Daily Show with Jon Stewart

Wednesday, September 3, 2008

»The O’Reilly Factor with Bill O’Reilly

»The Daily Show with Jon Stewart

Thursday, September 4, 2008

»The O’Reilly Factor with Bill O’Reilly

»The Daily Show with Jon Stewart

Friday, September 5, 2008

»The O’Reilly Factor with Bill O’Reilly

»The Daily Show with Jon Stewart

A few notes:

  • I didn’t remove commercials from the transcripts, so for the commercials that had captioning, those are reflected in the results
  • I removed captioning cues from the transcripts so they didn’t skew the results… I’m talking about things like “[Applause and cheering]” (mostly on the Daily Show :-)) and “Jon:” and “Bill:”
  • So what do you think? Are these visualizations interesting? What are your observations? I’m not someone who has a background doing content analysis so hopefully I can get some experts to give me their conclusions.