SDForum Technorati Talk

| | TrackBacks (1)

Tonight I attended the SDForum Web Services SIG meeting whose topic was “Semantic XHTML — Can your website be your API?”. The presenters were Kevin Marks and Tantek Çelik from Technorati. Following are my rough notes from this interesting presentation.

Update 2004-10-05: Slides from this talk now posted on Tantek’s site

Semantic XHTML

Can your website be your API?

SDForum Web Services SIG, 2004-09-28

Some SDForum general topics: * Monthly Web Services Working Group will probably be formed in a couple months * Forming a new Web Client SIG, topics to inclue RSS, Atom, SOAP, REST, etc.; looking for a host * New PayPal Hacks book coming out

Background on Technorati

Tracking 4 million blogs now (was 3 million in June). About 4 million posts per week. New Politics site tracks and summarizes about 10,000 political blogs. Link analysis is the key attribute of their processing. For international, they use UTF-8 internally and can convert from the majority of encodings as needed. Not as much content searching yet for internationals, but not as critical yet because they rely on links rather than content.

Presentation

HTML started structured, became presentational during browser wars. Explosive growth because of error tolerance. Table abuse & font tagitis & spacer GIF layouts caused two backlashes:

  • Backlash for structure — XML; draconian error checking, freedom to make own schemas, appeals to programmers
  • Backlash for layout — CSS; move presentation away from structure, content independence, appeals to designers, http://www.csszengarden.com

Where does XML fail?

  • schema explosion (everyone makes their own)
  • tag/attribute battling
  • abstraction ratholes - BTO ontology
  • not human readable (partly by design)
  • doesn’t work on “the Web” today

Where does CSS fail?

  • folk coding (design rather than engineering community)
  • variable implementations
  • visual designers thinking about presentation ass structure
  • structure hacks to fix presentation

Can we re-integrate these strands?

  • XHTML is XML (XHTML = HTML made into XML)
  • parseable, modular
  • XHTML supports CSS
  • everyone already has a viewer
  • everyone can make queries

Example - Politics Site. Sample problem:

  • wanted a chart of the top 3 links on a page
  • dynamically generated using some complex app logic to choose the link title based on transient data
  • solution: use the site output page as input, easily parsable to extract desired information
  • this web page wasn’t originally designed with that in mind, but due to its structure was reusable

XHTML building blocks

  • most applications reuse a lot of common concepts
  • strings
  • lists, correspond to program arrays (<ol> and <ul>)
  • tables, can be used for 2D array
  • links with ‘rel’ attribute explicitly defines relationship; is extensible and multivalued
  • definition lists, key/value pairs or hashtables
  • citations and quotes; cite a person or source by name, popular use in weblogs

Existing examples

  • XFN - XHTML friends network; just add ‘rel’ to your blogroll links; define profile using a dictionary: http://gmpg.org/xfn/1

Future example

  • attention.xml; what are you reading, how often are you reading them, etc. with goal of application that can help synchronize what you’re reading, help highlight things that you are interested in
  • XSPF - play lists (XML shared playlist format)

New types - Methodology

  • map existing data structures into XHTML equivalents
  • enable new stylable building blocks
  • readily exchange data as mapping is 1:1

New type - People

  • RFC 2426 vCard <-> hCard
  • create an XHTML representation of this
  • embed within a webpage, share to and from the web

New type - Events

  • RFC 2445 iCalendar <-> hCalendar
  • describe events
  • display them and enable parsing

Possibly related posts:

1 TrackBacks

Listed below are links to blogs that reference this entry: SDForum Technorati Talk.

TrackBack URL for this entry: http://www.cantoni.org/cgi-sys/cgiwrap/bcantoni/mt/mt-tb.cgi/180

Semantic XHTML from Bitsplitter Blog on October 8, 2004 11:52 PM

Brian Cantoni has a good writeup of a talk about Semantic XHTML given by Kevin Marks and Tantek Celik. The slides are available online as well. There's some good stuff in there. Lately I've been working a bunch with the idea of mixing in additio... Read More

About

This is the personal website of Brian Cantoni. All opinions on this site are my own.

Subscribe

Keep up to date with new content:

 Subscribe in a reader

 Subscribe by Email

 

Twitter

Delicious Links

This weblog is licensed under a Creative Commons License.

Tweetfave

Coming soon - a new way to get the most out of your Twitter Favorites: Tweetfave.com

Mobile Websites

Visit cantoni.mobi for a helpful list of mobile sites.

Advertising

Archives

For older entries, please refer to the Archives.