I usually keep quiet during the myriad technology debates that flood certain web circles, preferring to just do my coding and building of things. So when I do dig into some technology or other — often way after all the geeks have argued and hashed to death some obscure techie implementation tidbit — I'm shocked to discover just how messed up it is. This week's struggle lies with OPML. I like to think OPML stands for Other People's Markup Language, and I try to be down with that, but in reality it stands for Outline Processor Markup Language and it's a format many weblog-related tools (such as blogrolling.com and various RSS news readers) have implemented to make it easy for you to import and export a list of your favorite weblogs. Sounds like a pretty good idea, if only it were actually standard.
Unfortunately OPML has a DTD that says you can extend OPML anyway you want (which is crazy talk to me, a DTD you can change? What's the point?), meaning you can add more elements, or more attributes to your elements. So when someone (me) tries to implement something with various OPML outputs, you (again me) realize that one tool outputs an attribute "url" while another outputs "htmlUrl" and a third "htmlurl" — all to signify the same thing! Sure, some RegEx can clean this up, but weren't we trying to avoid all that with XML in the first place? Argh! I just want to be able to develop something and have a strong contact defined. Is that too much to ask? No "extends XYZ," no "I changed this" just "this is how you express X" and that's it. Maybe if the format you're using requires you to change it to represent your data, you're not using the right format in the first place.
Which makes me realize that I think some of the problems we've had in the weblog community around formats like RSS and OPML might stem from the fact that we use them in manners for which they weren't designed. But that seems like a topic for another day's rant.
Matt Hamer writes in with more coherent thoughts on this issue:
The DTD (at least this one:
http://static.userland.com/gems/radiodiscuss/opmlDtd.txt) *with no
modifications* does not allow you to add extra attributes. A document
with undefined attributes would not validate against this DTD. I don't
want to spend time reading the full spec right now, but based on the
comment in the DTD, I assume the spec prose says that you can add any
attribute that you want to. The DTD makes it easy to add your own
attributes to the outline element, but you must define them by adding
them to the OtherAttributes ENTITY. If you do this, you are really
working with a different DTD.
The real problem is not with the DTD, or really even with the spec that
says, "add your own attributes." The problem seems to be that people
are adding information that you (and other people) find useful in *non
standard* ways. If 'url' or 'htmlURL' or whatever is valuable
information, a standard attribute should be added to the DTD.