XML: The scourge of the InternetNot that XML is a new thing nor that it has just started to piss me off just lately, its just that XML has finally started pissing me off enough for me to get off of my lazy ass and write down just how much it is pissing me off. A lot. Here I am, minding my own business and trying to build out a new development server... you know, for development. And as all good developers know, you need to sometimes develop with software which has not quite matured yet. Saying nothing of the state of maturity of PHP as a whole (that's a different rant), I decided to download the 5.x series to "give it a go". Aside from being the mistake that I thought it was going to be, it took the opportunity, and possibly some pleasure, in telling me that my system is not sufficiently advanced to run the default PHP because it doesn't have libxml2 2.5.10 or newer. This coming from a product whose build finishes as follows: Build complete. But, again, this is about XML. Some number of years ago, not content with creating the horror that was/is/forever will be HTML, the World Wide Web Consortium, a.k.a W3C, set out to try to make us forget their previous blunder by providing us with an even bigger fuck-up: XML Here's the first paragraph from the XML page at the W3C, annotated for your pleasure: Extensible Markup Language (XML) is a simple1, very flexible2 text format3 derived from SGML (ISO 8879)4. Originally designed to meet the challenges of large-scale electronic publishing5, XML is also playing an increasingly important role6 in the exchange of a wide variety of data7 on the Web and elsewhere8.
XML configuration files. For what? Portability? No. Its because programmers (mostly Java programmers, yet another rant) are either too lazy or completely lack the skills to write a text file parser. I guess that's what happens when you develop a code base around a language which depends on programmers having to be able to put up with more bullshit than having to demonstrate programming skills. But its a well known, well formatted configuration file Number 1: It is not well known. You have to add your own 'extensions' which takes twice as long for somebody to figure out because all of your variables and data are hidden in superflous tags which are there more for the convenience of the parser than they are there to be useful. Number 2: Yes! It does have to be well formatted, cause guess what, you would have no chance at parsing it otherwise! Just like a regular text configuation file! Properly format a configuration file and you won't have any of these problems. And then you have to bitch and moan to the user if the XML file isn't well-formed. GRRR! Can't I just use the data if I can figure out what it is? No, of course not, because just because the computer thinks it knows what's going on doesn't mean that the person who wrote the configuration file does. Its just way too easy to leave off a quote and just completele fuck the parser. Here's an example:
Now think about this for a second. All the data in the left cell is in right cell. Aside from precisely what is in the left cell, all the rest of the characters in the right cell are unnecessary. ALL of them. So its a couple of extra bytes. Who cares? Well, me for starters. This is the thin end of the wedge. A couple of bytes in the case of 2 body parts expands to kilobytes and megabytes once you start getting a lot of entries. And I don't know about you, but I already download far more crap than useful information on the web as it is, I don't need somebody to pad the number.Now, think of the poor computer which has to parse this file over, and over, and over again. Its not enough that we torture the poor beast into doing stuff that it is not good at, i.e. anything but math, we also have the make it even harder with a text format which is as hard to parse as possible. Here's some C code for parsing each (untested):
I've put in almost 15 minutes of thought into this, which apparently is more time than the people who use XML have put in. These are the same people who complain that computers just don't work right. If you abuse the computer, the computer is going to abuse you back. I use XML for RSS Well, that's more like what the description of XML from the W3C says. Which makes you wonder why that paragraph isn't reserved for RSS. Okay, lets see what they say about it: RSS 1.0 ("RDF Site Summary") is an RDF Vocabulary1 that provides a lightweight2 multipurpose3 extensible4 metadata description5 and syndication format6. In short7, its a means for describing news and events8 so that they can be shared across the web9.
Once again, they're taking the opportunity to fill your data stream with superflous tags instead of something that the computer has an easy time understanding. Here's a sample of an RSS feed, and the way that I would do it because I actually like using my computer instead of waiting for it to parse XML.
Yeah, so you can't put the data in whichever order you want. Waah! Suck on it and just put it down. Put a god damned universal date (in this case UNIX time_t) so that people can actually see what time it was published in their own time zone. Plus, you can actually mmap() this file so you don't have to crawl through it like a toddler looking for his mommy. Yeah, so its not extensible. Like RSS is extensible. Lemme see, let's extend it to tell people where they can find their butt. Then let's push out the extended RSS definition file so that their XML parser knows what to do with it. THEN WHAT? Their browser or their parser knows what to do with it? No! Good use of that extensibility there. I shudder to think about the Google ad that is going to be a the top of the page when you're reading this. Needless to say, its gonna be some company trying to sell you a tool kit or training courses or materials on XML. First off, as you can tell, I don't know why anybody would want XML, or why anybody would want to pay money to support the W3C's XML habit. But if you still do insist, go ahead. At least I'll make some money off of it. Make what you want of this. If you love XML and use it religiously, I'm not here to change your mind: I'm just here to call you lazy and stupid. So, like the RSS feed says: Stop being a pussy. Be a man and parse your own god-damned text files. |