« Parent of Wrox Press Reportedly in Bankruptcy | Main | Using Perl to Write a UNIX Daemon »

Contributor to XML Standard: "XML is Too Hard for Programmers"

Dave Aiello wrote, "In a recently published essay on his Ongoing weblog, Tim Bray says XML is too hard for programmers to use. The topic alone is certain to get a lot of developers with web infrastructure interests to tune in. What does he mean? Well, he says:"

During the process of setting up ongoing {his weblog}, for the first time in a year or more I wrote a bunch of code to process arbitrary incoming XML, and I found it irritating, time-consuming, and error-prone.

"Nice to know that one of the authors of the XML specification has the same sort of problems that the rank and file does. But, many of us have studied XML manipulation extensively and found one or more solutions that work in our problem spaces."

"It turns out that Bray writes a lot of code in Perl, but he defaults to the lowest common denominator method for parsing XML:"

As regards XML, I've been living in the land of scripting generally and Perl specifically in recent times.... That leaves input data munging, which I do a lot of, and a lot of input data these days is XML. Now here's the dirty secret; most of it is machine-generated XML, and in most cases, I use the perl regexp {regular expression} engine to read and process it. I've even gone to the length of writing a prefilter to glue together tags that got split across multiple lines, just so I could do the regexp trick.

"Bray's chief complaint, when you get down to it, is that he wants a reliable stream-oriented XML parser in Perl that does not rely on callbacks. I was never able to find one, but I found a way to do what I wanted by using XML::Twig. XML::Twig is fast, memory-efficient, and can be used in an object-oriented or callback-oriented method."

"Initially, I fought against using a Perl module like XML::Twig. I said to myself, 'I ought to be able to extract the small amount of data I need using regular expressions.' I tried it. It's not easy. And, it never worked 100 percent of the time for me."

"Maybe the problem I solved was different from Bray's, but I get the impression from reading his article that he hasn't tried all the different ways the Perl community has come up with to process XML."

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

About CTDATA

CTDATA Venutures (CTDATA) develops Internet and Intranet applications for corporations and non profit organizations. Our services include:

  • Consulting services for Movable Type and TypePad-based publishing systems (visit our Weblog Improvement website for more information),
  • Financial services business process consulting,
  • Content management system and knowledge management system consulting,
  • Apache web server engineering and hosting,
  • MySQL, Sybase, and Microsoft SQL Server architecture and development,
  • SOAP, REST, and XML-RPC system architecture and programming, including Amazon Web Services and
  • Weblog publishing.
For more information, contact Dave Aiello by email at dave [at] daveaiello.com or call him at +1-267-352-4420.
Copyright © 1995-2010, CTDATA Ventures. All Rights Reserved.
Powered by
Movable Type 4.25