We have spent a great deal of effort working with XML. Historically, our feelings towards XML were best described as love-hate.
We loved it as it was good for business and it gave us lots of technical challenges. We hated it because it was hard and while we made progress on the reading side the writing side remained elusive.
The power of XML is of course its ability to encode and model arbitrarily complex entities and the relationships between these entities. Its domain neutrality means that XML is used to encode information across virtually all industries. The difficulty in dealing with XML documents is that in many cases the XML data models are done with no regard for the software engineer who may have to work with or use the data. Note that here that I am saying “work with” or “use” the data. Reading XML data is easy, using can be difficult.
The thing that has made XML hard for us is that we have tried to treat XML like relational data and it is of course not relational. Once we instead approach XML in a much more free flowing approach, then it becomes way less hard. Take metadata for example. James Fee recently wrote about metadata and the challenges of how to make it accessible. When asked about metadata in the past we often made the joke that “We have never met a data we haven’t liked!” while at the same time being open that “We have never met a metadata that we knew what to do with!”. Again the thing here was that we again thought of metadata as a type of data. So far so good; it is data about data. But again we tried to make it fit into a relational workflow scenario which does not work. In fact metadata varies significantly from site to site even if it is in the same standard: FGDC or ISO. The conclusion here was again the same. As XML (in this case containing metadata) doesn’t fit into the relational paradigm at all we shouldn’t be attacking it in that manner. Yet this is precisely what we had been doing!
Things we have learned:
- XML is not relational and trying to approach it in a relational manner is doomed to failure.
- The amount of effort required to get data into an XML schema is directly related to the complexity of the schema. The more complex the schema the more effort is going to be required to structure the data to fit into the XML schema.
What is really needed to process XML is an environment that makes it really easy for people to build up complex objects in a gradual and understandable fashion. As luck would have it, Workbench fit this paradigm perfectly and is a great environment for this, but in the past it lacked the necessary supporting transformers for building XML documents. Enter FME 2010 (FME 2011 beta is even better) and the XML Templater and now you can Get Smart with XML.
So how do YOU feel about working with XML now? Me? I’m using XML… (take me out Don Adams)
Don MurrayDon is the co-founder and President of Safe Software. Safe Software was founded originally doing work for the BC Government on a project sharing spatial data with the forestry industry. During that project Don and other co-founder, Dale Lutz, realized the need for a data integration platform like FME. When Don’s not raving about how much he loves XML, you can find Don working with the team at Safe to take the FME product to the next level. You will also find him on the road talking with customers and partners to learn more about what new FME features they’d like to see.
Thanks for the insight, and especially for the Youtube clip 😉
Referring to the statement above “1) XML is not relational and trying to approach it in a relational manner is doomed to failure.” … is very true, and also part of my struggle with XML.
Let me add this question:
Where are the database-vendors who provide databases which can deal with all the wonderful XML-schemas we get nowadays, especially those derived from wonderfully complex UML models …
Seems the paradigm shift is not behind us yet …
Thanks for your comment.
As I am sure you know there are XML Databases out there that do work directly with XML schemas. Check out http://en.wikipedia.org/wiki/XML_database for a description of the concepts. If we did find an XML database that could deal with all the wonderful XML-schemas, the problem of how we transform data from the wonderful XML schema would still exist. Now most of our users use FME to transform data from one relational schema to another relational schema. Moving to XML schemas will definitely not render schema translation obsolete. I expect the opposite may be true.
Playing with your words a bit here I would say that any wonderful XML schema will require a wonderful transformation script to work with it. At the end of the day someone working with data has to understand it. There is no getting around that.
I think your points are interesting, but I would like to see your thoughts in action. Can you show an example related to metadata that shows the difference between the relational thought process and the XML thought process.
I would be happy to show you how we attack XML and metadata with the latest FME technology. I am returning from the UK where I showed it as part of our FME UK User Meeting. To be honest I have totally abandon trying to make XML or metadata fit into a relational model so I won’t be demoing that but would be happy to show you via a GoToMeeting what we are up to. Please send an email to email@example.com. I am always excited to discuss, and show the progress that we have made and discuss possible next steps and weaknesses. If anyone else out there would be interested in a GoToMeeting please let me know too via email.
[…] that they need a PhD in order to understand it. Yet, XML is a reality that can’t be ignored – like it or not. XML (and its Geographic relative GML) are here to stay and used for many initiatives in different […]
[…] compelling, and one we’ve seen before. Aaron took real-time bus data from San Francisco, in my favourite format, and used FME Server to bring that information into Google Earth. He was then free to customize the […]
[…] For those following the second link: Yes, we love XML and are excited about INSPIRE, but also agree not every tool is right for every […]