More and more data is becoming available all the time and the geospatial community is no exception with many recent open data announcements (i.e. Vancouver and Ordnance Survey). While at one time it seemed difficult to get access to “open” data, the challenge is increasingly becoming how do I use it?
What Defines Open Data?
Some will argue (often from a conflict of interest) that for any data to be truly “open” it must be in an open standard such as GML and XML. However, the real success of “open” data is its usability and the number of folks who actually use it. The fact is that there are lots of “defacto” standards that already exist such as ESRI Shape, MapInfo MIF/MID, and AutoCAD DXF to name a few. To associate “format” with data “openness” is missing the point.
Getting value out of data is not just about reading the data, but more about using the data. The key to using ”open” data is transformation.
Transformation is all about taking “open” data sources and massaging or restructuring them so that they can be used directly or easily integrated with your corporate data to deliver maximum value to your GIS or other systems. With transformation you can get value from data easier and faster.
Transformation is an important part of the INSPIRE initiative and I would argue that no SDI is complete without transformation. Next time I will talk about the importance of transformation in a web services / SDI context.
How important is transformation? I believe that it is impossible to fully leverage the power of “open” data without it. What do you think?
Don MurrayDon is the co-founder and President of Safe Software. Safe Software was founded originally doing work for the BC Government on a project sharing spatial data with the forestry industry. During that project Don and other co-founder, Dale Lutz, realized the need for a data integration platform like FME. When Don’s not raving about how much he loves XML, you can find Don working with the team at Safe to take the FME product to the next level. You will also find him on the road talking with customers and partners to learn more about what new FME features they’d like to see.
For truly open data (data in an open standard format like XML), transformation is almost always required. This will continue to be the case for the foreseeable future because the non-open standard formats continue to be supported in software packages users want to exploit the data in. In addition, some of the non-open standard formats have benefits (e.g. compression capabilities) that may make it difficult to do away with them entirely. Until the software packages that are used most support open standard formats or become obsolete, transformation will continue to be crucial to exploiting open data.
INSPIRE is certainly a good subject in this regard.
Are there plans for an INSPIRE writer in FME ?
Using the GML writer with Inspire’s schema files (XSD) is quite complicated today..
Given the complexity of GML, I think the users have to be helped as much as possible to leverage the transformation possiblities of FME in a context such as INSPIRE.
Nice post, and especially relevant at the moment.
I think it’s very important that to reveal the true value of the data, it needs to be (a) transformable, and also (b) extendable. It’s great to get data to be visualised and used in your GIS, but it also needs to be in a format or be semantically enrichable too. An example of this might be census or geodemographic data.
It needs to be in a suitable format to be whacked into a GIS (e.g. shapefile), but also needs to be at a level of granularity to suite the data that you want to extend it with – such as your own statistics – at the same (or directly aggregatable) level of granularity.
I guess this is where transformation via ETL comes into its own – that you can not only change formats, but also set the data at the correct geographic level that you want to use it.
In most cases, I think transformation to your native GIS system will continue to be necessary for performance.
Perhaps some INSPIRE transformers?
Thanks for the comments. Amber’s comment reminds me of a study we did at Safe a few years ago to discern the most popular “transformation” done with our technology in the GIS industry. Before the study was undertaken we had ideas that it must be CAD/GIS or loading data into spatial databases. Once the study was concluded we found that the largest transformation by far was from Shape/Shape. The second was from MapInfo TAB/TAB! The pattern was repeated over and over with other formats. This really drove home the point that even if data is available in the native format of your application, transformation is still required in order for it to be used!
With respect to writing to INSPIRE or other XML formats we have taken an entirely different approach with FME 2010. Check out the previous blog post that I wrote. The new approach is based on our XMLTemplater transformer. All you need is a sample XML document and you are good to go. Workbench is a great environment for assembling and building up these complex documents (contact me at email@example.com if you would like a demo or webinar). I would be happy to discuss this with anyone interested.
If you are going to the INSPIRE conference in Poland this year be sure to stop by our booth or come to a session where the solution will be presented and discussed. I would of course be very interested in any transformers that folks can suggest that can help our users deliver INSPIRE solutions.
Thanks for your insights as well Stu. Transformation, as you alude to, is also needed to restructure data so that it can be combined, integrated, or enriched with other datasets. Before this can be done both datasets must be aligned in order for the spatial join to make sense. For example, if you one dataset has county data and another dataset has state data then the counties would have to be combined or dissolved to the state level before any meaningful integration could occur. At the end of the day it is all about making the data usable in the users application of choice. By the way, Stu – I have to say that I really enjoyed reading your blog post about Spatial Data Quality.
great post. i read these every week. Keep up the good work.
I think, what is important is how people can consume data. The internal formats and transformation are interesting and, in the end, to make data useful will be the key to have it “open”. My take… Thanks, Oleg
I agree. In the end it is giving people the ability to consume the data (i.e. use it). “Open” data is important and the fewer formats there are, the better. Transformation is the catalyst that moves data from the “readable” state to the “consumable” or “usable” state.
This is a great example of how FME Server and OpenData make a great team:
Find out the impacts on the environment when something nasty gets spilled. Select a season. Then, click on the map where the spill occured (within city boundary). The map will show you where your spill went, and, depending on the season, how long it took to travel to the ocean. Click the “Drains On” button to show the underlying drainage network.
Thanks for posting this site here. It is a great example showing the importance of transformation to deliver useful information to folks enabling the to leverage the visualization/spatial application of their choice. Also great to see this use in the home city of Safe Software. I wasn’t aware of this application and thanks for the “Powered by FME Server” logo on the page too!
[…] In Part 1, I wrote about how transformation is the key to unlocking open data and that format alone is not sufficient for data sharing. When it comes to SDIs (Spatial Data Infrastructure), the story is no different. Transformation is critical to fully comply with the requirements of an SDI (including INSPIRE). […]
[…] today than it was even 5 years ago. Governments of all kinds are moving to make their data available freely to citizens. Remote sensing imagery is becoming ever more current and inexpensive. Highly […]