We’ve covered content modeling, authoring, and content management a bit in this series of posts. The next step is doing something with all that shiny XML. Keeping it cooped up in a database of some sort isn’t going to pay the bills.
Naturally, and as usual, I can’t tell you how to do anything. I don’t know what you’re making. Maybe you’re making something that’s going to compete with something I’m making. Awk-ward! But there are some general pathways that we can go over.
You can transform your XML into another flavor of XML. Maybe you have an existing XML-based system that would very much like to ingest your data. You can transform XML to another kind of markup, like HTML or whatever the kids are using these days. You can wrestle it into all kinds of proprietary input formats for various applications. If your equipment is fancy enough, you can even create PDF files.
So about that fancy equipment. You need something that can manipulate the XML, because doing it by hand is simply crazy and nobody would ever consider doing it that way. Forget I even said that.
One step above retyping everything with new tags (which is laughable and totally not worth doing) is using a text editor of some sort to execute find-and-replace functions to transform the XML. Maybe your XML has lowercase tag names, and the target content model has capitalized tag names. If there are no major structural differences, maybe a simple find-and-replace would be adequate. A slight upgrade to this tactic would be to use something that allows for grep searches. This dramatically expands the level of complexity you can have in your searches. Instead of “find all the lowercase n’s” you can do “find all the lowercase n’s that appear after < at the beginning of a line and are not followed by o, l, d, q, or z”. Because you’re going to need to do that at some point.
This would be a very cheap implementation, but it’s kind of like the MacGyver duct tape method of problem solving. You’ll look really cool, and it will probably work fairly well, but certainly with a little more time and planning you can come up with a more elegant solution.
There are many free or cheap XSLT engines available online. Just search for “XSLT engine” and behold the bounty. The advantage with these is, um, they’re specifically designed to manipulate XML data. So you get nice things like validation. These can also be plugged into a workflow of some sort to allow for automation.
There are also fancy commercial content engines available. These are designed to be used by companies that have lots of data to transform, and are often plugged into content management systems. There will be consultants involved. They might have one of the free engines as part of their chewy core (the engine, not the consultant), but with clusters of nougat and whatnot bolted on for your transformation pleasure. I guess read the brochure and ask the consultants why their product is any better than the free ones. See how many times they say “seamless” and “robust”. Remind them you are not buying gutters or coffee beans.
I’ll leave you with this anthem to transformation. Enjoy. Sorry.
Filed under: XML |