XML Authoring Tools, part 2

Authentic Altova

Available in both desktop and browser versions, Authentic is a free ­authoring tool. XML documents are created and edited in e-forms via a word processor-style interface, based on structured stylesheet designs.

Authoring Experience

Authentic users edit XML in a WYSIWYG word processor-style interface. Content is rendered in e-forms based on stylesheet designs created in Altova StyleVision. E-forms are designed to enable users to view content sequence and structure as it will appear in the target delivery format(s). The same stylesheets can be used to instantly render the content HTML, RTF, PDF, and Microsoft Word 2007 (OOXML). E-form presentation is dynamic, based on user input. Authors can input graphics, images, and hyperlinks as well as text. Windows for project management, messages, entry helpers, and additional information provide users with guidance during the authoring process.

Integration Bits

Desktop version supports project management, batch operations, version control systems. Scripting (JavaScript, VBScript) for forms, event ­handlers, & macros. Can be embedded in other applications as an ActiveX control or widget.

System Requirements

Desktop version requires Microsoft ­Windows (2000, XP, Vista). Browser version requires a plug-in for Internet Explorer.


I’m not sure how math content could be authored without an equation editor ­component. Perhaps in MathType, exported to MathML and pasted into Authentic?

If it were a Ben and Jerry’s ice cream flavor it would be…

Cherry Garcia

Open Office OASIS

OpenOffice is free and open, works on Mac and PC. OpenDocument Text format (ODT) files would to be converted to/from your DTD through a customization of the XML Filter command.

Authoring Experience

Authoring would happen in the Writer word processing application. Templates would be created with styles corresponding to DTD tags. XSL stylesheets would be created for your desired outputs. Styles are then transformed to XML tags on export via the XML Filter command. This workflow is supported out of the box for DocBook.

Integration Bits

Would require 2 XSLTs to translate between your DTD and ODT (1 for import, 1 for export). Can transform to/from Word 2003 and DocBook out of the box. Some web applications e.g. Google Docs support ODT.

Technical Specifications

XML import is based on the SAX API.

System Requirements

Microsoft Windows: Windows 2000 (Service Pack 2 or higher), Windows XP, Windows 2003, Windows Vista, 256 Mbytes RAM (512 MB RAM recommended), At least 650 Mbytes available disk space for a default install
Mac OS X: Mac OS X 10.4 (Tiger) or higher, Intel Processor, 512 Mbytes RAM, 400 Mbytes available disk space
All platforms: Java runtime environment 1.4.0_02 / 1.4.1_01 or newer, Java Access Bridge.


The provided formula editor is pretty rudimentary. It offers no way to adjust formatting of equations.

If it were a Ben and Jerry’s ice cream flavor it would be…

French Vanilla

<oXygen/> XML Author SyncroSoft

XML Author is a cross-platform, simplified version of the Oxygen XML Editor, which includes all the authoring, database access, and publishing features of the larger application.

Authoring Experience

The WYSWYG word processing interface uses CSS stylesheets for presenting XML to authors. The application window is highly customizable, with detachable, floating, rearrangeable toolbars and menus. An XML Preview Window can show the results of processing documents with XSLT. Other efficiency features include intelligent (XML Schema/DTD aware) content completion and configurable XML syntax coloring for elements and attributes. Users can edit, validate, and transform Office Open XML (OOXML) and Open Document Format (ODF) data.

Integration Bits

Access to document repositories can be made through WebDAV, FTP, SFTP. <oXygen/> supports access to native XML databases (MarkLogic, etc.) and relational databases (Oracle, IBM DB2). For accessing a content management system, the editor can be extended by writing a Java URL protocol handler.

System Requirements

Minimum hardware configuration is PowerPC G4 class system with 256 MB of RAM (512 MB recommended) and 300 MB free disk space.
An official and stable Java VM ­version 1.5 or later from Sun ­Microsystems
Macintosh: Mac OS X 10.4 or later
PC: Microsoft Windows application (NT 4.0, 2000, XP, 2003 Server).


Default templates are available for authoring MathML.
A 30-day demo version of Oxygen 10 is available.

If it were a Ben and Jerry’s ice cream flavor it would be…

Phish Food


How to Cheat at Tic-Tac-Toe With XML

I’ve spent a lot of time lately looking for and at XML authoring tools. Usually, it’s a pretty dry exercise, but I just came across one that made me laugh.

Xopus is an XML authoring tool that operates in the browser (IE 6 or 7 and Firefox 2 or 3). To illustrate the tool’s capabilities (and their sense of humor) the developers have a demo on their website in which you can author an XML document in such a way that it feels like you are playing tic-tac-toe against a computer. It’s awesome.


Here’s how it works. The developers wrote a schema that essentially describes the valid states of a tic-tac-toe game. When you click in a square, Xopus validates the document, and finding it “invalid”, responds (via javascript) by putting an X in another square. And so it goes until the game is over. There’s an XSL to render the game in the browser. It’s pretty neat. And the best part: if you find yourself about to lose a game of tic-tac-toe to an XSD schema (how embarassing!) you can cheat and Undo. I don’t know yet how useful Xopus will be to me for real work, but this is easily the most fun application of an XML authoring tool I’ve seen.

Test Drive A Hybrid Workflow

You never know where inspiration will come from. Yesterday it came to me in the form of a traffic jam slowing my ride to work. Sitting on the commuter bus, mired in “bumpa-ta-bumpa”, I stared out the window. Life on pause. I imagined the lake of gasoline that was fueling all these cars, puffing out their tailpipes, melting Greenland. Then a shiny red Toyta Prius rolled by. It had a good vibe, like a forward-thinking, best alternative in an otherwise unworkable situation. A hybrid.

This was definitely a sign from the cosmos that it was time to talk about the hybrid XML-InCopy workflow I’d been playing with since last year. I had planned to finish off the cookbook project tonight, but this idea overtook it in my brain. I offer it up now in case any of you out there is stuck in a traffic jam of a publishing workflow, with a line of products and file formats in each other’s way, slowly crawling ahead while the clock ticks ticks ticks.

The point of a hybrid workflow is to combine the virtues of XML and InCopy to give you speed and efficiency that is otherwise impossible. Single-source authoring means multiple print and/or Web products derive and arrive simultaneously from the same set of keystrokes. You author and edit in XML, transform when necessary, and use InCopy to preview your print layouts, where space is finite and styling matters, as you go.

I know there are people out there who have written amazing scripts, or developed plug-ins or workflow systems that can accomplish what I’m going to show you better, faster, with more goodies. But as always, I write about what can you do with the off-the-shelf tools and garden variety skills. Or if you don’t currently possess those skills, you can come by them without completely re-wiring your brain. Perhaps someday Adobe will release the equivalent of an electric car, XML authoring as part of the Creative Suite, and we’ll all be merrily speeding down the Cross-Media Expressway. What follows is my idea of how to do today. And it works. So hop in the hybrid and take it for a spin.

The Pitch

With a W3C XML Schema as the foundation of your workflow, you can develop multiple print and online products simultaneously, and achieve efficiency and savings through content re-use with off-the-shelf tools.

The Tools

To do this you’re going to need InDesign CS3, InCopy CS3, and the Altova MissionKit For XML Developers. The MissionKit is an XML Developer’s equivalent of the Adobe Creative Suite. It’s three applications that work in concert to for the creation and transformation of XML files: XMLSpy, StyleVision, and MapForce. This is a Windows-only package. There is no Mac version, so if you kneel at the altar of Jobs like I do, you need emulation software like Parallels. Syncrosoft’s oXygen is an alternative that runs on the Mac, but only if you don’t need a lot of help writing XSLT. I do. MapForce gives you graphical creation of XSLT, or as I call it, XSLTW (XSL with Training Wheels). I’m not trying to do a commercial for Altova, but their stuff is the only stuff that I know works for everything we’re trying to do.

Step 1: Planning

This is the big one. Map out all your content. Depending on the complexity of your content, this can be a tough job, so you only want to do it once. Spend enough time to get it right, since everything flows downhill from here. Every screw-up or oversight at this stage will echo throughout the workflow in some combination of time, aggravation, or cost. And everyone needs to know that once this is done, there’s no changing the structure, at least not for anyone who wishes to remain with the company.

Your map should show every piece and where it fits into the overall scheme of your project. Map every recombination, and every dependency. Leave no stone unturned. This part can be a real eye opener. If you survive with your sanity intact, you will understand your content better than ever before and maybe discover new ways of using it.

Reverse engineer your own content. Cut up books and move the pieces around as they would move in the digital realm. Follow the life of a lowly paragraph as it appears throughout your product line. Once you grasp the details, you can answer the first key question. What kind of schema best suits your needs: a custom built-from-scratch schema or a generic format? Do you have the time and money to make the former? Do you have the flexibility for the latter? A bad fit might cost you more in the long run. Investigate DocBook and DITA. If you go generic, skip to step 3.

Step 2: Build the Schema File

With the understanding that you gained by mapping your content, you can now build an XML Schema that will guide your authoring, transformation, and output. Why a Schema? Why not a DTD? I have nothing against DTDs. In fact, they are more appropriate for describing book-like things. Schema excel at describing data more than documents. In fact, I love DTDs so much, the other day on the highway I was passed by someone with the license plate 736 DTD, and I thought “hey, that’s cool.” Then I felt the urge to slap myself for being such a geek.

I say use Schema purely because the Altova tools support Schema in ways that they don’t support DTDs. Namely, you can graphically create a Schema in XMLSpy. I feel a little hypocritical because this is the same tool-based thinking I dissed in a previous post. But facts is facts, and until I find another tool that can do this workflow end-to-end with a DTD, I’m sticking to my story. Actually, if you must have a DTD, there is a workaround: build a Schema, then use XMLSpy to convert the Schema to a DTD.

Step 3: Create Authoring Templates

Using StyleVision you take your Schema and apply styling to it to make a user-friendly authoring template. This is something the oXygen can do too. You choose from CSS properties to apply fonts, spacing, and position to your elements. You can make pop-up menus for standardizing choices, and clickable links to insert required elements.

Step 4: Develop Layout Templates

Import sample XML files into InDesign, structure and style it. Set up styles to tags mapping. Make use of the Story Editor to be sure your tagging remains intact and whitespace characters are where they belong.

Step 5: Distribute Authoring Template

Let the writers have at it.

Step 6: Import XML Files into InDesign

And when you do, be sure to maintain the live link, so the XML file appears in the Links panel.

Step 7: Export to InCopy

BUT tell everyone that the InCopy files are untouchable! Hide them. Instead, InCopy users drop the InDesign file onto InCopy to open it directly.

Step 8: Editors Do the InCopy Two Step

Check out the appropriate stories from the InDesign layout. Show the Links panel to see the XML file. Edit in the XML file, save it. Go back to the Links panel and update the link to the XML file. Magic! You have your cake (XML) and eat (publish) it too.

The fact that this works at all is a complete accident–the unintended consequence of 3 InCopy capabilities: access to the links palette (intended for the management of placed images), the ability to use an InDesign layout for preview (so one story can be simultaneously linked to both an XML file and a .incx file), and the ability to maintain a live link to to text files (meant for Word and spreadsheets). Sometimes things just fall into place.

Editors can do some work, like styling, and working with boilerplate (untagged) content in the layout file. But they must understand the fundamental truth that anything they do between the tags in the layout will be wiped out the next time the XML file is saved. Stuff outside the tags, in whitespace elements, remains.

At the end of the day, when all is said and done, you still have intact XML files, with the most up-to-date content, ready to be flowed into whatever template or media you need.

Bonus Points

At any point in this workflow you can use MapForce to create XSLT to transform your content, making it fit another purpose. You don’t need automated workflow systems or scripts to make that transformation happen now that InDesign supports XSLT. Examples of what you can do with XSLT: Making HTML for Web presentation, making PDF, making alternative print products by gathering or sorting content according to attributes, making NIMAS files.

Math Doesn’t Add Up

All this is great, but it will not work for you if you need MathML. The only ways to get MathML in and out of InDesign involve scripted solutions, or customized versions of 3rd party plug-ins like MathMagic. I hope that some day InMath, which has always been my favorite equation editor for InDesign, will add MathML support. Design Science’s MathType speaks fluent MathML, and you can place those equations (in EPS format) into an InDesign layout. PowerMath also exists for InDesign but I haven’t tried it out. Note to self: I should do a future post comparing all the different ways to do math in InDesign.

Next Steps

My next project (if ever stop spending all my free time blogging) is to experiment with Office Open XML. Since the new version of Office has XML underlying every file format, why not exploit that, and author in Word, transform OOML to your Schema, then import in InDesign, Web, etc. It should work like a charm, and it’ll probably have authors and editors breathing a sigh of joyful relief that the XML authoring tool they have to use is Word. It may not be the electric car, but it’s pretty close to one of those that runs on old french fry oil. Mmmm, I could go for some fries right now.

XML is like…Cousin Oliver

Today’s topic is everyone’s favorite publishing tech du jour, XML. I am not a true XML geek by nature. I don’t know much about its uses in programming, data interchange, etc. I look somewhat dumbfounded at the XML vs. JSON flamewars. Coding is not fun for me. It’s a chore, less odious than cleaning out a cat’s litterbox, but more tedious than emptying the dishwasher. I think of it like changing the oil in the car–you can save a few bucks doing it yourself, but if you do it wrong, things will get messy in a hurry. I try not to fall in love with any software, process, or technology. They’re all just means to an end. Tool-based thinking gives me the creeps. It’s the medically-proven first step to becoming a zombie. I am just interested in XML for the solutions it offers publishers. So even though I went to the XML Conference last December and enjoyed it, I was probably the least XMLish person there.

So along with that disclaimer, if you want truly informed XML talk and tools, check out these:

If you’re barely sure what XML stands for, start with the W3 Schools’ XML tutorial

Then if you’re still interested, check out some of these:


Cafe con leche

IBM’s developerworks



XML is a lot of things to a lot of people. For fun, Google the phrase, “XML is like” Here’s a sampling of the results you’ll get:

“XML is like a drug: when you think it’s solved all your problems, you’re using it too much.”

“XML is like HTML with the training wheels off.”

“XML is like sex, even when it’s bad it’s still pretty good.”

“XML is like cardboard. It is a very useful packing material…”

“XML is like lye. It is very useful, but humans shouldn’t touch it.”

“XML is like a fat tick after a good meal…bloated.”

“XML is like a set of Russian dolls where text can be nested at each level.”

“XML is like teenage sex. Everybody talks about it, and thinks everybody else is doing it, when they’re really not.”

“XML is like a carcinogen. We don’t notice it’s there, but we’re still getting exposed to it.”

And probably the most famous and oft-repeated:

“XML is like violence: if it doesn’t solve your problem, you aren’t using enough of it.”

Now, how could you NOT be fascinated with something that is simultaneously compared to sex, drugs, violence…and cardboard?

To me, it seems like XML is to the ’00s as PDF was to the ’90s: a technology that eventually all publishers will be using. Both are charmingly flawed, but will end up being adopted anyway. So XML is like Cousin Oliver in the Brady Bunch.

XML and PDF. One’s about structure, semantics, and meaning while the other’s about portability, predictability, and drop shadows. Adobe’s working on an interesting mash-up of the two, called Mars. Think of it as a digital Reese’s, with PDF as chocolate, and XML as peanut butter. I’ll do a proper post on Mars in the near future.

As was the case with PDF, I think XML’s invasion of publishing tech is inevitable. There are just too many benefits, for us not to end up tagging everything. Someday we’ll shake our heads in disbelief that we ever worked with un-tagged content. So primitive. Is XML the perfect means for tagging content? No, it can be clumsy and verbose (but so can I). It might be fair to paraphrase Churchill, and say that XML is the worst tagging system, “except for all the others that have been tried from time to time.” We just need the other pieces of our publishing systems to evolve so we can reap the benefits of all those tags, while keeping the XML mostly invisible. Or, to risk infamy and quote Churchill twice in one paragraph, “Give us the tools and we will finish the job.” Of course, he was talking about things that explode.

Since I’m in a quoting mood, I am reminded of a conversation I had last year at the InDesign Conference. I took an XML class and the instructor made a good point when I told him I was struggling to learn XSLT. He said, “You make PDFs all the time, but not by hand coding, right? So why would you want to hand code XSL?” He wasn’t trying to discourage me from learning, but rather to realize that there are and will be more powerful tools to make a publishing XML workflow really fly. Specifically, he showed me MapForce by Altova to graphically create XSLT. Hand coding is a great and essential skill for some, but the easier it is to use any technology the faster it can really take off.

With the release of Creative Suite 3, InDesign took some significant steps forward with support for XSLT, XML rules, Text Variables, and other new features. Those of us who have been exploring the use of XML in InDesign, got a nice Christmas gift last December, with the publication of A Designer’s Guide to Adobe InDesign and XML by James J. Maivald and Cathy Palmer. If you want to understand the sometimes strange relationship between XML and InDesign, get your hands on this book. Or at least head over to the authors’ site, cookingwithxml.com. Adobe’s official documentation makes a nice side dish, especially the technical reference found here. But Maivald’s book is the main course. It is by far the most comprehensive and well-written instruction on this topic I’ve seen. I love the tone of the writing, which is friendly and informal, but not too silly (something I strive for myself). The authors know that when it comes to XML in InDesign, there’s a thin line between elegant beauty, and unfixable junk. Sometimes, the difference a single keystroke.

I’ll probably also buy Simon St. Laurent’s XML, Meet InDesign from O’Reilly soon, because a) I like the title, and b) I like the author’s hat.

If you are new to InDesign and/or XML, and do decide to use the Maivald book, here’s a tip: read every word. I’m not kidding. Every word of every sentence, in order. And pay attention. Have bright lights on in the room. And that Handel concerto playing softly in the background. And it helps to be slightly caffeinenated. Seriously, if you are new to this, do not skip around. There can’t be any gaps in your understanding or execution of XML in InDesign, or it just won’t work. Believe me, I’ve been tinkering with this since InDesign 2.0. I’ve made every mistake there is. Several times.

Given all that, is fragility of an InDesign XML-based layout a deal breaker? If it’s that hard to get right, and that easy to break, what good is it? How can it work without automated workflows and/or an army of scripters? Is there really room for human hands in an XML workflow? Good question. I think it points to the fact that right now highly-designed, “hand crafted” layouts and XML are an uneasy marriage. Think Arthur Miller-Marilyn Monroe. Or maybe Stephen Hawking-Paris Hilton.

But in 10 years, whether we’re using InDesign or something else, I think all the coding will be hidden under a GUI coating and we’ll have self-healing layouts that are impervious to the whims of designers and editors. Kids will be making XSLT by dragging their creations around onscreen and molding them like digital Play-Doh. And they won’t have a clue what XSLT stands for, nor should they. For the grown-ups maybe it will be something more like TurboTax, where the software interviews you, asking what you want to do with your content, and all the calculations are handled in the background. Let’s just hope the IRS doesn’t introduce a tax on tags any time soon. Actually, maybe a tag tax would help pay for things in Iraq.

Next time, I’ll share a little project of mine, taking old content and making it new again via XML.And as promised, here are a few tidbits so you don’t go hungry:

1. For anyone interested in a behind-the-scenes look into the world of educational publishing, check out the The Muddle Machine: Confessions of a Textbook Editor by Tamim Ansary. It’s more than 3 years old, but still a fantastic read. And the accompanying graphic alone is worth the trip. It also reinforces my Six Degrees of Star Wars theory, as edutopia.org is a project of the George Lucas Educational Foundation.

2. Again with the educational angle, here’s Bill Gates proclaming textbooks are dead. He’s right, or at least he would be if we could all afford the hardware. At my kids schools, we get hit with fundraisers and teacher requests for classroom supplies from the first day. My son is using a spelling book from 1994. I don’t think a tablet PC will be coming his way any time soon. I’m sure Bill’s used to a slightly different set of circumstances.

3. Here’s one for the true prepress geeks. A challenge by the Sandee Cohen, for someone to demonstrate why in this day and age you would ever want to use a Photoshop EPS file in lieu of PSD, TIFF, etc. I love questions like this that make us prove ourselves and debunk the myths.

4. Lastly, here is a guy who really, really likes bookmaps.