Lunchtime Links

Macworld has an article on Twitter for Mac Creatives. I found the comments interesting too. There’s definitely a line in the sand between those who “get” Twitter and those who don’t. As with anything, YMMV, but I think those who complain about the lousy signal-to-noise ratio just aren’t following the right people. Sure, there’s a lot of “I like bacon” tweets. But there’s a ton of good stuff too. Don’t throw the baby out with the TweetWater.

Following up on Eric’s Why XML? post, here’s a link to the slides from O’Reilly’s Start With XML one-day conference that was held in NYC on January 13th.

Speaking of O’Reilly, the annual Tools of Change conference is coming up: Feb 9-11, also in NYC. MarkLogic, Adobe, <oXygen/> and Cory Doctorow of BoingBoing. I’ll take it, with an everything bagel, to go. Of course, if you can’t make it, you can follow the conference on Twitter. SeewhatImean?

While you’re at it, you can follow me. I promise not to tweet about bacon. I don’t even eat the stuff.

MarkLogic announced the MarkLogic Toolkit for Word, which is a free open source means for developers to build applications directly bridging Microsoft Office Word 2007 and MarkLogic. Makes you go <hmmm/>.

Last but not least, GeoEye has some photos of yesterday’s inauguration, taken by Google’s new satellite. See how SkyNet, er, Google views 1.something million people: as a bunch of ants. That reminds me, I should put Judgment Day in my iCal. Thanks for the tip goes to John Nack at Adobe, whose Photoshop blog is required reading for everyone in my household, even the cats (how do you think they put those lightsabers in their paws?). Actually, they use FriskiesPix.

Test Drive A Hybrid Workflow

You never know where inspiration will come from. Yesterday it came to me in the form of a traffic jam slowing my ride to work. Sitting on the commuter bus, mired in “bumpa-ta-bumpa”, I stared out the window. Life on pause. I imagined the lake of gasoline that was fueling all these cars, puffing out their tailpipes, melting Greenland. Then a shiny red Toyta Prius rolled by. It had a good vibe, like a forward-thinking, best alternative in an otherwise unworkable situation. A hybrid.

This was definitely a sign from the cosmos that it was time to talk about the hybrid XML-InCopy workflow I’d been playing with since last year. I had planned to finish off the cookbook project tonight, but this idea overtook it in my brain. I offer it up now in case any of you out there is stuck in a traffic jam of a publishing workflow, with a line of products and file formats in each other’s way, slowly crawling ahead while the clock ticks ticks ticks.

The point of a hybrid workflow is to combine the virtues of XML and InCopy to give you speed and efficiency that is otherwise impossible. Single-source authoring means multiple print and/or Web products derive and arrive simultaneously from the same set of keystrokes. You author and edit in XML, transform when necessary, and use InCopy to preview your print layouts, where space is finite and styling matters, as you go.

I know there are people out there who have written amazing scripts, or developed plug-ins or workflow systems that can accomplish what I’m going to show you better, faster, with more goodies. But as always, I write about what can you do with the off-the-shelf tools and garden variety skills. Or if you don’t currently possess those skills, you can come by them without completely re-wiring your brain. Perhaps someday Adobe will release the equivalent of an electric car, XML authoring as part of the Creative Suite, and we’ll all be merrily speeding down the Cross-Media Expressway. What follows is my idea of how to do today. And it works. So hop in the hybrid and take it for a spin.

The Pitch

With a W3C XML Schema as the foundation of your workflow, you can develop multiple print and online products simultaneously, and achieve efficiency and savings through content re-use with off-the-shelf tools.

The Tools

To do this you’re going to need InDesign CS3, InCopy CS3, and the Altova MissionKit For XML Developers. The MissionKit is an XML Developer’s equivalent of the Adobe Creative Suite. It’s three applications that work in concert to for the creation and transformation of XML files: XMLSpy, StyleVision, and MapForce. This is a Windows-only package. There is no Mac version, so if you kneel at the altar of Jobs like I do, you need emulation software like Parallels. Syncrosoft’s oXygen is an alternative that runs on the Mac, but only if you don’t need a lot of help writing XSLT. I do. MapForce gives you graphical creation of XSLT, or as I call it, XSLTW (XSL with Training Wheels). I’m not trying to do a commercial for Altova, but their stuff is the only stuff that I know works for everything we’re trying to do.

Step 1: Planning

This is the big one. Map out all your content. Depending on the complexity of your content, this can be a tough job, so you only want to do it once. Spend enough time to get it right, since everything flows downhill from here. Every screw-up or oversight at this stage will echo throughout the workflow in some combination of time, aggravation, or cost. And everyone needs to know that once this is done, there’s no changing the structure, at least not for anyone who wishes to remain with the company.

Your map should show every piece and where it fits into the overall scheme of your project. Map every recombination, and every dependency. Leave no stone unturned. This part can be a real eye opener. If you survive with your sanity intact, you will understand your content better than ever before and maybe discover new ways of using it.

Reverse engineer your own content. Cut up books and move the pieces around as they would move in the digital realm. Follow the life of a lowly paragraph as it appears throughout your product line. Once you grasp the details, you can answer the first key question. What kind of schema best suits your needs: a custom built-from-scratch schema or a generic format? Do you have the time and money to make the former? Do you have the flexibility for the latter? A bad fit might cost you more in the long run. Investigate DocBook and DITA. If you go generic, skip to step 3.

Step 2: Build the Schema File

With the understanding that you gained by mapping your content, you can now build an XML Schema that will guide your authoring, transformation, and output. Why a Schema? Why not a DTD? I have nothing against DTDs. In fact, they are more appropriate for describing book-like things. Schema excel at describing data more than documents. In fact, I love DTDs so much, the other day on the highway I was passed by someone with the license plate 736 DTD, and I thought “hey, that’s cool.” Then I felt the urge to slap myself for being such a geek.

I say use Schema purely because the Altova tools support Schema in ways that they don’t support DTDs. Namely, you can graphically create a Schema in XMLSpy. I feel a little hypocritical because this is the same tool-based thinking I dissed in a previous post. But facts is facts, and until I find another tool that can do this workflow end-to-end with a DTD, I’m sticking to my story. Actually, if you must have a DTD, there is a workaround: build a Schema, then use XMLSpy to convert the Schema to a DTD.

Step 3: Create Authoring Templates

Using StyleVision you take your Schema and apply styling to it to make a user-friendly authoring template. This is something the oXygen can do too. You choose from CSS properties to apply fonts, spacing, and position to your elements. You can make pop-up menus for standardizing choices, and clickable links to insert required elements.

Step 4: Develop Layout Templates

Import sample XML files into InDesign, structure and style it. Set up styles to tags mapping. Make use of the Story Editor to be sure your tagging remains intact and whitespace characters are where they belong.

Step 5: Distribute Authoring Template

Let the writers have at it.

Step 6: Import XML Files into InDesign

And when you do, be sure to maintain the live link, so the XML file appears in the Links panel.

Step 7: Export to InCopy

BUT tell everyone that the InCopy files are untouchable! Hide them. Instead, InCopy users drop the InDesign file onto InCopy to open it directly.

Step 8: Editors Do the InCopy Two Step

Check out the appropriate stories from the InDesign layout. Show the Links panel to see the XML file. Edit in the XML file, save it. Go back to the Links panel and update the link to the XML file. Magic! You have your cake (XML) and eat (publish) it too.

The fact that this works at all is a complete accident–the unintended consequence of 3 InCopy capabilities: access to the links palette (intended for the management of placed images), the ability to use an InDesign layout for preview (so one story can be simultaneously linked to both an XML file and a .incx file), and the ability to maintain a live link to to text files (meant for Word and spreadsheets). Sometimes things just fall into place.

Editors can do some work, like styling, and working with boilerplate (untagged) content in the layout file. But they must understand the fundamental truth that anything they do between the tags in the layout will be wiped out the next time the XML file is saved. Stuff outside the tags, in whitespace elements, remains.

At the end of the day, when all is said and done, you still have intact XML files, with the most up-to-date content, ready to be flowed into whatever template or media you need.

Bonus Points

At any point in this workflow you can use MapForce to create XSLT to transform your content, making it fit another purpose. You don’t need automated workflow systems or scripts to make that transformation happen now that InDesign supports XSLT. Examples of what you can do with XSLT: Making HTML for Web presentation, making PDF, making alternative print products by gathering or sorting content according to attributes, making NIMAS files.

Math Doesn’t Add Up

All this is great, but it will not work for you if you need MathML. The only ways to get MathML in and out of InDesign involve scripted solutions, or customized versions of 3rd party plug-ins like MathMagic. I hope that some day InMath, which has always been my favorite equation editor for InDesign, will add MathML support. Design Science’s MathType speaks fluent MathML, and you can place those equations (in EPS format) into an InDesign layout. PowerMath also exists for InDesign but I haven’t tried it out. Note to self: I should do a future post comparing all the different ways to do math in InDesign.

Next Steps

My next project (if ever stop spending all my free time blogging) is to experiment with Office Open XML. Since the new version of Office has XML underlying every file format, why not exploit that, and author in Word, transform OOML to your Schema, then import in InDesign, Web, etc. It should work like a charm, and it’ll probably have authors and editors breathing a sigh of joyful relief that the XML authoring tool they have to use is Word. It may not be the electric car, but it’s pretty close to one of those that runs on old french fry oil. Mmmm, I could go for some fries right now.

Family Cookbook 2.0 continued

When we left off with our project, we had transmogrified the cookbook Quark file into InDesign, and a made few observations about the potential work needed to make it the apple of our cross-media eyes.

plish004-fritter.gif

Remember, what we (I) have to work with is Quark Xpress 4.11, InDesign CS3, Acrobat, and no scripting knowledge, Xtensions, Xcetera. At the finish we want to serve up our cookbook content in an nice HTML/CSS website, plus a new InDesign doc for print.

Before we start working in InDesign, just so you’ll never think I’m lazy (crazy, sure; lazy, no) here’s a list of alternate methods I explored for getting the content out of that old XPress file, and why I didn’t choose them.

The Roads Not Taken

1. Quark to ASCII: yields nice clean text, but no hooks to attach the XML tags to, and the text from the recipe cards gets left out since it’s not part of the main story.

2. Quark to XPress Tags: a little more interesting. Any time we hear the word “tags” our ears should prick up. It also gave me a reason to dust off David Blatner’s venerable Quark XPress 4 Book, and read the section on XPress Tags. His enthusiasm about them makes me feel like I missed out on something cool, well geek cool, since I never really used them before. Guess I’ll never know.

Quark’s tagging syntax is quite different from XML. It’s based on presentation and uses only opening tags. So we’d need to be clever about crafting a Find-and-Replace scheme. Before I read A Designer’s Guide to InDesign and XML, I would have just given up and moved on, but that book gave me the confidence to try just about anything with Find and Replace. I’ll spare you the gory details, but in 3 steps, I went from the XPress Tags and whitespace surrounding the ingredients to real live opening and closing XML tags.

plish004-xtagstoxml.jpg

So this method does work. But it kind of hurts my brain. And anyway, there’s that same deal killer of the recipe card content getting left out. Ahh, if only I could go back in time and warn myself not to put that stuff in inline text boxes…

3. Quark to Word: I can save to either Word 6 or Word 8. Both versions crash InDesign when I try to place them. No thanks.

4. Quark to PDF to Word: practically every paragraph is in it’s own text frame and somehow my 6 paragraph styles have ballooned into 44 styles named CM1-CM44. Pass.

5. Quark to PDF to RTF: I had high hopes for this one, since I thought it would reuinte those inline boxes with the rest of the text and have a style attached to everything. But I’ve tried several times to import it into ID and it crashes it every time. Sucks to be me.

6. Quark to PDF to HTML: no cards, plus everything’s chopped to bits in tiny <p> and <span> elements that don’t really correspond to meaningful elements. I think I need a beer.

7. Quark to PDF to XML: This is also kind of interesting, but not in a good way. More like Marshmallow-Peeps-in-a-microwave interesting. Almost everything is wrapped in <P> tags, which alone would be a deal breaker. But I really screwed things up by carelessly making the PDF from Quark with missing fonts, and as a result some of the recipe cards have overset text. Of course, the PDF doesn’t include any of that overset text, so it’s just gone. I also think the missing fonts resulted in some of the ingredients ending up in table tags. Very loose lines of justified text also got put into tables. “Clean-up in aisle 7!”

plish004-justifed.jpg

plish004-tablexml.jpg

8. Quark to PDF to Mars to SVG to XML: OK, I need to stop. You get the point. Besides if I do the work in InDesign, I can make use of what’s already there for the print side of things.

If you just can’t get enough of this text-out-of-Quark topic, by all means check out the InDesign Secrets thread on it. It just makes me jealous that I don’t have access to things like TeXTractor (or a comp vendor in India).

Here’s how you know you’ve drunk the Adobe kool aid: I can’t type the word “India” without capitalizing the “d,” so it’s always InDia the first time. I am no longer capable of InDependent thought. InDeed, it’s InDefensible.

InDesign clean-up

Ahhh, it feels so good to be back in InDesign after all that Quark-Word silliness (told ya I drank the Kool-Aid). Let’s do this quick. I want that text creamed and buffed with a fine chamois, and I want it now. Chop chop. The first thing is to clean up the paragraph styles so they match the element names I want to use in my XML. I have the luxury of not having to conform to a DTD or Schema, so I can call ’em anything I want. For now I’ll stay with simple semantic names. Then I’ll create tags with matching names that match the styles, and I change the default Root to cookbook. I’ll tag the story frame as recipies.

plish004-tagsnstyles.jpg
Now to clean up those two-column ingredients. A quick trip to the Find/Change dialog will suffice. First we replace all tabs in the ingredient style with paragraph returns. I also had some soft returns in there to put comments under ingredients, so let’s replace those with regular spaces. And last let’s use 3 of InDesign’s built-in GREP searches to tidy up any extra returns, spaces, or tabs.

plish004-grep.jpg

Now comes the moment of truth. Mapping Styles to Tags. If I’ve done things right to this point, everything will fall into place. And…I think it worked. The XML is pretty flat but I like what we’ve got here. In a project with more than one destination for this code, I’d want some nesting so that each recipe was a enclosed in a set of tags, and maybe even the ingredients and groups as well. But this is a one-off kind of thing.

plish004-cleantagstruct.jpg

The next step is straight out of A Designer’s Guide… Chapter 9 to be precise. Since there’s no other re-use of this content besides a page on a Web server, then there is no need to get all huffy about keeping the semantic nature of our tags. I already have things grouped consistently, so I can change XML tags to HTML tags, add a few things and I’m good to go. First I’ll change cookbook to HTML, then add a tag called head, and drag it waaaaaaaaaay up in the structure pane, just under HTML. I’ll change the recipes tag to div, and map the card tag to it. The code is starting to look Webbish.

plish004-finalxml.jpg

Let’s export it.

InDesign gives me a little agita on the way out:

plish004-cannotbeencoded.jpg

I tried in vain to find these shady characters. How dare they refused to be encoded! I want to speak to their parent elements!

I thought the warning might have something to do with the Structure Pane Gremlin I’ve seen from time to time. Here he is.

plish004-gremlin.jpg

Has anyone else seen this thing and know what it is? Looks like an Asian character of some sort. He is darn difficult to get rid of. I’ve had to untag and delete all the surrounding content to get rid of him. I thought maybe the Department of Homeland Security had bugged my InDesign file to see if I was a terrorist. Or maybe it’s just a glitch in the Matrix. Nothing to worry about. The Gremlin appeared 3 times in the cookbook code. Unfortuantely, even after removing all 3, I still get the warning message, so I have two unsolved mysteries. But the good news is the code looks fine when I eyeball it in Firefox.

That’s all for now. When we pick this up again, we’ll try to achieve Find and Replace nirvana in oXygen and test my ability to use CSS without making a MESS.