When we left off with our project, we had transmogrified the cookbook Quark file into InDesign, and a made few observations about the potential work needed to make it the apple of our cross-media eyes.
Remember, what we (I) have to work with is Quark Xpress 4.11, InDesign CS3, Acrobat, and no scripting knowledge, Xtensions, Xcetera. At the finish we want to serve up our cookbook content in an nice HTML/CSS website, plus a new InDesign doc for print.
Before we start working in InDesign, just so you’ll never think I’m lazy (crazy, sure; lazy, no) here’s a list of alternate methods I explored for getting the content out of that old XPress file, and why I didn’t choose them.
The Roads Not Taken
1. Quark to ASCII: yields nice clean text, but no hooks to attach the XML tags to, and the text from the recipe cards gets left out since it’s not part of the main story.
2. Quark to XPress Tags: a little more interesting. Any time we hear the word “tags” our ears should prick up. It also gave me a reason to dust off David Blatner’s venerable Quark XPress 4 Book, and read the section on XPress Tags. His enthusiasm about them makes me feel like I missed out on something cool, well geek cool, since I never really used them before. Guess I’ll never know.
Quark’s tagging syntax is quite different from XML. It’s based on presentation and uses only opening tags. So we’d need to be clever about crafting a Find-and-Replace scheme. Before I read A Designer’s Guide to InDesign and XML, I would have just given up and moved on, but that book gave me the confidence to try just about anything with Find and Replace. I’ll spare you the gory details, but in 3 steps, I went from the XPress Tags and whitespace surrounding the ingredients to real live opening and closing XML tags.
So this method does work. But it kind of hurts my brain. And anyway, there’s that same deal killer of the recipe card content getting left out. Ahh, if only I could go back in time and warn myself not to put that stuff in inline text boxes…
3. Quark to Word: I can save to either Word 6 or Word 8. Both versions crash InDesign when I try to place them. No thanks.
4. Quark to PDF to Word: practically every paragraph is in it’s own text frame and somehow my 6 paragraph styles have ballooned into 44 styles named CM1-CM44. Pass.
5. Quark to PDF to RTF: I had high hopes for this one, since I thought it would reuinte those inline boxes with the rest of the text and have a style attached to everything. But I’ve tried several times to import it into ID and it crashes it every time. Sucks to be me.
6. Quark to PDF to HTML: no cards, plus everything’s chopped to bits in tiny <p> and <span> elements that don’t really correspond to meaningful elements. I think I need a beer.
7. Quark to PDF to XML: This is also kind of interesting, but not in a good way. More like Marshmallow-Peeps-in-a-microwave interesting. Almost everything is wrapped in <P> tags, which alone would be a deal breaker. But I really screwed things up by carelessly making the PDF from Quark with missing fonts, and as a result some of the recipe cards have overset text. Of course, the PDF doesn’t include any of that overset text, so it’s just gone. I also think the missing fonts resulted in some of the ingredients ending up in table tags. Very loose lines of justified text also got put into tables. “Clean-up in aisle 7!”
8. Quark to PDF to Mars to SVG to XML: OK, I need to stop. You get the point. Besides if I do the work in InDesign, I can make use of what’s already there for the print side of things.
If you just can’t get enough of this text-out-of-Quark topic, by all means check out the InDesign Secrets thread on it. It just makes me jealous that I don’t have access to things like TeXTractor (or a comp vendor in India).
Here’s how you know you’ve drunk the Adobe kool aid: I can’t type the word “India” without capitalizing the “d,” so it’s always InDia the first time. I am no longer capable of InDependent thought. InDeed, it’s InDefensible.
Ahhh, it feels so good to be back in InDesign after all that Quark-Word silliness (told ya I drank the Kool-Aid). Let’s do this quick. I want that text creamed and buffed with a fine chamois, and I want it now. Chop chop. The first thing is to clean up the paragraph styles so they match the element names I want to use in my XML. I have the luxury of not having to conform to a DTD or Schema, so I can call ’em anything I want. For now I’ll stay with simple semantic names. Then I’ll create tags with matching names that match the styles, and I change the default Root to cookbook. I’ll tag the story frame as recipies.
Now to clean up those two-column ingredients. A quick trip to the Find/Change dialog will suffice. First we replace all tabs in the ingredient style with paragraph returns. I also had some soft returns in there to put comments under ingredients, so let’s replace those with regular spaces. And last let’s use 3 of InDesign’s built-in GREP searches to tidy up any extra returns, spaces, or tabs.
Now comes the moment of truth. Mapping Styles to Tags. If I’ve done things right to this point, everything will fall into place. And…I think it worked. The XML is pretty flat but I like what we’ve got here. In a project with more than one destination for this code, I’d want some nesting so that each recipe was a enclosed in a set of tags, and maybe even the ingredients and groups as well. But this is a one-off kind of thing.
The next step is straight out of A Designer’s Guide… Chapter 9 to be precise. Since there’s no other re-use of this content besides a page on a Web server, then there is no need to get all huffy about keeping the semantic nature of our tags. I already have things grouped consistently, so I can change XML tags to HTML tags, add a few things and I’m good to go. First I’ll change cookbook to HTML, then add a tag called head, and drag it waaaaaaaaaay up in the structure pane, just under HTML. I’ll change the recipes tag to div, and map the card tag to it. The code is starting to look Webbish.
Let’s export it.
InDesign gives me a little agita on the way out:
I tried in vain to find these shady characters. How dare they refused to be encoded! I want to speak to their parent elements!
I thought the warning might have something to do with the Structure Pane Gremlin I’ve seen from time to time. Here he is.
Has anyone else seen this thing and know what it is? Looks like an Asian character of some sort. He is darn difficult to get rid of. I’ve had to untag and delete all the surrounding content to get rid of him. I thought maybe the Department of Homeland Security had bugged my InDesign file to see if I was a terrorist. Or maybe it’s just a glitch in the Matrix. Nothing to worry about. The Gremlin appeared 3 times in the cookbook code. Unfortuantely, even after removing all 3, I still get the warning message, so I have two unsolved mysteries. But the good news is the code looks fine when I eyeball it in Firefox.
That’s all for now. When we pick this up again, we’ll try to achieve Find and Replace nirvana in oXygen and test my ability to use CSS without making a MESS.
Filed under: Adobe, Books, Cookbook, HTML, InDesign, Page Layout, PDF, Quark XPress, Word, XML | Tagged: Adobe, Cookbook, cross-media, InDesign, PDF, Quark, RTF, Word, XML, XPress, XPress tags | Leave a comment »