Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: MS Word to XHTML

__/ [Alan J. Flavell] on Sunday 11 September 2005 11:19 \__

> On Sun, 11 Sep 2005, SpaceGirl wrote:
>> Alan J. Flavell wrote:
> [comprehensive quote of my posting, without apparently having anything
> relevant to say about it.]
>> Word XP and upwards stores its documents in XML format doesn't it?
> So what?  XML is only a format for defining markup.  If the markup
> doesn't do anything meaningful (specifically - if it only creates a
> visual result on a printed page, without having any significant
> structure) then it's not going to turn into effective HTML: it'd just
> be the usual garbage in / garbage out that we're accustomed to with
> Word conversions to soi-disant "web" format.
>> You could probably write your own XSLT to turn in into HTML fairly
>> easily.
> There seems to be some kind of conceptual disconnect here. Most Word
> documents (in my experience) simply don't contain the necessary
> structure for useful conversion to HTML: they've been created as a
> purely visual construction for printing onto paper.  It's irrelevant
> what underlying technology you use (RTF, XML, SGML, whatever) - the
> problem is that the source material simply does not represent the
> needed structures, *because the document authors do not put it there*.
> You might as well try to convert cheese into fresh cream: both are
> fine milk products, it's true, but instead of trying to convert the
> one into the other, you'd do better to produce them both starting from
> fresh milk.  And the kind of "fresh milk" that's needed here is
> logically structured text markup.  Not visual formatting.  Until the
> authors of Word documents can grasp that, the prospects for conversion
> of Word to web formats are poor, IMHO.

I fully agree with you on that point. Any attempt at rephrasing the same
ideas would result in depletion. To suggest ways forward, I suggest that
the OP, who clearly wants to publish material on the Web, learns LaTeX.
Shall the idea of editing raw text become daunting, I suggest LyX < lyx.org
> [LyX: Front-end to LaTeX]. 5 minutes with LyX would help anyone realise
the difference and convey the idea, e.g. varying outputs, styles,
imposition of structure, etc.

Only a few days ago, somebody in the LyX mailing lists mentioned his
upcoming presentation on "Word: What you See Is What a Mess". The
presentation I deliver on Wednesday is well-formed XHTML <
http://schestowitz.com/Weblog/archives/2005/09/11/public-speaking/ > and is
motored by Eric Meyer's S5.


Roy S. Schestowitz      | "Software sucks. Open Source sucks less."
http://Schestowitz.com  |    SuSE Linux    |     PGP-Key: 74572E8E
  1:45pm  up 17 days 12:13,  3 users,  load average: 0.51, 0.58, 0.70

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index