Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: MS Word to XHTML

On Sun, 11 Sep 2005, Roy Schestowitz wrote (seen on alt.html):

>  * Fragment the output as requires, probably by hand (WYSIWYG programs 
> like Word have no notion of structure or semantics)

This isn't by any means aimed at you personally, but your posting 
triggered a response from me, and it looks as if knowledge is proceeding 

Proper use of MS Word uses Styles, oriented towards the structure of the 
document.  (If I had my way, I'd rip the direct styling buttons out of the 
main menu of Word, and hide them away in an Advanced Users menu).  Such 
properly-made Word documents are reasonably capable of being converted 
well to structural HTML, and a stylesheet suitable for web use can then be 
applied (it usually won't be the same "style sheet" (= style template) as 
would be suitable for a printed Word document, of course!).

I had some experience, around 1997-8, with the (payware) rtftohtml program 
- subsequently renamed and marketed under the company name Logictran - it 
had this pretty-much sorted out.  I must admit I haven't got experience of 
it since the change of name, but I can say that the principles of the 
original program seemed to what I was looking for, unlike most of the 
other pseudo-WYSIWYG garbage from other places (that offended all sense of 
what is suitable for the WWW).

With that rtftohtml program, decently structured Word could be turned into 
decently structured HTML, and split on chapter or section headings quite 
automatically, with HTML indexes and table of contents generated 
automatically.  OK, there were some rough edges, but at least the 
principles showed up just fine.  I find it sad that some 7 years later we 
seem to have fallen back to the stone age of direct styling and 
pseudo-WYSIWYG in most of the Word conversions that I have seen.

[Note - there are other programs called rtftohtml or rtf2html - it may be 
that some of them do a similar job, I can't speak for or against them, 
I'm just commenting as a reasonably satistfied user of version 4 of this 
particular program from around 1998 onwards.]

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index