With the advent of the Web, many organizations have put large amounts of information in the form of HTML pages. These pages are tied up to a single presentation. Extensible Markup Language (XML) allows us to separate the content and the presentation. If the developers think of a migration of these HTML pages to XML mechanically -- trying to create well-formed documents out of the existing HTML documents, or cutting and pasting contents from HTML to the newly created XML files, or whatever -- that would be a pretty daunting task. This article shows how the tool HTML Tidy and a COM Wrapper can make our job simpler. In my article
Server side use of MSXMLDOM with HTML, I showed how we can exploit the functionality of the Document Object Model (DOM) parser to work with HTML documents, provided they are well-formed. This article is an extension of the same idea. Here we shall discuss a sample conversion of the bookmark file from HTML to XML and then into a browser-neutral tree view.