HTML to FlowDocument Converter

I have an existing application that I wrote that stores "notes" in HTML format. I actually used the embedded HTML editor in IE to allow the user to create text - similar to what a RichText control would be used for but outputting to HTML rather than (the even more horrid) RTF.

I'm now in the process of upgrading that application to WPF, so its only natural that I want to display these notes using one of the WPF FlowDocument viewer controls. The problem I encountered was how to convert my HTML to something that could be nicely displayed in the FlowDocument?

Step 1 - Converting HTML to XAML

The solution was to download Microsoft's sample HtmlToXaml Converter (which actually allows conversion in both directions). Its apparently not foolproof but its certainly more than enough to convert my very simple HTML to the corresponding FlowDocument.

Using the HtmlToXamlConverter classes ConvertHtmlToXaml we can take a HTML string and convert to a XAML string, e.g. from:

    <p>The <b>Markup</b> that is to be converted.</p>

to:

    <FlowDocument>
<Paragraph>The <Run FontWeight="bold">Markup</Run> that is to be converted.</Paragraph>
</FlowDocument>

Step 2 - Converting XAML markup into a FlowDocument instance

This was certainly a great start but it converts HTML text to XAML text - not actual objects. So the next hurdle was how to convert the XAML document markup at runtime and insert within a FlowDocument?

The solution was fairly easy to find thanks to Google and Ronald Clifford - although I can't say I think its obvious.

    FlowDocument flowDocument = new FlowDocument();
    string xaml = "<p>The <b>Markup</b> that is to be converted.</p>";

    using (MemoryStream msDocument = new MemoryStream((new ASCIIEncoding()).GetBytes(xaml)))
    {
        TextRange textRange = new TextRange(flowDocument.ContentStart, flowDocument.ContentEnd);
        textRange.Load(msDocument, DataFormats.Xaml);
    }

Step 3 - Using DataBinding

So things were almost coming together. The last step was to perform this conversion as easily as possible - which meant using DataBinding. I had a list of items displayed, each with a field containing a string field that contained the HTML markup (as populated from a database using LINQ). So I wanted to be able to bind the FlowDocumentScrollViewer to the field containing the HTML and for things to work themselves out automatically. This required yet another IValueConverter class to use on the data binding.

    public class HtmlToFlowDocumentConverter : IValueConverter
    {
        public object Convert(object value, Type targetType, object parameter, 
System.Globalization.CultureInfo culture) { if (value != null) { FlowDocument flowDocument = new FlowDocument(); string xaml = HtmlToXamlConverter.ConvertHtmlToXaml(value.ToString(), false); using (MemoryStream stream = new MemoryStream((new ASCIIEncoding()).GetBytes(xaml))) { TextRange text = new TextRange(flowDocument.ContentStart, flowDocument.ContentEnd); text.Load(stream, DataFormats.Xaml); } return flowDocument; } return value; } public object ConvertBack(object value, Type targetType, object parameter,
System.Globalization.CultureInfo culture) { throw new NotImplementedException(); } }

Note: The HtmlToXmlConverter class above is simply the Microsoft sample.

The Result

This means I can now use this converter anywhere in my XAML that I wish to display HTML content within a FlowDocument control.

    <Page.Resources>
        <conv:HtmlToFlowDocumentConverter x:Key="htmlToXamlConverter"/>
    </Page.Resources>
    ...
    <FlowDocumentScrollViewer Document="{Binding Path=Events/Diaries/Comment,
                              Converter
={StaticResource htmlToXamlConverter}}"/>

 

What did you think of this article?




Trackbacks
  • Trackbacks are closed for this post.
Comments

  • Monday, April 21, 2008 8:45 PM Marcelo wrote:
    I want to consume a rss feed but I can not visualize the images.
    How can I do this (I couldn't figure out what I need to modify at HtmlToXmlConverter class?

    Thanks !!
    1. Tuesday, April 22, 2008 11:07 AM Nigel Spencer wrote:
      Hi Marcelo,

      I've posted a new entry http://blog.spencen.com/2008/04/23/handling-images-with-the-htmltoxamlconverter.aspx which I hope will help you.

      Cheers,
      Nigel
  • Thursday, September 04, 2008 11:31 AM phil wrote:
    Hi, I'm trying step 2 but get error "The tag 'p' does not exist in XML namespace ''. I guess the namespace is null? Thanks.
  • Thursday, September 04, 2008 12:55 PM phil wrote:
    Well, now I understand that xaml (

    TheMarkup...) was just an example and is not the actual xaml to be converted; boy to I feel stupid for my previous post. Anyway, I'm trying to set the text of a RichTextBox this way and it just remains blank. I'm using the RichTextBox's Document (which is a FlowDocument) instead of declaring a FlowDocument variable as you have above. But, again, the RichTextBox just stays blank. Any help appreciated. Thanks.

  • Thursday, September 04, 2008 1:00 PM phil wrote:
    Sorry, I tried to paste the Xaml string and it hosed up my post. Please disregard my first two posts/questions. I realize that the xaml in step 2 is just an example and is not the actual xaml.
    What I really need help on is I'm trying to set the text of a RichTextBox in the manner of your code above and it just remains blank. I'm using the RichTextBox's Document (which is a FlowDocument) instead of declaring a FlowDocument variable as you have above. But, again, the RichTextBox just stays blank. Thanks much.
  • Thursday, September 04, 2008 1:29 PM phil wrote:
    Got it to work! Had to modify the Xaml that the HtmlToXaml Converter outputs. Had to start the Xaml with a Span instead of FlowDocument. Also ditch the Paragraph tag. Had to add the xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" attribute to the Span tag.
    Anyways, thanks for your post - truly helpful. -Phil
    1. Saturday, September 06, 2008 8:03 AM Nigel Spencer wrote:
      Wow Phil - I feel like I missed a whole conversation here with your four comments . I sometimes call them "cardboard cut-out" conversations, where a developer comes to you seeking advice on some techinical problem in an area you dealt with a while back. By the time they've finished explaining the problem and you're just grasping where they are at - they suddenly have a Eureka moment - often caused simply by them voicing the question and being forced to explain their approach. They offer thanks and then leave without you having had any input to the conversation whatsoever (as if you could as well have been a cardboard cut-out).

      Anyhow - I'm certainly pleased if anything I wrote in the post was of any use to you whatsoever, and I certainly appreciate you taking the time to leave some comments - especially this one which hopefully will be of use to others.
  • Wednesday, December 17, 2008 7:11 PM Eric wrote:
    There's a lot of custom code here that will have bugs and inevitably fall out of date. RichTextBox supports copy / paste from HTML- could you not leverage that functionality to do conversions?
  • Friday, February 13, 2009 3:33 AM PPC Services wrote:
    Thanks for showing us how to do the converting. God bless to your blog:)
  • Thursday, July 02, 2009 7:45 AM Spielautomaten im Internet wrote:
    The solution was to download Microsoft's sample HtmlToXaml Converter (which actually allows conversion in both directions). Its apparently not foolproof but its certainly more than enough to convert my very simple HTML to the corresponding FlowDocument. I'm using the RichTextBox's Document (which is a FlowDocument) instead of declaring a FlowDocument variable as you have above. But, again, the RichTextBox just stays blank
  • Wednesday, July 08, 2009 2:45 AM r4 games wrote:
    Hi,

    You might want to have a look at Chris Lovett's SGMLReader class, as this is a slot-in replacement for an XMLReader AND will read an HTML file. This then can be piped into an XSL stylesheet and transformed directly into XAML.
Leave a comment

Comments are closed.