Tuesday, June 28, 2005

A useful bit of code I wrote...

The other day I had something to do that left me thinking 'Hey, y'know I can really use that again!'

So I thought I'd share it.

What I had was a web page with a bunch of complex(ish) formatting and (rightly or wrongly) some maths tied up in the repeater that produced the table. My boss then said to me 'Hey, you know what would be really cool is if we could export that data into Excel.

Hmm. I thought. There are 3 things I can do here.

I could re-write all my data-access and manipulation bits so that the underlying dataset had the values I wanted, and then spin that out into a text-file or something.

I could just export the whole page into Excel and have that monkey about translating this nice table (and all the other form elements) into an ugly, ill thought out spreadsheet.

Or I could simply take my page output, take out just the table (with no drop-downs, viewstate, or anything I didn't need) and sling that into Excel, taking advantage of Excel's fairly good support for HTML tables.

This last option seemed best to me, since apart from not having to re-write code, I could maintain all the color-coding from the HTML table and keep this report looking quite nice.

So, how could I do that? I tried playing about with some code making every element on the page invisible, except for the repeater (well, actually it was a set of repeaters) that produced the table, but I found out something weird:

Excel tries to render the hidden viewstate element as an image, and flashes an ugly redex when it can't decipher it, before turning it into just a blank cell at the top of the page. Not nice. Livable, but not nice. So I figured I'd have to parse the page output right down manually to create an export-type mode (as opposed to a display mode) for the page. Once I'd solved that problem in the short-term, that was when I figured I could make this a bit more useful by creating a page that naturally supported moded output. So this is how I did it:

I started by creating a supporting class to represent my mode. I wanted to be able to specify the markers that were going to live in the HTML that would delimit the sections that would be rendered. To keep things neat, I decided I'd do that by putting them in comment tags.

So this was my 'Mode' class:

Public Class PageMode
    Dim FModeName As String
    Dim FMarker As String
    Dim FContentType As PageModeContentType
    Dim StartFormatString As String = "<!--{0}-->"
    Dim EndFormatString As String = "<!--/{0}-->"
 
#Region " Properties "
    Public Property ModeName() As String
        Get
            Return FModeName
        End Get
        Set(ByVal Value As String)
            FModeName = Value
        End Set
    End Property
 
    Public Property Marker() As String
        Get
            If FMarker = "" Then
                Return String.Format(StartFormatString, FModeName)
            Else
                Return String.Format(StartFormatString, FMarker)
            End If
        End Get
        Set(ByVal Value As String)
            FMarker = Value
        End Set
    End Property
 
    Public ReadOnly Property EndMarker() As String
        Get
            Return String.Format(EndFormatString, FMarker)
        End Get
    End Property
 
    Public Property ContentType() As PageModeContentType
        Get
            Return FContentType
        End Get
        Set(ByVal Value As PageModeContentType)
            FContentType = Value
        End Set
    End Property
#End Region
 
#Region " Constructors "
    Private Sub New()
        MyBase.New()
    End Sub
 
    Public Sub New(ByVal ModeName As String, ByVal MarkerText As String, ByVal ContentType As PageModeContentType)
        FModeName = ModeName
        FMarker = MarkerText
        FContentType = ContentType
    End Sub
 
    Public Sub New(ByVal ModeName As String)
        MyClass.New(ModeName, ModeName, PageModeContentType.web)
    End Sub
 
    Public Sub New(ByVal ModeName As String, ByVal MarkerText As String)
        MyClass.New(ModeName, MarkerText, PageModeContentType.web)
    End Sub
 
    Public Sub New(ByVal ModeName As String, ByVal ContentType As PageModeContentType)
        MyClass.New(ModeName, ModeName, ContentType)
    End Sub
#End Region
 
End Class


As you can see, it's got an unusual type in there for ContentType. This was just an enumeration I added so I didn't have to worry about playing with strings and stuff - And I love the intellisense support for Enums!

The rest of it's pretty standard. I've got some properties in there that let you define your own marker text. If, for instance, you had 3 or 4 different ways in which you wanted to show your page, you can just give them different marker values. Easy.

So once I'd put together the Mode class, it was time to actually do something with it.

To make some use of this, all I did was create a new class that inherited from Page and overrode its Render method to parse the page output and only include the sections marked out. Here's the full code:

Public Class SectionablePage
    Inherits Page
    Dim FPageMode As PageMode
 
    Protected Overrides Sub render(ByVal writer As HtmlTextWriter)
        If FPageMode Is Nothing Then
            'If there aren't any modes created, then render as normal
            MyBase.Render(writer)
        Else
            Select Case FPageMode.ContentType
                Case PageModeContentType.web
                    Response.ContentType = "text/html"
                Case PageModeContentType.excel
                    Response.ContentType = "application/vnd.microsoft-excel"
                    Response.AddHeader("content-disposition", "attachment; filename=""" & FPageMode.ModeName & ".xls""")
                Case PageModeContentType.word
                    Response.ContentType = "application/vnd.microsoft-word"
                    Response.AddHeader("content-disposition", "attachment; filename=Page.doc")
            End Select
 
            Dim sw As New StringWriter
            Dim hw As New HtmlTextWriter(sw)
            MyBase.Render(hw)
 
            Dim EntirePage As String = sw.ToString
            Dim Marker As String = FPageMode.Marker
            Dim EndMarker As String = FPageMode.EndMarker
            Dim CurrentPoint As Integer = 0
            Dim SectionStart As Integer
            Dim SectionEnd As Integer
            Dim Section(1) As Integer
            Dim AllSections As New ArrayList
 
            Do While EntirePage.IndexOf(Marker, CurrentPoint) > -1
                Section = Section.CreateInstance(GetType(Integer), 2)
                Section(0) = EntirePage.IndexOf(Marker, CurrentPoint)
                Section(1) = EntirePage.IndexOf(EndMarker, Section(0)) + EndMarker.Length
                If Section(1) = -1 Then
                    'Check that there is a matching end tag to go with the start tag
                    Throw New SectionablePageException("Section is missing an end tag")
                    Exit Sub
                End If
                AllSections.Add(Section)
                CurrentPoint = Section(1)
            Loop
 
            writer.WriteFullBeginTag("html")
            For Each arSection As Integer() In AllSections
                writer.Write(EntirePage.Substring(arSection(0), arSection(1) - arSection(0)))
            Next
            writer.WriteEndTag("html")
        End If
    End Sub
 
    Public Property Mode() As PageMode
        Get
            Return FPageMode
        End Get
        Set(ByVal Value As PageMode)
            FPageMode = Value
        End Set
    End Property
 
End Class


As you can see, I have a PageMode property, which just Gets and Sets the FPageMode variable (I could have just declared it as a public field, but I didn't want to. No good reason, I just didn't want to.)

In the render method, I start by checking that the page has had a mode set, and if it doesn't, then the page just renders as usual. If the page does have a mode set, then it decides what to do with it.

Firstly, it decides what flavour of content it's going to say it is. This is where MS Office's HTML support really comes in handy. Again, there's that ContentType enumeration. If I had more time, patience, or any real need, I would have made a slightly smarter way of doing that. As it is, though, I just use Select Case to determine the ContentType header for the page response. If it's a file, I also want to change the add the Content-Disposition header, mainly so I can change the filename.

After it's decided what it's doing, the real magic begins.

It calls the render method so that it spins the page content out into a great big string.

It then steps through the great big string looking for pairs of start and end tags for the PageMode, putting these pairs of values into an ArrayList.

Once it's got to the end of the string (i.e. the end of the page), it takes the pairs of start and end tags and outputs just those sections, all wrapped in HTML tags.

It's far from perfect. I think one of these days I'll make the ContentType a bit smarter, and maybe handle the errors a bit nicer, but at the end of the day, it's all there.

In use, it looks something like this:

ASPX:
<%@ Page Language="vb" AutoEventWireup="false" Codebehind="WebForm1.aspx.vb" Inherits="WebSectionsTester.WebForm1"%>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
    <HEAD>
        <title>WebForm1</title>
        <meta name="GENERATOR" content="Microsoft Visual Studio .NET 7.1">
        <meta name="CODE_LANGUAGE" content="Visual Basic .NET 7.1">
        <meta name="vs_defaultClientScript" content="JavaScript">
        <meta name="vs_targetSchema" content="http://schemas.microsoft.com/intellisense/ie5">
    </HEAD>
    <body>
        <form id="Form1" method="post" runat="server">
            <!--Para12-->
            Pick section:<br>
            <!--/Para12-->
            <asp:LinkButton ID="AllContent" Runat="server" text="All Content" /><br>
            <asp:LinkButton ID="Para1" Runat="server" text="Para 1" /><br>
            <asp:LinkButton ID="Para12" Runat="server" text="Para 12" /><br>
            <!--AllContent-->
            <h1>This is the Title</h1>
            <!--Para1-->
            <!--Para12-->
            <p>This is paragraph 1</p>
            <!--/Para1-->
            <p>This is paragraph 2</p>
            <!--/Para12-->
            <!--/AllContent-->
        </form>
    </body>
</HTML>


Code-Behind:
    ...
 
    Private Sub Page_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
        'Put user code to initialize the page here
    End Sub
 
    Public Sub ProduceContent(ByVal sender As Object, ByVal e As EventArgs) Handles AllContent.Click, Para1.Click, Para12.Click
        Dim ModeString As String = CType(sender, LinkButton).ID
        Dim theMode As New PageMode("Download", ModeString, PageModeContentType.web)
        Me.Mode = theMode
    End Sub
 
    ...


As you can see, you can create overlapping parts, you can have sections that are split into several parts. I think it's quite good (even if i do say so myself).

Sure, it's not perfect. I think it might be lighter on the server to parse it using Regular Expressions, rather than String.IndexOf(), and one of these days I'm going to get rid of that ContentType Enum and the Select Case in Render. But that's another thing for another day.

Just wanted to share. I promised something vaguely .NET related, didn't I.

No comments: