The other day I had something to do that left me thinking 'Hey, y'know I can really use that again!'
So I thought I'd share it.
What I had was a web page with a bunch of complex(ish) formatting and (rightly or wrongly) some maths tied up in the repeater that produced the table. My boss then said to me 'Hey, you know what would be really cool is if we could export that data into Excel.
Hmm. I thought. There are 3 things I can do here.
I could re-write all my data-access and manipulation bits so that the underlying dataset had the values I wanted, and then spin that out into a text-file or something.
I could just export the whole page into Excel and have that monkey about translating this nice table (and all the other form elements) into an ugly, ill thought out spreadsheet.
Or I could simply take my page output, take out just the table (with no drop-downs, viewstate, or anything I didn't need) and sling that into Excel, taking advantage of Excel's fairly good support for HTML tables.
This last option seemed best to me, since apart from not having to re-write code, I could maintain all the color-coding from the HTML table and keep this report looking quite nice.
So, how could I do that? I tried playing about with some code making every element on the page invisible, except for the repeater (well, actually it was a set of repeaters) that produced the table, but I found out something weird:
Excel tries to render the hidden viewstate element as an image, and flashes an ugly redex when it can't decipher it, before turning it into just a blank cell at the top of the page. Not nice. Livable, but not nice. So I figured I'd have to parse the page output right down manually to create an export-type mode (as opposed to a display mode) for the page. Once I'd solved that problem in the short-term, that was when I figured I could make this a bit more useful by creating a page that naturally supported moded output. So this is how I did it:
I started by creating a supporting class to represent my mode. I wanted to be able to specify the markers that were going to live in the HTML that would delimit the sections that would be rendered. To keep things neat, I decided I'd do that by putting them in comment tags.
So this was my 'Mode' class:
Public Class PageMode
Dim FModeName As String
Dim FMarker As String
Dim FContentType As PageModeContentType
Dim StartFormatString As String = "<!--{0}-->"
Dim EndFormatString As String = "<!--/{0}-->"
#Region " Properties "
Public Property ModeName() As String
Get
Return FModeName
End Get
Set(ByVal Value As String)
FModeName = Value
End Set
End Property
Public Property Marker() As String
Get
If FMarker = "" Then
Return String.Format(StartFormatString, FModeName)
Else
Return String.Format(StartFormatString, FMarker)
End If
End Get
Set(ByVal Value As String)
FMarker = Value
End Set
End Property
Public ReadOnly Property EndMarker() As String
Get
Return String.Format(EndFormatString, FMarker)
End Get
End Property
Public Property ContentType() As PageModeContentType
Get
Return FContentType
End Get
Set(ByVal Value As PageModeContentType)
FContentType = Value
End Set
End Property
#End Region
#Region " Constructors "
Private Sub New()
MyBase.New()
End Sub
Public Sub New(ByVal ModeName As String, ByVal MarkerText As String, ByVal ContentType As PageModeContentType)
FModeName = ModeName
FMarker = MarkerText
FContentType = ContentType
End Sub
Public Sub New(ByVal ModeName As String)
MyClass.New(ModeName, ModeName, PageModeContentType.web)
End Sub
Public Sub New(ByVal ModeName As String, ByVal MarkerText As String)
MyClass.New(ModeName, MarkerText, PageModeContentType.web)
End Sub
Public Sub New(ByVal ModeName As String, ByVal ContentType As PageModeContentType)
MyClass.New(ModeName, ModeName, ContentType)
End Sub
#End Region
End Class
As you can see, it's got an unusual type in there for ContentType. This was just an enumeration I added so I didn't have to worry about playing with strings and stuff - And I love the intellisense support for Enums!
The rest of it's pretty standard. I've got some properties in there that let you define your own marker text. If, for instance, you had 3 or 4 different ways in which you wanted to show your page, you can just give them different marker values. Easy.
So once I'd put together the Mode class, it was time to actually do something with it.
To make some use of this, all I did was create a new class that inherited from Page and overrode its Render method to parse the page output and only include the sections marked out. Here's the full code:
Public Class SectionablePage
Inherits Page
Dim FPageMode As PageMode
Protected Overrides Sub render(ByVal writer As HtmlTextWriter)
If FPageMode Is Nothing Then
'If there aren't any modes created, then render as normal
MyBase.Render(writer)
Else
Select Case FPageMode.ContentType
Case PageModeContentType.web
Response.ContentType = "text/html"
Case PageModeContentType.excel
Response.ContentType = "application/vnd.microsoft-excel"
Response.AddHeader("content-disposition", "attachment; filename=""" & FPageMode.ModeName & ".xls""")
Case PageModeContentType.word
Response.ContentType = "application/vnd.microsoft-word"
Response.AddHeader("content-disposition", "attachment; filename=Page.doc")
End Select
Dim sw As New StringWriter
Dim hw As New HtmlTextWriter(sw)
MyBase.Render(hw)
Dim EntirePage As String = sw.ToString
Dim Marker As String = FPageMode.Marker
Dim EndMarker As String = FPageMode.EndMarker
Dim CurrentPoint As Integer = 0
Dim SectionStart As Integer
Dim SectionEnd As Integer
Dim Section(1) As Integer
Dim AllSections As New ArrayList
Do While EntirePage.IndexOf(Marker, CurrentPoint) > -1
Section = Section.CreateInstance(GetType(Integer), 2)
Section(0) = EntirePage.IndexOf(Marker, CurrentPoint)
Section(1) = EntirePage.IndexOf(EndMarker, Section(0)) + EndMarker.Length
If Section(1) = -1 Then
'Check that there is a matching end tag to go with the start tag
Throw New SectionablePageException("Section is missing an end tag")
Exit Sub
End If
AllSections.Add(Section)
CurrentPoint = Section(1)
Loop
writer.WriteFullBeginTag("html")
For Each arSection As Integer() In AllSections
writer.Write(EntirePage.Substring(arSection(0), arSection(1) - arSection(0)))
Next
writer.WriteEndTag("html")
End If
End Sub
Public Property Mode() As PageMode
Get
Return FPageMode
End Get
Set(ByVal Value As PageMode)
FPageMode = Value
End Set
End Property
End Class
As you can see, I have a PageMode property, which just Gets and Sets the FPageMode variable (I could have just declared it as a public field, but I didn't want to. No good reason, I just didn't want to.)
In the render method, I start by checking that the page has had a mode set, and if it doesn't, then the page just renders as usual. If the page
does have a mode set, then it decides what to do with it.
Firstly, it decides what flavour of content it's going to say it is. This is where MS Office's HTML support really comes in handy. Again, there's that ContentType enumeration. If I had more time, patience, or any real need, I would have made a slightly smarter way of doing that. As it is, though, I just use Select Case to determine the ContentType header for the page response. If it's a file, I also want to change the add the Content-Disposition header, mainly so I can change the filename.
After it's decided what it's doing, the real magic begins.
It calls the render method so that it spins the page content out into a great big string.
It then steps through the great big string looking for pairs of start and end tags for the PageMode, putting these pairs of values into an ArrayList.
Once it's got to the end of the string (i.e. the end of the page), it takes the pairs of start and end tags and outputs just those sections, all wrapped in HTML tags.
It's far from perfect. I think one of these days I'll make the ContentType a bit smarter, and maybe handle the errors a bit nicer, but at the end of the day, it's all there.
In use, it looks something like this:
ASPX:
<%@ Page Language="vb" AutoEventWireup="false" Codebehind="WebForm1.aspx.vb" Inherits="WebSectionsTester.WebForm1"%>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
<HEAD>
<title>WebForm1</title>
<meta name="GENERATOR" content="Microsoft Visual Studio .NET 7.1">
<meta name="CODE_LANGUAGE" content="Visual Basic .NET 7.1">
<meta name="vs_defaultClientScript" content="JavaScript">
<meta name="vs_targetSchema" content="http://schemas.microsoft.com/intellisense/ie5">
</HEAD>
<body>
<form id="Form1" method="post" runat="server">
<!--Para12-->
Pick section:<br>
<!--/Para12-->
<asp:LinkButton ID="AllContent" Runat="server" text="All Content" /><br>
<asp:LinkButton ID="Para1" Runat="server" text="Para 1" /><br>
<asp:LinkButton ID="Para12" Runat="server" text="Para 12" /><br>
<!--AllContent-->
<h1>This is the Title</h1>
<!--Para1-->
<!--Para12-->
<p>This is paragraph 1</p>
<!--/Para1-->
<p>This is paragraph 2</p>
<!--/Para12-->
<!--/AllContent-->
</form>
</body>
</HTML>
Code-Behind:
...
Private Sub Page_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
'Put user code to initialize the page here
End Sub
Public Sub ProduceContent(ByVal sender As Object, ByVal e As EventArgs) Handles AllContent.Click, Para1.Click, Para12.Click
Dim ModeString As String = CType(sender, LinkButton).ID
Dim theMode As New PageMode("Download", ModeString, PageModeContentType.web)
Me.Mode = theMode
End Sub
...
As you can see, you can create overlapping parts, you can have sections that are split into several parts. I think it's quite good (even if i do say so myself).
Sure, it's not perfect. I think it might be lighter on the server to parse it using Regular Expressions, rather than String.IndexOf(), and one of these days I'm going to get rid of that ContentType Enum and the Select Case in Render. But that's another thing for another day.
Just wanted to share. I promised something vaguely .NET related, didn't I.