Strip HTML from a String
While not exactly earth-shattering, I thought I would post some handy code. Recently, I had to save some data from a FreeTextBox, in both HTML and plain text form (so that it could be full-text indexed). It looks like there are a number of components out there, but I knew the IE HTML DOM was very liberal in its interpretation of HTML and therefor a good choice, not to mention it would be hard to find a system without this component installed. (Although you need to make an interop reference to mshtml.tlb)
Without further ado, the code:
Protected Function StipHtml(ByVal Html As String) As String
Dim Doc As New mshtml.HTMLDocument()
Dim d1 As mshtml.IHTMLDocument4 = Doc
Dim d2 As mshtml.IHTMLDocument4 = Doc
d2.write(Html)
d2.close()
Return Doc.body.innerText
End Function
Happy coding,
- Brent