public class HTMLToString
extends Object
convert()
method that converts an HTML file into an XML string that can be
pre-filtered and added to a Lucene database by the
XMLTextProcessor class. | Modifier and Type | Field and Description |
|---|---|
private static HashMap |
htmlCodeMap
Build a HashMap from the code table above
|
(package private) static String[] |
htmlCodes
Table of conversions from HTML ampersand codes to UNICODE.
|
(package private) static Tidy |
tidy
Create the HTMLTidy object that will do the work.
|
| Constructor and Description |
|---|
HTMLToString() |
| Modifier and Type | Method and Description |
|---|---|
static String |
convert(InputStream htmlInputStream)
Convert an HTML file into an HTMLTidy style XML string.
|
static String |
replaceHtmlCodes(String in)
Convert any non-XML ampersand codes within a string to their unicode
equivalents.
|
static Tidy tidy
static final String[] htmlCodes
private static HashMap htmlCodeMap
public static String convert(InputStream htmlInputStream)
htmlInputStream - Stream of HTML text to convert to an XML string.null.public static String replaceHtmlCodes(String in)
in - The string within which to convert codes.