pulse - the web application framework Get pulse at SourceForge.net. Fast, secure and Free Open Source software downloads Ohloh project report for pulse-java

org.torweg.pulse.util.xml
Class XMLConverter

java.lang.Object
  extended by org.torweg.pulse.util.xml.XMLConverter

public final class XMLConverter
extends java.lang.Object

a utility for different XML conversion scenarios.

Version:
$Revision: 2063 $
Author:
Thomas Weber

Method Summary
static org.jdom.Element cleanHTML(java.io.Reader input)
          cleans up faulty XHTML and returns a complete Document allowing unknown elements.
static org.jdom.Element cleanHTML(java.io.Reader input, boolean ignoreUnknownElements)
          leans up faulty XHTML and returns a complete Document.
static org.jdom.Element cleanHTML(java.lang.String input)
          cleans up faulty XHTML and returns a complete Document allowing unknown elements.
static org.jdom.Element cleanHTML(java.lang.String input, boolean ignoreUnknownElements)
          cleans up faulty XHTML and returns a complete Document allowing unknown elements.
static java.util.Date fromDateTime(java.lang.String dt)
          converts an RFC 3339 compliant date-time string to a Date.
static org.jdom.Document getCleanedXHTML(java.io.InputStream xhtmlInput)
          returns a cleaned Document for the given XHTML input.
static org.jdom.Document getCleanedXHTML(java.io.Reader xhtmlInput)
          returns a cleaned Document for the given XHTML input.
static org.jdom.Document getCleanedXHTML(java.lang.String xhtmlInput)
          returns a cleaned Document for the given XHTML input.
static java.lang.String getCompactString(org.jdom.Document d, boolean omitDecl)
          returns a compact XML string for the given Document.
static java.lang.String getCompactString(org.jdom.Element e, boolean omitDecl)
          returns a compact XML string for the given Element.
static org.jdom.Document getFilteredXML(org.jdom.Document xml, XSLHandle filter)
          uses the given XSLHandle to filter XML input.
static org.jdom.Document getFilteredXML(org.jdom.Element xml, XSLHandle filter)
          uses the given XSLHandle to filter XML input.
static org.jdom.Document getFilteredXML(java.lang.String xml, XSLHandle filter)
          uses the given XSLHandle to filter XML input.
static java.lang.String getFormattedXHTML(java.lang.String xhtmlInput)
          returns a formatted version of the given XHTML input.
static java.lang.String getHTMLText(org.jdom.Document html)
          extracts the textual content of a Document representing HTML code.
static java.lang.String getHTMLText(org.jdom.Element html)
          extracts the textual content of an Element representing HTML code.
static java.lang.String getPrettyString(org.jdom.Document d, boolean omitDecl)
          returns a pretty indented, human readable XML string for the given Document.
static java.lang.String getPrettyString(org.jdom.Element e, boolean omitDecl)
          returns a pretty indented, human readable for the given Element.
static java.lang.String getRawString(org.jdom.Document d, boolean omitDecl)
          returns a pretty indented, human readable XML string for the given Document.
static java.lang.String getRawString(org.jdom.Element e, boolean omitDecl)
          returns a raw XML string for the given Element.
static org.w3c.dom.Document getW3CDocument(org.jdom.Document d)
          converts the given JDOM document to a W3C document.
static org.w3c.dom.Document getW3CDocument(org.jdom.Element e)
          converts the given JDOM element to a W3C document.
static java.lang.String marshal(java.lang.Object o)
          utility method to do marshaling of a JAXB object.
static org.jdom.Document resolveRelativeLinks(org.jdom.Document html, java.net.URI baseURI)
          resolves all href and src attributes to the given base URI.
static java.lang.String toDateTime(java.util.Date d)
          converts the given Date to a RFC 3339 compliant date-time string.
static java.lang.String toDateTime(long ts)
          converts the given time-stamp to a RFC 3339 compliant date-time string.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

cleanHTML

public static org.jdom.Element cleanHTML(java.io.Reader input)
                                  throws org.jdom.JDOMException,
                                         java.io.IOException
cleans up faulty XHTML and returns a complete Document allowing unknown elements.

Parameters:
input - a reader for reading the XHTML
Returns:
the cleaned XHTML string as a Document within a root node <html/>.
Throws:
java.io.IOException - on i/o errors
org.jdom.JDOMException - if the parsing of the HTML fails

cleanHTML

public static org.jdom.Element cleanHTML(java.io.Reader input,
                                         boolean ignoreUnknownElements)
                                  throws org.jdom.JDOMException,
                                         java.io.IOException
leans up faulty XHTML and returns a complete Document.

Parameters:
input - a reader for reading the XHTML
ignoreUnknownElements - flag, indicating whether to ignore unknown elements
Returns:
the cleaned XHTML string as a Document within a root node <html/>.
Throws:
java.io.IOException - on i/o errors
org.jdom.JDOMException - if the parsing of the HTML fails

cleanHTML

public static org.jdom.Element cleanHTML(java.lang.String input)
                                  throws org.jdom.JDOMException,
                                         java.io.IOException
cleans up faulty XHTML and returns a complete Document allowing unknown elements.

Parameters:
input - a String containing the XHTML
Returns:
the cleaned XHTML string as a Document within a root node <html/>.
Throws:
java.io.IOException - on i/o errors
org.jdom.JDOMException - if the parsing of the HTML fails

cleanHTML

public static org.jdom.Element cleanHTML(java.lang.String input,
                                         boolean ignoreUnknownElements)
                                  throws org.jdom.JDOMException,
                                         java.io.IOException
cleans up faulty XHTML and returns a complete Document allowing unknown elements.

Parameters:
input - a String containing the XHTML
ignoreUnknownElements - flag, indicating whether to ignore unknown elements
Returns:
the cleaned XHTML string as a Document within a root node <html/>.
Throws:
java.io.IOException - on i/o errors
org.jdom.JDOMException - if the parsing of the HTML fails

getCompactString

public static java.lang.String getCompactString(org.jdom.Element e,
                                                boolean omitDecl)
returns a compact XML string for the given Element.

Parameters:
e - the element
omitDecl - true, if the XML declaration shall be omitted
Returns:
the compact string

getCompactString

public static java.lang.String getCompactString(org.jdom.Document d,
                                                boolean omitDecl)
returns a compact XML string for the given Document.

Parameters:
d - the document
omitDecl - true, if the XML declaration shall be omitted
Returns:
the compact string

getPrettyString

public static java.lang.String getPrettyString(org.jdom.Element e,
                                               boolean omitDecl)
returns a pretty indented, human readable for the given Element.

Parameters:
e - the element
omitDecl - true, if the XML declaration shall be omitted
Returns:
the pretty string

getPrettyString

public static java.lang.String getPrettyString(org.jdom.Document d,
                                               boolean omitDecl)
returns a pretty indented, human readable XML string for the given Document.

Parameters:
d - the document
omitDecl - true, if the XML declaration shall be omitted
Returns:
the pretty string

getRawString

public static java.lang.String getRawString(org.jdom.Element e,
                                            boolean omitDecl)
returns a raw XML string for the given Element.

Parameters:
e - the element
omitDecl - true, if the XML declaration shall be omitted
Returns:
the pretty string

getRawString

public static java.lang.String getRawString(org.jdom.Document d,
                                            boolean omitDecl)
returns a pretty indented, human readable XML string for the given Document.

Parameters:
d - the document
omitDecl - true, if the XML declaration shall be omitted
Returns:
the pretty string

getCleanedXHTML

public static org.jdom.Document getCleanedXHTML(java.lang.String xhtmlInput)
returns a cleaned Document for the given XHTML input.

Underneath tagsoup is used to auto-correct the input.

Parameters:
xhtmlInput - the XHTML to be converted to a Document
Returns:
the cleaned XHTML as a Document within a root node <body/>.
Throws:
IllegalXHTMLException - on errors cleaning the XHTML

getCleanedXHTML

public static org.jdom.Document getCleanedXHTML(java.io.Reader xhtmlInput)
returns a cleaned Document for the given XHTML input.

Underneath tagsoup is used to auto-correct the input.

Parameters:
xhtmlInput - the XHTML to be converted to a Document
Returns:
the cleaned XHTML as a Document within a root node <body/>.
Throws:
IllegalXHTMLException - on errors cleaning the XHTML

getCleanedXHTML

public static org.jdom.Document getCleanedXHTML(java.io.InputStream xhtmlInput)
returns a cleaned Document for the given XHTML input.

Underneath tagsoup is used to auto-correct the input.

Parameters:
xhtmlInput - the XHTML to be converted to a Document
Returns:
the cleaned XHTML as a Document within a root node <body/>.
Throws:
IllegalXHTMLException - on errors cleaning the XHTML

resolveRelativeLinks

public static org.jdom.Document resolveRelativeLinks(org.jdom.Document html,
                                                     java.net.URI baseURI)
resolves all href and src attributes to the given base URI.

Parameters:
html - the HTML to be processed
baseURI - the base URI for resolving
Returns:
the modified document

getFormattedXHTML

public static java.lang.String getFormattedXHTML(java.lang.String xhtmlInput)
returns a formatted version of the given XHTML input.

Parameters:
xhtmlInput - the XHTML to be formatted
Returns:
the formatted XHTML
Throws:
IllegalXHTMLException - on errors cleaning the XHTML

getFilteredXML

public static org.jdom.Document getFilteredXML(org.jdom.Document xml, XSLHandle filter)
uses the given XSLHandle to filter XML input.

The given XSL in the XSLHandle can be used to check user input for illegal or unwanted code and remove it automatically by transforming the input with the given XSL.

Parameters:
xml - the xml input
filter - the xsl filter
Returns:
the filtered xml

getFilteredXML

public static org.jdom.Document getFilteredXML(org.jdom.Element xml, XSLHandle filter)
uses the given XSLHandle to filter XML input.

The given XSL in the XSLHandle can be used to check user input for illegal or unwanted code and remove it automatically by transforming the input with the given XSL.

Parameters:
xml - the xml input
filter - the xsl filter
Returns:
the filtered xml

getFilteredXML

public static org.jdom.Document getFilteredXML(java.lang.String xml, XSLHandle filter)
uses the given XSLHandle to filter XML input.

The given XSL in the XSLHandle can be used to check user input for illegal or unwanted code and remove it automatically by transforming the input with the given XSL.

Parameters:
xml - the xml input
filter - the xsl filter
Returns:
the filtered xml

getHTMLText

public static java.lang.String getHTMLText(org.jdom.Element html)
extracts the textual content of an Element representing HTML code.

Parameters:
html - the HTML
Returns:
the textual content of the HTML

getHTMLText

public static java.lang.String getHTMLText(org.jdom.Document html)
extracts the textual content of a Document representing HTML code.

Parameters:
html - the HTML
Returns:
the textual content of the HTML

getW3CDocument

public static org.w3c.dom.Document getW3CDocument(org.jdom.Document d)
converts the given JDOM document to a W3C document.

Parameters:
d - the JDOM document
Returns:
the W3C document

getW3CDocument

public static org.w3c.dom.Document getW3CDocument(org.jdom.Element e)
converts the given JDOM element to a W3C document.

Parameters:
e - the JDOM element
Returns:
the W3C document

marshal

public static java.lang.String marshal(java.lang.Object o)
                                throws javax.xml.bind.JAXBException
utility method to do marshaling of a JAXB object.

Parameters:
o - the object to be marshaled
Returns:
the XML as a string
Throws:
javax.xml.bind.JAXBException - on errors

toDateTime

public static java.lang.String toDateTime(java.util.Date d)
converts the given Date to a RFC 3339 compliant date-time string.

Parameters:
d - the date to be converted
Returns:
the RFC 3339 compliant date-time

toDateTime

public static java.lang.String toDateTime(long ts)
converts the given time-stamp to a RFC 3339 compliant date-time string.

Parameters:
ts - the time-stamp
Returns:
the RFC 3339 compliant date-time

fromDateTime

public static java.util.Date fromDateTime(java.lang.String dt)
converts an RFC 3339 compliant date-time string to a Date.

Parameters:
dt - the RFC 3339 compliant date-time string
Returns:
the parsed Date or null, if the date-time could not be parsed.