|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.openexchange.mail.text.HTMLProcessing
public final class HTMLProcessing
HTMLProcessing - Various methods for HTML processing.
| Field Summary | |
|---|---|
static java.util.regex.Pattern |
PATTERN_LINK
The regular expression to match URLs and anchors inside text. |
static java.util.regex.Pattern |
PATTERN_LINK_WITH_GROUP
The regular expression to match URLs and anchors inside text. |
static java.util.regex.Pattern |
PATTERN_URL
The regular expression to match URLs inside text: \(? |
| Method Summary | |
|---|---|
static java.lang.String |
convertAndKeepQuotes(java.lang.String htmlContent,
Html2TextConverter converter)
Converts given HTML content into plain text, but keeps <blockquote> tags if any present. |
static org.w3c.dom.Document |
createDOMDocument(java.lang.String string)
Creates a DOM document from specified XML/HTML string. |
static java.lang.String |
filterExternalImages(java.lang.String htmlContent,
boolean[] modified)
Filters externally loaded images out of specified HTML content. |
static java.lang.String |
filterInlineImages(java.lang.String content,
com.openexchange.session.Session session,
MailPath msgUID)
Filters inline images occurring in HTML content of a message: Inline images The source of inline images is in the message itself. |
static java.lang.String |
filterWhitelist(java.lang.String htmlContent)
Filters specified HTML content according to white-list filter. |
static java.lang.String |
formatContentForDisplay(java.lang.String content,
java.lang.String charset,
boolean isHtml,
com.openexchange.session.Session session,
MailPath mailPath,
UserSettingMail usm,
boolean[] modified,
DisplayMode mode)
Performs all the formatting for both text and HTML content for a proper display according to specified user's mail settings. |
static java.lang.String |
formatHrefLinks(java.lang.String content)
Searches for non-HTML links and convert them to valid HTML links. |
static java.lang.String |
formatHTMLForDisplay(java.lang.String content,
java.lang.String charset,
com.openexchange.session.Session session,
MailPath mailPath,
UserSettingMail usm,
boolean[] modified,
DisplayMode mode)
Performs all the formatting for HTML content for a proper display according to specified user's mail settings. |
static java.lang.String |
formatTextForDisplay(java.lang.String content,
UserSettingMail usm,
DisplayMode mode)
Performs all the formatting for text content for a proper display according to specified user's mail settings. |
static java.lang.String |
getConformHTML(java.lang.String htmlContent,
ContentType contentType)
Creates valid HTML from specified HTML content conform to W3C standards. |
static java.lang.String |
getConformHTML(java.lang.String htmlContent,
java.lang.String charset)
Creates valid HTML from specified HTML content conform to W3C standards. |
static java.lang.Character |
getHTMLEntity(java.lang.String entity)
Maps specified HTML entity - e.g. |
static java.io.InputStream |
getTidyMessages()
Gets the messages used by JTidy as an input stream. |
static java.lang.String |
htmlFormat(java.lang.String plainText)
Formats plain text to HTML by escaping HTML special characters e.g. |
static java.lang.String |
htmlFormat(java.lang.String plainText,
boolean withQuote)
Formats plain text to HTML by escaping HTML special characters e.g. |
static java.lang.String |
prettyPrint(java.lang.String htmlContent)
Pretty prints specified HTML content. |
static java.lang.String |
prettyPrintXML(org.w3c.dom.Node node)
Pretty-prints specified XML/HTML node. |
static java.lang.String |
prettyPrintXML(java.lang.String string)
Pretty-prints specified XML/HTML string. |
static java.lang.String |
replaceHTMLEntities(java.lang.String content)
Replaces all HTML entities occurring in specified HTML content. |
static java.lang.String |
replaceHTMLSimpleQuotesForDisplay(java.lang.String htmlText)
Turns all simple quotes "> " occurring in specified HTML text to colored "<blockquote>" tags according to configured quote colors. |
static java.lang.String |
urlEncodeSafe(java.lang.String text,
java.lang.String charset)
Translates specified string into application/x-www-form-urlencoded format using a specific encoding scheme. |
static java.lang.String |
validate(java.lang.String htmlContent)
Validates specified HTML content with tidy html library. |
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final java.util.regex.Pattern PATTERN_URL
\(?\b(?:https?://|ftp://|mailto:|news\\.|www\.)[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|]
Parentheses, if present, are allowed in the URL -- The leading one is absorbed, too.
String s = matcher.group();
int mlen = s.length() - 1;
if (mlen > 0 && '(' == s.charAt(0) && ')' == s.charAt(mlen)) {
s = s.substring(1, mlen);
}
public static final java.util.regex.Pattern PATTERN_LINK
String s = matcher.group();
int mlen = s.length() - 1;
if (mlen > 0 && '(' == s.charAt(0) && ')' == s.charAt(mlen)) {
s = s.substring(1, mlen);
}
public static final java.util.regex.Pattern PATTERN_LINK_WITH_GROUP
String s = matcher.group(1);
int mlen = s.length() - 1;
if (mlen > 0 && '(' == s.charAt(0) && ')' == s.charAt(mlen)) {
s = s.substring(1, mlen);
}
| Method Detail |
|---|
public static java.lang.String formatTextForDisplay(java.lang.String content,
UserSettingMail usm,
DisplayMode mode)
content - The plain text contentusm - The settings used for formatting contentmode - The display mode
formatContentForDisplay(String, String, boolean, Session, MailPath, UserSettingMail, boolean[], DisplayMode)
public static java.lang.String formatHTMLForDisplay(java.lang.String content,
java.lang.String charset,
com.openexchange.session.Session session,
MailPath mailPath,
UserSettingMail usm,
boolean[] modified,
DisplayMode mode)
content - The HTML contentcharset - The character encodingsession - The sessionmailPath - The message's unique path in mailboxusm - The settings used for formatting contentmodified - A boolean array with length 1 to store modified status of external images filtermode - The display mode
formatContentForDisplay(String, String, boolean, Session, MailPath, UserSettingMail, boolean[], DisplayMode)
public static java.lang.String formatContentForDisplay(java.lang.String content,
java.lang.String charset,
boolean isHtml,
com.openexchange.session.Session session,
MailPath mailPath,
UserSettingMail usm,
boolean[] modified,
DisplayMode mode)
If content is plain text:
DisplayMode.MODIFYABLE is givenDisplayMode.DISPLAY is givenDisplayMode.DISPLAY is givenDisplayMode.DISPLAY is
given
content - The contentcharset - The character encoding (only needed by HTML content; may be null on plain text)isHtml - true if content is of type text/html; otherwise falsesession - The sessionmailPath - The message's unique path in mailboxusm - The settings used for formatting contentmodified - A boolean array with length 1 to store modified status of external images filter (only
needed by HTML content; may be null on plain text)mode - The display mode
public static java.lang.String formatHrefLinks(java.lang.String content)
Example: http://www.somewhere.com is converted to
<a href="http://www.somewhere.com">http://www.somewhere.com</a>.
content - The content to search in
public static java.lang.String getConformHTML(java.lang.String htmlContent,
ContentType contentType)
htmlContent - The HTML contentcontentType - The corresponding content type (including charset parameter)
public static java.lang.String getConformHTML(java.lang.String htmlContent,
java.lang.String charset)
htmlContent - The HTML contentcharset - The charset parameter
public static org.w3c.dom.Document createDOMDocument(java.lang.String string)
DOM document from specified XML/HTML string.
string - The XML/HTML string
null if given string cannot be transformed to a DOM documentpublic static java.lang.String prettyPrintXML(java.lang.String string)
string - The XML/HTML string to pretty-print
public static java.lang.String prettyPrintXML(org.w3c.dom.Node node)
node - The XML/HTML node pretty-print
public static java.io.InputStream getTidyMessages()
throws java.io.IOException
java.io.IOException - If input stream cannot be generatedpublic static java.lang.String validate(java.lang.String htmlContent)
htmlContent - The HTML content
public static java.lang.String prettyPrint(java.lang.String htmlContent)
htmlContent - The HTML content
public static java.lang.String convertAndKeepQuotes(java.lang.String htmlContent,
Html2TextConverter converter)
throws java.io.IOException
<blockquote> tags if any present.
htmlContent - The HTML contentconverter - The instance of Html2TextConverter
java.io.IOException - If an I/O error occurspublic static java.lang.String replaceHTMLEntities(java.lang.String content)
content - The content
public static java.lang.Character getHTMLEntity(java.lang.String entity)
ü - to corresponding ASCII character.
entity - The HTML entity
null
public static java.lang.String htmlFormat(java.lang.String plainText,
boolean withQuote)
"<" is converted to
"<".
plainText - The plain textwithQuote - Whether to escape quotes (") or not
public static java.lang.String htmlFormat(java.lang.String plainText)
"<" is converted to
"<".
This is just a convenience method which invokes with latter parameter set to
htmlFormat(String, boolean)true.
plainText - The plain text
htmlFormat(String, boolean)public static java.lang.String replaceHTMLSimpleQuotesForDisplay(java.lang.String htmlText)
htmlText - The HTML text
public static java.lang.String filterWhitelist(java.lang.String htmlContent)
htmlContent - The HTML content
public static java.lang.String filterExternalImages(java.lang.String htmlContent,
boolean[] modified)
htmlContent - The HTML contentmodified - A boolean array with length 1 to store modified status
public static java.lang.String filterInlineImages(java.lang.String content,
com.openexchange.session.Session session,
MailPath msgUID)
Content-Id; e.g.: <img
src="cid:[cid-value]" ... />.
content - The HTML content possibly containing imagessession - The sessionmsgUID - The message's unique path in mailbox
public static java.lang.String urlEncodeSafe(java.lang.String text,
java.lang.String charset)
text - The string to be translated.charset - The character encoding to use; should be UTF-8 according to W3C
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||