|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object com.quiotix.html.parser.HtmlVisitor com.quiotix.html.parser.HtmlScrubber
public class HtmlScrubber
HtmlScrubber is a Visitor which walks an HtmlDocument and cleans it up. It can change tags and tag attributes to uppercase or lowercase, strip out unnecessary quotes from attribute values, and strip trailing spaces before a newline.
Field Summary | |
---|---|
static int |
ATTR_DOWNCASE
Set attribute case to lower. |
static int |
ATTR_UPCASE
Set attribute case to upper. |
static int |
DEFAULT_OPTIONS
Defaults: downcase tags and attributes, quote attributes. |
protected int |
flags
|
protected boolean |
inPreBlock
|
protected HtmlDocument.HtmlElement |
previousElement
|
static int |
QUOTE_ATTRS
Quote attributes. |
static int |
STRIP_QUOTES
Remove quotes. |
static int |
TAGS_DOWNCASE
Set tag case to lower. |
static int |
TAGS_UPCASE
Set tag case to upper. |
static int |
TRIM_SPACES
Trim spaces. |
Constructor Summary | |
---|---|
HtmlScrubber()
Create an HtmlScrubber with the default options (downcase tags and tag attributes, strip out unnecessary quotes). |
|
HtmlScrubber(int flags)
Create an HtmlScrubber with the desired set of options. |
Method Summary | |
---|---|
void |
start()
Start. |
void |
visit(HtmlDocument.Annotation a)
Visit an Annotation. |
void |
visit(HtmlDocument.Comment c)
Visit a Comment. |
void |
visit(HtmlDocument.EndTag t)
Visit an EndTag. |
void |
visit(HtmlDocument.Newline n)
Visit a Newline. |
void |
visit(HtmlDocument.Tag t)
Visit a Tag. |
void |
visit(HtmlDocument.TagBlock bl)
Visit a TagBlock. |
void |
visit(HtmlDocument.Text t)
Visit Text. |
Methods inherited from class com.quiotix.html.parser.HtmlVisitor |
---|
finish, visit, visit |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final int TAGS_UPCASE
public static final int TAGS_DOWNCASE
public static final int ATTR_UPCASE
public static final int ATTR_DOWNCASE
public static final int STRIP_QUOTES
public static final int TRIM_SPACES
public static final int QUOTE_ATTRS
public static final int DEFAULT_OPTIONS
protected int flags
protected HtmlDocument.HtmlElement previousElement
protected boolean inPreBlock
Constructor Detail |
---|
public HtmlScrubber()
public HtmlScrubber(int flags)
flags
- A bitmask representing the desired scrubbing optionsMethod Detail |
---|
public void start()
HtmlVisitor
start
in class HtmlVisitor
public void visit(HtmlDocument.Tag t)
HtmlVisitor
visit
in class HtmlVisitor
public void visit(HtmlDocument.EndTag t)
HtmlVisitor
visit
in class HtmlVisitor
public void visit(HtmlDocument.Text t)
HtmlVisitor
visit
in class HtmlVisitor
public void visit(HtmlDocument.Comment c)
HtmlVisitor
visit
in class HtmlVisitor
public void visit(HtmlDocument.Newline n)
HtmlVisitor
visit
in class HtmlVisitor
public void visit(HtmlDocument.Annotation a)
HtmlVisitor
visit
in class HtmlVisitor
public void visit(HtmlDocument.TagBlock bl)
HtmlVisitor
visit
in class HtmlVisitor
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |