Package weka.core.xml
Class XMLDocument
java.lang.Object
weka.core.xml.XMLDocument
- All Implemented Interfaces:
RevisionHandler
- Direct Known Subclasses:
XMLInstances
This class offers some methods for generating, reading and writing
XML documents.
It can only handle UTF-8.
It can only handle UTF-8.
- Version:
- $Revision: 8034 $
- Author:
- FracPete (fracpete at waikato dot ac dot nz)
- See Also:
-
Field Summary
Modifier and TypeFieldDescriptionstatic final String
the "name" attribute.static final String
the "version" attribute.static final String
the ANY placeholder.static final String
the at least one marker.static final String
the AttList definition.static final String
the CDATA placeholder.static final String
the DocType definition.static final String
the Element definition.static final String
the #IMPLIED placeholder.static final String
the optional marker.static final String
the #PCDATA placeholder.static final String
the #REQUIRED placeholder.static final String
the option separator.static final String
the zero or more marker.static final String
the parsing instructions "<?xml version=\"1.0\" encoding=\"utf-8\"?>" (may not show up in Javadoc due to tags!).static final String
the value "no".static final String
the value "yes". -
Constructor Summary
ConstructorDescriptioninitializes the factory with non-validating parser.XMLDocument
(File file) Creates a new instance of XMLDocument.XMLDocument
(InputStream stream) Creates a new instance of XMLDocument.XMLDocument
(Reader reader) Creates a new instance of XMLDocument.XMLDocument
(String xml) Creates a new instance of XMLDocument. -
Method Summary
Modifier and TypeMethodDescriptionvoid
clear()
sets up an empty DOM document, with the current DOCTYPE and root node.evalBoolean
(String xpath) Evaluates and returns the boolean result of the XPath expression.evalDouble
(String xpath) Evaluates and returns the double result of the XPath expression.evalString
(String xpath) Evaluates and returns the boolean result of the XPath expression.Returns the nodes that the given xpath expression will find in the document.returns the DocumentBuilder.getChildTags
(Node parent) returns all non tag-children from the given node.getChildTags
(Node parent, String name) returns all non tag-children from the given node.static String
getContent
(Element node) returns the text between the opening and closing tag of a node (performs atrim()
on the result).returns the current DOCTYPE, can benull
.returns the parsed DOM document.returns the DocumentBuilderFactory.Returns the node represented by the XPath expression.Returns the revision string.returns the current root node.boolean
returns whether a validating parser is used.static void
for testing only.newDocument
(String docType, String rootNode) creates a new Document with the given information.void
print()
prints the current DOM document to standard out.parses the given file and returns a DOM document.read
(InputStream stream) parses the given stream and returns a DOM document.parses the given reader and returns a DOM document.parses the given XML string (can be XML or a filename) and returns a DOM Document.void
setDocType
(String docType) sets the DOCTYPE-String to use in the XML output.void
setDocument
(Document newDocument) sets the DOM document to use.void
setRootNode
(String rootNode) sets the root node to use in the XML output.void
setValidating
(boolean validating) sets whether to use a validating parser or not.
Note: this does clear the current DOM document!toString()
returns the current DOM document as XML-string.void
writes the current DOM document into the given file.void
write
(OutputStream stream) writes the current DOM document into the given stream.void
writes the current DOM document into the given writer.void
writes the current DOM document into the given file.
-
Field Details
-
PI
the parsing instructions "<?xml version=\"1.0\" encoding=\"utf-8\"?>" (may not show up in Javadoc due to tags!).- See Also:
-
DTD_DOCTYPE
the DocType definition.- See Also:
-
DTD_ELEMENT
the Element definition.- See Also:
-
DTD_ATTLIST
the AttList definition.- See Also:
-
DTD_OPTIONAL
the optional marker.- See Also:
-
DTD_AT_LEAST_ONE
the at least one marker.- See Also:
-
DTD_ZERO_OR_MORE
the zero or more marker.- See Also:
-
DTD_SEPARATOR
the option separator.- See Also:
-
DTD_CDATA
the CDATA placeholder.- See Also:
-
DTD_ANY
the ANY placeholder.- See Also:
-
DTD_PCDATA
the #PCDATA placeholder.- See Also:
-
DTD_IMPLIED
the #IMPLIED placeholder.- See Also:
-
DTD_REQUIRED
the #REQUIRED placeholder.- See Also:
-
ATT_VERSION
the "version" attribute.- See Also:
-
ATT_NAME
the "name" attribute.- See Also:
-
VAL_YES
the value "yes".- See Also:
-
VAL_NO
the value "no".- See Also:
-
-
Constructor Details
-
XMLDocument
initializes the factory with non-validating parser.- Throws:
Exception
- if the construction fails
-
XMLDocument
Creates a new instance of XMLDocument.- Parameters:
xml
- the xml to parse (if "<?xml" is not found then it is considered a file)- Throws:
Exception
- if the construction of the DocumentBuilder fails- See Also:
-
XMLDocument
Creates a new instance of XMLDocument.- Parameters:
file
- the XML file to parse- Throws:
Exception
- if the construction of the DocumentBuilder fails- See Also:
-
XMLDocument
Creates a new instance of XMLDocument.- Parameters:
stream
- the XML stream to parse- Throws:
Exception
- if the construction of the DocumentBuilder fails- See Also:
-
XMLDocument
Creates a new instance of XMLDocument.- Parameters:
reader
- the XML reader to parse- Throws:
Exception
- if the construction of the DocumentBuilder fails- See Also:
-
-
Method Details
-
getFactory
returns the DocumentBuilderFactory.- Returns:
- the DocumentBuilderFactory
-
getBuilder
returns the DocumentBuilder.- Returns:
- the DocumentBuilder
-
getValidating
public boolean getValidating()returns whether a validating parser is used.- Returns:
- whether a validating parser is used
-
setValidating
sets whether to use a validating parser or not.
Note: this does clear the current DOM document!- Parameters:
validating
- whether to use a validating parser- Throws:
Exception
- if the instantiating of the DocumentBuilder fails
-
getDocument
returns the parsed DOM document.- Returns:
- the parsed DOM document
-
setDocument
sets the DOM document to use.- Parameters:
newDocument
- the DOM document to use
-
setDocType
sets the DOCTYPE-String to use in the XML output. Performs NO checking! if it isnull
the DOCTYPE is omitted.- Parameters:
docType
- the DOCTYPE definition to use in XML output
-
getDocType
returns the current DOCTYPE, can benull
.- Returns:
- the current DOCTYPE definition, can be
null
-
setRootNode
sets the root node to use in the XML output. Performs NO checking with DOCTYPE!- Parameters:
rootNode
- the root node to use in the XML output
-
getRootNode
returns the current root node.- Returns:
- the current root node
-
clear
public void clear()sets up an empty DOM document, with the current DOCTYPE and root node.- See Also:
-
newDocument
creates a new Document with the given information.- Parameters:
docType
- the DOCTYPE definition (no checking happens!), can be nullrootNode
- the name of the root node (must correspond to the one given indocType
)- Returns:
- returns the just created DOM document for convenience
-
read
parses the given XML string (can be XML or a filename) and returns a DOM Document.- Parameters:
xml
- the xml to parse (if "<?xml" is not found then it is considered a file)- Returns:
- the parsed DOM document
- Throws:
Exception
- if something goes wrong with the parsing
-
read
parses the given file and returns a DOM document.- Parameters:
file
- the XML file to parse- Returns:
- the parsed DOM document
- Throws:
Exception
- if something goes wrong with the parsing
-
read
parses the given stream and returns a DOM document.- Parameters:
stream
- the XML stream to parse- Returns:
- the parsed DOM document
- Throws:
Exception
- if something goes wrong with the parsing
-
read
parses the given reader and returns a DOM document.- Parameters:
reader
- the XML reader to parse- Returns:
- the parsed DOM document
- Throws:
Exception
- if something goes wrong with the parsing
-
write
writes the current DOM document into the given file.- Parameters:
file
- the filename to write to- Throws:
Exception
- if something goes wrong with the parsing
-
write
writes the current DOM document into the given file.- Parameters:
file
- the filename to write to- Throws:
Exception
- if something goes wrong with the parsing
-
write
writes the current DOM document into the given stream.- Parameters:
stream
- the filename to write to- Throws:
Exception
- if something goes wrong with the parsing
-
write
writes the current DOM document into the given writer.- Parameters:
writer
- the filename to write to- Throws:
Exception
- if something goes wrong with the parsing
-
getChildTags
returns all non tag-children from the given node.- Parameters:
parent
- the node to get the children from- Returns:
- a vector containing all the non-text children
-
getChildTags
returns all non tag-children from the given node.- Parameters:
parent
- the node to get the children fromname
- the name of the tags to return, "" for all- Returns:
- a vector containing all the non-text children
-
findNodes
Returns the nodes that the given xpath expression will find in the document. Can return null if an error occurred.- Parameters:
xpath
- the XPath expression to run on the document- Returns:
- the nodelist
-
getNode
Returns the node represented by the XPath expression. Can return null if an error occurred.- Parameters:
xpath
- the XPath expression to run on the document- Returns:
- the node
-
evalBoolean
Evaluates and returns the boolean result of the XPath expression.- Parameters:
xpath
- the expression to evaluate- Returns:
- the result of the evaluation, null in case of an error
-
evalDouble
Evaluates and returns the double result of the XPath expression.- Parameters:
xpath
- the expression to evaluate- Returns:
- the result of the evaluation, null in case of an error
-
evalString
Evaluates and returns the boolean result of the XPath expression.- Parameters:
xpath
- the expression to evaluate- Returns:
- the result of the evaluation
-
getContent
returns the text between the opening and closing tag of a node (performs atrim()
on the result).- Parameters:
node
- the node to get the text from- Returns:
- the content of the given node
-
print
public void print()prints the current DOM document to standard out. -
toString
returns the current DOM document as XML-string. -
getRevision
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Returns:
- the revision
-
main
for testing only. takes the name of an XML file as first arg, reads that file, prints it to stdout and if a second filename is given, writes the parsed document to that again.- Parameters:
args
- the commandline arguments- Throws:
Exception
- if something goes wrong
-