java & xml学习笔记
1、需要软件
java,解析器(例如xerces),api(例如sax,dom)
2、sax机制
1)解析
string xmluri = "c:/test.xml";
string vendorparserclass = "org.apache.xerces.parsers.saxparser";
xmlreaer reader = xmlreaderfactory.createxmlreader(vendorparserclass);
inputsource inputsource = new inputsource(xmluri);
reader.parse(inputsource);
这样一个xml文档解析过程就完成了。因为sax是采用时间处理机制来解析xml
文档的,在解析过程中会触发一些事件,也就是执行特定的方法,你可以实现
这些方法,就可以通过解析xml来做一些事情了
2)处理
sax2.0定义的核心处理接口一共有
org.xml.sax.contenthandler
org.xml.sax.errorhandler
org.xml.sax.dtdhandler
org.xml.sax.entityresolver
这些接口是通过
org.xml.sax.xmlreader的setcontenthandler(),seteroorhandler(),
setdtdhandler(),setentityhandler()注册到解析器,这里面最重要的是
org.xml.sax.contenthandler接口,它具体如下
public interface contenthandler{
public void setdocumentlocator(locator locator);
public void startdocument() throws saxexception;
public void enddocument() throws saxexception;
public void startprefixmapping(string prefix,string uri)
throws saxexception;
public void endprefixmapping(string prifix)
throws saxexception;
public void startelement(string namespaceuri,string localname,
string qname,attributes atts) throws saxexception;
public void endelement(string namespaceuri,string localname,
string qname) throws saxexception;
public void characters(char ch[],int start,int length)
throws saxexception;
public void ignorablewhitespace(char ch[],int start,int length)
throws saxexception;
public void processinginstruction(string target,string data)
throws saxexception;
public void skippedentity(string name)
throws saxexception;
}
通过setcontenthandler()将你实现的contenthandler注册给xmlreader之后,
在解析过程中,系统根据各种条件执行接口中的方法,下面简单说明一下
1)文档定位器
private locator locator;
public void setdocumentlocator(locator locator){
this.locator = locator;
}
通常情况下,你只要如此实现就可以了,这个主要是得到当前解析的位置,
通过得到的locator,你可以使用它的getlinenumber(),getcolumnname()等
方法,可以得到文档当前的位置,但要注意的是,这个locator不能保存,只
针对当前的解析有效
2)文档的开头和结尾
public void startdocument() throws saxexception{
//解析过程中仅位于setdocumentlocator()方法后调用
}
public void enddocument() throws saxexception{
//解析过程中最后调用
}
大多数情况下你可以不用理他们,只要写个空方法就可以了
3)名字空间的开始和结束
public void startprefixmapping(string prefix,string uri)
throws saxexception{
}
public void endprefixmapping(string prifix)
throws saxexception{
}
4)元素的开始和结束
public void startelement(string namespaceuri,string localname,
string qname,attributes atts) throws saxexception{
}
public void endelement(string namespaceuri,string localname,
string qname) throws saxexception{
}
5)元素的数据
public void characters(char ch[],int start,int length)
throws saxexception{
string s = new string(ch,start,length);
}
这个是得到当前的元素的文本数据
6)可以忽略的空白
public void ignorablewhitespace(char ch[],int start,int length)
throws saxexception{
}
7)实体
public void skippedentity(string name)
throws saxexception{
}
8)指令处理
public void processinginstruction(string target,string data)
throws saxexception{
}
3)例子:这个是从java & xml 中复制过来的,
/*
* created on 2004-11-30
*
* todo to change the template for this generated file go to
* window – preferences – java – code style – code templates
*/
package javaxml2;
/**
* @author yuangfang
*
* todo to change the template for this generated type comment go to
* window – preferences – java – code style – code templates
*/
import java.io.*;
import java.util.*;
import org.xml.sax.*;
import org.xml.sax.ext.lexicalhandler;
import org.xml.sax.helpers.xmlreaderfactory;
import java.awt.*;
import javax.swing.*;
import javax.swing.tree.*;
public class saxtreeviewer extends jframe{
private string vendorparserclass = "org.apache.xerces.parsers.saxparser";
private jtree jtree;
defaulttreemodel defaulttreemodel;
public saxtreeviewer(){
super("sax tree viewer");
setsize(600,450);
}
public void init(string xmluri) throws ioexception,saxexception{
defaultmutabletreenode base = new defaultmutabletreenode("xml document:" + xmluri);
defaulttreemodel = new defaulttreemodel(base);
jtree = new jtree(defaulttreemodel);
buildtree(defaulttreemodel,base,xmluri);
getcontentpane().add(new jscrollpane(jtree),borderlayout.center);
}
public void buildtree(defaulttreemodel treemodel,defaultmutabletreenode base,string xmluri)
throws ioexception,saxexception{
string featureuri = "";
try{
xmlreader reader = xmlreaderfactory.createxmlreader(vendorparserclass);
contenthandler jtreecontenthandler = new jtreecontenthandler(treemodel,base);
errorhandler jtreeerrorhandler = new jtreeerrorhandler();
reader.setcontenthandler(jtreecontenthandler);
reader.seterrorhandler(jtreeerrorhandler);
reader.setentityresolver(new simpleentityresolver());
featureuri = "http://xml.org/sax/features/validation";
reader.setfeature(featureuri,true);
featureuri = "http://xml.org/sax/features/namespaces";
setnamespaceprocessing(reader,true);
featureuri = "http://xml.org/sax/features/string-interning";
reader.setfeature(featureuri,true);
featureuri = "http://apache.org/xml/features/validation/schema";
reader.setfeature(featureuri,false);
inputsource inputsource = new inputsource(xmluri);
reader.parse(inputsource);
}
catch(saxnotrecognizedexception e){
system.out.println("the parse class " + vendorparserclass
+ " does not recognize the feature uri " + featureuri);
system.exit(0);
}
catch(saxnotsupportedexception e){
system.out.println("the parser class " + vendorparserclass +
" does not support the feature uri " + featureuri);
}
}
private void setnamespaceprocessing(xmlreader reader,boolean state)
throws saxnotsupportedexception,saxnotrecognizedexception
{
reader.setfeature("http://xml.org/sax/features/namespaces",state);
reader.setfeature("http://xml.org/sax/features/namespace-prefixes",!state);
}
public static void main(string[] args) {
try{
if(args.length != 1){
system.out.println("usage:java javaxml2.saxtreeviewer " + "[xml document uri]");
system.exit(0);
}
saxtreeviewer viewer = new saxtreeviewer();
viewer.init(args[0]);
viewer.setvisible(true);
}catch(exception e)
{
e.printstacktrace();
}
}
}
class jtreecontenthandler implements contenthandler,lexicalhandler{
private defaulttreemodel treemodel;
private defaultmutabletreenode current;
private locator locator;
private map namespacemappings;
/* (non-javadoc)
* @see org.xml.sax.ext.lexicalhandler#comment(char[], int, int)
*/
public void comment(char[] ch, int start, int length) throws saxexception {
// todo auto-generated method stub
}
/* (non-javadoc)
* @see org.xml.sax.ext.lexicalhandler#endcdata()
*/
public void endcdata() throws saxexception {
// todo auto-generated method stub
}
/* (non-javadoc)
* @see org.xml.sax.ext.lexicalhandler#enddtd()
*/
public void enddtd() throws saxexception {
// todo auto-generated method stub
}
/* (non-javadoc)
* @see org.xml.sax.ext.lexicalhandler#endentity(java.lang.string)
*/
public void endentity(string name) throws saxexception {
// todo auto-generated method stub
current = (defaultmutabletreenode)current.getparent();
}
/* (non-javadoc)
* @see org.xml.sax.ext.lexicalhandler#startcdata()
*/
public void startcdata() throws saxexception {
// todo auto-generated method stub
}
/* (non-javadoc)
* @see org.xml.sax.ext.lexicalhandler#startdtd(java.lang.string, java.lang.string, java.lang.string)
*/
public void startdtd(string name, string publicid, string systemid)
throws saxexception {
// todo auto-generated method stub
system.out.println("start dtd");
defaultmutabletreenode dtdreference = new defaultmutabletreenode("dtd for " + name + "");
if(publicid != null)
{
defaultmutabletreenode publicidnode = new defaultmutabletreenode("public id: " + publicid + "");
dtdreference.add(publicidnode);
}
if(systemid != null)
{
defaultmutabletreenode systemidnode = new defaultmutabletreenode("system id: " + systemid + "");
dtdreference.add(systemidnode);
}
current.add(dtdreference);
}
/* (non-javadoc)
* @see org.xml.sax.ext.lexicalhandler#startentity(java.lang.string)
*/
public void startentity(string name) throws saxexception {
// todo auto-generated method stub
defaultmutabletreenode entity = new defaultmutabletreenode("entity: " + name + "");
current.add(entity);
current = entity;
}
public jtreecontenthandler(defaulttreemodel treemodel,defaultmutabletreenode base)
{
this.treemodel = treemodel;
this.current = base;
this.namespacemappings = new hashmap();
}
/* (non-javadoc)
* @see org.xml.sax.contenthandler#setdocumentlocator(org.xml.sax.locator)
*/
public void setdocumentlocator(locator locator) {
// todo auto-generated method stub
this.locator = locator;
}
/* (non-javadoc)
* @see org.xml.sax.contenthandler#startdocument()
*/
public void startdocument() throws saxexception {
// todo auto-generated method stub
system.out.println("start document");
}
/* (non-javadoc)
* @see org.xml.sax.contenthandler#enddocument()
*/
public void enddocument() throws saxexception {
// todo auto-generated method stub
system.out.println("end document");
}
/* (non-javadoc)
* @see org.xml.sax.contenthandler#startprefixmapping(java.lang.string, java.lang.string)
*/
public void startprefixmapping(string prefix, string uri) throws saxexception {
// todo auto-generated method stub
namespacemappings.put(uri,prefix);
system.out.println("start prefixmapping " + prefix);
}
/* (non-javadoc)
* @see org.xml.sax.contenthandler#endprefixmapping(java.lang.string)
*/
public void endprefixmapping(string prefix) throws saxexception {
// todo auto-generated method stub
for(iterator i = namespacemappings.keyset().iterator();i.hasnext();)
{
string uri = (string) i.next();
string thisprefix = (string)namespacemappings.get(uri);
if(prefix.equals(thisprefix)){
namespacemappings.remove(uri);
break;
}
}
system.out.println("end prefixmapping " + prefix);
}
/* (non-javadoc)
* @see org.xml.sax.contenthandler#startelement(java.lang.string, java.lang.string, java.lang.string, org.xml.sax.attributes)
*/
public void startelement(string uri, string localname, string qname, attributes atts) throws saxexception {
// todo auto-generated method stub
defaultmutabletreenode element = new defaultmutabletreenode("element: " + localname + " at line " + locator.getlinenumber());
current.add(element);
current = element;
if(uri.length() > 0)
{
string prefix = (string)namespacemappings.get(uri);
if(prefix.equals("")){
prefix = "[none]";
}
defaultmutabletreenode namespace = new defaultmutabletreenode("namespace: prefix = " +
prefix + ",uri = " + uri + "");
current.add(namespace);
}
for(int i = 0;i<atts.getlength();i++)
{
defaultmutabletreenode attribute = new defaultmutabletreenode("attribute (name = " +
atts.getlocalname(i) + ",value = " + atts.getvalue(i) + ")");
string atturi = atts.geturi(i);
if(atturi.length() > 0)
{
string attprefix = (string)namespacemappings.get(atturi);
if(attprefix.equals("")){
attprefix = "[none]";
}
defaultmutabletreenode attnamespace = new defaultmutabletreenode("namespace: prefix = " +
attprefix + ",uri = " + atturi + "");
attribute.add(attnamespace);
}
current.add(attribute);
}
}
/* (non-javadoc)
* @see org.xml.sax.contenthandler#endelement(java.lang.string, java.lang.string, java.lang.string)
*/
public void endelement(string uri, string localname, string qname) throws saxexception {
// todo auto-generated method stub
current = (defaultmutabletreenode)current.getparent();
}
/* (non-javadoc)
* @see org.xml.sax.contenthandler#characters(char[], int, int)
*/
public void characters(char[] ch, int start, int length) throws saxexception {
// todo auto-generated method stub
string s = new string(ch,start,length);
defaultmutabletreenode data = new defaultmutabletreenode("character data: " + s + "");
current.add(data);
}
/* (non-javadoc)
* @see org.xml.sax.contenthandler#ignorablewhitespace(char[], int, int)
*/
public void ignorablewhitespace(char[] ch, int start, int length) throws saxexception {
// todo auto-generated method stub
}
/* (non-javadoc)
* @see org.xml.sax.contenthandler#processinginstruction(java.lang.string, java.lang.string)
*/
public void processinginstruction(string target, string data) throws saxexception {
// todo auto-generated method stub
defaultmutabletreenode pi = new defaultmutabletreenode("pi (target = "
+ target + ", data = " + data + ")");
current.add(pi);
}
/* (non-javadoc)
* @see org.xml.sax.contenthandler#skippedentity(java.lang.string)
*/
public void skippedentity(string name) throws saxexception {
// todo auto-generated method stub
defaultmutabletreenode skipped = new defaultmutabletreenode("skipped entity: " + name + "");
current.add(skipped);
}
}
class jtreeerrorhandler implements errorhandler{
/* (non-javadoc)
* @see org.xml.sax.errorhandler#warning(org.xml.sax.saxparseexception)
*/
public void warning(saxparseexception exception) throws saxexception {
// todo auto-generated method stub
system.out.println("**parsing warning**\n" +
" line: " +
exception.getlinenumber() + "\n" +
" uri: " +
exception.getsystemid() + "\n" +
" message:" +
exception.getmessage());
throw new saxexception("warning encountered");
}
/* (non-javadoc)
* @see org.xml.sax.errorhandler#error(org.xml.sax.saxparseexception)
*/
public void error(saxparseexception exception) throws saxexception {
// todo auto-generated method stub
system.out.println("**parsing error**\n" +
" line: " +
exception.getlinenumber() + "\n" +
" uri: " +
exception.getsystemid() + "\n" +
" message:" +
exception.getmessage());
throw new saxexception("error encounted");
}
/* (non-javadoc)
* @see org.xml.sax.errorhandler#fatalerror(org.xml.sax.saxparseexception)
*/
public void fatalerror(saxparseexception exception) throws saxexception {
// todo auto-generated method stub
system.out.println("**parsing fatal error**\n" +
" line: " +
exception.getlinenumber() + "\n" +
" uri: " +
exception.getsystemid() + "\n" +
" message:" +
exception.getmessage());
throw new saxexception("fatal error encounted");
}
}
xml文件如下:你可以不用这个xml,用别的xml文件也可以
<?xml version="1.0"?>
<!doctype book system "dtd/javaxml.dtd">
<!– java and xml contents –>
<book xmlns="http://www.oreilly.com/javaxml2" xmlns:ora="http://www.oreilly.com" >
<title ora:series="java">java and xml</title>
<!– chapter list –>
<contents>
<chapter title="introduction" number="1">
<topic name="xml matters"/>
<topic name="whats important"/>
<topic name="the essentials"/>
<topic name="what's next?"/>
</chapter>
<chapter title="nuts and bolts" number="2">
<topic name="the basics"/>
<topic name="constraints"/>
<topic name="transformations"/>
<topic name="and more…"/>
<topic name="what's next?"/>
</chapter>
<chapter title="sax" number="3">
<topic name="getting prepared"/>
<topic name="sax readers"/>
<topic name="content handlers"/>
<topic name="gotcha!"/>
<topic name="what's next?"/>
</chapter>
<chapter title="advanced sax" number="4">
<topic name="properties and features"/>
<topic name="more handlers"/>
<topic name="filters and writers"/>
<topic name="even more handlers"/>
<topic name="gotcha!"/>
<topic name="what's next?"/>
</chapter>
<chapter title="dom" number="5">
<topic name="the document object model"/>
<topic name="serialization"/>
<topic name="mutability"/>
<topic name="gotcha!"/>
<topic name="what's next?"/>
</chapter>
<chapter title="advanced dom" number="6">
<topic name="dom and mutation"/>
<topic name="namespaces and dom level 2"/>
<topic name="dom and html"/>
<topic name="dom level 3"/>
<topic name="gotcha!"/>
<topic name="what's next?"/>
</chapter>
<chapter title="jdom" number="7">
<topic name="the basics"/>
<topic name="propstoxml"/>
<topic name="xmlproperties"/>
<topic name="is jdom a standard?"/>
<topic name="gotcha!"/>
<topic name="what's next?"/>
</chapter>
<chapter title="advanced jdom" number="8">
<topic name="the whole ball of wax"/>
<topic name="jdom and factories"/>
<topic name="wrappers and decorators"/>
<topic name="gotcha!"/>
<topic name="what's next?"/>
</chapter>
<chapter title="jaxp" number="9">
<topic name="api or abstraction?"/>
<topic name="jaxp 1.0"/>
<topic name="jaxp 1.1"/>
<topic name="gotcha!"/>
<topic name="what's next?"/>
</chapter>
<chapter title="web publishing frameworks" number="10">
<topic name="selecting a framework"/>
<topic name="installation"/>
<topic name="using a publishing framework"/>
<topic name="xsp"/>
<topic name="cocoon 2.0 and beyond"/>
<topic name="what's next?"/>
</chapter>
<chapter title="xml-rpc" number="11">
<topic name="rpc versus rmi"/>
<topic name="saying hello"/>
<topic name="the real world"/>
<topic name="what's next?"/>
</chapter>
<chapter title="soap" number="12">
<topic name="starting out"/>
<topic name="setting up"/>
<topic name="getting dirty"/>
<topic name="going further"/>
<topic name="what's next?"/>
</chapter>
<chapter title="web services" number="13">
<topic name="web services"/>
<topic name="uddi"/>
<topic name="wsdl"/>
<topic name="putting it all together"/>
<topic name="what's next?"/>
</chapter>
<chapter title="content syndication" number="14">
<topic name="the foobar public library"/>
<topic name="mytechbooks.comi"/>
<topic name="push versus pull"/>
<topic name="what's next?"/>
</chapter>
<chapter title="xml data binding" number="15">
<topic name="first principles"/>
<topic name="castor"/>
<topic name="zeus"/>
<topic name="jaxb"/>
<topic name="what's next?"/>
</chapter>
<chapter title="looking forward" number="16">
<topic name="xlink"/>
<topic name="xpointer"/>
<topic name="xml schema bindings"/>
<topic name="and the rest…"/>
<topic name="what's next?"/>
</chapter>
</contents>
<ora:copyright>&oreillycopyright;</ora:copyright>
</book>
