Java API解析名称空间的几种方法-java api是干嘛的

前提条件和示例

本文所有的示例均使用如下这个XML文件：

清单1. 示例XML




  
    
    Michael Schmidt
  
  
    
    Johann Wolfgang von Goethe
  
  
    
    Johann Wolfgang von Goethe
  
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.

这个 XML 示例包含三个在根元素内声明的名称空间，一个在此结构的更深层元素上声明的名称空间。您将可以看到这种设置所带来的差异。

这个 XML 示例的第二个有趣之处在于元素 booklist 具有三个子元素，均名为 book。但是第一个子元素具有名称空间 science，而其他子元素则具有名称空间 fiction。这意味着这些元素完全有别于 XPath。在接下来的这些例子中，您将可以看到这种特性产生的结果。

示例源代码中有一个需要注意之处：此代码没有针对维护进行优化，只针对可读性进行了优化。这意味着它将具有某些冗余。输出通过 System.out.println() 以最为简单的方式生成。在本文中有关输出的代码行均缩写为 “...”。

理论背景

名称空间究竟有何意义？为何要如此关注它呢？名称空间是元素或属性的标识符的一部分。元素或属性可以具有相同的本地名称，但是必须使用不同的名称空间。它们完全不同。请参考上述示例（science:book 和 fiction:book）。若要综合来自不同资源的 XML 文件，就需要使用名称空间来解决命名冲突。以一个 XSLT 文件为例。它包含 XSLT 名称空间的元素、来自您自己名称空间的元素以及（通常）XHTML 名称空间的元素。使用名称空间，就可以避免具有相同本地名称的元素所带来的不确定性。

名称空间由 URI（在本例中为 http://univNaSpResolver/booklist）定义。为了避免使用这个长字符串，可以定义一个与此 URI 相关联的前缀（在本例中为 books）。请记住此前缀类似于一个变量：其名称并不重要。如果两个前缀引用相同的 URI，那么被加上前缀的元素的名称空间将是相同的（请参见清单 5 中的示例 1）。

XPath 表达式使用前缀（比如 books:booklist/science:book）并且您必须提供与每个前缀相关联的 URI。这时，就需要使用 NamespaceContext。它恰好能够实现此目的。

本文给出了提供前缀和 URI 之间的映射的不同方式。

在此 XML 文件中，映射由类似 xmlns:books="http://univNaSpResolver/booklist" 这样的 xmlns 属性或 xmlns="http://univNaSpResolver/book"（默认名称空间）提供。

提供名称空间解析的必要性

如果 XML 使用了名称空间，若不提供 NamespaceContext，那么 XPath 表达式将会失效。清单 2 中的示例 0 充分展示了这一点。其中的 XPath 对象在所加载的 XML 文档之上构建和求值。首先，尝试不用任何名称空间前缀（result1）编写此表达式。之后，再用名称空间前缀（result2）编写此表达式。

清单 2. 无名称空间解析的示例 0

private static void example0(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** Zero example - no namespaces provided ***");

        XPath xPath = XPathFactory.newInstance().newXPath();

...
        NodeList result1 = (NodeList) xPath.evaluate("booklist/book", example,
                XPathConstants.NODESET);
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/science:book", example, XPathConstants.NODESET);
...
    }1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.

输出如下所示。

清单 3. 示例 0 的输出

*** Zero example - no namespaces provided ***
First try asking without namespace prefix:
--> booklist/book
Result is of length 0
Then try asking with namespace prefix:
--> books:booklist/science:book
Result is of length 0
The expression does not work in both cases.1.
2.
3.
4.
5.
6.
7.
8.

在两种情况下，XPath 求值并不返回任何节点，而且也没有任何异常。XPath 找不到节点，因为缺少前缀到 URI 的映射。

硬编码的名称空间解析

也可以以硬编码的值来提供名称空间，类似于清单 4 中的类：

清单 4. 硬编码的名称空间解析

public class HardcodedNamespaceResolver implements NamespaceContext {

    /**
     * This method returns the uri for all prefixes needed. Wherever possible
     * it uses XMLConstants.
     * 
     * @param prefix
     * @return uri
     */
    public String getNamespaceURI(String prefix) {
        if (prefix == null) {
            throw new IllegalArgumentException("No prefix provided!");
        } else if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
            return "http://univNaSpResolver/book";
        } else if (prefix.equals("books")) {
            return "http://univNaSpResolver/booklist";
        } else if (prefix.equals("fiction")) {
            return "http://univNaSpResolver/fictionbook";
        } else if (prefix.equals("technical")) {
            return "http://univNaSpResolver/sciencebook";
        } else {
            return XMLConstants.NULL_NS_URI;
        }
    }

    public String getPrefix(String namespaceURI) {
        // Not needed in this context.
        return null;
    }

    public Iterator getPrefixes(String namespaceURI) {
        // Not needed in this context.
        return null;
    }

}1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.

请注意名称空间 http://univNaSpResolver/sciencebook 被绑定到了前缀 technical（不是之前的 science）。结果将可以在随后的示例（清单 6）中看到。在清单 5 中，使用此解析器的代码还使用了新的前缀。

清单 5. 具有硬编码名称空间解析的示例 1

private static void example1(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** First example - namespacelookup hardcoded ***");

        XPath xPath = XPathFactory.newInstance().newXPath();
        xPath.setNamespaceContext(new HardcodedNamespaceResolver());

...
        NodeList result1 = (NodeList) xPath.evaluate(
                "books:booklist/technical:book", example,
                XPathConstants.NODESET);
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/fiction:book", example, XPathConstants.NODESET);
...
        String result = xPath.evaluate("books:booklist/technical:book/:author",
                example);
...
    }1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.

如下是此示例的输出。

清单 6. 示例 1 的输出

*** First example - namespacelookup hardcoded ***
Using any namespaces results in a NodeList:
--> books:booklist/technical:book
Number of Nodes: 1

  
    
    Michael Schmidt
  
--> books:booklist/fiction:book
Number of Nodes: 2

  
    
    Johann Wolfgang von Goethe
  

  
    
    Johann Wolfgang von Goethe
  
The default namespace works also:
--> books:booklist/technical:book/:author
Michael Schmidt1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.

如您所见，XPath 现在找到了节点。好处是您可以如您所希望的那样重命名前缀，我对前缀 science 就是这么做的。XML 文件包含前缀 science，而 XPath 则使用了另一个前缀 technical。由于这些 URI 都是相同的，所以节点均可被 XPath 找到。不利之处是您必须要在多个地方（XML、XSD、 XPath 表达式和此名称空间的上下文）维护名称空间。

从文档读取名称空间

名称空间及其前缀均存档在此 XML 文件内，因此可以从那里使用它们。实现此目的的最为简单的方式是将这个查找指派给该文档。

清单 7. 从文档直接进行名称空间解析

public class UniversalNamespaceResolver implements NamespaceContext {
    // the delegate
    private Document sourceDocument;

    /**
     * This constructor stores the source document to search the namespaces in
     * it.
     * 
     * @param document
     *            source document
     */
    public UniversalNamespaceResolver(Document document) {
        sourceDocument = document;
    }

    /**
     * The lookup for the namespace uris is delegated to the stored document.
     * 
     * @param prefix
     *            to search for
     * @return uri
     */
    public String getNamespaceURI(String prefix) {
        if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
            return sourceDocument.lookupNamespaceURI(null);
        } else {
            return sourceDocument.lookupNamespaceURI(prefix);
        }
    }

    /**
     * This method is not needed in this context, but can be implemented in a
     * similar way.
     */
    public String getPrefix(String namespaceURI) {
        return sourceDocument.lookupPrefix(namespaceURI);
    }

    public Iterator getPrefixes(String namespaceURI) {
        // not implemented yet
        return null;
    }

}1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.

请注意如下这些事项：

•如果文档在使用 XPath 前已更改，那么此更改还将反应在名称空间的这个查找上，因为指派是在需要的时候通过使用文档的当前版本完成的。

•对名称空间或前缀的查找在所用节点的祖先节点完成，在我们的例子中，即节点 sourceDocument。这意味着，借助所提供的代码，您只需在根节点上声明此名称空间。在我们的示例中，名称空间 science 没有被找到。

•此查找在 XPath 求值时被调用，因此它会消耗一些额外的时间。

如下是示例代码：

清单 8. 从文档直接进行名称空间解析的示例 2

private static void example2(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** Second example - namespacelookup delegated to document ***");

        XPath xPath = XPathFactory.newInstance().newXPath();
        xPath.setNamespaceContext(new UniversalNamespaceResolver(example));

        try {
...
            NodeList result1 = (NodeList) xPath.evaluate(
                    "books:booklist/science:book", example,
                    XPathConstants.NODESET);
...
        } catch (XPathExpressionException e) {
...
        }
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/fiction:book", example, XPathConstants.NODESET);
...
        String result = xPath.evaluate(
                "books:booklist/fiction:book[1]/:author", example);
...
    }1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.

此示例的输出为：

清单 9. 示例 2 的输出

*** Second example - namespacelookup delegated to document ***
Try to use the science prefix: no result
--> books:booklist/science:book
The resolver only knows namespaces of the first level!
To be precise: Only namespaces above the node, passed in the constructor.
The fiction namespace is such a namespace:
--> books:booklist/fiction:book
Number of Nodes: 2

  
    
    Johann Wolfgang von Goethe
  

  
    
    Johann Wolfgang von Goethe
  
The default namespace works also:
--> books:booklist/fiction:book[1]/:author
Johann Wolfgang von Goethe1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.

正如输出所示，在 book 元素上声明的、具有前缀 science 的名称空间并未被解析。求值方法抛出了一个 XPathExpressionException。要解决这个问题，需要从文档提取节点 science:book 并将此节点用作代表（delegate）。但是这将意味着对此文档要进行额外的解析，而且也不优雅。

从文档读取名称空间并缓存它们

NamespaceContext 的下一个版本要稍好一些。它只在构造函数内提前读取一次名称空间。对一个名称空间的每次调用均回应自缓存。这样一来，文档内的更改就变得无关紧要，因为名称空间列表在 Java 对象创建之时就已被缓存。

清单 10. 从文档缓存名称空间解析

public class UniversalNamespaceCache implements NamespaceContext {
    private static final String DEFAULT_NS = "DEFAULT";
    private Map prefix2Uri = new HashMap();
    private Map uri2Prefix = new HashMap();

    /**
     * This constructor parses the document and stores all namespaces it can
     * find. If toplevelOnly is true, only namespaces in the root are used.
     * 
     * @param document
     *            source document
     * @param toplevelOnly
     *            restriction of the search to enhance performance
     */
    public UniversalNamespaceCache(Document document, boolean toplevelOnly) {
        examineNode(document.getFirstChild(), toplevelOnly);
        System.out.println("The list of the cached namespaces:");
        for (String key : prefix2Uri.keySet()) {
            System.out
                    .println("prefix " + key + ": uri " + prefix2Uri.get(key));
        }
    }

    /**
     * A single node is read, the namespace attributes are extracted and stored.
     * 
     * @param node
     *            to examine
     * @param attributesOnly,
     *            if true no recursion happens
     */
    private void examineNode(Node node, boolean attributesOnly) {
        NamedNodeMap attributes = node.getAttributes();
        for (int i = 0; i < attributes.getLength(); i++) {
            Node attribute = attributes.item(i);
            storeAttribute((Attr) attribute);
        }

        if (!attributesOnly) {
            NodeList chields = node.getChildNodes();
            for (int i = 0; i < chields.getLength(); i++) {
                Node chield = chields.item(i);
                if (chield.getNodeType() == Node.ELEMENT_NODE)
                    examineNode(chield, false);
            }
        }
    }

    /**
     * This method looks at an attribute and stores it, if it is a namespace
     * attribute.
     * 
     * @param attribute
     *            to examine
     */
    private void storeAttribute(Attr attribute) {
        // examine the attributes in namespace xmlns
        if (attribute.getNamespaceURI() != null
                && attribute.getNamespaceURI().equals(
                        XMLConstants.XMLNS_ATTRIBUTE_NS_URI)) {
            // Default namespace xmlns="uri goes here"
            if (attribute.getNodeName().equals(XMLConstants.XMLNS_ATTRIBUTE)) {
                putInCache(DEFAULT_NS, attribute.getNodeValue());
            } else {
                // The defined prefixes are stored here
                putInCache(attribute.getLocalName(), attribute.getNodeValue());
            }
        }

    }

    private void putInCache(String prefix, String uri) {
        prefix2Uri.put(prefix, uri);
        uri2Prefix.put(uri, prefix);
    }

    /**
     * This method is called by XPath. It returns the default namespace, if the
     * prefix is null or "".
     * 
     * @param prefix
     *            to search for
     * @return uri
     */
    public String getNamespaceURI(String prefix) {
        if (prefix == null || prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
            return prefix2Uri.get(DEFAULT_NS);
        } else {
            return prefix2Uri.get(prefix);
        }
    }

    /**
     * This method is not needed in this context, but can be implemented in a
     * similar way.
     */
    public String getPrefix(String namespaceURI) {
        return uri2Prefix.get(namespaceURI);
    }

    public Iterator getPrefixes(String namespaceURI) {
        // Not implemented
        return null;
    }

}1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
55.
56.
57.
58.
59.
60.
61.
62.
63.
64.
65.
66.
67.
68.
69.
70.
71.
72.
73.
74.
75.
76.
77.
78.
79.
80.
81.
82.
83.
84.
85.
86.
87.
88.
89.
90.
91.
92.
93.
94.
95.
96.
97.
98.
99.
100.
101.
102.
103.
104.
105.
106.

请注意在代码中有一个调试输出。每个节点的属性均被检查和存储。但子节点不被检查，因为构造函数内的布尔值 toplevelOnly 被设置为 true。如果此布尔值被设为 false，那么子节点的检查将会在属性存储完毕后开始。有关此代码，有一点需要注意：在 DOM 中，第一个节点代表整个文档，所以，要让元素 book 读取这些名称空间，必须访问子节点刚好一次。

在这种情况下，使用 NamespaceContext 非常简单：

清单 11. 具有缓存了的名称空间解析的示例 3（只面向顶级）

private static void example3(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** Third example - namespaces of toplevel node cached ***");

        XPath xPath = XPathFactory.newInstance().newXPath();
        xPath.setNamespaceContext(new UniversalNamespaceCache(example, true));

        try {
...
            NodeList result1 = (NodeList) xPath.evaluate(
                    "books:booklist/science:book", example,
                    XPathConstants.NODESET);
...
        } catch (XPathExpressionException e) {
...
        }
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/fiction:book", example, XPathConstants.NODESET);
...
        String result = xPath.evaluate(
                "books:booklist/fiction:book[1]/:author", example);
...
    }1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.

这会导致如下输出：

清单 12. 示例 3 的输出

*** Third example - namespaces of toplevel node cached ***
The list of the cached namespaces:
prefix DEFAULT: uri http://univNaSpResolver/book
prefix fiction: uri http://univNaSpResolver/fictionbook
prefix books: uri http://univNaSpResolver/booklist
Try to use the science prefix:
--> books:booklist/science:book
The cache only knows namespaces of the first level!
The fiction namespace is such a namespace:
--> books:booklist/fiction:book
Number of Nodes: 2

  
    
    Johann Wolfgang von Goethe
  

  
    
    Johann Wolfgang von Goethe
  
The default namespace works also:
--> books:booklist/fiction:book[1]/:author
Johann Wolfgang von Goethe1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.

上述代码只找到了根元素的名称空间。更准确的说法是：此节点的名称空间被构造函数传递给了方法 examineNode。这会加速构造函数的运行，因它无需迭代整个文档。不过，正如您从输出看到的，science 前缀不能被解析。XPath 表达式导致了一个异常（XPathExpressionException）。

从文档及其所有元素读取名称空间并对之进行缓存

此版本将从这个 XML 文件读取所有名称空间声明。现在，即便是前缀 science 上的 XPath 也是有效的。但是有一种情况让此版本有些复杂：如果一个前缀重载（在不同 URI 上的嵌套元素内声明），所找到的最后一个将会 “胜出”。在实际中，这通常不成问题。

在本例中，NamespaceContext 的使用与前一个示例相同。构造函数内的布尔值 toplevelOnly 必须被设置为 false。

清单 13. 具有缓存了的名称空间解析的示例 4（面向所有级别）

private static void example4(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** Fourth example - namespaces all levels cached ***");

        XPath xPath = XPathFactory.newInstance().newXPath();
        xPath.setNamespaceContext(new UniversalNamespaceCache(example, false));
...
        NodeList result1 = (NodeList) xPath.evaluate(
                "books:booklist/science:book", example, XPathConstants.NODESET);
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/fiction:book", example, XPathConstants.NODESET);
...
        String result = xPath.evaluate(
                "books:booklist/fiction:book[1]/:author", example);
...
    }1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.

其输出结果如下：

清单 14. 示例 4 的输出

*** Fourth example - namespaces all levels cached ***
The list of the cached namespaces:
prefix science: uri http://univNaSpResolver/sciencebook
prefix DEFAULT: uri http://univNaSpResolver/book
prefix fiction: uri http://univNaSpResolver/fictionbook
prefix books: uri http://univNaSpResolver/booklist
Now the use of the science prefix works as well:
--> books:booklist/science:book
Number of Nodes: 1

  
    
    Michael Schmidt
  
The fiction namespace is resolved:
--> books:booklist/fiction:book
Number of Nodes: 2

  
    
    Johann Wolfgang von Goethe
  

  
    
    Johann Wolfgang von Goethe
  
The default namespace works also:
--> books:booklist/fiction:book[1]/:author
Johann Wolfgang von Goethe1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.

结束语

实现名称空间解析，有几种方式可供选择，这些方式大都好于硬编码的实现方式：

•如果示例很小并且所有名称空间均位于顶部元素内，指派到此文档的方式将会十分有效。

•如果 XML 文件较大且具有深层嵌套和多个 XPath 求值，最好是缓存名称空间的列表。

•但是如果您无法控制 XML 文件，并且别人可以发送给您任何前缀，最好是独立于他人的选择。您可以编码实现您自己的名称空间解析，如示例 1 （HardcodedNamespaceResolver）所示，并将它们用于您的 XPath 表达式。

在上述这些情况下，解析自此 XML 文件的 NamespaceContext 能够让您的代码更少、并且更为通用。

【编辑推荐】