几个月来,疑被SOA,一直在和xml操作打交道,SQL差不多又忘光了。现在已经知道,至少有四种常用人XML数据操作方式(好像java差不多),不过还没有实际比较过这些方式各有哪些特点或优劣。正好看到网上也没有这方面的实验,偶来总结一下。
测试开始先读取XML源,用一个比较大的rss文件链接,复制到项目bin/debug目录下。
- Stream xmlStream = new MemoryStream(File.ReadAllBytes(path));
一、XmlDocument 方式
代码
- static IList testXmlDocument()
- {
- var doc = new XmlDocument();
- doc.Load(xmlStream);
- var nodeList = doc.DocumentElement.ChildNodes;
- var lstChannel = new List<Object>(nodeList.Count );
- foreach (XmlNode node in nodeList)
- {
- var channel = new
- {
- Title = node.SelectSingleNode("title").InnerText,
- Link = node.SelectSingleNode("link").InnerText,
- Description = node.SelectSingleNode("description").InnerText,
- Content = node.SelectSingleNode("content").InnerText,
- PubDate = node.SelectSingleNode("pubDate").InnerText,
- Author = node.SelectSingleNode("author").InnerText,
- Category = node.SelectSingleNode("category").InnerText
- };
- lstChannel.Add(channel);
- }
- return lstChannel;
- }
二、XPathNavigator 方式
代码
- static IList testXmlNavigator()
- {
- var doc = new XmlDocument();
- doc.Load(xmlStream);
- var nav = doc.CreateNavigator();
- nav.MoveToRoot();
- var nodeList = nav.Select("/channel/item");
- var lstChannel = new List<Object>(nodeList.Count);
- foreach (XPathNavigator node in nodeList)
- {
- var channel = new
- {
- Title = node.SelectSingleNode("title").Value,
- Link = node.SelectSingleNode("link").Value,
- Description = node.SelectSingleNode("description").Value,
- Content = node.SelectSingleNode("content").Value,
- PubDate = node.SelectSingleNode("pubDate").Value,
- Author = node.SelectSingleNode("author").Value,
- Category = node.SelectSingleNode("category").Value
- };
- lstChannel.Add(channel);
- }
- return lstChannel;
- }
三、XmlTextReader 方式
代码
- static List<Channel> testXmlReader()
- {
- var lstChannel = new List<Channel>();
- var reader = XmlReader.Create(xmlStream);
- while (reader.Read())
- {
- if (reader.Name == "item" && reader.NodeType == XmlNodeType.Element)
- { var channel = new Channel();
- lstChannel.Add(channel);
- while (reader.Read())
- {
- if (reader.Name == "item") break;
- if (reader.NodeType != XmlNodeType.Element) continue;
- switch (reader.Name)
- {
- case "title":
- channel.Title = reader.ReadString();
- break;
- case "link":
- channel.Link = reader.ReadString();
- break;
- case "description":
- channel.Description = reader.ReadString();
- break;
- case "content":
- channel.Content = reader.ReadString();
- break;
- case "pubDate":
- channel.PubDate = reader.ReadString();
- break;
- case "author":
- channel.Author = reader.ReadString();
- break;
- case "category":
- channel.Category = reader.ReadString();
- break;
- default:
- break;
- }}}}
- return lstChannel;
- }
四、Linq to XML 方式
代码
- static IList testXmlLinq()
- {
- var xd = XDocument.Load(xmlStream);
- var list = from node in xd.Elements("channel").Descendants("item")
- select new
- {
- Title = node.Element("title").Value,
- Link = node.Element("link").Value,
- Description = node.Element("description").Value,
- Content = node.Element("content").Value,
- PubDate = node.Element("pubDate").Value,
- Author = node.Element("author").Value,
- Category = node.Element("category").Value
- };
- return list.ToList();
测试结果:
XmlDocment 47ms
XPathNavigator 42ms
XmlTextReader 23ms
Xml Linq 28ms
小结一下自己的认识,XmlDocument的操作基本按W3C的DOM操作方式,不过要将全部节点解析成对象加载到内存中,往往造成很大浪费。所以微软自己的编程规范也不推荐用它。这里由于读取了所有节点,可能因此性能和Navigator方式相差不大。在三种随机读取方式中,Xml Linq性能最高,只是方法名有点别扭。XmlTextReader方式是所谓的SAX,只读向前,无疑性能最高,不过实现上麻烦了不少,要比较精确的控制访问逻辑,也无法用匿名类存储数据。
.Net 3.5发布Xml Linq可以很好地取代前两种方式,通常情况下,最好用它。只有个别场合,如果对性能要求极高,或者读取Xml数据量太大不能一下子下载或读取到内存中,那就只好痛苦委身于XmlTextReader了。
【编辑推荐】