当前位置：移动技术网 > IT编程>移动开发>WP > wIndows phone 7 解析Html数据

wIndows phone 7 解析Html数据

2018年10月13日 | 移动技术网IT编程 | 我要评论

古晨天天向上,相约到百年,唐明皇赐名

在我的上一篇文章中我介绍了windows phone 7的gb2312解码,

/kf/201111/112551.html

解决了下载的html乱码问题,这一篇,我将介绍关于windows phone 7解析html数据，以便我们获得想要的数据.

这里,我先介绍一个类库htmlagilitypack,（上一篇文章也是通过这个工具来解码的）. 类库的dll文件我会随demo一起提供

这里,我以新浪新闻为例来解析数据

先看看网页版的新浪新闻

http://news.sina.com.cn/w/sd/2011-11-27/070023531646.shtml

然后我们看一下他的源文件，

发现新闻内容的结构是

view sourceprint?

<h1 id="artibodytitle" pid="1" tid="1" did="23531646" fid="1666">title</h1>

<a href="http://www.sina.com.cn">http://www.sina.com.cn</a> pub_date <a href="">media_name</a> <a href=""></a>

大部分还有id属性,这更适合我们去解析了。

接下来我们开始去解析

第一：引用htmlagilitypack.dll文件

第二：用webclient或者webrequest类来下载html页面然后处理成字符串。

view sourceprint?public delegate void callbackevent(object sender, downloadeventargs e);

public event callbackevent downloadcallbackevent;

public void httpwebrequestdownloadget(string url)

{

thread _thread = new thread(delegate()

{

uri _uri = new uri(url, urikind.relativeorabsolute);

httpwebrequest _httpwebrequest = (httpwebrequest)webrequest.create(_uri);

_httpwebrequest.method="get";

_httpwebrequest.begingetresponse(new asynccallback(delegate(iasyncresult result)

{

httpwebrequest _httpwebrequestcallback = (httpwebrequest)result.asyncstate;

httpwebresponse _httpwebresponsecallback = (httpwebresponse)_httpwebrequestcallback.endgetresponse(result);

stream _streamcallback = _httpwebresponsecallback.getresponsestream();

streamreader _streamreader = new streamreader(_streamcallback,new htmlagilitypack.gb2312encoding());

string _stringcallback = _streamreader.readtoend();

deployment.current.dispatcher.begininvoke(new action(() =>

{

if (downloadcallbackevent != null)

{

downloadeventargs _downloadeventargs = new downloadeventargs();

_downloadeventargs._downloadstream = _streamcallback;

_downloadeventargs._downloadstring = _stringcallback;

downloadcallbackevent(this, _downloadeventargs);

}

}));

}), _httpwebrequest);

}) ;

_thread.start();

}

// }

o(∩_∩)o! 我这个比较复杂, 总之我们下载了html的数据就行了。

贴一个简单的下载方式吧

view sourceprint?webclient webclenet=new webclient();

webclenet.encoding = new htmlagilitypack.gb2312encoding(); //加入这句设定编码

webclenet.downloadstringasync(new uri("http://news.sina.com.cn/s/2011-11-25/120923524756.shtml", urikind.relativeorabsolute));

webclenet.downloadstringcompleted += new downloadstringcompletedeventhandler(webclenet_downloadstringcompleted);

现在处理回调函数的e.result

view sourceprint?string _result = e._downloadstring;

htmldocument _doc = new htmldocument(); //实例化htmlagilitypack.htmldocument对象

_doc.loadhtml(_result); //载入html

htmlnode _htmlnode01 = _doc.getelementbyid("artibodytitle"); //新闻标题的div

string _title = _htmlnode01.innertext;

htmlnode _htmlnode02 = _doc.getelementbyid("artibody"); //获取内容的p

string _content = _htmlnode02.innertext;

// int _count= _htmlnode02.childnodes.where(new func<htmlnode,bool>("p"));

int _pindex = _content.indexof(" .blkcomment");

_content= _content.substring(0,_pindex);

#region　新浪标签

htmlnode _htmlnodo03 = _doc.getelementbyid("art_source");

string _www = _htmlnodo03.firstchild.innertext;

string _wwwint = _htmlnodo03.firstchild.attributes[0].value;

#endregion

// string _source = _htmlnodo03;

//_htmlnodo03.childnodes

#region 发布时间

htmlnode _htmlnodo04 = _doc.getelementbyid("pub_date");

string _pub_date = _htmlnodo04.innertext;

#endregion

#region 来源网站信息

htmlnode _htmlnodo05 = _doc.getelementbyid("media_name");

string _media_name = _htmlnodo05.firstchild.innertext;

string _modia_source = _htmlnodo05.firstchild.attributes[0].value;

#endregion

media_namehyperlinkbutton.content = _pub_date + " " + _media_name;

media_namehyperlinkbutton.navigateuri = new uri(_modia_source, urikind.relativeorabsolute);

titletextblock.text = _title;

contenttextblock.text = _content;

结果如下图所示：

网页的大部分标签是没有id属性的,不过幸运的是htmlagilitypack支持xpath

那就需要通过xpath语言来查找匹配所需节点

xpath教程：http://www.w3school.com.cn/xpath/index.

案例下载：

http://115.com/file/dn87dl2d#

myframework_test.zip

作者青瓷

您可能感兴趣的文章:

如对本文有疑问，请在下面进行留言讨论，广大热心网友会与你互动！！点击进行留言回复

Windows Phone 7编程实践—推送通知(剖析推送通知实现架构)

xuesong 作品目标：windows phone 7 开发的实用手册推送通知的工作流 window phone客... [阅读全文]
WP7实例篇之优酷搜索器（1）

学习wp7也有一段时间了，就以一个小demo实现wp7优酷搜索器来练练手。首先，我们使用expression blend 4创建项目，命名为wp7youk... [阅读全文]
Windows Phone 7 Tips (1)

学习windows phone 7也有一段时间了，也整理了一些不错的知识点，下面一个系列将会每篇博客分享10个windows phone 7 技巧，并且提供离... [阅读全文]
Windows Phone 7 Tips (4)

1.windows phone 7 中常见的使用webclient代码段： webclient twitter... [阅读全文]
WP7实例篇之土豆搜索器（1）

修改说明：由于youku的api不怎么geli ，转而采用土豆的api，文章下方就不怎么修改了，大家可以自行修改项目中的文本. 学习wp7也有一段时间了，就以... [阅读全文]
WP7实例篇之土豆搜索器（2）

在上篇wp7实例篇之土豆搜索器（1）中，我们创建了示例数据源并且将其绑定到页面中，接下来我们将要动态化绑定源数据，此时visual studio的用处就体现... [阅读全文]
Esri for Window Phone 7（一）加载BingMap

最近在学习esri for windows phone 7 地图开发方面的东西来提升自己,同进也是工作中的耐要。通过一段时间的了解与接触，感觉地图数据是一个... [阅读全文]
浅谈ListBox在Windows Phone 7 中的使用

在windows phone 7 tips (5) 中曾经提到，在windows phone 7 中页面的布局一般分为：panoramic、pivot、l... [阅读全文]
WP7 应用数据存储IsolatedStorage 篇

windows phone 7 在独立存储（isolated storage）功能方面提供了两种数据存储方法：文件存储（aspx" target=_blank... [阅读全文]
windows phone7.1中两个新增控件

richtextbox在wp7中，所有的控件都无法实现图文混排，这个控件解决了无法图文混排的问题，使微博和聊天软件不在只是文字显示那么单调了。但是这个控件目前... [阅读全文]

网友评论


验证码：

wIndows phone 7 解析Html数据

2018年10月13日 | 移动技术网IT编程 | 我要评论

您可能感兴趣的文章:

相关文章:

网友评论