当前位置: 首页 > 图文教程 > 网络编程 > ASP.NET > asp.net HttpWebRequest自动识别网页编码

ASP.NET
asp.net GridView控件中模板列CheckBox全选、反选、取消
asp.net GridView 删除时弹出确认对话框(包括内容提示)
asp.net DropDownList 三级联动下拉菜单实现代码
asp DataTable添加列和行的三种方法
Asp.net 页面调用javascript变量的值
asp.net 长文章通过设定的行数分页
asp.net 定时间点执行任务的简易解决办法
asp.net 页面延时五秒,跳转到另外的页面
asp.net 动态输出透明gif图片
asp.net DataList与Repeater用法区别
asp.net Javascript获取CheckBoxList的value
asp.net程序在调式和发布之间图片路径问题的解决方法
asp.net下生成英文字符数字验证码的代码
asp.net 页面版文本框智能提示JSCode (升级版)
ASP.NET URL伪静态重写实现方法
ASP.NET 2.0 中Forms安全认证
asp.net 动态添加多个用户控件
asp.net Repeater显示父子表数据,无闪烁
asp.net 无法获取的内部内容,因为该内容不是文本 的解决方法
asp.net GridView排序简单实现

ASP.NET 中的 asp.net HttpWebRequest自动识别网页编码


出处:互联网   整理: 软晨网(RuanChen.com)   发布: 2009-09-13   浏览: 87 ::
收藏到网摘: n/a

HttpWebRequest获取网页源代码时自动识别网页编码,通过读取页面中的charset和读取http头中的编码信息获取页面的编码,基本可以正确获取网页编码

复制代码 代码如下:

static string GetEncoding(string url)
{
HttpWebRequest request = null;
HttpWebResponse response = null;
StreamReader reader = null;
try
{
request = (HttpWebRequest)WebRequest.Create(url);
request.Timeout = 20000;
request.AllowAutoRedirect = false;
response = (HttpWebResponse)request.GetResponse();
if (response.StatusCode == HttpStatusCode.OK && response.ContentLength < 1024 * 1024)
{
if (response.ContentEncoding != null && response.ContentEncoding.Equals("gzip", StringComparison.InvariantCultureIgnoreCase))
reader = new StreamReader(new GZipStream(response.GetResponseStream(), CompressionMode.Decompress));
else
reader = new StreamReader(response.GetResponseStream(), Encoding.ASCII);
string html = reader.ReadToEnd();
Regex reg_charset = new Regex(@"charset\b\s*=\s*(?<charset>[^""]*)");
if (reg_charset.IsMatch(html))
{
return reg_charset.Match(html).Groups["charset"].Value;
}
else if (response.CharacterSet != string.Empty)
{
return response.CharacterSet;
}
else
return Encoding.Default.BodyName;
}
}
catch
{
}
finally
{
if (response != null)
{
response.Close();
response = null;
}
if (reader != null)
reader.Close();
if (request != null)
request = null;
}
return Encoding.Default.BodyName;
}
/// <summary>
/// 获取源代码
/// </summary>
/// <param name="url"></param>
/// <returns></returns>
static string GetHtml(string url, Encoding encoding)
{
HttpWebRequest request = null;
HttpWebResponse response = null;
StreamReader reader = null;
try
{
request = (HttpWebRequest)WebRequest.Create(url);
request.Timeout = 20000;
request.AllowAutoRedirect = false;
response = (HttpWebResponse)request.GetResponse();
if (response.StatusCode == HttpStatusCode.OK && response.ContentLength < 1024 * 1024)
{
if (response.ContentEncoding != null && response.ContentEncoding.Equals("gzip", StringComparison.InvariantCultureIgnoreCase))
reader = new StreamReader(new GZipStream(response.GetResponseStream(), CompressionMode.Decompress), encoding);
else
reader = new StreamReader(response.GetResponseStream(), encoding);
string html = reader.ReadToEnd();
return html;
}
}
catch
{
}
finally
{
if (response != null)
{
response.Close();
response = null;
}
if (reader != null)
reader.Close();
if (request != null)
request = null;
}
return string.Empty;
}