HttpClient + Jsoup 模拟登陆解析HTML.docx

上传人:b****0 文档编号:12456171 上传时间:2023-04-19 格式:DOCX 页数:42 大小:25.34KB
下载 相关 举报
HttpClient + Jsoup 模拟登陆解析HTML.docx_第1页
第1页 / 共42页
HttpClient + Jsoup 模拟登陆解析HTML.docx_第2页
第2页 / 共42页
HttpClient + Jsoup 模拟登陆解析HTML.docx_第3页
第3页 / 共42页
HttpClient + Jsoup 模拟登陆解析HTML.docx_第4页
第4页 / 共42页
HttpClient + Jsoup 模拟登陆解析HTML.docx_第5页
第5页 / 共42页
点击查看更多>>
下载资源
资源描述

HttpClient + Jsoup 模拟登陆解析HTML.docx

《HttpClient + Jsoup 模拟登陆解析HTML.docx》由会员分享,可在线阅读,更多相关《HttpClient + Jsoup 模拟登陆解析HTML.docx(42页珍藏版)》请在冰豆网上搜索。

HttpClient + Jsoup 模拟登陆解析HTML.docx

HttpClient+Jsoup模拟登陆解析HTML

HttpClient+Jsoup模拟登陆,解析HTML,信息筛选(广工图书馆)

HttpClient+Jsoup模拟登陆,解析HTML获取信息

微博:

QQ:

375061590

最近在做一个校园综合Android客户端,主要是想把学校各类网站信息进行整合,放在一个平台上,供学校学生阅览。

思路如下:

拿广东工业大学图书馆网站作为一个例子

实现目标:

用个人账号登陆图书馆并获取到个人借阅情况。

登陆地址http:

//222.200.98.171:

81/login.aspx

这里会用到Chrome的开发者工具(浏览器按F12可以开启)

打开登陆界面的源码,下面是源码中的form标签

Html代码

<formname="aspnetForm"method="post"action="login.aspx?

ReturnUrl=%2fuser%2fuserinfo.aspx"onsubmit="javascript:

returnWebForm_OnSubmit();"id="aspnetForm">

<div>

<inputtype="hidden"name="__EVENTTARGET"id="__EVENTTARGET"value=""/>

<inputtype="hidden"name="__EVENTARGUMENT"id="__EVENTARGUMENT"value=""/>

<inputtype="hidden"name="__VIEWSTATE"id="__VIEWSTATE"value="/wEPDwULLTE0MjY3MDAxNzcPZBYCZg9kFgoCAQ8PFgIeCEltYWdlVXJsBRt+XGltYWdlc1xoZWFkZXJvcGFjNGdpZi5naWZkZAICDw8WAh4EVGV4dAUt5bm/5Lic5bel5Lia5aSn5a2m5Zu+5Lmm6aaG5Lmm55uu5qOA57Si57O757ufZGQCAw8PFgIfAQUcMjAxM+W5tDAz5pyIMDXml6UgIOaYn+acn+S6jGRkAgQPZBYEZg9kFgQCAQ8WAh4LXyFJdGVtQ291bnQCCBYSAgEPZBYCZg8VAwtzZWFyY2guYXNweAAM55uu5b2V5qOA57SiZAICD2QWAmYPFQMTcGVyaV9uYXZfY2xhc3MuYXNweAAM5YiG57G75a+86IiqZAIDD2QWAmYPFQMOYm9va19yYW5rLmFzcHgADOivu+S5puaMh+W8lWQCBA9kFgJmDxUDCXhzdGIuYXNweAAM5paw5Lmm6YCa5oqlZAIFD2QWAmYPFQMUcmVhZGVycmVjb21tZW5kLmFzcHgADOivu+iAheiNkOi0rWQCBg9kFgJmDxUDE292ZXJkdWVib29rc19mLmFzcHgADOaPkOmGkuacjeWKoWQCBw9kFgJmDxUDEnVzZXIvdXNlcmluZm8uYXNweAAP5oiR55qE5Zu+5Lmm6aaGZAIID2QWAmYPFQMbaHR0cDovL2xpYnJhcnkuZ2R1dC5lZHUuY24vAA/lm77kuabppobpppbpobVkAgkPZBYCAgEPFgIeB1Zpc2libGVoZAIDDxYCHwJmZAIBD2QWBAIDD2QWBAIBDw9kFgIeDGF1dG9jb21wbGV0ZQUDb2ZmZAIHDw8WAh8BZWRkAgUPZBYGAgEPEGRkFgFmZAIDDxBkZBYBZmQCBQ8PZBYCHwQFA29mZmQCBQ8PFgIfAQWlAUNvcHlyaWdodCAmY29weTsyMDA4LTIwMDkuIFNVTENNSVMgT1BBQyA0LjAxIG9mIFNoZW56aGVuIFVuaXZlcnNpdHkgTGlicmFyeS4gIEFsbCByaWdodHMgcmVzZXJ2ZWQuPGJyIC8+54mI5p2D5omA5pyJ77ya5rex5Zyz5aSn5a2m5Zu+5Lmm6aaGIEUtbWFpbDpzenVsaWJAc3p1LmVkdS5jbmRkZL5QuJMrEZz+0UxuTVpXZ/EaY5A4"/>

</div>

<scripttype="text/javascript">

//<!

[CDATA[

vartheForm=document.forms[‘aspnetForm’];

if(!

theForm){

theForm=document.aspnetForm;

}

function__doPostBack(eventTarget,eventArgument){

if(!

theForm.onsubmit||(theForm.onsubmit()!

=false)){

theForm.__EVENTTARGET.value=eventTarget;

theForm.__EVENTARGUMENT.value=eventArgument;

theForm.submit();

}

}

//]]>

</script>

<scriptsrc="/WebResource.axd?

d=kbLQnwjf5uNQN4GcWRC5kD1rIySOzkR3uLyKE5xUO0j4Fa2lQPZwQlk_qYaspRXtlojncSBfRJNkA00qXOMQqsKd8WY1&amp;t=634751988274393221"type="text/javascript"></script>

<scriptsrc="/WebResource.axd?

d=nsbO6ZJty6_6fuRufFNYnRiJ-xEoD0xQr70NX6g0v64gngATPLSnyyt7jyZkELLW6THXmh92_m0Y5TyvhES_-JroQeU1&amp;t=634751988274393221"type="text/javascript"></script>

<scripttype="text/javascript">

//<!

[CDATA[

functionWebForm_OnSubmit(){

if(typeof(ValidatorOnSubmit)=="function"&&ValidatorOnSubmit()==false)returnfalse;

returntrue;

}

//]]>

</script>

<div>

<inputtype="hidden"name="__EVENTVALIDATION"id="__EVENTVALIDATION"value="/wEWBQKa7ezdCwKOmK5RApX9wcYGAsP9wL8JAqW86pcIaBhXmFYzd5pGDTk/afln2TfArPw="/>

</div>

<inputname="ctl00$ContentPlaceHolder1$txtlogintype"type="hidden"id="ctl00_ContentPlaceHolder1_txtlogintype"value="0"/>

<divid="Login"class="clearFix">

<divclass="LoginTitle">

登录我的图书馆

</div>

<divclass="LeftLogin">

<divclass="LoginDiv">

<divclass="loginContent">

<divclass="loginInfo">

<spanclass="leftInfo">图书证号:

</span>

<spanclass="rightInfo">

<inputname="ctl00$ContentPlaceHolder1$txtUsername_Lib"type="text"id="ctl00_ContentPlaceHolder1_txtUsername_Lib"class="txtInput"autocomplete="off"/><spanid="ctl00_ContentPlaceHolder1_rfv_UserName_Lib"style="color:

Red;display:

none;">请输入证号</span>

</span>

</div>

<divclass="loginInfo">

<spanclass="leftInfo">密&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;码:

</span>

<spanclass="rightInfo">

<inputname="ctl00$ContentPlaceHolder1$txtPas_Lib"type="password"id="ctl00_ContentPlaceHolder1_txtPas_Lib"class="txtInput"/><spanid="ctl00_ContentPlaceHolder1_rfv_Password_Lib"style="color:

Red;display:

none;">请输入密码</span>

</span>

</div>

<div>

<spanid="ctl00_ContentPlaceHolder1_lblErr_Lib"></span>

</div>

<divclass="loginInfo">

<inputtype="submit"name="ctl00$ContentPlaceHolder1$btnLogin_Lib"value="登录"onclick="javascript:

WebForm_DoPostBackWithOptions(newWebForm_PostBackOptions(&quot;ctl00$ContentPlaceHolder1$btnLogin_Lib&quot;,&quot;&quot;,true,&quot;&quot;,&quot;&quot;,false,false))"id="ctl00_ContentPlaceHolder1_btnLogin_Lib"class="btn"/>

<inputtype="button"value="清空"onclick="rset()"class="btn"/>

</div>

</div>

</div>

</div>

<divclass="RightDescription">

<imgsrc="images/pin.gif"/><br/>

1.如果您使用的是公共电脑,请在使用完毕后,务必退出登录,以保安全。

<br/>

2.首次登录,请先<ahref="changepas.aspx">修改初始密码</a>。

</div>

</div>

<scripttype="text/javascript">

//<!

[CDATA[

varPage_Validators=newArray(document.getElementById("ctl00_ContentPlaceHolder1_rfv_UserName_Lib"),document.getElementById("ctl00_ContentPlaceHolder1_rfv_Password_Lib"));

//]]>

</script>

<scripttype="text/javascript">

//<!

[CDATA[

varctl00_ContentPlaceHolder1_rfv_UserName_Lib=document.all?

document.all["ctl00_ContentPlaceHolder1_rfv_UserName_Lib"]:

document.getElementById("ctl00_ContentPlaceHolder1_rfv_UserName_Lib");

ctl00_ContentPlaceHolder1_rfv_UserName_Lib.controltovalidate="ctl00_ContentPlaceHolder1_txtUsername_Lib";

ctl00_ContentPlaceHolder1_rfv_UserName_Lib.focusOnError="t";

ctl00_ContentPlaceHolder1_rfv_UserName_Lib.errormessage="请输入证号";

ctl00_ContentPlaceHolder1_rfv_UserName_Lib.display="Dynamic";

ctl00_ContentPlaceHolder1_rfv_UserName_Lib.evaluationfunction="RequiredFieldValidatorEvaluateIsValid";

ctl00_ContentPlaceHolder1_rfv_UserName_Lib.initialvalue="";

varctl00_ContentPlaceHolder1_rfv_Password_Lib=document.all?

document.all["ctl00_ContentPlaceHolder1_rfv_Password_Lib"]:

document.getElementById("ctl00_ContentPlaceHolder1_rfv_Password_Lib");

ctl00_ContentPlaceHolder1_rfv_Password_Lib.controltovalidate="ctl00_ContentPlaceHolder1_txtPas_Lib";

ctl00_ContentPlaceHolder1_rfv_Password_Lib.focusOnError="t";

ctl00_ContentPlaceHolder1_rfv_Password_Lib.errormessage="请输入密码";

ctl00_ContentPlaceHolder1_rfv_Password_Lib.display="Dynamic";

ctl00_ContentPlaceHolder1_rfv_Password_Lib.evaluationfunction="RequiredFieldValidatorEvaluateIsValid";

ctl00_ContentPlaceHolder1_rfv_Password_Lib.initialvalue="";

//]]>

</script>

<scripttype="text/javascript">

//<!

[CDATA[

varPage_ValidationActive=false;

if(typeof(ValidatorOnLoad)=="function"){

ValidatorOnLoad();

}

functionValidatorOnSubmit(){

if(Page_ValidationActive){

returnValidatorCommonOnSubmit();

}

else{

returntrue;

}

}

//]]>

</script>

</form>

里面很多代码,我们要从中提取出我们登陆所需要的表单信息,input和select这些标签都是作为登陆表单内容,这里只有input标签我们就提取它就好了,代码如下:

initLoginParmas(StringuserName,StringpassWord)和getLoginFormData(Stringurl)两个方法

Java代码

/**

*初始化参数

*

*@paramuserName

*@parampassWord

*@return

*@throwsParseException

*@throwsIOException

*/

publicstaticList<NameValuePair>initLoginParmas(StringuserName,

StringpassWord)throwsParseException,IOException{

List<NameValuePair>parmasList=newArrayList<NameValuePair>();

HashMap<String,String>parmasMap=getLoginFormData(LoginUrl);

Set<String>keySet=parmasMap.keySet();

for(Stringtemp:

keySet){

if(temp.contains("Username")){

parmasMap.put(temp,userName);

}elseif(temp.contains("txtPas")){

parmasMap.put(temp,passWord);

}

}

Set<String>keySet2=parmasMap.keySet();

System.out.println("表单内容:

");

for(Stringtemp:

keySet2){

System.out.println(temp+"="+parmasMap.get(temp));

}

for(Stringtemp:

keySet2){

parmasList.add(newBasicNameValuePair(temp,parmasMap.get(temp)));

}

//System.out.println("initParams\n"+parmasMap);

returnparmasList;

}

Java代码

/**

*获取登录表单input内容

*

*@paramurl

*@return

*@throwsIOException

*@throwsParseException

*/

publicstaticHashMap<String,String>getLoginFormData(Stringurl)

throwsParseException,IOException{

Documentdocument=Jsoup.parse(getHtml(url));

Elementselement1=document.getElemen

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 幼儿教育 > 幼儿读物

copyright@ 2008-2022 冰豆网网站版权所有

经营许可证编号:鄂ICP备2022015515号-1