【背景】
之前已经写了教程,分析模拟登陆百度的逻辑:
【教程】手把手教你如何利用工具(IE9的F12)去分析模拟登陆网站(百度首页)的内部逻辑过程
然后又去用不同的语言:
Python的:
【教程】模拟登陆网站 之 Python版(内含两种版本的完整的可运行的代码)
C#的:
【教程】模拟登陆网站 之 C#版(内含两种版本的完整的可运行的代码)
Java的:
而现在:
对于,算是一无所知的go语言,大概了解到,其也可以有对应的http的库,所以,也打算,
从无到有,一点点,边学习go语言本身,边去实现对应的,模拟登陆百度的功能。
【折腾过程】
1.先去学习一下go语言本身:
2.然后再去搞懂基本的开发:
【记录】go语言的基本开发:实现Hello World,找到合适的开发环境和工具
3.换了个环境,不过也是x64的win7,然后重新去下载和安装go,然后再去试试普通的hello world。
此处,几点值得一提的:
(1)此处,自动安装完go后,已经把对应的路径:
D:\tmp\dev_install_root\Go\bin
加入到当前的PATH中了;
(2)对应的go/bin下面,有三个工具:
- go.exe
- godoc.exe
- gofmt.exe
4.继续去学习如何写go代码:
5.搞清楚了,如何写go代码,接着就是去,参考官网手册,去学习http方面的代码如何写了。
6.关于go的命名规范,这里有介绍:
7.接着,可以去折腾,如何实现,基本的网页抓取方面的功能了:
8.但是如上获得的内容,都是打印到cmd中的,不方便后续开发记录和查看。
所以希望,能log内容到文件中:
9.然后出现文件编码的问题:
【问题】go代码运行出错:# command-line-arguments .\EmulateLoginBaidu.go:86: illegal UTF-8 sequence
10.接着又出现“cannot use body (type []byte) as type string in assignment”的错误:
【已解决】go代码中直接使用http返回的body赋值给string结果出错:cannot use body (type []byte) as type string in assignment
11.至此,已经可以实现了:
将百度主页的html抓取下来,并且输出到log文件中了。
12.接着,继续去,搞懂,如何获得http返回的cookie:
13.接下来,就是要去从返回的html中提取我要的内容,所以要去折腾:
14.接下来,要去搞懂,go语言中的字典类型变量:
15.再去搞懂,如何获得console的输入:
16.接着再去搞懂,如何发送http的POST:
【记录】go语言中实现http的POST且传递对应的post data
17.实现了POST,且可以传递post data后,可以正常模拟登陆成功了,可以获得对应的cookie了。
所以接下来,再去检测,对应的各个cookie:
注意到,当前此刻返回的httpResp.Header中的Set-Cookie是:
(格式化后)
BDUSS=G1LNG5uLTNYWkU2bzA2SGxCZHZ2Rm5ocnN-MEhFem5uQkZrdkJFVmplUmpBV1ZTQVFBQUFBJCQAAAAAAAAAAAEAAAB-OUgCYWdhaW5pbnB1dAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGN0PVJjdD1SM; expires=Wed, 08-Dec-2021 10:26:43 GMT; path=/; domain=baidu.com; httponly PTOKEN=deleted; expires=Fri, 21-Sep-2012 10:26:42 GMT; path=/; domain=baidu.com; httponly PTOKEN=0f1e0187b042630a47c4eea8e0e96a2f; expires=Wed, 08-Dec-2021 10:26:43 GMT; path=/; domain=passport.baidu.com; httponly STOKEN=8d6ce0cbc7f689a8cd647b8beb5872e3; expires=Wed, 08-Dec-2021 10:26:43 GMT; path=/; domain=passport.baidu.com; httponly SAVEUSERID=deleted; expires=Fri, 21-Sep-2012 10:26:42 GMT; path=/; domain=passport.baidu.com; httponly USERNAMETYPE=1; expires=Wed, 08-Dec-2021 10:26:43 GMT; path=/; domain=passport.baidu.com; httponly
可见,对应的cookie:
(1)PTOKEN,对于:
domain=baidu.com
是delete掉了;
而对于passport.baidu.com,PTOKEN还是存在的;
(2)而另外几个cookie:
STOKEN,SAVEUSERID,USERNAMETYPE,的domain却都是:
passport.baidu.com
而不是原以为的:
baidu.com
(3)BDUSS的domain的确是baidu.com
这样的话,之前的代码:
gCurCookies = gCurCookieJar.Cookies(httpReq.URL);
以为会只能获得对应的
BDUSS
或者是:
STOKEN,SAVEUSERID,USERNAMETYPE
不过,幸运的是,此处通过:
dbgPrintCurCookies
而打印出来的cookie,是都存在的:
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:199) cookieNum=7 [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [0]------ [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name =H_PS_PSSID [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value =3359_1455_2976_2981_3090 [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires =0001-01-01 00:00:00 +0000 UTC [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge =0 [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure =false [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly =false [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed =[] [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [1]------ [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name =BAIDUID [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value =74F5614706B58BFCCCB3923C8ABD3E61:FG=1 [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires =0001-01-01 00:00:00 +0000 UTC [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge =0 [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure =false [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly =false [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed =[] [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [2]------ [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name =HOSUPPORT [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value =1 [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires =0001-01-01 00:00:00 +0000 UTC [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge =0 [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure =false [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly =false [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed =[] [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [3]------ [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name =BDUSS [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value =W95bX41ZTlhNkFKQkpQcGd5Y1ZUOENiYzJ2TkpvakJaZVBXSS10WXh1THVCbVZTQVFBQUFBJCQAAAAAAAAAAAEAAAB-OUgCYWdhaW5pbnB1dAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAO55PVLueT1SO [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires =0001-01-01 00:00:00 +0000 UTC [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge =0 [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure =false [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly =false [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed =[] [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [4]------ [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name =PTOKEN [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value =2e67f3d7d5c52118bf4d222ab87ac9a4 [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires =0001-01-01 00:00:00 +0000 UTC [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge =0 [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure =false [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly =false [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed =[] [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [5]------ [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name =STOKEN [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value =63a3b62efbd83a00c095c624ca4dfdfc [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires =0001-01-01 00:00:00 +0000 UTC [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge =0 [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure =false [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly =false [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed =[] [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [6]------ [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name =USERNAMETYPE [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value =1 [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires =0001-01-01 00:00:00 +0000 UTC [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge =0 [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure =false [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly =false [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw = [2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed =[]
所以,后续可以直接通过cookie的名字,去判断是否存在了。
18.最终,模拟登陆百度成功了。
所用代码为:
/*
* [File]
* EmulateLoginBaidu.go
*
* [Function]
* 【记录】用go语言实现模拟登陆百度
* https://www.crifan.com/emulate_login_baidu_using_go_language/
*
* [Version]
* 2013-09-21
*
* [Contact]
* https://www.crifan.com/about/me/
*/
package main
import (
"fmt"
//"builtin"
//"log"
"os"
"runtime"
"path"
"strings"
"time"
//"io"
"io/ioutil"
"net/http"
"net/http/cookiejar"
"net/url"
//"sync"
//"net/url"
"regexp"
//"bufio"
"bytes"
)
//import l4g "log4go.googlecode.com/hg"
//import l4g "code.google.com/p/log4go"
import "code.google.com/p/log4go"
/***************************************************************************************************
Global Variables
***************************************************************************************************/
var gCurCookies []*http.Cookie;
var gCurCookieJar *cookiejar.Jar;
var gLogger log4go.Logger;
/***************************************************************************************************
Functions
***************************************************************************************************/
//do init before all others
func initAll(){
gCurCookies = nil
//var err error;
gCurCookieJar,_ = cookiejar.New(nil)
gLogger = nil
initLogger()
initCrifanLib()
}
//de-init for all
func deinitAll(){
gCurCookies = nil
if(nil == gLogger) {
gLogger.Close();
//os.Stdout.Sync() //try manually flush, but can not fix log4go's flush bug
gLogger = nil
}
}
//do some init for crifanLib
func initCrifanLib(){
gLogger.Debug("init for crifanLib")
gCurCookies = nil
return
}
//init for logger
func initLogger(){
var filenameOnly string = GetCurFilename()
var logFilename string = filenameOnly + ".log";
//gLogger = log4go.NewLogger()
//gLogger = make(log4go.Logger)
//for console
//gLogger.AddFilter("stdout", log4go.INFO, log4go.NewConsoleLogWriter())
gLogger = log4go.NewDefaultLogger(log4go.INFO)
//for log file
if _, err := os.Stat(logFilename); err == nil {
//fmt.Printf("found old log file %s, now remove it\n", logFilename)
os.Remove(logFilename)
}
//gLogger.AddFilter("logfile", log4go.FINEST, log4go.NewFileLogWriter(logFilename, true))
//gLogger.AddFilter("logfile", log4go.FINEST, log4go.NewFileLogWriter(logFilename, false))
gLogger.AddFilter("log", log4go.FINEST, log4go.NewFileLogWriter(logFilename, false))
gLogger.Debug("Current time is : %s", time.Now().Format("15:04:05 MST 2006/01/02"))
return
}
// GetCurFilename
// Get current file name, without suffix
func GetCurFilename() string {
_, fulleFilename, _, _ := runtime.Caller(0)
//fmt.Println(fulleFilename)
var filenameWithSuffix string
filenameWithSuffix = path.Base(fulleFilename)
//fmt.Println("filenameWithSuffix=", filenameWithSuffix)
var fileSuffix string
fileSuffix = path.Ext(filenameWithSuffix)
//fmt.Println("fileSuffix=", fileSuffix)
var filenameOnly string
filenameOnly = strings.TrimSuffix(filenameWithSuffix, fileSuffix)
//fmt.Println("filenameOnly=", filenameOnly)
return filenameOnly
}
//get url response html
func getUrlRespHtml(strUrl string, postDict map[string]string) string{
gLogger.Debug("in getUrlRespHtml, strUrl=%s", strUrl)
gLogger.Debug("postDict=%s", postDict)
var respHtml string = "";
httpClient := &http.Client{
//Transport:nil,
//CheckRedirect: nil,
Jar:gCurCookieJar,
}
var httpReq *http.Request
//var newReqErr error
if nil == postDict {
gLogger.Debug("is GET")
//httpReq, newReqErr = http.NewRequest("GET", strUrl, nil)
httpReq, _ = http.NewRequest("GET", strUrl, nil)
// ...
//httpReq.Header.Add("If-None-Match", `W/"wyzzy"`)
} else {
//【记录】go语言中实现http的POST且传递对应的post data
//https://www.crifan.com/go_language_http_do_post_pass_post_data
gLogger.Debug("is POST")
postValues := url.Values{}
for postKey, PostValue := range postDict{
postValues.Set(postKey, PostValue)
}
gLogger.Debug("postValues=%s", postValues)
postDataStr := postValues.Encode()
gLogger.Debug("postDataStr=%s", postDataStr)
postDataBytes := []byte(postDataStr)
gLogger.Debug("postDataBytes=%s", postDataBytes)
postBytesReader := bytes.NewReader(postDataBytes)
//httpReq, newReqErr = http.NewRequest("POST", strUrl, postBytesReader)
httpReq, _ = http.NewRequest("POST", strUrl, postBytesReader)
//httpReq.Header.Set("Content-Type", "application/x-www-form-urlencoded; param=value")
httpReq.Header.Add("Content-Type", "application/x-www-form-urlencoded")
}
httpResp, err := httpClient.Do(httpReq)
// ...
//httpResp, err := http.Get(strUrl)
//gLogger.Info("http.Get done")
if err != nil {
gLogger.Warn("http get strUrl=%s response error=%s\n", strUrl, err.Error())
}
gLogger.Debug("httpResp.Header=%s", httpResp.Header)
gLogger.Debug("httpResp.Status=%s", httpResp.Status)
defer httpResp.Body.Close()
// gLogger.Info("defer httpResp.Body.Close done")
body, errReadAll := ioutil.ReadAll(httpResp.Body)
//gLogger.Info("ioutil.ReadAll done")
if errReadAll != nil {
gLogger.Warn("get response for strUrl=%s got error=%s\n", strUrl, errReadAll.Error())
}
//gLogger.Debug("body=%s\n", body)
//gCurCookies = httpResp.Cookies()
//gCurCookieJar = httpClient.Jar;
gCurCookies = gCurCookieJar.Cookies(httpReq.URL);
//gLogger.Info("httpResp.Cookies done")
//respHtml = "just for test log ok or not"
respHtml = string(body)
//gLogger.Info("httpResp body []byte to string done")
return respHtml
}
func dbgPrintCurCookies() {
var cookieNum int = len(gCurCookies);
gLogger.Debug("cookieNum=%d", cookieNum)
for i := 0; i < cookieNum; i++ {
var curCk *http.Cookie = gCurCookies[i];
//gLogger.Debug("curCk.Raw=%s", curCk.Raw)
gLogger.Debug("------ Cookie [%d]------", i)
gLogger.Debug("Name\t\t=%s", curCk.Name)
gLogger.Debug("Value\t=%s", curCk.Value)
gLogger.Debug("Path\t\t=%s", curCk.Path)
gLogger.Debug("Domain\t=%s", curCk.Domain)
gLogger.Debug("Expires\t=%s", curCk.Expires)
gLogger.Debug("RawExpires\t=%s", curCk.RawExpires)
gLogger.Debug("MaxAge\t=%d", curCk.MaxAge)
gLogger.Debug("Secure\t=%t", curCk.Secure)
gLogger.Debug("HttpOnly\t=%t", curCk.HttpOnly)
gLogger.Debug("Raw\t\t=%s", curCk.Raw)
gLogger.Debug("Unparsed\t=%s", curCk.Unparsed)
}
}
func main() {
initAll()
gLogger.Info("============ 程序说明 ============");
gLogger.Info("功能:本程序是用来演示使用Java代码去实现模拟登陆百度");
gLogger.Info("注意事项:部分百度账户,在登陆时会出现:");
gLogger.Info("1.部分百度账户,在登陆时会出现:");
gLogger.Info("系统检测到您的帐号疑似被盗,存在安全风险。请尽快修改密码。");
gLogger.Info("此时,本程序,无法成功模拟登陆,请自行按照提示去修改密码后,就可以了。");
//step1: access baidu url to get cookie BAIDUID
gLogger.Info("====== 步骤1:获得BAIDUID的Cookie ======")
var baiduMainUrl string = "http://www.baidu.com/";
gLogger.Debug("baiduMainUrl=%s", baiduMainUrl)
respHtml := getUrlRespHtml(baiduMainUrl, nil)
gLogger.Debug("respHtml=%s", respHtml)
dbgPrintCurCookies()
//check cookie
var bGotCookieBaiduid = false;
//var cookieNameListToCheck []string = ["BAIDUID"]
//toCheckCookieNameList := [1]string{"BAIDUID"}
toCheckCookieNameList := []string{"BAIDUID"}
toCheckCookieNum := len(toCheckCookieNameList)
gLogger.Debug("toCheckCookieNum=%d", toCheckCookieNum)
curCookieNum := len(gCurCookies)
gLogger.Debug("curCookieNum=%d", curCookieNum)
for i := 0; i < toCheckCookieNum; i++ {
toCheckCkName := toCheckCookieNameList[i];
gLogger.Debug("[%d]toCheckCkName=%s", i, toCheckCkName)
for j := 0; j < curCookieNum; j++{
curCookie := gCurCookies[j]
if(strings.EqualFold(toCheckCkName, curCookie.Name)){
bGotCookieBaiduid = true;
break;
}
}
}
if bGotCookieBaiduid {
gLogger.Info("Found cookie BAIDUID");
}else{
gLogger.Info("Not found cookie BAIDUID");
}
//step2: login, pass paras, extract resp cookie
gLogger.Info("====== 步骤2:提取login_token ======");
bExtractTokenValueOK := false
strLoginToken := ""
var getApiRespHtml string;
if bGotCookieBaiduid{
//https://passport.baidu.com/v2/api/?getapi&class=login&tpl=mn&tangram=true
var getapiUrl string = "https://passport.baidu.com/v2/api/?getapi&class=login&tpl=mn&tangram=true";
getApiRespHtml = getUrlRespHtml(getapiUrl, nil);
gLogger.Debug("getApiRespHtml=%s", getApiRespHtml);
dbgPrintCurCookies()
//bdPass.api.params.login_token='278623fc5463aa25b0189ddd34165592';
//use regex to extract login_token
//【记录】go语言中用正则表达式查找某个值
//https://www.crifan.com/go_language_regular_expression_find_value/
loginTokenP, _ := regexp.Compile(`bdPass\.api\.params\.login_token='(?P<loginToken>\w+)';`)
//loginToken := loginTokenP.FindString(getApiRespHtml);
//loginToken := loginTokenP.FindSubmatch(getApiRespHtml);
foundLoginToken := loginTokenP.FindStringSubmatch(getApiRespHtml);
gLogger.Debug("foundLoginToken=%s", foundLoginToken);
if nil != foundLoginToken {
strLoginToken = foundLoginToken[1] //tmp go regexp not support named group, so use index here
gLogger.Info("found bdPass.api.params.login_token=%s", strLoginToken);
bExtractTokenValueOK = true;
} else {
gLogger.Warn(" not found login_token from html=%s", getApiRespHtml);
}
}
//step3: verify returned cookies
bLoginBaiduOk := false;
if bGotCookieBaiduid && bExtractTokenValueOK {
gLogger.Info("======步骤3:登陆百度并检验返回的Cookie ======");
staticPageUrl := "http://www.baidu.com/cache/user/html/jump.html";
postDict := map[string]string{}
//postDict["ppui_logintime"] = ""
postDict["charset"] = "utf-8"
//postDict["codestring"] = ""
postDict["token"] = strLoginToken
postDict["isPhone"] = "false"
postDict["index"] = "0"
//postDict["u"] = ""
//postDict["safeflg"] = "0"
postDict["staticpage"] = staticPageUrl
postDict["loginType"] = "1"
postDict["tpl"] = "mn"
postDict["callback"] = "parent.bdPass.api.login._postCallback"
//【已解决】go语言中获得控制台输入的字符串
//https://www.crifan.com/go_language_get_console_input_string/
strBaiduUsername := ""
strBaiduPassword := ""
gLogger.Info("Plese input:")
gLogger.Info("Baidu Username:")
_, err1 := fmt.Scanln(&strBaiduUsername)
if nil == err1 {
gLogger.Debug("strBaiduUsername=%s", strBaiduUsername)
}
gLogger.Info("Baidu Password:")
_, err2 := fmt.Scanln(&strBaiduPassword)
if nil == err2 {
gLogger.Debug("strBaiduPassword=%s", strBaiduPassword)
}
postDict["username"] = strBaiduUsername
postDict["password"] = strBaiduPassword
postDict["verifycode"] = ""
postDict["mem_pass"] = "on"
gLogger.Debug("postDict=%s", postDict)
baiduMainLoginUrl := "https://passport.baidu.com/v2/api/?login";
loginBaiduRespHtml := getUrlRespHtml(baiduMainLoginUrl, postDict);
gLogger.Debug("loginBaiduRespHtml=%s", loginBaiduRespHtml)
dbgPrintCurCookies();
//check resp cookies exist or not
cookieNameDict := map[string]bool{
"BDUSS" : false,
"PTOKEN" : false,
"STOKEN" : false,
//"SAVEUSERID": false, //be deleted
}
for cookieName, _ := range cookieNameDict {
for _, singleCookie := range gCurCookies {
//if(strings.EqualFold(cookieName, singleCookie.Name)){
if cookieName == singleCookie.Name {
cookieNameDict[cookieName] = true;
gLogger.Debug("Found cookie %s", cookieName)
}
}
}
gLogger.Debug("After check resp cookie, cookieNameDict=%s", cookieNameDict)
bAllCookiesFound := true
for _, bIsExist := range cookieNameDict {
bAllCookiesFound = bAllCookiesFound && bIsExist
}
bLoginBaiduOk = bAllCookiesFound
if (bLoginBaiduOk) {
gLogger.Info("成功模拟登陆百度首页!" );
} else{
gLogger.Info("模拟登陆百度首页 失败!");
gLogger.Info("所返回的HTML源码为:" + loginBaiduRespHtml);
}
}
deinitAll()
//【workaround】go语言中用log4go输出信息时有bug:只输出部分信息,甚至是无任何输出
//https://www.crifan.com/go_language_log4go_only_output_part_info/
time.Sleep(100 * time.Millisecond)
}效果为:
【总结】
从无到有,经历千辛万苦,最终终于用go语言,实现了,模拟登陆百度。
后续的,抽空再继续优化,至少包括:
【记录】在用go语言成功模拟登陆百度后把相关函数整理至自己的go语言的库函数:crifanLib.go
转载请注明:在路上 » 【记录】用go语言实现模拟登陆百度