【记录】用go语言实现模拟登陆百度

【背景】

之前已经写了教程,分析模拟登陆百度的逻辑:

【教程】手把手教你如何利用工具(IE9的F12)去分析模拟登陆网站(百度首页)的内部逻辑过程

然后又去用不同的语言:

Python的:

【教程】模拟登陆网站 之 Python版(内含两种版本的完整的可运行的代码)

C#的:

【教程】模拟登陆网站 之 C#版(内含两种版本的完整的可运行的代码)

Java的:

【教程】模拟登陆百度之Java代码版

而现在:

对于,算是一无所知的go语言,大概了解到,其也可以有对应的http的库,所以,也打算,

从无到有,一点点,边学习go语言本身,边去实现对应的,模拟登陆百度的功能。

【折腾过程】

1.先去学习一下go语言本身:

【记录】下载和安装go语言

2.然后再去搞懂基本的开发:

【记录】go语言的基本开发:实现Hello World,找到合适的开发环境和工具

3.换了个环境,不过也是x64的win7,然后重新去下载和安装go,然后再去试试普通的hello world。

此处,几点值得一提的:

(1)此处,自动安装完go后,已经把对应的路径:

D:\tmp\dev_install_root\Go\bin

加入到当前的PATH中了;

(2)对应的go/bin下面,有三个工具:

  • go.exe
  • godoc.exe
  • gofmt.exe

4.继续去学习如何写go代码:

【记录】学习如何写go语言代码

5.搞清楚了,如何写go代码,接着就是去,参考官网手册,去学习http方面的代码如何写了。

6.关于go的命名规范,这里有介绍:

Effective Go

7.接着,可以去折腾,如何实现,基本的网页抓取方面的功能了:

【记录】用go实现基本http方面的抓取网页html

8.但是如上获得的内容,都是打印到cmd中的,不方便后续开发记录和查看。

所以希望,能log内容到文件中:

【已解决】go语言中实现输出内容到log文件

9.然后出现文件编码的问题:

【问题】go代码运行出错:# command-line-arguments .\EmulateLoginBaidu.go:86: illegal UTF-8 sequence

10.接着又出现“cannot use body (type []byte) as type string in assignment”的错误:

【已解决】go代码中直接使用http返回的body赋值给string结果出错:cannot use body (type []byte) as type string in assignment

11.至此,已经可以实现了:

将百度主页的html抓取下来,并且输出到log文件中了。

12.接着,继续去,搞懂,如何获得http返回的cookie:

【记录】go语言中处理http的cookie

13.接下来,就是要去从返回的html中提取我要的内容,所以要去折腾:

【记录】go语言中用正则表达式查找某个值

14.接下来,要去搞懂,go语言中的字典类型变量:

【已解决】go语言中的字典类型变量:map

15.再去搞懂,如何获得console的输入:

【已解决】go语言中获得控制台输入的字符串

16.接着再去搞懂,如何发送http的POST:

【记录】go语言中实现http的POST且传递对应的post data

17.实现了POST,且可以传递post data后,可以正常模拟登陆成功了,可以获得对应的cookie了。

所以接下来,再去检测,对应的各个cookie:

注意到,当前此刻返回的httpResp.Header中的Set-Cookie是:

(格式化后)

BDUSS=G1LNG5uLTNYWkU2bzA2SGxCZHZ2Rm5ocnN-MEhFem5uQkZrdkJFVmplUmpBV1ZTQVFBQUFBJCQAAAAAAAAAAAEAAAB-OUgCYWdhaW5pbnB1dAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGN0PVJjdD1SM; expires=Wed, 08-Dec-2021 10:26:43 GMT; path=/; domain=baidu.com; httponly
PTOKEN=deleted; expires=Fri, 21-Sep-2012 10:26:42 GMT; path=/; domain=baidu.com; httponly
PTOKEN=0f1e0187b042630a47c4eea8e0e96a2f; expires=Wed, 08-Dec-2021 10:26:43 GMT; path=/; domain=passport.baidu.com; httponly
STOKEN=8d6ce0cbc7f689a8cd647b8beb5872e3; expires=Wed, 08-Dec-2021 10:26:43 GMT; path=/; domain=passport.baidu.com; httponly
SAVEUSERID=deleted; expires=Fri, 21-Sep-2012 10:26:42 GMT; path=/; domain=passport.baidu.com; httponly
USERNAMETYPE=1; expires=Wed, 08-Dec-2021 10:26:43 GMT; path=/; domain=passport.baidu.com; httponly

可见,对应的cookie:

(1)PTOKEN,对于:

domain=baidu.com

是delete掉了;

而对于passport.baidu.com,PTOKEN还是存在的;

(2)而另外几个cookie:

STOKEN,SAVEUSERID,USERNAMETYPE,的domain却都是:

passport.baidu.com

而不是原以为的:

baidu.com

(3)BDUSS的domain的确是baidu.com

这样的话,之前的代码:

gCurCookies = gCurCookieJar.Cookies(httpReq.URL);

以为会只能获得对应的

BDUSS

或者是:

STOKEN,SAVEUSERID,USERNAMETYPE

不过,幸运的是,此处通过:

dbgPrintCurCookies

而打印出来的cookie,是都存在的:

[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:199) cookieNum=7
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [0]------
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name		=H_PS_PSSID
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value	=3359_1455_2976_2981_3090
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path		=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain	=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires	=0001-01-01 00:00:00 +0000 UTC
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires	=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge	=0
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure	=false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly	=false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw		=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed	=[]
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [1]------
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name		=BAIDUID
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value	=74F5614706B58BFCCCB3923C8ABD3E61:FG=1
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path		=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain	=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires	=0001-01-01 00:00:00 +0000 UTC
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires	=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge	=0
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure	=false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly	=false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw		=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed	=[]
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [2]------
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name		=HOSUPPORT
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value	=1
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path		=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain	=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires	=0001-01-01 00:00:00 +0000 UTC
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires	=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge	=0
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure	=false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly	=false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw		=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed	=[]
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [3]------
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name		=BDUSS
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value	=W95bX41ZTlhNkFKQkpQcGd5Y1ZUOENiYzJ2TkpvakJaZVBXSS10WXh1THVCbVZTQVFBQUFBJCQAAAAAAAAAAAEAAAB-OUgCYWdhaW5pbnB1dAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAO55PVLueT1SO
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path		=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain	=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires	=0001-01-01 00:00:00 +0000 UTC
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires	=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge	=0
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure	=false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly	=false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw		=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed	=[]
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [4]------
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name		=PTOKEN
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value	=2e67f3d7d5c52118bf4d222ab87ac9a4
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path		=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain	=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires	=0001-01-01 00:00:00 +0000 UTC
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires	=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge	=0
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure	=false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly	=false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw		=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed	=[]
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [5]------
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name		=STOKEN
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value	=63a3b62efbd83a00c095c624ca4dfdfc
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path		=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain	=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires	=0001-01-01 00:00:00 +0000 UTC
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires	=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge	=0
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure	=false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly	=false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw		=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed	=[]
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [6]------
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name		=USERNAMETYPE
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value	=1
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path		=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain	=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires	=0001-01-01 00:00:00 +0000 UTC
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires	=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge	=0
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure	=false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly	=false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw		=
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed	=[]

所以,后续可以直接通过cookie的名字,去判断是否存在了。

18.最终,模拟登陆百度成功了。

所用代码为:

/*
 * [File]
 * EmulateLoginBaidu.go
 * 
 * [Function]
 * 【记录】用go语言实现模拟登陆百度
 * http://www.crifan.com/emulate_login_baidu_using_go_language/
 * 
 * [Version]
 * 2013-09-21
 *
 * [Contact]
 * http://www.crifan.com/about/me/
 */
package main

import (
    "fmt"
    //"builtin"
    //"log"
    "os"
    "runtime"
    "path"
    "strings"
    "time"
    //"io"
    "io/ioutil"
    "net/http"
    "net/http/cookiejar"
    "net/url"
    //"sync"
    //"net/url"
    "regexp"
    //"bufio"
    "bytes"
)

//import l4g "log4go.googlecode.com/hg"
//import l4g "code.google.com/p/log4go"
import "code.google.com/p/log4go"

/***************************************************************************************************
    Global Variables
***************************************************************************************************/
var gCurCookies []*http.Cookie;
var gCurCookieJar *cookiejar.Jar;
var gLogger log4go.Logger;

/***************************************************************************************************
    Functions
***************************************************************************************************/
//do init before all others
func initAll(){
    gCurCookies = nil
    //var err error;
    gCurCookieJar,_ = cookiejar.New(nil)
    gLogger = nil
    
    initLogger()
    initCrifanLib()
}

//de-init for all
func deinitAll(){
    gCurCookies = nil
    if(nil == gLogger) {
        gLogger.Close();
        //os.Stdout.Sync() //try manually flush, but can not fix log4go's flush bug
        
        gLogger = nil
    }
}

//do some init for crifanLib
func initCrifanLib(){
    gLogger.Debug("init for crifanLib")
    gCurCookies = nil
    return
}

//init for logger
func initLogger(){
    var filenameOnly string = GetCurFilename()
    var logFilename string =  filenameOnly + ".log";
    
    //gLogger = log4go.NewLogger()
    //gLogger = make(log4go.Logger)
    
    //for console
    //gLogger.AddFilter("stdout", log4go.INFO, log4go.NewConsoleLogWriter())
    gLogger = log4go.NewDefaultLogger(log4go.INFO)
    
    //for log file
    if _, err := os.Stat(logFilename); err == nil {
        //fmt.Printf("found old log file %s, now remove it\n", logFilename)
        os.Remove(logFilename)
    }
    //gLogger.AddFilter("logfile", log4go.FINEST, log4go.NewFileLogWriter(logFilename, true))
    //gLogger.AddFilter("logfile", log4go.FINEST, log4go.NewFileLogWriter(logFilename, false))
    gLogger.AddFilter("log", log4go.FINEST, log4go.NewFileLogWriter(logFilename, false))
    gLogger.Debug("Current time is : %s", time.Now().Format("15:04:05 MST 2006/01/02"))
    
    return
}

// GetCurFilename
// Get current file name, without suffix
func GetCurFilename() string {
    _, fulleFilename, _, _ := runtime.Caller(0)
    //fmt.Println(fulleFilename)
    var filenameWithSuffix string
    filenameWithSuffix = path.Base(fulleFilename)
    //fmt.Println("filenameWithSuffix=", filenameWithSuffix)
    var fileSuffix string
    fileSuffix = path.Ext(filenameWithSuffix)
    //fmt.Println("fileSuffix=", fileSuffix)
    
    var filenameOnly string
    filenameOnly = strings.TrimSuffix(filenameWithSuffix, fileSuffix)
    //fmt.Println("filenameOnly=", filenameOnly)
    
    return filenameOnly
}

//get url response html
func getUrlRespHtml(strUrl string, postDict map[string]string) string{
    gLogger.Debug("in getUrlRespHtml, strUrl=%s", strUrl)
    gLogger.Debug("postDict=%s", postDict)
    
    var respHtml string = "";
    
    httpClient := &http.Client{
        //Transport:nil,
        //CheckRedirect: nil,
        Jar:gCurCookieJar,
    }

    var httpReq *http.Request
    //var newReqErr error
    if nil == postDict {
        gLogger.Debug("is GET")
        //httpReq, newReqErr = http.NewRequest("GET", strUrl, nil)
        httpReq, _ = http.NewRequest("GET", strUrl, nil)
        // ...
        //httpReq.Header.Add("If-None-Match", `W/"wyzzy"`)
    } else {
        //【记录】go语言中实现http的POST且传递对应的post data
        //http://www.crifan.com/go_language_http_do_post_pass_post_data
        gLogger.Debug("is POST")
        postValues := url.Values{}
        for postKey, PostValue := range postDict{
            postValues.Set(postKey, PostValue)
        }
        gLogger.Debug("postValues=%s", postValues)
        postDataStr := postValues.Encode()
        gLogger.Debug("postDataStr=%s", postDataStr)
        postDataBytes := []byte(postDataStr)
        gLogger.Debug("postDataBytes=%s", postDataBytes)
        postBytesReader := bytes.NewReader(postDataBytes)
        //httpReq, newReqErr = http.NewRequest("POST", strUrl, postBytesReader)
        httpReq, _ = http.NewRequest("POST", strUrl, postBytesReader)
        //httpReq.Header.Set("Content-Type", "application/x-www-form-urlencoded; param=value")
        httpReq.Header.Add("Content-Type", "application/x-www-form-urlencoded")
    }
    
    httpResp, err := httpClient.Do(httpReq)
    // ...
    
    //httpResp, err := http.Get(strUrl)
    //gLogger.Info("http.Get done")
    if err != nil {
        gLogger.Warn("http get strUrl=%s response error=%s\n", strUrl, err.Error())
    }
    gLogger.Debug("httpResp.Header=%s", httpResp.Header)
    gLogger.Debug("httpResp.Status=%s", httpResp.Status)

    defer httpResp.Body.Close()
    // gLogger.Info("defer httpResp.Body.Close done")
    
    body, errReadAll := ioutil.ReadAll(httpResp.Body)
    //gLogger.Info("ioutil.ReadAll done")
    if errReadAll != nil {
        gLogger.Warn("get response for strUrl=%s got error=%s\n", strUrl, errReadAll.Error())
    }
    //gLogger.Debug("body=%s\n", body)

    //gCurCookies = httpResp.Cookies()
    //gCurCookieJar = httpClient.Jar;
    gCurCookies = gCurCookieJar.Cookies(httpReq.URL);
    //gLogger.Info("httpResp.Cookies done")
    
    //respHtml = "just for test log ok or not"
    respHtml = string(body)
    //gLogger.Info("httpResp body []byte to string done")

    return respHtml
}

func dbgPrintCurCookies() {
    var cookieNum int = len(gCurCookies);
    gLogger.Debug("cookieNum=%d", cookieNum)
    for i := 0; i < cookieNum; i++ {
        var curCk *http.Cookie = gCurCookies[i];
        //gLogger.Debug("curCk.Raw=%s", curCk.Raw)
        gLogger.Debug("------ Cookie [%d]------", i)
        gLogger.Debug("Name\t\t=%s", curCk.Name)
        gLogger.Debug("Value\t=%s", curCk.Value)
        gLogger.Debug("Path\t\t=%s", curCk.Path)
        gLogger.Debug("Domain\t=%s", curCk.Domain)
        gLogger.Debug("Expires\t=%s", curCk.Expires)
        gLogger.Debug("RawExpires\t=%s", curCk.RawExpires)
        gLogger.Debug("MaxAge\t=%d", curCk.MaxAge)
        gLogger.Debug("Secure\t=%t", curCk.Secure)
        gLogger.Debug("HttpOnly\t=%t", curCk.HttpOnly)
        gLogger.Debug("Raw\t\t=%s", curCk.Raw)
        gLogger.Debug("Unparsed\t=%s", curCk.Unparsed)
    }
}

func main() {
    initAll()

    gLogger.Info("============ 程序说明 ============");
    gLogger.Info("功能:本程序是用来演示使用Java代码去实现模拟登陆百度");
    gLogger.Info("注意事项:部分百度账户,在登陆时会出现:");
    gLogger.Info("1.部分百度账户,在登陆时会出现:");
    gLogger.Info("系统检测到您的帐号疑似被盗,存在安全风险。请尽快修改密码。");
    gLogger.Info("此时,本程序,无法成功模拟登陆,请自行按照提示去修改密码后,就可以了。");

    //step1: access baidu url to get cookie BAIDUID
    gLogger.Info("====== 步骤1:获得BAIDUID的Cookie ======")
    var baiduMainUrl string = "http://www.baidu.com/";
    gLogger.Debug("baiduMainUrl=%s", baiduMainUrl)
    respHtml := getUrlRespHtml(baiduMainUrl, nil)
    gLogger.Debug("respHtml=%s", respHtml)
    dbgPrintCurCookies()
    
    //check cookie
    var bGotCookieBaiduid = false;
    //var cookieNameListToCheck []string = ["BAIDUID"]
    //toCheckCookieNameList := [1]string{"BAIDUID"}
    toCheckCookieNameList := []string{"BAIDUID"}
    toCheckCookieNum := len(toCheckCookieNameList)
    gLogger.Debug("toCheckCookieNum=%d", toCheckCookieNum)
    curCookieNum := len(gCurCookies)
    gLogger.Debug("curCookieNum=%d", curCookieNum)
    for i := 0; i < toCheckCookieNum; i++ {
        toCheckCkName := toCheckCookieNameList[i];
        gLogger.Debug("[%d]toCheckCkName=%s", i, toCheckCkName)
        for j := 0; j < curCookieNum; j++{
            curCookie := gCurCookies[j]
            if(strings.EqualFold(toCheckCkName, curCookie.Name)){
                bGotCookieBaiduid = true;
                break;
            }
        }
    }

    if bGotCookieBaiduid {
        gLogger.Info("Found cookie BAIDUID");
    }else{
        gLogger.Info("Not found cookie BAIDUID");
    }
    
    //step2: login, pass paras, extract resp cookie
    gLogger.Info("====== 步骤2:提取login_token ======");
    bExtractTokenValueOK := false
    strLoginToken := ""
    var getApiRespHtml string;
    if bGotCookieBaiduid{
        //https://passport.baidu.com/v2/api/?getapi&class=login&tpl=mn&tangram=true
        var getapiUrl string = "https://passport.baidu.com/v2/api/?getapi&class=login&tpl=mn&tangram=true";
        getApiRespHtml = getUrlRespHtml(getapiUrl, nil);
        gLogger.Debug("getApiRespHtml=%s", getApiRespHtml);
        dbgPrintCurCookies()
        
        //bdPass.api.params.login_token='278623fc5463aa25b0189ddd34165592';
        //use regex to extract login_token
        //【记录】go语言中用正则表达式查找某个值
        //http://www.crifan.com/go_language_regular_expression_find_value/
        loginTokenP, _ := regexp.Compile(`bdPass\.api\.params\.login_token='(?P<loginToken>\w+)';`)
        //loginToken := loginTokenP.FindString(getApiRespHtml);
        //loginToken := loginTokenP.FindSubmatch(getApiRespHtml);
        foundLoginToken := loginTokenP.FindStringSubmatch(getApiRespHtml);
        gLogger.Debug("foundLoginToken=%s", foundLoginToken);
        if nil != foundLoginToken {
            strLoginToken = foundLoginToken[1] //tmp go regexp not support named group, so use index here
            gLogger.Info("found bdPass.api.params.login_token=%s", strLoginToken);
            bExtractTokenValueOK = true;
        } else {
            gLogger.Warn(" not found login_token from html=%s", getApiRespHtml);
        }
    }

    //step3: verify returned cookies
    bLoginBaiduOk := false;
    if bGotCookieBaiduid && bExtractTokenValueOK {
        gLogger.Info("======步骤3:登陆百度并检验返回的Cookie ======");
        staticPageUrl := "http://www.baidu.com/cache/user/html/jump.html";
        
        postDict := map[string]string{}
        //postDict["ppui_logintime"] = ""
        postDict["charset"] = "utf-8"
        //postDict["codestring"] = ""
        postDict["token"] = strLoginToken
        postDict["isPhone"] = "false"
        postDict["index"] = "0"
        //postDict["u"] = ""
        //postDict["safeflg"] = "0"
        postDict["staticpage"] = staticPageUrl
        postDict["loginType"] = "1"
        postDict["tpl"] = "mn"
        postDict["callback"] = "parent.bdPass.api.login._postCallback"

        //【已解决】go语言中获得控制台输入的字符串
        //http://www.crifan.com/go_language_get_console_input_string/
        strBaiduUsername := ""
        strBaiduPassword := ""
        gLogger.Info("Plese input:")
        gLogger.Info("Baidu Username:")
        _, err1 := fmt.Scanln(&strBaiduUsername)
        if nil == err1 {
            gLogger.Debug("strBaiduUsername=%s", strBaiduUsername)
        }
        gLogger.Info("Baidu Password:")
        _, err2 := fmt.Scanln(&strBaiduPassword)
        if nil == err2 {
            gLogger.Debug("strBaiduPassword=%s", strBaiduPassword)
        }
        
        postDict["username"] = strBaiduUsername
        postDict["password"] = strBaiduPassword
        postDict["verifycode"] = ""
        postDict["mem_pass"] = "on"
        
        gLogger.Debug("postDict=%s", postDict)
        
        baiduMainLoginUrl := "https://passport.baidu.com/v2/api/?login";
        loginBaiduRespHtml := getUrlRespHtml(baiduMainLoginUrl, postDict);
        gLogger.Debug("loginBaiduRespHtml=%s", loginBaiduRespHtml)
        dbgPrintCurCookies();
        
        //check resp cookies exist or not
        cookieNameDict := map[string]bool{
            "BDUSS"     : false,
            "PTOKEN"    : false,
            "STOKEN"    : false,
            //"SAVEUSERID": false, //be deleted
        }
        
        for cookieName, _ := range cookieNameDict {
            for _, singleCookie := range gCurCookies {
                //if(strings.EqualFold(cookieName, singleCookie.Name)){
                if cookieName == singleCookie.Name {
                    cookieNameDict[cookieName] = true;
                    gLogger.Debug("Found cookie %s", cookieName)
                }
            }
        }
        gLogger.Debug("After check resp cookie, cookieNameDict=%s", cookieNameDict)
        
        bAllCookiesFound := true
        for _, bIsExist := range cookieNameDict {
            bAllCookiesFound = bAllCookiesFound && bIsExist
        }
        bLoginBaiduOk = bAllCookiesFound
        if (bLoginBaiduOk) {
            gLogger.Info("成功模拟登陆百度首页!" );
        } else{
            gLogger.Info("模拟登陆百度首页 失败!");
            gLogger.Info("所返回的HTML源码为:" + loginBaiduRespHtml);
        }
    }
    
    deinitAll()

    //【workaround】go语言中用log4go输出信息时有bug:只输出部分信息,甚至是无任何输出
    //http://www.crifan.com/go_language_log4go_only_output_part_info/
    time.Sleep(100 * time.Millisecond)
}

效果为:

emulate baidu login ok in the end

 

【总结】

从无到有,经历千辛万苦,最终终于用go语言,实现了,模拟登陆百度。


后续的,抽空再继续优化,至少包括:

【记录】在用go语言成功模拟登陆百度后把相关函数整理至自己的go语言的库函数:crifanLib.go



3 Thoughts on “【记录】用go语言实现模拟登陆百度

  1. 楼主又开始折腾go语言去了?

发表评论

电子邮件地址不会被公开。 必填项已用*标注

无觅相关文章插件,快速提升流量