最新消息:20210816 当前crifan.com域名已被污染,为防止失联,请关注(页面右下角的)公众号

【已解决】PHP中获得访问某url地址后所返回的cookie和发送cookie

PHP crifan 7187浏览 0评论

【问题】

已经可以通过PHP的代码,去获得对应的某个url地址所返回的html代码了:

//http://cn2.php.net/curl_setopt
function getUrlRespHtml($url)
{
    printAutoNewline("now to get response from url=".$url);
    
    //get the file (e.g. image) and output it to the browser
    $ch = curl_init(); //open curl handle
    curl_setopt($ch, CURLOPT_URL, $url); //set an url
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); //do not output directly, use variable
    curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1); //do a binary transfer
    curl_setopt($ch, CURLOPT_FAILONERROR, 1); //stop if an error occurred
    $response = curl_exec($ch); //store the content in variable
    
    if(!curl_errno($ch))
    {
        //send out headers and output
        header("Content-type: ".curl_getinfo($ch, CURLINFO_CONTENT_TYPE)."");
        header("Content-Length: ".curl_getinfo($ch, CURLINFO_CONTENT_LENGTH_DOWNLOAD)."");

        //printAutoNewline($response);
    }
    else{
        printAutoNewline('Curl error: ' . curl_error($ch));
    }
    curl_close($ch); //close curl handle

    return $response;
}

现在想要获得http访问后,所返回的cookie。

【解决过程】

1.去看官网的解释:

http://cn2.php.net/curl_setopt

找到一些和cookie相关的参数:

value选项:

CURLOPT_COOKIESESSION

启用时curl会仅仅传递一个session cookie,忽略其他的cookie,默认状况下cURL会将所有的cookie返回给服务端。session cookie是指那些用来判断服务器端的session是否有效而存在的cookie。

 

对于下面的这些option的可选参数,value应该被设置一个string类型的值:

CURLOPT_COOKIE

设定HTTP请求中"Cookie: "部分的内容。多个cookie用分号分隔,分号后带一个空格(例如, "fruit=apple; colour=red")。

CURLOPT_COOKIEFILE

包含cookie数据的文件名,cookie文件的格式可以是Netscape格式,或者只是纯HTTP头部信息存入文件。

CURLOPT_COOKIEJAR

连接结束后保存cookie信息的文件。

另外,也看到了一些其他说明:

Remember:

- 'Server-side' cookies exists as information even before they were set on browser agent(HTTP COOKIE HEADER),

- javascript cookies does NOT exists as information before they were set on browser agent,

so, if you're trying to save cookies using CURLOPT_COOKIEJAR to a local file, that cookie must be server - side cookie, otherwise you are wasting time, javascript-produced cookies only exists when client browser's JS interpreter set them.

About CURLOPT_COOKIEJAR and CURLOPT_COOKIEFILE, and which / how to use.

- CURLOPT_COOKIEJAR is used when cURL is reading cookie data from disk.

- CURLOPT_COOKIEFILE is used when cURL is writing the cookie data to disk.

So you need to specify both (and set the same file location on both) when working with sessions for example.

有人也写了一些参考代码:

This function helps to parse netscape cookie file, generated by cURL into cookie array:

<?php

function _curl_parse_cookiefile($file) {

$aCookies = array();

$aLines = file($file);

    foreach($aLines as $line){

      if('#'==$line{0})

        continue;

$arr = explode("\t", $line);

      if(isset($arr[5]) && isset($arr[6]))

$aCookies[$arr[5]] = $arr[6];

    }

    return $aCookies;

  }

?>

Sime sites may protect themselves from remote logins by checking which site you came from.

Then you might want to use CURLOPT_REFERER.

<?php

// $url = page to POST data

// $ref_url = tell the server which page you came from (spoofing)

// $login = true will make a clean cookie-file.

// $proxy = proxy data

// $proxystatus = do you use a proxy ? true/false

function

curl_grab_page($url,$ref_url,$data,$login,$proxy,$proxystatus){

    if($login == 'true') {

$fp = fopen("cookie.txt", "w");

fclose($fp);

    }

$ch = curl_init();

curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");

curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");

curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)");

curl_setopt($ch, CURLOPT_TIMEOUT, 40);

curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);

    if ($proxystatus == 'true') {

curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, TRUE);

curl_setopt($ch, CURLOPT_PROXY, $proxy);

    }

curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);

curl_setopt($ch, CURLOPT_URL, $url);

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

curl_setopt($ch, CURLOPT_REFERER, $ref_url);

curl_setopt($ch, CURLOPT_HEADER, TRUE);

curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);

curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);

curl_setopt($ch, CURLOPT_POST, TRUE);

curl_setopt($ch, CURLOPT_POSTFIELDS, $data);

ob_start();

    return curl_exec ($ch); // execute the curl command

ob_end_clean();

curl_close ($ch);

    unset($ch);

}

echo curl_grab_page("https://www.example.net/login.php", "https://www.example.net/", "username=foo&password=bar", "true",  "null", "false");

?>

下面就是自己去折腾,写写代码了。

2.此处想要获得返回的cookie,所以看起来应该是用“CURLOPT_COOKIEJAR”。

但是添加了代码:

    printAutoNewline("now add CURLOPT_COOKIEJAR support");
    curl_setopt($ch, CURLOPT_COOKIEJAR, "local_cookie.txt"); 

结果运行后,却没有生成对应的cookie文件.

3.手动去先创建一个空的local_cookie.txt,再重新运行代码.

结果是还是没用.

4.后来重新添加代码:

    curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie_jar.txt");
    curl_setopt($ch, CURLOPT_COOKIEFILE,"cookie_file.txt");

结果还是没用,两个cookie文件都没有生成.

5.对应的,关于对于cookie的理解,和下面的解释

To collect cookies recieved with a request, set CURLOPT_COOKIEJAR "cookieFileName".  Then use CURLOPT_COOKIEFILE "cookieFileName" to recall them in subsequent transactions.

也是一样的。

用CURLOPT_COOKIEJAR去获得返回的cookie,然后后续发送新的请求,送上CURLOPT_COOKIEFILE。

但是此处就是无法通过CURLOPT_COOKIEJAR去获得返回的cookie,很是奇怪。

6.参考:

PHP cURL cookies not saving on Windows

去添加相对路径,变为:

    printAutoNewline("now add CURLOPT_COOKIEJAR support");
    curl_setopt($ch, CURLOPT_COOKIEJAR, dirname(__FILE__)."cookie_jar.txt");
    //curl_setopt($ch, CURLOPT_COOKIEFILE,dirname(__FILE__)."cookie_file.txt");

然后结果还是不行。

7.新建一个空的cookie_jar.txt,再去执行代码试试,结果还是不行。

8.添加了个打印代码试试结果:

    printAutoNewline("now add CURLOPT_COOKIEJAR support");
    $cookieJarFullname = dirname(__FILE__)."cookie_jar.txt";
    printAutoNewline("cookieJarFullname=".$cookieJarFullname);
    curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieJarFullname);
    //curl_setopt($ch, CURLOPT_COOKIEFILE,dirname(__FILE__)."cookie_file.txt");

然后结果是:

cookieJarFullname=D:\tmp\WordPress\DevRoot\httpd-2.2.19-win64\httpd-2.2-x64\htdocs\php_test\35934503_datacookie_jar.txt

很明显,和我所猜测的一样,的确没有加上最后一个反斜杠。

但是直接去添加,则是不好的做法,应该找到是否有对应的路径拼接的函数。

详细折腾过程参见:

【已解决】PHP中如何实现路径拼接(两个路径合并)以及合并文件夹路径和文件名

9.最后,经过正确设置cookie文件(的完整文件名)后,就可以正确保存出来对应的cookie到文件里面了。

cookie中的内容是Netscape格式的:

# Netscape HTTP Cookie File

# http://curl.haxx.se/rfc/cookie_spec.html

# This file was generated by libcurl! Edit at your own risk.

www.yell.com    FALSE    /ucs    FALSE    0    JSESSIONID    3861C648DD4F626447319030008D17B1

#HttpOnly_www.yell.com    FALSE    /    FALSE    1355387093    NSC_mcw_xxx-c.zfmmhspvq.dpn_80    ffffffffc3a0420b45525d5f4f58455e445a4a421517

对应的代码是:

/*
 * 【已解决】PHP中如何实现路径拼接(两个路径合并)以及合并文件夹路径和文件名
 * https://www.crifan.com/php_path_concatenation_combine_directory_and_filename
 * eg:
 * from:
 * D:\tmp\WordPress\DevRoot\httpd-2.2.19-win64\httpd-2.2-x64\htdocs\php_test\35934503_data
 * cookie_jar.txt
 * to:
 * D:\tmp\WordPress\DevRoot\httpd-2.2.19-win64\httpd-2.2-x64\htdocs\php_test\35934503_data\cookie_jar.txt
 */
function concatenatePath($headPath, $tailPath)
{
    $realHeadPath = realpath($headPath);
    printAutoNewline("realHeadPath=".$realHeadPath);
    //$realTailPath = realpath($tailPath);
    //printAutoNewline("realTailPath=".$realTailPath);
    //$concatnatedPath = $realHeadPath.DIRECTORY_SEPARATOR.$realTailPath;
    printAutoNewline("tailPath=".$tailPath);
    
    $concatnatedPath = $realHeadPath.DIRECTORY_SEPARATOR.$tailPath;
    printAutoNewline("concatnatedPath=".$concatnatedPath);
    return $concatnatedPath;
}

//http://cn2.php.net/curl_setopt
function getUrlRespHtml($url)
{
    printAutoNewline("now to get response from url=".$url);
    
    //get the file (e.g. image) and output it to the browser
    $ch = curl_init(); //open curl handle
    curl_setopt($ch, CURLOPT_URL, $url); //set an url
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); //do not output directly, use variable
    curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1); //do a binary transfer
    curl_setopt($ch, CURLOPT_FAILONERROR, 1); //stop if an error occurred
    
    printAutoNewline("now add CURLOPT_COOKIEJAR support");
    $cookieJarFilename = "cookie_jar.txt";
    //$cookieJarFullname = dirname(__FILE__).$cookieJarFilename;
    $cookieJarFullname = concatenatePath(dirname(__FILE__), $cookieJarFilename);
    printAutoNewline("cookieJarFullname=".$cookieJarFullname);
    curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieJarFullname);
    //curl_setopt($ch, CURLOPT_COOKIEFILE,dirname(__FILE__)."cookie_file.txt");

    $response = curl_exec($ch); //store the content in variable
    
    if(!curl_errno($ch))
    {
        //send out headers and output
        header("Content-type: ".curl_getinfo($ch, CURLINFO_CONTENT_TYPE)."");
        header("Content-Length: ".curl_getinfo($ch, CURLINFO_CONTENT_LENGTH_DOWNLOAD)."");

        //printAutoNewline($response);
    }
    else{
        printAutoNewline('Curl error: ' . curl_error($ch));
    }
    curl_close($ch); //close curl handle

    return $response;
}

printAutoNewline("DIRECTORY_SEPARATOR=".DIRECTORY_SEPARATOR);

$yellEntryUrl = "http://www.yell.com/";
$yesllRespHtml = getUrlRespHtml($yellEntryUrl);
//printAutoNewline("yesllRespHtml=".$yesllRespHtml);
$outputFilename = "respHtml.html";
saveToFile($yesllRespHtml, $outputFilename);

10.另外,再试试,先把cookie_jar.txt删除掉,看看其是否会自动创建,还是会直接报错说找不到文件。

结果证实,其是可以自动创建cookie_jar.txt的。

 

【总结】

想要获得返回的cookie,可以通过:

curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieJarFullname);

的方式,将返回的cookie,保存到对应的文件$cookieJarFullname里面。

其中要注意的是,$cookieJarFullname必须是带绝对路径的,完整的文件名。

 

另外,如果想要发送cookie,即下次再想要访问别的url,用到这个cookie,那么可以通过:

curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieJarFullname);

去把此cookie发送出去。

转载请注明:在路上 » 【已解决】PHP中获得访问某url地址后所返回的cookie和发送cookie

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
82 queries in 0.166 seconds, using 22.14MB memory