【整理】关于python中文件open后去调用read()时需要知道的事情:只需调用一次即可获得文件的全部内容

关于,python中,文件被open打开后

再去调用read(),需要注意的是:

只需要调用一次read(),即可(读取出)获得文件的,所有的,内容了。

示例代码如下:

#!/usr/bin/python
# -*- coding: utf-8 -*-
"""
Function:
【整理】关于python中文件open后去调用read()时需要知道的事情:只需调用一次即可获得文件的全部内容
http://www.crifan.com/python_file_open_then_read_once_will_read_all_file_content

Author:     Crifan Li
Version:    2013-09-23
Contact:    http://www.crifan.com/contact_me/
"""

import os
import fnmatch
#import re

def findTailContainSomeStr():
    #for fName in os.listdir('E:/Dev_Root/python/answer_question/from_mail/qiu-yang.zhao/')
    #for fName in os.listdir('./')
    #for fName in os.listdir("E:\\Dev_Root\\python\\answer_question\\from_mail\\qiu-yang.zhao\\")
    #for fName in os.listdir("E:\\Dev_Root\\python\\answer_question\\from_mail\\qiu-yang.zhao")
    for fName in os.listdir("E:\\Dev_Root\\python\\answer_question\\from_mail\\qiu_yang_zhao\\"):
        print "fName=",fName;
        if fnmatch.fnmatch(fName, '*.pdf'):
            #open(name[, mode[, buffering]]) 
            #fhandle = open (fName)
            fhandle = open (fName, 'rb')
            print "fhandle=",fhandle
            readOutFirst = fhandle.read();
            #print "readOutFirst=",readOutFirst; # this line will print ALL file content, here is toooooo much, will make cmd crash, so comment it out
            print "len(readOutFirst)=",len(readOutFirst)
            readOutSecond = fhandle.read();
            print "readOutSecond=",readOutSecond; # readOutSecond=
            #above, readOutSecond is empty, for:
            #fileHandler.read() will read ALL file content, once
            #so, after one read(), not content left, so next read(), only return empty(null)
            #so, use empty string to find something, will return -1
            #so use follow:
            print readOutFirst.find('xxx') #470644
            allFileContent = readOutFirst
            #print readOutSecond.find('trailer')
            print allFileContent.find('yyy') #470525
            # here, give your Python mannual explanation:
            # file.read([size]) 
            # Read at most size bytes from the file (less if the read hits EOF before obtaining size bytes).
            # If the size argument is negative or omitted, read all data until EOF is reached. 
            # The bytes are returned as a string object. 
            # An empty string is returned when EOF is encountered immediately. 
            # (For certain files, like ttys, it makes sense to continue reading after an EOF is hit.)
            # Note that this method may call the underlying C function fread() more than once in an effort
            # to acquire as close to size bytes as possible. 
            # Also note that when in non-blocking mode, 
            #less data than was requested may be returned, even if no size parameter was given.
            # Note
            # This function is simply a wrapper for the underlying fread() C function, 
            # and will behave the same in corner cases, such as whether the EOF value is cached.
            fhandle.close()
            #whole output is:
            #
            # fName= xxx.pdf
            # fhandle= <open file 'xxx.pdf', mode 'rb' at 0x00000000021F6150>
            # len(readOutFirst)= 475293
            # readOutSecond=
            # 470644
            # 470525
            # fName= findTailContainSomeStr.py

if __name__ == "__main__":
    findTailContainSomeStr();

 

【总结】

这种事情的发生,其实:

和你自己是否知道此python的read()这个api的详细功能,关系不是很大

而和你是否掌握学习方法,关系很大。

换句话说,如果你学习python,是参考了,类似我的这样的帖子

Python的学习方法

中的:

【整理】如何学习Python + 如何有效利用Python有关的网络资源 + 如何利用Python自带手册(Python Manual)

后,掌握了,基本的学习方法后,

你在使用read()这个函数之前,就会去查查Python的自带的Mannual,然后可以查到,

(我已经贴在代码中出来的)

file.read([size])

Read at most size bytes from the file (less if the read hits EOF before obtaining size bytes). If the size argument is negative or omitted, read all data until EOF is reached. The bytes are returned as a string object. An empty string is returned when EOF is encountered immediately. (For certain files, like ttys, it makes sense to continue reading after an EOF is hit.) Note that this method may call the underlying C function fread() more than once in an effort to acquire as close to size bytes as possible. Also note that when in non-blocking mode, less data than was requested may be returned, even if no size parameter was given.

Note

This function is simply a wrapper for the underlying fread() C function, and will behave the same in corner cases, such as whether the EOF value is cached.

的内容了,就可以看到这句:

If the size argument is negative or omitted, read all data until EOF is reached

知道是:

调用read()时,如果不指定size参数,则会一直读,知道文件末尾(EOF)

即,直接读取文件的全部内容了。

即,read()不是读单个字节,而是文件的所有的内容。

所以,此处,既然调用了一次read()了

已经获得了文件的全部的内容了。

那么如果再次调用,必然是一个字节也读取不到:

因为已经(文件指针都指)到文件末尾了,没有更多的内容(字节)供你读了。

所以,必然返回空或-1等值了。

 

所以,总的说就是:

正常的写代码,学习的思路是:

尽量在写代码之前,通过各种办法,去搞懂你所用的api的含义

然后再动手写,任何事情,不确定的情况下,都不要按照固有的,原有的想法,理念,去理解一个东西,

即不要想当然,因为不同的语言,不同的api,都是不尽相同的。各有各的特点和用法。

必须搞懂你要用的api的真正含义,才能写出正确的,质量高的代码,真正又快又好的实现你要的功能。



发表评论

电子邮件地址不会被公开。 必填项已用*标注

无觅相关文章插件,快速提升流量