Python Read Error Character Maps to Undefined

On this page: open(), file.read(), file.readlines(), file.write(), file.writelines().

Before proceeding, brand sure yous understand the concepts of file path and CWD. If y'all run into bug, visit the Common Pitfalls section at the lesser of this page.

Opening and Closing a "File Object"

As seen in Tutorials #12 and #13, file IO (input/output) operations are done through a file data object. It typically gain as follows:
  1. Create a file object using the open() office. Forth with the file name, specify:
    • 'r' for reading in an existing file (default; can exist dropped),
    • 'west' for creating a new file for writing,
    • 'a' for appending new content to an existing file.
  2. Exercise something with the file object (reading, writing).
  3. Close the file object by calling the .shut() method on the file object.
Beneath, myfile is the file information object nosotros're creating for reading. 'alice.txt' is a pre-existing text file in the same directory as the foo.py script. Afterwards the file content is read in, .close() is called on myfile, closing the file object.
myfile = open('alice.txt',              'r')       myfile.close()                                    foo.py            
Below, myfile is opened for writing. In the 2d instance, the 'a' switch makes sure that the new content is tacked on at the stop of the existing text file. Had you used 'due west' instead, the original file would have been overwritten.
myfile = open('results.txt',              'w')     myfile.close()                        myfile = open('results.txt',              'a')     myfile.close()                                    foo.py            
There is i more piece of crucial data: encoding. Some files may have to be read as a particular encoding type, and sometimes you need to write out a file in a specific encoding organisation. For such cases, the open() statement should include an encoding spcification, with the encoding='xxx' switch:
myfile = open('alice.txt',              encoding='utf-eight'              )        myfile = open('results.txt',              'westward',              encoding='utf-8'              )                 foo.py            
Generally, you will need 'utf-8' (8-bit Unicode), 'utf-xvi' (16-bit Unicode), or 'utf-32' (32-bit), but it may exist something different, especially if you are dealing with a strange language text. Here is a full list of encodings.

Reading from a File

OK, we know how to open up and close a file object. Only what are the actual commands for reading? At that place are multiple methods.

Beginning off,

.read() reads in the unabridged text content of the file as a unmarried string. Below, the file is read into a variable named marytxt, which ends up being a string-type object. Download mary-brusque.txt and try out yourself.
                      >>>                      f = open up('mary-short.txt')                      >>>                      marytxt = f.read()                                   >>>                      f.shut()                      >>>                      marytxt                      'Mary had a footling lamb,\nHis fleece was white as snowfall,\nAnd everywhere that Mary  went,\nThe lamb was sure to become.\n'                      >>>                      type(marytxt)                                        <type 'str'>                      >>>                      len(marytxt)                                         110                      >>>                      print(marytxt[0])                      M                    
Next, .readlines() reads in the unabridged text content of the file as a listing of lines, each terminating with a line suspension. Below, yous tin see marylines is a listing of strings, where each string is a line from mary-short.txt.
                      >>>                      f = open('mary-short.txt')                      >>>                      marylines = f.readlines()                            >>>                      f.close()                      >>>                      marylines                      ['Mary had a petty lamb,\n', 'His fleece was white as snow,\n', 'And everywhere  that Mary went,\north', 'The lamb was certain to go.\n']                      >>>                      type(marylines)                                      <type 'list'>                      >>>                      len(marylines)                                       four                      >>>                      print(marylines[0])                      Mary had a little lamb,                                          
Lastly, rather than loading the unabridged file content into memory, yous can iterate through the file object line by line using the for ... in loop. This method is more retentivity-efficient and therefore recommended when dealing with a very big file. Below, bible-kjv.txt is opened, and any line containing smite is printed out. Download bible-kjv.txt and try out yourself.
f = open('bible-kjv.txt')      for line in f:                     if              'smite'              in line:          print(line,)                  f.shut()              foo.py            

Writing to a File

Writing methods as well come in a pair: .write() and .writelines(). Similar the corresponding reading methods, .write() handles a single string, while .writelines() handles a listing of strings.

Below,

.write() writes a single string each time to the designated output file:
                      >>>                      fout = open('hello.txt',                      'west')                      >>>                      fout.write('Howdy, globe!\n')                                      >>>                      fout.write('My name is Homer.\n')                      >>>                      fout.write("What a beautiful day we're having.\north")                      >>>                      fout.close()                    
This time, we take tobuy, a list of strings, which .writelines() writes out at once:
                      >>>                      tobuy = ['milk\n',                      'butter\northward',                      'coffee beans\due north',                      'arugula\n']                      >>>                      fout = open('grocerylist.txt',                      'w')                      >>>                      fout.writelines(tobuy)                                            >>>                      fout.close()                    
Notation that all strings in the examples have the line break '\n' at the terminate. Without it, all strings volition be printed out on the aforementioned line, which is what was happening in Tutorial 13. Unlike the print statement which prints out a string on its ain new line, writing methods will not tack on a newline character -- y'all must call up to supply '\n' if you wish a string to occupy its ain line.

Common Pitfalls

File I/O is notoriously fraught with stumbling blocks for beginning programmers. Below are the most mutual ones.

"No such file or directory" error

                      >>>                      f = open('mary-short.txt')                      Traceback (most recent call last):   File "", line 1, in                                                      IOError: [Errno 1] No such file or directory: 'mary-brusk.txt'                                                                  
You are getting this error because Python failed to locate the file for reading. Make sure you lot are supplying the correct file path and name. Read first File Path and CWD. Also, refer to this, this and this FAQ.

Issues with encoding

                      >>>                      f = open up('mary-short.txt')                          >>>                      marytxt = f.read()                      Traceback (almost recent call terminal):   File "<pyshell#xiv>", line 1, in <module>     marytxt = f.read()   File "C:\Program Files (x86)\Python35-32\lib\encodings\cp1252.py", line 23, in decode     return codecs.charmap_decode(input,cocky.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec tin't decode byte 0x81 in position 36593: character  maps to <undefined>                    
"UnicodeDecodeError" means y'all accept a file encoding issue. Each figurer has its own system-wide default encoding, and the file y'all are trying to open up is encoded in something different, nigh likely some version of Unicode. If this happens, yous should specify the encoding using the encoding='xxx' switch while opening the file. If you are not sure which encoding to use, try 'utf-viii', 'utf-16', and 'utf-32'.

Entire file content tin be read in merely Once per opening

                      >>>                      f = open('mary-short.txt')                      >>>                      marytxt = f.read()                                 >>>                      marylines = f.readlines()                          >>>                      f.shut()                      >>>                      len(marytxt)                      110                      >>>                      len(marylines)                           0                    
Both .read() and .readlines() come with the concept of a cursor. Later either command is executed, the cursor moves to the terminate of the file, leaving aught more than to read in. Therefore, one time a file content has been read in, another attempt to read from the file object will produce an empty data object. If for some reason you must read the file content again, you must shut and re-open the file.

Only the cord blazon can be written

                      >>>                      pi = 3.141592                      >>>                      fout = open('math.txt',                      'w')                      >>>                      fout.write("Pi's value is ")                      >>>                      fout.write(pi)                                      Traceback (most recent call last):   File "", line 1, in                                                      TypeError: expected a character buffer object                                                                    >>>                      fout.write(str(pi))                                >>>                                          
Writing methods only works with strings: .write() takes a single string, and .writelines() takes a list which contains strings but. Non-string type data must be first coerced into the string blazon past using the str() function.

Your output file is empty

This happens to everyone: y'all write something out, open up up the file to view, merely to detect information technology empty. In other times, the file content may exist incomplete. Curious, isn't it? Well, the cause is simple: YOU FORGOT .shut(). Writing out happens in buffers; flushing out the last writing buffer does not happen until y'all close your file object. E'er REMEMBER TO Close YOUR FILE OBJECT.

(Windows) Line breaks practice not show up
If yous open up your text file in Notepad app in Windows and meet everything in one line, don't exist alarmed. Open up the same text file in Wordpad or, even better, Notepad++, and you will see that the line breaks are there after all. See this FAQ for details.

nashtheine.blogspot.com

Source: https://sites.pitt.edu/~naraehan/python3/reading_writing_methods.html

Related Posts

0 Response to "Python Read Error Character Maps to Undefined"

Postar um comentário

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel