Logo

File Handling

Text files usually contain a massive quantity of information.

 Programmers need to be skillful while working with data, they may concern weather data, transportation data, socioeconomic data, literary masterpieces, and other information which can all be found in text files.

Reading from a file is very useful in data analysis applications, but it may be used in any case where you need to analyze or edit data saved in a file.

For example, you can develop a software that reads the contents of a text file and reformats it so that it can be displayed on a browser.

The first step in working with the information in a text file is to read the file into memory. You can view a file's whole contents or work on a single file at a time.

Click on the button below to download the a txt to demonstrate on your laptop/workstation.

 

Download Txt                        Preview text

Buttons 11.1. Click to download/preview text file.

 

  • Save the file as pi_digits.txt, or you can download the file.
  • Save the file in the same directory.

You'll see a sequence of pi values : 3.141592653589793238462643383279

 

Download Python file                        Preview Python

Buttons 11.2. Click to download/preview python file.

 

open() is a method for opening files.

To accomplish anything with a file, even printing its contents, you must first open it to gain access to it.

The name of the file you want to open is the only argument to the open() function.

Python looks for this file in the directory where the currently running program is located, python searches for Pi.txt in the same directory as file Read1.py.

The open() function or in this case, the open('Pi.txt') returns an object that represents pi digits.txt.

Once you run the program you should get an output like this :

 

Read1.py

Figure 11.1. Output of PI.textfile.

 

Reading Files

File reading is a 3 stage procedure:

  1. opening file

  2. reading the file

  3. closing the file

 

The process of opening the file is carried out by the function Open() which creates a 'filehandle' lasting till the lifetime of the file.

This function take two parameters 'Name of the file' and 'opening mode', here the name of the file usuually includes the file path, unless its in the same folder location (absolute path) as the .py user is working with.

The 2nd parameter are the modes of dealing with the file :

 

   
r  Read mode
w Write mode
a Append mode
r+ Read and write mode
   

Table 11.1. Different modes of file manipulation.

 

Using open create a file handle to read a file:
>>> file_handle = open(’sampletext.txt’, ’r’)

file_handle is not the file, but a reference to it :

File handle

Figure 11.2. File handler object.

 

Once file has been opened we can read the contents some methods which allow this are:

read(i) : reads 'i' bytes out from the file, without indication of 'i' it reads the whole file.

readline() : returns only a single line from file with newline character (\n) and empty string at reaching end of file.

readlines() : returns list of strings line by line from file read.

 

Download P53 gene.txt                        Preview p53 gene.txt

Buttons 11.3. Click to download/preview text file.

 

Download P53 solution.py                        Preview p53 py file.

Buttons 11.4. Click to download/preview python file.

 

Once we are done with the file, we close it by using: filehandle.close(). If this step is missed the python will close it after program execution. A way to ensure the file will be closed is to use with.

so we can also write it as :

with open(’readme.txt’, ’r’) as file_handle:

# do something with the file

file_handle.read()

# from here on, the file is closed

Next, we'll try to extract 'name of sequence' and 'sequence' from fasta file which we previously downloaded.

 

Download Firstread.py                        Preview Firstread.py

Buttons 11.5. Click to download/preview python file.

 

Extract name and sequence.

Figure 11.3. Output produced by python file.

 

The above figure demonstrates the output which we have produced from execution of our python.

The names must now be separated from the sequences.

Because the name comes after the ">" symbol and before the "n," we can use it to get the data we need (line 3).

The sequence is formed by joining the elements obtained by splitting the my file string, but excluding the first element.

We use read() to read the entire file and if we lack proper memory it can be problematic hence we use readline().

 

Download modified readline()                        Preview readline()

Buttons 11.6. Click to download/preview python file.

 

Writing Files

Writing into a file is similar to reading where steps 1 and 3 are similar.

  1. Open the file.

  2. Write into file.

  3. Close the file.

Opening file is similar to reading with the difference being the mode. The "w" - write mode is selected to overwrite or write new. The "a" - append mode allows to add to existing file.

Eg :

Creating a file handle for a new file:
>>> fh = open(’newfile.txt’,’w’)
Creating a new file handle to append information to a file:
>>> fh = open(’error.log’,’a’)

 

writing into the file is done by write() function which takes the string as parameter. the syntax will appear as :

 

file_handle.write(string)

 

Close the file with the filehandle.close(), however as previously mentioned you can avoid this entirely by using with keyword while opening the file.

Some example python files for demonstration of reading and writng can be found below.

 

Download write&read.py                        Preview write&read.py

Buttons 11.7. Click to download/preview python file.

 

Download numbers.txt                        Preview numbers.txt

Buttons 11.8. Click to download/preview text file.

 

The process of handling CSV (Comma Separated Values) can be done similarly.


import pandas as pd                    # import pandas library as pd
 
data =pd.read_csv("example.csv")       # read and store values into data variable
 
print(data)                            # display the values

 

However if you wanna do it like a PRO!!, we can achieve this with CSV package. Lets try the same with a text file which has several columns and rows.

 

Using CSV module

CSV's are the most common type of file manipulation formats used by Users and programmers almost everywhere in any field.

This is module which reads tabular CSV data, This module has reader and writer objects which allows to read and write values into CSV.

We can also utilize DictReader and DictWriter classes to read and write into dictionary data.

Below is the link to the file.

 

Download numbers.txt                        Preview numbers.txt

Buttons 11.9. Click to download/preview text file.

 

You can do so like this :



import csv # import CSV
 
with open('qwerty.txt') as file:             # open qwerty.txt with file filehandler
 
    read = csv.reader(file, delimiter=',')   # reads the text file
    counter = 0                              # Counter value
 
    for i in read:                   # i is the iterator
 
    if counter == 0:
      print(f'{" | ".join(i)}')      # f'{",".join(i)}' formats values one by one with join ,
      counter += 1
 
    else:
      print(f'{i[0]} | {i[1]} | {i[2]} ')
      counter += 1
 
print(f'Lines we parsed through : {counter}')  # {} curly braces should enclose variable of interest

 

Download write&read.py                        Preview write&read.py

Buttons 11.10. Click to download/preview python file.

 

 

 

 

 

 

 

---- Summary ----

As of now you know how file handling in python works.

  • read from a file.

  • write into a file.

  • Manipulate CVS's.

  • Use CSV module.

  • etc..


________________________________________________________________________________________________________________________________
Footer
________________________________________________________________________________________________________________________________

Copyright © 2022-2023. Anoop Johny. All Rights Reserved.