Youtube Python Read Data From Text File

Reading and Writing Text Files

Overview

Teaching: 60 min
Exercises: xxx min

Questions

  • How tin I read in information that is stored in a file or write data out to a file?

Objectives

  • Be able to open a file and read in the data stored in that file

  • Understand the difference between the file name, the opened file object, and the information read in from the file

  • Exist able to write output to a text file with simple formatting

Why do we want to read and write files?

Existence able to open and read in files allows us to work with larger data sets, where it wouldn't be possible to type in each and every value and shop them i-at-a-fourth dimension as variables. Writing files allows us to process our data and then relieve the output to a file so we can wait at it later.

Correct now, nosotros will practice working with a comma-delimited text file (.csv) that contains several columns of data. Even so, what you learn in this lesson can be applied to any general text file. In the next lesson, y'all will learn another way to read and procedure .csv data.

Paths to files

In order to open a file, nosotros need to tell Python exactly where the file is located, relative to where Python is currently working (the working directory). In Spyder, we can do this by setting our electric current working directory to the folder where the file is located. Or, when we provide the file name, nosotros tin give a complete path to the file.

Lesson Setup

We volition work with the practice file Plates_output_simple.csv.

  1. Locate the file Plates_output_simple.csv in the directory dwelling house/Desktop/workshops/bash-git-python.
  2. Re-create the file to your working directory, home/Desktop/workshops/YourName.
  3. Make sure that your working directory is likewise set to the folder home/Desktop/workshops/YourName.
  4. As you are working, brand sure that y'all salvage your file opening script(due south) to this directory.

The File Setup

Permit's open up and examine the structure of the file Plates_output_simple.csv. If you open the file in a text editor, y'all will encounter that the file contains several lines of text.

DataFileRaw

Yet, this is adequately difficult to read. If y'all open the file in a spreadsheet programme such equally LibreOfficeCalc or Excel, you tin can see that the file is organized into columns, with each cavalcade separated by the commas in the prototype above (hence the file extension .csv, which stands for comma-separated values).

DataFileColumns

The file contains 1 header row, followed by eight rows of data. Each row represents a single plate image. If we look at the cavalcade headings, we can run across that we have nerveless data for each plate:

  • The name of the image from which the data was collected
  • The plate number (there were 4 plates, with each plate imaged at two different fourth dimension points)
  • The growth condition (either control or experimental)
  • The ascertainment timepoint (either 24 or 48 hours)
  • Colony count for the plate
  • The average colony size for the plate
  • The percentage of the plate covered by bacterial colonies

We will read in this information file and then work to analyze the data.

Opening and reading files is a three-step process

We will open up and read the file in three steps.

  1. We volition create a variable to agree the name of the file that nosotros want to open.
  2. We will call a open up to open up the file.
  3. We will call a office to really read the data in the file and shop it in a variable so that we tin process it.

And so, there's one more stride to do!

  • When we are washed, nosotros should remember to shut the file!

You can call back of these three steps as being similar to checking out a book from the library. First, yous have to go to the catalog or database to find out which book you need (the filename). Then, you take to go and get information technology off the shelf and open the book up (the open function). Finally, to gain any information from the book, you have to read the words (the read part)!

Here is an instance of opening, reading, and closing a file.

                          #Create a variable for the file name              filename              =              'Plates_output_simple.csv'              #This is just a cord of text              #Open the file              infile              =              open              (              filename              ,              'r'              )              # 'r' says nosotros are opening the file to read, infile is the opened file object that nosotros will read from              #Shop the data from the file in a variable              information              =              infile              .              read              ()              #Impress the data in the file              print              (              data              )              #shut the file              infile              .              close              ()                      

In one case we have read the data in the file into our variable information, we can treat information technology like any other variable in our code.

Use consistent names to make your lawmaking clearer

Information technology is a adept idea to develop some consequent habits most the manner you open and read files. Using the same (or similar!) variable names each time will make it easier for you lot to proceed track of which variable is the name of the file, which variable is the opened file object, and which variable contains the read-in data.

In these examples, nosotros will employ filename for the text string containing the file name, infile for the open file object from which nosotros tin read in data, and data for the variable holding the contents of the file.

Commands for reading in files

There are a variety of commands that allow us to read in data from files.
infile.read() will read in the entire file as a single cord of text.
infile.readline() will read in one line at a fourth dimension (each time you telephone call this command, it reads in the next line).
infile.readlines() will read all of the lines into a list, where each line of the file is an detail in the listing.

Mixing these commands can accept some unexpected results.

                          #Create a variable for the file name              filename              =              'Plates_output_simple.csv'              #Open the file              infile              =              open              (              filename              ,              'r'              )              #Impress the get-go two lines of the file              print              (              infile              .              readline              ())              print              (              infile              .              readline              ())              #phone call infile.read()              print              (              infile              .              read              ())              #shut the file              infile              .              shut              ()                      

Find that the infile.read()command started at the third line of the file, where the first two infile.readline() commands left off.

Call up of information technology similar this: when the file is opened, a pointer is placed at the top left corner of the file at the beginning of the offset line. Whatsoever fourth dimension a read function is called, the cursor or arrow advances from where it already is. The first infile.readline() started at the beginning of the file and avant-garde to the cease of the first line. At present, the pointer is positioned at the start of the second line. The 2d infile.readline() advanced to the end of the second line of the file, and left the pointer positioned at the outset of the third line. infile.read() began from this position, and advanced through to the end of the file.

In general, if you want to switch between the different kinds of read commands, you should shut the file and and then open up it again to beginning over.

Reading all of the lines of a file into a list

infile.readlines() will read all of the lines into a list, where each line of the file is an item in the list. This is extremely useful, because in one case nosotros have read the file in this way, we tin loop through each line of the file and procedure it. This approach works well on data files where the data is organized into columns similar to a spreadsheet, because it is probable that we volition want to handle each line in the same fashion.

The case below demonstrates this approach:

                          #Create a variable for the file name              filename              =              "Plates_output_simple.csv"              #Open the file              infile              =              open              (              filename              ,              'r'              )              lines              =              infile              .              readlines              ()              for              line              in              lines              :              #lines is a list with each detail representing a line of the file              if              'command'              in              line              :              print              (              line              )              #print lines for command status              infile              .              close              ()              #close the file when you're washed!                      

Using .divide() to split up "columns"

Since our data is in a .csv file, we tin can utilise the divide control to separate each line of the file into a list. This can be useful if we want to access specific columns of the file.

                          #Create a variable for the file name                            filename              =              "Plates_output_simple.csv"              #Open up the file              infile              =              open              (              filename              ,              'r'              )              lines              =              infile              .              readlines              ()              for              line              in              lines              :              sline              =              line              .              split              (              ','              )              # separates line into a list of items.  ',' tells it to carve up the lines at the commas              print              (              sline              )              #each line is at present a list              infile              .              close              ()              #Always close the file!                      

Consistent names, once again

At first glance, the variable proper noun sline in the example to a higher place may non make much sense. In fact, nosotros chose information technology to be an abridgement for "split line", which exactly describes the contents of the variable.

Y'all don't have to use this naming convention if you don't want to, but you should work to use consequent variable names across your code for common operations like this. Information technology will brand it much easier to open up an old script and quickly sympathize exactly what it is doing.

Converting text to numbers

When we called the readlines() command in the previous code, Python reads in the contents of the file every bit a string. If we want our code to recognize something in the file as a number, we need to tell it this!

For case, bladder('5.0') will tell Python to care for the text string '5.0' every bit the number v.0. int(sline[4]) volition tell our code to treat the text string stored in the fifth position of the list sline as an integer (not-decimal) number.

For each line in the file, the ColonyCount is stored in the 5th column (index iv with our 0-based counting).
Change the code above to print the line but if the ColonyCount is greater than 30.

Solution

                                  #Create a variable for the file name                  filename                  =                  'Plates_output_simple.csv'                  ##Open the file                  infile                  =                  open                  (                  filename                  ,                  'r'                  )                  lines                  =                  infile                  .                  readlines                  ()                  for                  line                  in                  lines                  [                  1                  :]:                  #skip the first line, which is the header                  sline                  =                  line                  .                  split                  (                  ','                  )                  # separates line into a list of items.  ',' tells it to divide the lines at the commas                  colonyCount                  =                  int                  (                  sline                  [                  four                  ])                  #store the colony count for the line as an integer                  if                  colonyCount                  >                  30                  :                  impress                  (                  sline                  )                  #close the file                  infile                  .                  close                  ()                              

Writing information out to a file

Ofttimes, we volition desire to write data to a new file. This is especially useful if we have done a lot of computations or data processing and we want to exist able to save it and come back to it afterward.

Writing a file is the same multi-step procedure

Only like reading a file, nosotros volition open up and write the file in multiple steps.

  1. Create a variable to concur the name of the file that we want to open up. Often, this volition exist a new file that doesn't yet exist.
  2. Phone call a function to open the file. This time, nosotros volition specify that we are opening the file to write into it!
  3. Write the data into the file. This requires some careful attending to formatting.
  4. When nosotros are done, we should call up to close the file!

The code beneath gives an example of writing to a file:

                          filename              =              "output.txt"              #westward tells python we are opening the file to write into it              outfile              =              open              (              filename              ,              'due west'              )              outfile              .              write              (              "This is the commencement line of the file"              )              outfile              .              write              (              "This is the 2d line of the file"              )              outfile              .              close              ()              #Shut the file when we're done!                      

Where did my file terminate upwards?

Any time you open a new file and write to it, the file will be saved in your electric current working directory, unless you specified a unlike path in the variable filename.

Newline characters

When you lot examine the file you just wrote, you will see that all of the text is on the same line! This is because we must tell Python when to start on a new line by using the special cord character '\northward'. This newline character will tell Python exactly where to start each new line.

The example beneath demonstrates how to utilize newline characters:

                          filename              =              'output_newlines.txt'              #w tells python nosotros are opening the file to write into information technology              outfile              =              open              (              filename              ,              'west'              )              outfile              .              write              (              "This is the first line of the file              \n              "              )              outfile              .              write              (              "This is the second line of the file              \n              "              )              outfile              .              close              ()              #Close the file when we're done!                      

Go open up the file you only wrote and and cheque that the lines are spaced correctly.:

Dealing with newline characters when you read a file

You may have noticed in the final file reading example that the printed output included newline characters at the end of each line of the file:

['colonies02.tif', '2', 'exp', '24', '84', '3.2', '22\n']
['colonies03.tif', 'three', 'exp', '24', '792', '3', '78\n']
['colonies06.tif', '2', 'exp', '48', '85', '5.2', '46\northward']

We can get rid of these newlines by using the .strip() part, which will go rid of newline characters:

                              #Create a variable for the file proper name                filename                =                'Plates_output_simple.csv'                ##Open up the file                infile                =                open up                (                filename                ,                'r'                )                lines                =                infile                .                readlines                ()                for                line                in                lines                [                1                :]:                #skip the showtime line, which is the header                sline                =                line                .                strip                ()                #get rid of trailing newline characters at the end of the line                sline                =                sline                .                split                (                ','                )                # separates line into a listing of items.  ',' tells it to split the lines at the commas                colonyCount                =                int                (                sline                [                4                ])                #shop the colony count for the line as an integer                if                colonyCount                >                30                :                print                (                sline                )                #shut the file                infile                .                close                ()                          

Writing numbers to files

Just like Python automatically reads files in every bit strings, the write()function expects to but write strings. If nosotros want to write numbers to a file, we will need to "cast" them as strings using the office str().

The lawmaking below shows an example of this:

                          numbers              =              range              (              0              ,              10              )              filename              =              "output_numbers.txt"              #w tells python nosotros are opening the file to write into it              outfile              =              open              (              filename              ,              'w'              )              for              number              in              numbers              :              outfile              .              write              (              str              (              number              ))              outfile              .              close              ()              #Close the file when we're done!                      

Writing new lines and numbers

Go open and examine the file you just wrote. Y'all will see that all of the numbers are written on the aforementioned line.

Modify the code to write each number on its own line.

Solution

                                  numbers                  =                  range                  (                  0                  ,                  10                  )                  #Create the range of numbers                  filename                  =                  "output_numbers.txt"                  #provide the file name                  #open the file in 'write' mode                  outfile                  =                  open up                  (                  filename                  ,                  'w'                  )                  for                  number                  in                  numbers                  :                  outfile                  .                  write                  (                  str                  (                  number                  )                  +                  '                  \n                  '                  )                  outfile                  .                  close                  ()                  #Shut the file when we're done!                              

The file you merely wrote should be saved in your Working Directory. Open the file and check that the output is correctly formatted with one number on each line.

Opening files in different 'modes'

When nosotros have opened files to read or write information, we have used the function parameter 'r' or 'w' to specify which "way" to open the file.
'r' indicates nosotros are opening the file to read data from it.
'w' indicates we are opening the file to write data into it.

Be very, very careful when opening an existing file in 'w' fashion.
'west' will over-write any data that is already in the file! The overwritten data volition exist lost!

If yous want to add on to what is already in the file (instead of erasing and over-writing it), you tin open up the file in suspend way past using the 'a' parameter instead.

Pulling information technology all together

Read in the data from the file Plates_output_simple.csv that we have been working with. Write a new csv-formatted file that contains only the rows for control plates.
You will need to exercise the following steps:

  1. Open the file.
  2. Utilize .readlines() to create a list of lines in the file. Then close the file!
  3. Open up a file to write your output into.
  4. Write the header line of the output file.
  5. Utilise a for loop to permit you to loop through each line in the list of lines from the input file.
  6. For each line, check if the growth condition was experimental or control.
  7. For the control lines, write the line of information to the output file.
  8. Shut the output file when you're done!

Solution

Here's one style to do information technology:

                                  #Create a variable for the file proper name                  filename                  =                  'Plates_output_simple.csv'                  ##Open the file                  infile                  =                  open                  (                  filename                  ,                  'r'                  )                  lines                  =                  infile                  .                  readlines                  ()                  #Nosotros will process the lines of the file later                  #close the input file                  infile                  .                  close                  ()                  #Create the file we will write to                  filename                  =                  'ControlPlatesData.txt'                  outfile                  =                  open                  (                  filename                  ,                  'w'                  )                  outfile                  .                  write                  (                  lines                  [                  0                  ])                  #This will write the header line of the file                                    for                  line                  in                  lines                  [                  1                  :]:                  #skip the offset line, which is the header                  sline                  =                  line                  .                  split                  (                  ','                  )                  # separates line into a list of items.  ',' tells it to split the lines at the commas                  condition                  =                  sline                  [                  ii                  ]                  #store the condition for the line as a string                  if                  status                  ==                  "control"                  :                  outfile                  .                  write                  (                  line                  )                  #The variable line is already formatted correctly!                  outfile                  .                  shut                  ()                  #Close the file when we're washed!                              

Claiming Problem

Open and read in the data from Plates_output_simple.csv. Write a new csv-formatted file that contains only the rows for the control condition and includes merely the columns for Time, colonyCount, avgColonySize, and percentColonyArea. Hint: you can use the .join() function to join a list of items into a cord.

                              names                =                [                'Erin'                ,                'Mark'                ,                'Tessa'                ]                nameString                =                ', '                .                join                (                names                )                #the ', ' tells Python to bring together the list with each item separated by a comma + space                print                (                nameString                )                          

'Erin, Marker, Tessa'

Solution

                                  #Create a variable for the input file name                  filename                  =                  'Plates_output_simple.csv'                  ##Open the file                  infile                  =                  open                  (                  filename                  ,                  'r'                  )                  lines                  =                  infile                  .                  readlines                  ()                  #We will process the lines of the file later                  #shut the file                  infile                  .                  close                  ()                  # Create the file we volition write to                  filename                  =                  'ControlPlatesData_Reduced.txt'                  outfile                  =                  open                  (                  filename                  ,                  'w'                  )                  #Write the header line                  headerList                  =                  lines                  [                  0                  ]                  .                  split                  (                  ','                  )[                  iii                  :]                  #This will return the listing of column headers from 'time' on                  headerString                  =                  ','                  .                  join                  (                  headerList                  )                  #join the items in the list with commas                  outfile                  .                  write                  (                  headerString                  )                  #There is already a newline at the end, so no need to add one                  #Write the remaining lines                  for                  line                  in                  lines                  [                  1                  :]:                  #skip the beginning line, which is the header                  sline                  =                  line                  .                  split                  (                  ','                  )                  # separates line into a listing of items.  ',' tells it to separate the lines at the commas                  condition                  =                  sline                  [                  two                  ]                  #store the colony count for the line as an integer                  if                  condition                  ==                  "control"                  :                  dataList                  =                  sline                  [                  iii                  :]                  dataString                  =                  ','                  .                  join                  (                  dataList                  )                  outfile                  .                  write                  (                  dataString                  )                  #The variable line is already formatted correctly!                  outfile                  .                  close                  ()                  #Close the file when we're done!                              

Fundamental Points

  • Opening and reading a file is a multistep process: Defining the filename, opening the file, and reading the data

  • Data stored in files can be read in using a diversity of commands

  • Writing data to a file requires attending to data types and formatting that isn't necessary with a print() statement

yancymeleat.blogspot.com

Source: https://eldoyle.github.io/PythonIntro/08-ReadingandWritingTextFiles/

0 Response to "Youtube Python Read Data From Text File"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel