In this tutorial you will learn how to build a full Pandas DataFrame from a for loop.
Steps
- import the pandas package
- Iterate over the sequence(list, dictionary, etc.) of values that you would like to be in your for loop.
- Append each iteration variable as a row to the DataFrame in it’s respective column.
Table of Contents
- Introduction to Pandas DataFrame
- How a for loop in Python works?
- Building a Pandas DataFrame from a for loop
- Conclusion
Introduction to Pandas DataFrame
A Data frame is a two-dimensional data structure. A DataFrame can also be thought of as a combination of two or more series. We must import pandas in order to construct a DataFrame.
The dataframe() function can be used to build DataFrames. The dataframe() takes one or two parameters. The first is the information that needs to be entered into the DataFrame table.
Let us create a simple pandas DataFrame using the DataFrame method to get familiar with the pandas DataFrame.
# import pandas
import pandas as pd
# creating empty DataFrame
df = pd.DataFrame()
# printing the DataFrame
print(df)
Output:
Empty DataFrame Columns: [] Index: []
As you can see, the above code has created an empty pandas DataFrame. Now, let us create a DataFrame by adding some data.
First, we will create a list of numbers as data points and then will define the column name while initializing the DataFrame as shown below:
# initialize data points
data_points = [1, 2, 4, 5, 3, 6, 7]
# initializing DataFrame and giving the column name
df = pd.DataFrame(data_points, columns=['values'])
print(df)
Output:
values 0 1 1 2 2 4 3 5 4 3 5 6 6 7
As you can see, this time as we are able to add the data to the pandas DataFrame. You can confirm it by printing the type of the DataFrame that we just created by running the following command.
# printing the type
print(type(df))
Output:
<class 'pandas.core.frame.DataFrame'>
As you can see, the type is pandas DataFrame.
It is easy to create a small DataFrame by manually adding elements or data points as we did in the above example, but if we need a large dataset, then adding elements manually will take a lot of time.
In such cases, we can use for loop to create elements for the DataFrame.
=> Join the Waitlist for Early Access.
How a for loop in Python works?
An item is iterated over and over again in Python using for loop until it is finished. For example, you can iterate through the items in a list or a string. The for loop’s syntax is for an item in an object, where “object” refers to the iterable you want to loop through.
It can also be used to repeat a certain piece of code in the specified range. The following diagram shows the working of Python for loop.
As you can see, each time the for loop is executed, the condition is checked and if the loop is not yet at the last items in the iterable, then the code inside the loop will be executed.
Let us take an example of Python for loop and print numbers from 1 to 10.
# for loop to print numbers
for i in range(1, 10):
print(i)
Output:
1 2 3 4 5 6 7 8 9
As you can see that we had defined the range from 1 to 10 and it didn’t print out the number 10 because for loop goes to the max range but does not print it.
Building a Pandas DataFrame from a for loop
As we are now familiar with the pandas DataFrame and learned how the for loop works in Python. It is time to jump into the actual purpose of this article and build a pandas DataFrame from a for loop.
We will use the following different methods to build a pandas DataFrame from a for loop.
- For loop to create a list
- List comprehension
- Adding elements to pandas DataFrame using for loop
Example-1: For loop to create list
As we have seen in the previous example that we can easily convert a Python list into pandas DataFrame and the same thing we are going to do in this section.
But instead of creating a list manually, we will use Python for loop to create a list, and then we will use pandas to convert the list into DataFrame.
# creating a list for the rows
rows = []
# using for loop to append values in rows
for i in range(10):
#appending the values to the row
rows.append(i)
# converting the list into DataFrame
df = pd.DataFrame(rows, columns=["column_1"])
print(df)
Output:
column_1 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9
As you can see, we were able to create a pandas DataFrame using Python for loop. We have first defined a for loop that is appending values to an empty list and then we converted that list into pandas DataFrame.
But notice that the above code can only create a pandas DataFrame with one column. We can modify this code to create a DataFrame with multiple columns as well. Let us now create a DataFrame with multiple columns.
# creating a list for the rows
rows = []
# using for loop to append values in rows
for i in range(10):
#appending the values to the row
rows.append([i+1, i + 1, i])
# converting the list into DataFrame
df = pd.DataFrame(rows, columns=["column_1", "column_2", "column_3"])
print(df)
Output:
column_1 column_2 column_3 0 1 1 0 1 2 2 1 2 3 3 2 3 4 4 3 4 5 5 4 5 6 6 5 6 7 7 6 7 8 8 7 8 9 9 8 9 10 10 9
As shown above, we were able to create a pandas DataFrame with multiple columns using for loop.
Example-2: Using list comprehensions
A Python list comprehension consists of brackets containing the expression, which is executed for each element along with the for loop to iterate over each element in the Python list.
Python List comprehensions provides a much more short syntax for creating a new list based on the values of an existing list. To understand better, let us first create a list using a list comprehension method.
# list comprehension method
L = [i for i in range(10)]
# printing list
print(L)
Output:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
As you can see, the above code has created a list. Now, similar to the above example, we can convert this list into pandas DataFrame.
# list comprehension method
L = [i for i in range(10)]
# converting the list into DataFrame
df = pd.DataFrame(L, columns=["column_1"])
print(df)
Output:
column_1 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9
As you can see, we have created a pandas DataFrame using the list comprehension method. The advantage of using list comprehension is that we use fewer lines of code.
Now, let us also create a DataFrame with multiple columns.
# list comprehension method
L = [[i, i*i] for i in range(10)]
# converting the list into DataFrame
df = pd.DataFrame(L, columns=["column_1", "column_2"])
print(df)
Output:
column_1 column_2 0 0 0 1 1 1 2 2 4 3 3 9 4 4 16 5 5 25 6 6 36 7 7 49 8 8 64 9 9 81
The above output shows the DataFrame created by using list comprehension method.
Example-3: Adding elements to Pandas DataFrame using for loop
In the above examples, we have seen we were creating a list using for loop first and then converting that list to pandas DataFrame. While now we are going to do the opposite.
We will now first create a pandas DataFrame and then use the for loop to append values to the DataFrame one by one.
# creating a DataFrame
df = pd.DataFrame(columns=["Column_1", "column_2", "column_3"])
# using for loop to iterate
for i in range(3):
column = df.columns[i]
df[column] = [i*i, i+1, i]
# printing
print(df)
Output:
Column_1 column_2 column_3 0 0 1 4 1 1 2 3 2 0 1 2
As you can see, we first created an empty DataFrame and then added values to it using for loop.
Conclusion
A Pandas DataFrame is a n-dimensional data structure, like a 2-dimensional array, or a table with rows and columns.
We can use dataframe() method from the pandas module to create pandas DataFrame. In this short article, we learned how we can create pandas DataFrame from for loop in Python using various methods.
Tanner Abraham
Data Scientist and Software Engineer with a focus on experimental projects in new budding technologies that incorporate machine learning and quantum computing into web applications.=> Join the Waitlist for Early Access.