pandas Basics - Joining DataFrames; Concatenating DataFrames
February 26, 2025
DataFramesDataFramesDataFrame for county-level data and DataFrame for geographic information, such as longitude and latitude.DataFrames based on common data values in those DataFrames.
merge() method in Pandas.DataFramesDataFrames are called keys.DataFrames with merge()
DataFrames with merge()
DataFrames with merge()x.
DataFrames with merge()y.
DataFrames with merge()x and y.
DataFrames with merge()DataFrame has duplicate keys (a one-to-many relationship).
DataFrames with merge()DataFrames have duplicate keys (many-to-many relationship).
DataFrames with merge()left_on and right_on parameters instead.DataFrames with merge()Let’s do Part 2 of Classwork 7!
DataFrames by adding rows or columns. This method is useful:
df1 = pd.read_csv('https://bcdanl.github.io/data/concat_1.csv')
df2 = pd.read_csv('https://bcdanl.github.io/data/concat_2.csv')
df3 = pd.read_csv('https://bcdanl.github.io/data/concat_3.csv').index and .columns in this Section.DataFrames on top of each other uses the concat() method.DataFrames to be concatenated are passed in a list.
pd.concat( [DataFrame_1, DataFrame_2, ... , DataFrame_N] )axis parameter in the concat() method.axis is 0 (or axis = "index"), so it will concatenate data in a row-wise fashion.axis = 1 (or axis = "columns") to the function, it will concatenate data in a column-wise manner.pd.concat([df1, df2, df3], axis = "columns")
pd.concat([df1, df2, df3], axis = "columns", ignore_index = True) ignore_index=True to reset the column indicesSeries and concatenate it with df1:# create a new row of data
new_row_series = pd.Series(['n1', 'n2', 'n3', 'n4'])
# attempt to add the new row to a DataFrame
df = pd.concat([df1, new_row_series])Series into a DataFrame.
DataFrame contains one row of data, and the column names are the ones the data will bind to.pd.concat([df1, new_row_series], axis = 1):::
:::
Let’s do Part 3 of Classwork 7!