pandas Basics - Sorting Methods; Setting a New Index; Locating Observations/Values
February 14, 2025
nba DataFramenba:# Below is to import the pandas library as pd
import pandas as pd
# Below is for an interactive display of DataFrame in Colab
from google.colab import data_table
data_table.enable_dataframe_formatter()
# Below is to read nba.csv as nba DataFrame
nba = pd.read_csv("https://bcdanl.github.io/data/nba.csv",
parse_dates = ["Birthday"])sort_values()sort_values() method’s first parameter, by, accepts the variables that pandas should use to sort observations.sort_values()sort_values() method’s ascending parameter determines the sort order.
ascending has a default argument of True.DataFrame has various methods that modify the existing DataFrame.nsmallest() and nlargest()nsmallest() are useful to get the first n observations ordered by a variable in ascending order.
nlargest() are useful to get the first n observations ordered by a variable in descending order.
nsmallest() and nlargest()keep = "all" keeps all duplicates, even it means selecting more than n observations.sort_values()DataFrame by multiple columns by passing a list to the by parameter.sort_values()ascending parameter to apply the same sort order to each variable.sort_values()ascending parameter.sort_values()Q. Which players on each team are paid the most?
sort_index()nba to nba DataFrame sorted by “Name”, how can we return it to its original form of DataFrame?
nba DataFrame still has its numeric index labels.sort_index() sorts observations by their index labels (row names).sort_index()sort_index() method can also be used to change the order of variables in an alphabetical order.
axis parameter and pass it an argument of "columns" or 1.set_index() method when we want to change the current index of a DataFrame to one or more existing columns.
set_index()set_index() method returns a new DataFrame with a given column set as the index.
keys, accepts the column name.reset_index()reset_index() method:
DataFrame column;inplace=True, the operation alters the original DataFrame directly.We can extract observations, variables, and values from a DataFrame by using the loc[] and iloc[] accessors.
.loc[Index Labels]nba with the Name index.# The two lines below are equivalent
nba = nba.set_index("Name")
nba.set_index("Name", inplace = True).loc attribute extracts an observation by index label (row name)..loc[Index Labels].loc[Index Labels]loc[:] to pull rows:
DataFrame to its end;DataFrame to a specific index label..iloc[Index Positions].iloc (index location) attribute locates rows by index position.
.iloc[:] is similar to the slicing syntax with strings/lists.
Let’s do Questions 4-7 in Classwork 5!
loc[Rows, Columns] or iloc[Rows, Columns].loc and .iloc attributes accept a second argument representing the column(s) to extract.
.loc, we have to provide the column names.loc[Rows, Columns] or iloc[Rows, Columns].loc and .iloc attributes accept a second argument representing the column(s) to extract.
.iloc, we have to provide the column position.