pandas
Basics - Loading Data
February 10, 2025
danl-210-lec-08-2025-0210.ipynb
).ipynb
files you do not use for your website..ipynb
) on Google Colab. Then, download the Jupyter Notebook from Google Colab..ipynb
) to your website project directory. (If it is for a blog post, create a subdirectory in the posts
directory, and move it to the subdirectory.)_quarto.yml
properly. Save the changes by clicking the floppy disk icon (💾).quarto render
.quarto render
completes, view the index.html
in your website working directory to see the HTML output.git
commands (add
-commit
-push
) on Terminal to update your online website.DataFrame
with read_csv()
info()
and describe()
[]
value_counts()
, nunique()
, and count()
sort_values()
and sort_index()
set_index()
and reset_index()
loc[]
and iloc[]
.astype()
DataFrames
with .melt()
and .pivot()
DataFrames
with .merge()
DataFrames
.read_html()
selenium
DataFrames
with groupby()
, .agg()
, and .transform()
DataFrames
with seaborn
Series
and DataFrame
Series
: a collection of a one-dimensional object containing a sequence of values.
DataFrame
: a collection of Series
columns with an index.
read_csv()
A CSV (comma-separated values) is a plain-text file that uses a comma to separate values (e.g., nba.csv).
The CSV is widely used for storing data, and we will use this throughout the module.
We use the read_csv()
function to load a CSV data file.
DataFrame
is the workhorse of the pandas library and the data structure.read_csv()
parse_dates
parameter to coerce the values into datetimes
.drive.mount('/content/drive')
files.upload()
drive
➡️ MyDrive
…from google.colab import data_table
data_table.enable_dataframe_formatter() # Enabling an interactive DataFrame display
nba
DataFrames
into interactive displays.# !pip install itables
from itables import init_notebook_mode, show
init_notebook_mode(all_interactive=False)
show(nba)
itables
provides similar interactive displays for DataFrames
.
itables
‘s interactive displays may work better than google.colab
’ ones.