import pandas as pd
import numpy as np
# Below is for an interactive display of DataFrame in Colab
from google.colab import data_table
data_table.enable_dataframe_formatter()Classwork 16
Pandas Fundamental V-2: Joining DataFrames
Below DataFrames are related, as described above.
flights = pd.read_csv("https://bcdanl.github.io/data/flights.zip")
airlines = pd.read_csv("https://bcdanl.github.io/data/flights-airlines.csv")
airports = pd.read_csv("https://bcdanl.github.io/data/flights-airports.csv")
planes = pd.read_csv("https://bcdanl.github.io/data/flights-planes.csv")
weather = pd.read_csv("https://bcdanl.github.io/data/flights-weather.csv")Variables in flights DataFrame
year,month,day: Date of departure.dep_time,arr_time: Actual departure and arrival times (format HHMM or HMM), local tz.sched_dep_time,sched_arr_time: Scheduled departure and arrival times (format HHMM or HMM), local tz.dep_delay,arr_delay: Departure and arrival delays, in minutes. Negative times represent early departures/arrivals.carrier: Two letter carrier abbreviation. See airlines DataFrame to get full names.flight: Flight number.tailnum: Plane tail number. See planes DataFrame for additional metadata.origin,dest: Origin and destination. SeeairportsDataFrame for additional metadata.air_time: Amount of time spent in the air, in minutes.distance: Distance between airports, in miles.hour,minute: Time of scheduled departure broken into hour and minutes.time_hour: Scheduled date and hour of the flight as a datetime64. Along withorigin, can be used to join flights data toweatherDataFrame
Question 1
The following is the flights DataFrame:
The following is the weather DataFrame:
Merge the flights and weather DataFrames so that all observations from flights are retained in the resulting DataFrame.
Answer:
Question 2
The following is the airlines DataFrame:
Identify the airlines (by full name) that have the top five dep_delay values in the flights DataFrame.
Answer:
Question 3
Consider the following two airlines:
- Delta Air Lines Inc. (DL)
- United Air Lines Inc. (UA)
Determine which airline has a higher proportion of flights with a
dep_delaygreater than 30 minutes.Hint:
np.where()can be very useful.
Answer:
Discussion
Welcome to our Classwork 16 Discussion Board! π
This space is designed for you to engage with your classmates about the material covered in Classwork 16.
Whether you are looking to delve deeper into the content, share insights, or have questions about the content, this is the perfect place for you.
If you have any specific questions for Byeong-Hak (@bcdanl) regarding the Classwork 16 materials or need clarification on any points, donβt hesitate to ask here.
All comments will be stored here.
Letβs collaborate and learn from each other!