Midterm Exam I
Spring 2025, DANL 210-01
Descriptive Statistics
The distribution of scores for Midterm Exam I is shown below:

The distribution of scores for Section 1 in Midterm Exam I is shown below:

The distribution of scores for Section 2 in Midterm Exam I is shown below:

The following provides the descriptive statistics for each part of the Midterm Exam I:
Section 1. Python Basics - Multiple Choice Questions (Questions 1-6)
Question 1
What is the output of the following code?
= ['Alice', 'David', 'Ethan']
name_lst print(name_lst[-1][1])
A. a
B. v
C. t
D. h
E. d
Click to Check the Answer!
C
Question 2
How many times will the following loop execute the print()
statement?
= 2
a
while True:
if a == 6:
print("Number", a, "reached")
break
elif a < 6:
+= 2
a print("Adding 2... Reached", a)
else:
= a - 1
a print("Subtracting 1... Reached", a)
A. 1
B. 2
C. 3
D. 4
E. 5
Click to Check the Answer!
C
Question 3
What is the output of the following code?
= [2, 3, 4, 11, 5]
numbers = 0
i = 0
total
while i < len (numbers) :
if numbers[i] == 11:
break
else:
+= numbers[i]
total = i + 1
i
print (total)
A. 0
B. 8
C. 9
D. 10
E. 11
Click to Check the Answer!
C
Question 4
What is the output of the following code?
= {"x": 10, "y": 20, "z": 30}
data = ["y", "a", "x", "z"]
keys = 10
result
for key in keys:
try:
if key == "x":
= result - data[key] * 2
result break
elif key == "z":
= result + data[key]
result else:
= result - data[key]
result except KeyError:
= result + 10
result
print(result)
A. 0
B. 10
C. -10
D. 20
E. -20
Click to Check the Answer!
E
Question 5
What is the output of the following code?
= []
result for i in range(5):
if i == 2:
continue
elif i == 4:
pass
result.append(i)
print(result)
A. [0, 1, 3, 4]
B. [0, 1, 2, 3, 4]
C. [0, 1, 3]
D. [0, 1, 3, 4, 5]
E. Error
Click to Check the Answer!
A
Question 6
= []
result for i in range(0, 5):
if i ** 2 < 10:
* 2) result.append(i
Which of the following list comprehensions produces the same output as the above code?
A.
= [i * 2 for i in range(5) if i * i < 9] result
B.
= [i * 2 for i in range(5) if i in [0, 1, 2, 3, 4]] result
C.
= [i * 2 for i in range(5) if i ** 2 != 32] result
D.
= [i * 2 for i in range(5) if i ** 2 < 10] result
E.
= [i * 2 for i in range(5) if i ** 2 <= 8] result
Click to Check the Answer!
D
Section 2. Pandas Basics – Coding Questions (Questions 7-17)
import pandas as pd
import numpy as np
For this Section, use the above Python libraries and the following four DataFrames from Global Gourmet Selections, which represent the company’s operations:
customers
– Contains customer-related information, including unique customer IDs, names, and contact details.orders
– Provides an overview of order details, including relevant transaction information.order_details
– Records individual order line items by linking each product to a specific order.products
– Stores product-related details, such as product IDs, names, and pricing.
Additionally, the categories
DataFrame offers metadata on product categories, including unique identifiers, names, and descriptions.
DataFrames
= pd.read_csv('https://bcdanl.github.io/data/gourmet-Categories.csv')
categories = pd.read_csv('https://bcdanl.github.io/data/gourmet-Customers2.csv')
customers = pd.read_csv('https://bcdanl.github.io/data/gourmet-Orders2.csv')
orders = pd.read_csv('https://bcdanl.github.io/data/gourmet-OrderDetails2.csv')
order_details = pd.read_csv('https://bcdanl.github.io/data/gourmet-Products2.csv') products
0. categories
DataFrame (containing 8 observations)
CategoryID | CategoryName | Description |
---|---|---|
1 | Beverages | Soft drinks coffees teas beers and ales |
2 | Condiments | Sweet and savory sauces relishes spreads and seasonings |
3 | Confections | Desserts candies and sweet breads |
4 | Dairy Products | Cheeses |
5 | Grains/Cereals | Breads crackers pasta and cereal |
6 | Meat/Poultry | Prepared meats |
7 | Produce | Dried fruit and bean curd |
8 | Seafood | Seaweed and fish |
CategoryID
(int64
): Unique identifier for each product category in thecategories
DataFrame.CategoryName
(object
): The name of the category.Description
(object
): A textual description providing details about the category.
1. customers
DataFrame with its first 15 observations
CustID | CustName | ContactName | City | PostalCode | Country |
---|---|---|---|---|---|
1 | Alfreds Futterkiste | Maria Anders | Berlin | 12209 | Germany |
2 | Ana Trujillo Emparedados y helados | Ana Trujillo | Mexico D.F. | NA | Mexico |
3 | Antonio Moreno Taqueria | Antonio Moreno | Mexico D.F. | NA | Mexico |
4 | Around the Horn | Thomas Hardy | London | WA1 1DP | UK |
5 | Berglunds snabbkop | Christina Berglund | Lulea | S-958 22 | Sweden |
6 | Blauer See Delikatessen | Hanna Moos | Mannheim | 68306 | Germany |
7 | Blondel pere et fils | Frederique Citeaux | Strasbourg | 67000 | France |
8 | Bolido Comidas preparadas | Martin Sommer | Madrid | 28023 | Spain |
9 | Bon app’ | Laurence Lebihans | Marseille | 13008 | France |
10 | Bottom-Dollar Marketse | Elizabeth Lincoln | Tsawassen | T2F 8M4 | Canada |
11 | B’s Beverages | Victoria Ashworth | London | EC2 5NT | UK |
12 | Cactus Comidas para llevar | Patricio Simpson | Buenos Aires | 1010 | Argentina |
13 | Centro comercial Moctezuma | Francisco Chang | Mexico D.F. | 5022 | Mexico |
14 | Chop-suey Chinese | Yang Wang | Bern | 3012 | Switzerland |
15 | Comercio Mineiro | Pedro Afonso | Sao Paulo | 05432-043 | Brazil |
CustID
(int64
): Unique identifier for each customer in thecustomers
DataFrame.CustName
(object
): The name of the customer’s company.ContactName
(object
): The name of the primary contact person at the company.City
(object
): The city where the customer is located.PostalCode
(object
): The postal or ZIP code of the customer’s address.Country
(object
): The country of the customer.
2. orders
DataFrame with its first 15 observations
OrderID | CustID | EmployeeID | OrderDate |
---|---|---|---|
10248 | 90 | 5 | 7/4/96 |
10249 | 81 | 6 | 7/5/96 |
10250 | 34 | 4 | 7/8/96 |
10251 | 84 | 3 | 7/8/96 |
10252 | 76 | 4 | 7/9/96 |
10253 | 34 | 3 | 7/10/96 |
10254 | 14 | 5 | 7/11/96 |
10255 | 68 | 9 | 7/12/96 |
10256 | 88 | 3 | 7/15/96 |
10257 | 35 | 4 | 7/16/96 |
10258 | 20 | 1 | 7/17/96 |
10259 | 13 | 4 | 7/18/96 |
10260 | 55 | 4 | 7/19/96 |
10261 | 61 | 4 | 7/19/96 |
10262 | 65 | 8 | 7/22/96 |
OrderID
(int64
): Unique identifier for each order in theorders
DataFrame.CustID
(int64
): Identifier linking to the customer who placed the order in thecustomers
DataFrame.EmployeeID
(int64
): Identifier for the employee handling the order.OrderDate
(object
): The date when the order was placed.
3. order_details
DataFrame with its first 15 observations
OrderDetail_ID | OrderID | ProductID | Quantity |
---|---|---|---|
1 | 10248 | 11 | 12 |
2 | 10248 | 42 | 10 |
3 | 10248 | 72 | 5 |
4 | 10249 | 14 | 9 |
5 | 10249 | 51 | 40 |
6 | 10250 | 41 | 10 |
7 | 10250 | 51 | 35 |
8 | 10250 | 65 | 15 |
9 | 10251 | 22 | 6 |
10 | 10251 | 57 | 15 |
11 | 10251 | 65 | 20 |
12 | 10252 | 20 | 40 |
13 | 10252 | 33 | 25 |
14 | 10252 | 60 | 40 |
15 | 10253 | 31 | 20 |
OrderDetail_ID
(int64
): Unique identifier for each record in theorder_details
DataFrame.OrderID
(int64
): Identifier linking each record to a specific order in theorders
DataFrame.ProductID
(int64
): Identifier linking the record to a product in theproducts
DataFrame.Quantity
(int64
): The number of units ordered.
4. products
DataFrame with its first 15 observations
ProductID | ProductName | CategoryID | Unit | Price |
---|---|---|---|---|
1 | Chais | 1 | 10 boxes x 20 bags | 18.00 |
2 | Chang | 1 | 24 - 12 oz bottles | 19.00 |
3 | Aniseed Syrup | 2 | 12 - 550 ml bottles | 10.00 |
4 | Chef Anton’s Cajun Seasoning | 2 | 48 - 6 oz jars | 22.00 |
5 | Chef Anton’s Gumbo Mix | 2 | 36 boxes | 21.35 |
6 | Grandma’s Boysenberry Spread | 2 | 12 - 8 oz jars | 25.00 |
7 | Uncle Bob’s Organic Dried Pears | 7 | 12 - 1 lb pkgs. | 30.00 |
8 | Northwoods Cranberry Sauce | 2 | 12 - 12 oz jars | 40.00 |
9 | Mishi Kobe Niku | 6 | 18 - 500 g pkgs. | 97.00 |
10 | Ikura | 8 | 12 - 200 ml jars | 31.00 |
11 | Queso Cabrales | 4 | 1 kg pkg. | 21.00 |
12 | Queso Manchego La Pastora | 4 | 10 - 500 g pkgs. | 38.00 |
13 | Konbu | 8 | 2 kg box | 6.00 |
14 | Tofu | 7 | 40 - 100 g pkgs. | 23.25 |
15 | Genen Shouyu | 2 | 24 - 250 ml bottles | 15.50 |
ProductID
(int64
): Unique identifier for each product in theproducts
DataFrame.ProductName
(object
): The name of the product.CategoryID
(int64
): Identifier linking the product to a category in thecategories
DataFrame.Unit
(object
): The packaging or standard unit in which the product is soldPrice
(float
): The cost of a single unit of the product (in USD)
Question 7
- Write a three-line code snippet to display the following summaries of the
orders
DataFrame:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 196 entries, 0 to 195
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 OrderID 196 non-null int64
1 CustID 196 non-null int64
2 EmployeeID 196 non-null int64
3 OrderDate 196 non-null object
dtypes: int64(3), object(1)
memory usage: 6.3+ KB
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 196 entries, 0 to 195
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 OrderID 196 non-null int64
1 CustID 196 non-null int64
2 EmployeeID 196 non-null int64
3 OrderDate 196 non-null datetime64[ns]
dtypes: datetime64[ns](1), int64(3)
memory usage: 6.3 KB
Click to Check the Answer!
orders.info()'OrderDate'] = orders['OrderDate'].astype('datetime64[ns]')
orders[ orders.info()
Question 8
Write a code snippet to produce the following output of descriptive statistics:
ProductName Price
count 77 77.000000
unique 77 NaN
top Chais NaN
freq 1 NaN
mean NaN 28.866364
std NaN 33.815111
min NaN 2.500000
25% NaN 13.250000
50% NaN 19.500000
75% NaN 33.250000
max NaN 263.500000
Click to Check the Answer!
'ProductName', 'Price']].describe(include="all") products[[
Question 9
Fill in the blank to count the number of observations from the customers
DataFrame where the customer’s Country is not USA.
0] customers[_______________________________________________].shape[
Click to Check the Answer!
'Country'] != 'USA'].shape[0] customers[customers[
Question 10
Fill in the blank to count the number of observations from the customers
DataFrame where the customer’s ContactName starts with B or P.
customers[_______________________________________________]
Click to Check the Answer!
'ContactName'].between('B', 'C') | customers['ContactName'].between('P', 'Q') ] customers[customers[
Question 11
Write a code snippet to identify the country or countries with the highest number of missing values in the PostalCode variable of the customers
DataFrame.
Click to Check the Answer!
('PostalCode'].isna()]['Country']
customers[customers[
.value_counts()
.reset_index()1, 'count', keep = 'all')
.nlargest( )
Question 12
Write a code snippet using the order_details
DataFrame to count the number of unique products ordered.
The total number of distinct products ordered is 77.
Click to Check the Answer!
'ProductID'].nunique() order_details[
Question 13
Write a code snippet to reshape the order_details
DataFrame so that each OrderID is a row and each ProductID becomes a column with values equal to the corresponding Quantity.
OrderID 1 2 3 4 5 6 7 ... 70 71 72 73 74 75 76 77
0 10248 NaN NaN NaN NaN NaN NaN NaN ... NaN NaN 5.0 NaN NaN NaN NaN NaN
1 10249 NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN
2 10250 NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN
3 10251 NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN
4 10252 NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN
.. ... .. ... .. .. .. .. .. ... .. .. ... .. ... .. .. ..
191 10439 NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN 30.0 NaN NaN NaN
192 10440 NaN 45.0 NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN
193 10441 NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN
194 10442 NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN
195 10443 NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN
[196 rows x 78 columns]
- The following displays the first 10 rows and the first 6 columns of the resulting DataFrame:
Click to Check the Answer!
(
order_details='OrderID',
.pivot(index='ProductID',
columns='Quantity')
values
.reset_index() )
Question 14
Write a code snippet to display the following new DataFrame showing each category along with the number of unique products in the products
DataFrame.
CategoryName | count |
---|---|
Confections | 13 |
Beverages | 12 |
Condiments | 12 |
Seafood | 12 |
Dairy Products | 10 |
Grains/Cereals | 7 |
Meat/Poultry | 6 |
Produce | 5 |
Click to Check the Answer!
(
products="CategoryID", how="left")[['CategoryName']]
.merge(categories, on
.value_counts()
.reset_index() )
Question 15
Write a code snippet to create the following new DataFrame, ProductID_11
, showing the Quantity ordered for ProductID 11. Then, write a one-line Pandas code to compute the total quantity of this product ordered using the ProductID_11
DataFrame.
OrderDetail_ID | OrderID | ProductID | ProductName | Quantity |
---|---|---|---|---|
1 | 10248 | 11 | Queso Cabrales | 12 |
130 | 10296 | 11 | Queso Cabrales | 12 |
211 | 10327 | 11 | Queso Cabrales | 50 |
281 | 10353 | 11 | Queso Cabrales | 12 |
314 | 10365 | 11 | Queso Cabrales | 24 |
426 | 10407 | 11 | Queso Cabrales | 30 |
492 | 10434 | 11 | Queso Cabrales | 6 |
514 | 10442 | 11 | Queso Cabrales | 30 |
517 | 10443 | 11 | Queso Cabrales | 6 |
The total quantity ordered for ProductID 11 is 182.
Note that 182 is calculated as follows: \[ 182 = 12 + 12 + 50 + 12 + 24 + 30 + 6 + 30 + 6 \]
Click to Check the Answer!
= (
ProductID_11
order_details
.merge(products, ="ProductID",
on="left")
how'ProductID == 11')
.query('OrderDetail_ID', 'OrderID', 'ProductID', 'ProductName', 'Quantity']]
[[
)
'Quantity'].sum() ProductID_11[
Question 16
- Write a code snippet to determine how many unique CustomerID-EmployeeID pairs exist in the
orders
DataFrame. Then, count how many times each EmployeeID appears in these unique pairs and return the resulting DataFrame as below:
EmployeeID | count |
---|---|
1 | 23 |
2 | 18 |
3 | 26 |
4 | 31 |
5 | 10 |
6 | 17 |
7 | 14 |
8 | 25 |
9 | 6 |
- What does this
count
variable above tell us?
Click to Check the Answer!
('EmployeeID', 'CustID']]
orders[[
.value_counts()'EmployeeID']]
.reset_index()[[
.value_counts()
.reset_index()
)
# `count` variable tells us the number of customers whose orders are handled by each employee.
Question 17
- Write a code snippet to compute the total order cost for customers in Mexico, expressed in Mexican Pesos (MXN).
- Begin by adding a new variable, PriceMXN, to the
products
DataFrame, which converts the Price to pesos by multiplying it by 20.604.
- Begin by adding a new variable, PriceMXN, to the
Total cost of orders made by customers in Mexico: 120771.58 MXN
Click to Check the Answer!
'PriceMXN'] = products['Price'] * 20.604
products[
= (
q17
order_details="ProductID", how="left")
.merge(products, on="OrderID", how="left")
.merge(orders, on="CustID", how="left")
.merge(customers, on'Country == "Mexico"')
.query(
)
'Cost'] = q17['PriceMXN'] * q17['Quantity']
q17['Cost'].sum() q17[