Data Collection III: Collecting Data with requests and APIs
March 4, 2026
Methods: The “verbs” that define what action the client wants the server to perform.
| Method | Purpose |
|---|---|
GET |
Retrieve data from a server ✅ (main one we use) |
POST |
Send new data to a server |
PUT |
Update existing data on a server |
DELETE |
Remove data from a server |
Request: What the client sends to the server — specifying the method, endpoint URL, and optional parameters (e.g., search terms, API keys).
Response: What the server sends back, consisting of three parts:
200 OK, 404 Not Found, 401 Unauthorized)When we type a URL starting with https://:
requests Methodsrequests?
GET, POST, PUT, DELETE, etc.requests.get():
GET request to retrieve data from a specified URL.response.status_code: Returns HTTP status code (e.g., 200 for success).response.json(): Converts JSON format data into a Python dictionary.response.text: Decoded text version of the response.contentstatus_code before processing the response.response.json() for easier handling of JSON data.NYC Open Data (https://opendata.cityofnewyork.us) is free public data published by NYC agencies and other partners.
Many metropolitan cities have the Open Data websites too:
import requests
import pandas as pd
endpoint = 'https://data.cityofnewyork.us/resource/ic3t-wcy2.json' ## API endpoint
response = requests.get(endpoint)
content = response.json() # to convert JSON response data to a dictionary
df = pd.DataFrame(content)requests.get() method sends a GET request to the specified URL.response.json() automatically converts JSON data into a dictionary or a list of dictionaries.
Most API interfaces will only let you access and download data after you have registered an API key with them.
Let’s download economic data from the FRED https://fred.stlouisfed.org using its API.
You need to create an account https://fredaccount.stlouisfed.org/login/ to get an API key for your FRED account.
As with all APIs, a good place to start is the FRED API developer docs https://fred.stlouisfed.org/docs/api/fred/.
We are interested in series/observations https://fred.stlouisfed.org/docs/api/fred/series_observations.html
The parameters that we will use are api_key, file_type, and series_id.
Replace “YOUR_API_KEY” with your actual API key in the following web address: https://api.stlouisfed.org/fred/series/observations?series_id=GNPCA&api_key=YOUR_API_KEY&file_type=json
import requests # to handle API requests
import json # to parse JSON response data
import pandas as pd
param_dicts = {
'api_key': 'YOUR_FRED_API_KEY', ## Change to your own key
'file_type': 'json',
'series_id': 'GDPC1' ## ID for US real GDP
}
url = "https://api.stlouisfed.org/"
endpoint = "series/observations"
api_endpoint = url + "fred/" + endpoint # sum of strings
response = requests.get(api_endpoint, params = param_dicts)requests, json, and pandas libraries.
requests comes with a variety of features that allow us to interact more flexibly and securely with web APIs.response.json()# Convert JSON response to Python dictionary.
content = response.json()
# Extract the "observations" list element.
df = pd.DataFrame( content['observations'] )json(): Converts JSON into a Python dictionary object (or a list of dictionaries).
By default, all columns in the DataFrame from content are string-type.
Let’s do Classwork 9!
pynytimesWhile NYTimes Developer Portal APIs provides API documentation, it is time-consuming to go through the documentation.
There is an unofficial Python library called pynytimes that provides a more user-friendly interface for working with the NYTimes API.
To get started, check out Introduction to pynytimes.
pynytimes# Settings
from pynytimes import NYTAPI
# Initialize API with your key
nyt = NYTAPI("YOUR_NYTIMES_API_KEY", parse_dates=True)Most industry-scale websites display data from their database servers.
Sometimes, it is possible to find their hidden APIs to retrieve data!
Examples:
json type response that seems to have datarequests MethodsOur course covers only the basics of the requests library.
For those who are interested in the Python’s requests, I recommend the following references:
Web-scrapping Decision