LaTeX Math Notation and Quarto Workflow for Capstone Papers
May 4, 2026
In empirical research papers, mathematical notation helps us describe models, variables, and results clearly.
For example, instead of writing:
Outcome equals intercept plus beta times treatment plus error.
We can write:
\[ Y_{it} = \beta_0 + \beta_1 X_{it} + \varepsilon_{it} \]
This is shorter, cleaner, and more professional.
Use single dollar signs for math that appears inside a sentence.
This renders as:
The coefficient \(\beta_1\) measures the relationship between \(X_{it}\) and \(Y_{it}\).
Use double dollar signs for equations that should appear on their own line.
$$
Y_{it} = \beta_0 + \beta_1 X_{it} + \varepsilon_{it}
Y_{it} = \beta_0 + \beta_1 X_{it} + \beta_2 X_{it}\times Z_{i} + \varepsilon_{it}
\log(Y_{it}) = \beta_0 + \beta_1 X_{it} + \varepsilon_{it}
$$This renders as:
$$ \[\begin{align} Y_{it} &= \beta_0 + \beta_1 X_{it} + \varepsilon_{it}\\ Y_{it} &= \beta_0 + \beta_1 X_{it} + \beta_2 X_{it}\times Z_{i} + \varepsilon_{it}\\ \log(Y_{it}) &= \beta_0 + \beta_1 X_{it} + \varepsilon_{it} \end{align}\] $$
Many statistical and econometric models use Greek letters.
| LaTeX code | Output | Meaning |
|---|---|---|
\alpha |
\(\alpha\) | alpha |
\beta |
\(\beta\) | beta |
\gamma |
\(\gamma\) | gamma |
\delta |
\(\delta\) | delta |
\varepsilon |
\(\varepsilon\) | error term |
\sigma |
\(\sigma\) | standard deviation |
\mu |
\(\mu\) | mean |
Example:
Use _ for subscripts and ^ for superscripts.
These render as:
\(Y_{it}\)
\(X_{it}\)
\(R^2\)
\(X_{it}^2\)
Use curly braces {} when the subscript or superscript has more than one character.
Correct:
Less clear:
Use \frac{numerator}{denominator}.
This renders as:
\[ \bar{X} = \frac{1}{n}\sum_{i=1}^{n} X_{it} \]
Use \sum.
This renders as:
\[ \sum_{i=1}^{n} X_{it} \]
Example: sample mean
\[ \bar{X} = \frac{1}{n}\sum_{i=1}^{n} X_{it} \]
Use \sqrt{}.
This renders as:
\[ RMSE = \sqrt{\frac{1}{n}\sum_{i=1}^{n}(Y_{it} - \hat{Y}_{it})^2} \]
| LaTeX code | Output | Common meaning |
|---|---|---|
\bar{X} |
\(\bar{X}\) | sample mean |
\widehat{Y} |
\(\widehat{Y}\) | predicted value |
\widetilde{X} |
\(\widetilde{X}\) | transformed variable |
\widehat{\beta} |
\(\widehat{\beta}\) | estimated coefficient |
Example:
Common symbols:
These render as:
\((x)\)
\([x]\)
\(\{x\}\)
\(\left( \frac{x}{y} \right)\)
The commands \left and \right automatically adjust the size of parentheses.
A simple regression model can be written as:
\[ Y_{it} = \beta_0 + \beta_1 X_{it} + \varepsilon_{it} \]
where:
A multiple regression model can be written as:
\[ Y_{it} = \beta_0 + \beta_1 X_{1i} + \beta_2 X_{2i} + \beta_3 X_{3i} + \varepsilon_{it} \]
A more compact version is:
\[ Y_{it} = \beta_0 + \mathbf{X}_{it}'\boldsymbol{\beta} + \varepsilon_{it} \]
where \(\mathbf{X}_i\) is a vector of explanatory variables.
For panel data, we often use two subscripts:
Example:
\[ Y_{it} = \beta_0 + \beta_1 X_{it} + \gamma' Z_{it} + \alpha_i + \lambda_t + \varepsilon_{it} \]
where:
A log-linear model:
\[ \log(Y_{it}) = \beta_0 + \beta_1 X_{it} + \varepsilon_{it} \]
A log-log model:
\[ \log(Y_{it}) = \beta_0 + \beta_1 \log(X_{it}) + \varepsilon_{it} \]
In a log-log model, \(\beta_1\) can often be interpreted as an elasticity:
A 1% increase in \(X\) is associated with approximately a \(\beta_1\)% change in \(Y\).
Incorrect:
Correct:
Less ideal:
Better:
Less clear:
Better:
Quarto allows you to combine:
all in one reproducible document.
A capstone paper should not only show results. It should also show a clear and reproducible workflow.
For this course, your personal GitHub website folder may contain a separate subdirectory for the capstone project.
A good website folder may look like this:
In this structure, capstone-project/ is a subdirectory of the website directory.
This is useful because your main website can contain many pages, and the capstone project can live inside its own folder.
A common mistake is putting every file inside the capstone project folder.
That is not always necessary.
For example, suppose your capstone project page is here:
Your data file can be outside the capstone-project/ folder but still inside the broader website folder:
In your Quarto file inside capstone-project/, you can read the data using a relative path:
The .. means “go up one folder.”
So this path:
means:
website/capstone-project/.website/.data/.cleaned_data.csv.This is the key idea: the data file does not have to be inside the same folder as the .qmd file.
There are several reasons:
For a public project website, it is often better to keep large or private data outside the public website repository and provide a separate data file or secure link in the final submission.
If your data is public, you can also read it from a URL.
This can be useful when the data is publicly available and stable.
During early development, you may use an absolute path on your computer:
However, this is not ideal for final submission because this path only works on your own computer.
A better final version is usually a relative path:
Or a public/shareable URL:
Recommended data workflow:
Example:
A typical empirical capstone paper may include:
Example R code chunk:
Your final capstone paper should include polished results.
It should not include every piece of trial-and-error code.
For example, do not include five failed versions of the same graph in the final paper. Keep only the final version and explain why it matters.
Do not wait until the deadline to render your Quarto file.
Render frequently while working.
This helps you catch problems early, such as:
Before submitting, check:
.qmd file?.qmd file?Here is an example paragraph for a methods section:
To examine the relationship between neighborhood characteristics and housing prices, I estimate the following regression model:
\[ \log(Price_i) = \beta_0 + \beta_1 SchoolQuality_i + \beta_2 Size_i + \beta_3 Age_i + \varepsilon_{it} \]
The dependent variable is the log of housing price. The main explanatory variable is school quality. The model also controls for house size and age. The coefficient \(\beta_1\) measures the association between school quality and housing prices, holding the other included variables constant. One limitation of this model is that omitted variables (such as neighborhood amenities) may bias the estimated relationship.
Suppose the estimated model is:
\[ \log(Price_i) = 2.1 + 0.04 SchoolQuality_i + 0.002 Size_i - 0.01 Age_i \]
Then one possible interpretation is:
Holding house size and age constant, a one-unit increase in school quality is associated with an approximately 4% increase in housing price.
For a predictive modeling project, you may write:
\[ \hat{Y}_i = f(X_{it}) \]
where \(f(\cdot)\) is a prediction function learned from the training data.
For example, in a random forest model:
\[ \hat{Y}_i = \frac{1}{B}\sum_{b=1}^{B} T_b(X_{it}) \]
where \(T_b(X_{it})\) is the prediction from tree \(b\), and \(B\) is the number of trees.
Good when data/ is a sibling folder of capstone-project/:
Less ideal for final submission:
If your data contains sensitive or private information, do not publish it on GitHub Pages.
Instead, submit the data privately through Brightspace or share it through a secure cloud link.
Good file names:
Avoid file names like:
This is better:
This can cause problems:
Good comments explain the purpose of the code.
Avoid comments that simply repeat the code.
Use one # for major sections:
Use two ## for subsections:
Use three ### for smaller subsections:
Write your model equation in LaTeX math notation:
Write a paragraph for your model in the methods section
Fill in the following model using variables from your own capstone project:
\[ Y_{it} = \beta_0 + \beta_1 X_{it} + \varepsilon_{it} \]
Then explain: