Syllabus

DANL 320: Big Data Analytics

Author

Byeong-Hak Choe

Published

March 23, 2026

DANL 320-01: Big Data Analytics

(3 credits)

Course Information

Item Details
Semester Spring 2026
Class Location South 338
Class Hours MW 3:30 P.M. – 4:45 P.M.

Instructor Information

Item Details
Name Byeong-Hak Choe
Email (preferred)
Phone (alternative) (545) 245-5425
Office Location South 227B
Office Hours MWF 9:15 A.M. – 10:15 A.M.

Course Access and Orientation

Course Description

This course introduces students to practical machine learning for analyzing large, real-world datasets using R. You will learn how to design end-to-end analytics workflows that scale to big-data settings where data are large, messy, high-dimensional, or computationally demanding by using efficient data wrangling, feature engineering, and modeling practices in R.

The course covers core supervised learning methods including linear regression, logistic regression, regularization, and tree-based models such as random forests. It also introduces unsupervised learning methods such as clustering and principal component analysis. In addition, students will explore text as data, learning how to preprocess large collections of documents and build baseline text models.

Throughout the course, students will gain hands-on experience building reproducible machine learning pipelines, validating and tuning models, interpreting results, and communicating findings clearly for decision-making.

Prerequisites/Corequisites

DANL 300

Communication Guidelines

Instructor checks email daily (Mon–Fri). Expect responses within 24–72 hours.

Syllabus Statement

This syllabus is a working document and subject to change. Updates will be communicated in class meetings and/or Brightspace announcements.

Required Materials

  • Hands-On Machine Learning with R (free)

Optional Readings

  • R for Data Science (2e) (free)

Technology Requirements with Privacy Policies

  • Brightspace
  • Microsoft Teams
  • RStudio Desktop
  • GitHub

Bachelor of Arts in Data Analytics Program Competency Goals (CGs)

  • Competency Goal 1: Our learners will have strong analytical skills.
  • Competency Goal 2: Our learners will have strong quantitative skills.
  • Competency Goal 3: Our learners will have effective communications skills.
  • Competency Goal 4: Our learners will have a thorough understanding of various functional areas of business.
  • Competency Goal 5: Our learners will have a multidimensional understanding of social responsibility.

Course Objectives

  • Build scalable supervised learning models on large, messy datasets. (CG1, CG2)
  • Evaluate and tune models at scale with train/validation/test design, cross-validation strategies, and metrics appropriate for big, imbalanced data. (CG1, CG2)
  • Apply unsupervised learning for high-dimensional data to uncover structure and reduce complexity in large datasets. (CG1, CG2, CG4)
  • Deliver reproducible, responsible analytics by communicating results clearly and addressing limitations, bias, and interpretability in big-data settings. (CG1, CG2, CG3, CG4, CG5)

General Education (GLOBE) Learning Outcomes

N/A

The SUNY Geneseo School of Business

School of Business Mission

The School of Business at SUNY Geneseo is committed to exceptional business and economics education within the context of a strong liberal arts tradition. The School is distinguished by a uniquely accomplished and dedicated faculty, motivated and capable students, a robust professional development program, and the engaged support of alumni, employers, and business leaders.

Students acquire strong quantitative, analytical, and communication skills while preparing for professional success as socially conscious contributors. We strive for teaching excellence, and we recognize that high-quality faculty scholarship and professional activities increase our impact on knowledge, practice, and pedagogy.

Course Schedule

Week Module Assignments Due Dates
1–2 Building and managing a data portfolio website with
Git, GitHub, and Quarto
2–4 Linear Regression HW 1 Feb 11
5–6 Logistic Regression HW 2 Feb 25
7–8 Classification HW 3 Mar 9
8 Midterm Exam Mar 11
9 Spring Break Mar 14–21
10–11 Regularization HW 4 Apr 8
12–13 Trees and Forests HW 5 Apr 27
14–15 Unsupervised Learning
(e.g., Clustering, Principal Component Analysis, Association Rules)
HW 6 May 6
16 Machine Learning Project Presentation Machine Learning Project Presentation May 4–6
17 Final Exam Final Exam May 13
17 Machine Learning Project Report Machine Learning Project Report May 14

Key Dates (Summary)

  • Midterm Exam: March 11 (during class time)
  • Spring Break: March 14–21 (no classes)
  • GREAT Day: April 22 (no class)
  • Final Exam: May 13, 3:30 P.M. – 5:30 P.M.
  • Machine Learning Project Report: May 14, 11:59 P.M.

Website

You will build and publish your own course website using Quarto, RStudio, and Git/GitHub. Throughout the semester, you will post reports that include R code, analysis, and visualizations on your Quarto site.

Your website will be hosted on GitHub Pages. We will also cover the basics of Markdown, and introduce essential concepts in HTML and CSS for simple customization.

Group Project

Each project team will consist of one or two students. Your group will choose a dataset for the project, and the dataset must be approved by the instructor before you begin.

The final capstone write-up must include:

  • Exploratory Data Analysis (EDA), including descriptive statistics, data transformation, and multiple visualizations
  • Machine learning analysis

All reports must be published on each team member’s website. Any change in group membership or project topic must be approved by the instructor.

Grading

Grade Components

  • Attendance (5%)
  • Participation (5%)
  • Group Project (20%)
  • Total Homework (20%)
  • Total Exam (50%)

Grading Details

  • Single lowest homework score is dropped when calculating the Total Homework Grade.
  • Total Exam Grade is the maximum between:
    • Simple average of Midterm Exam and Final Exam scores, and
    • Weighted average of Midterm Exam (33%) and Final Exam (67%)

Group Project Grade

  • Peer evaluation on group presentation (5%)
  • Instructor evaluation (95%)
    • Descriptive statistics (5%)
    • Data transformation (5%)
    • Data visualization (10%)
    • Data storytelling (10%)
    • Machine learning analysis (20%)
    • Presentation slides (10%)
    • Presentation (30%)
    • Code (10%)

Grading Scale

  • A = 93–100%
  • A– = 90–92%
  • B+ = 87–89%
  • B = 83–86%
  • B– = 80–82%
  • C+ = 77–79%
  • C = 73–76%
  • C– = 70–72%
  • D = 60–69%
  • E = 0–59%

Course Policies

Late Work

Accepted up to 3 days late with 30% penalty.

Make-up Work

Make-up exams will not be given unless you have either a medically verified excuse or an absence excused by the University.

If you cannot take exams because of religious obligations, notify me by email at least two weeks in advance so that an alternative exam time may be set.

A missed exam without an excused absence earns a grade of zero.

Late submissions for homework assignments will be accepted with a penalty. A zero will be recorded for a missed assignment.

Attendance & Participation

The knowledge and skills you will gain in this course highly depend on your participation in class learning activities as an in-person class. Because of that, you are expected to attend all class sessions unless you are ill or have a valid reason for missing.

If you are sick or have another valid reason for missing, you must email me before the absence. Any notifications after the absence will be disregarded.

Attendance will be taken during class via a sign-up sheet. You must sign in to be credited for attending class. If you attend class and do not sign in, it will be considered the equivalent of an absence.

You are provided with 5 unexcused absences per semester. Any additional unexcused absences will be subject to reducing Total Percentage Grade by one percentage point for each additional absence (6 total absences = 1 percentage point, 7 total absences = 2 percentage points, and so on).

For extended absences, such as more than a couple of days of classes, you should contact the Dean of Students, who can assist with contacting your faculty.

Netiquette Policy

  • Before contributing to a discussion board, verify if your question has already been asked and answered. Avoid repeating topics as you would in a real-life conversation.
  • Remain focused on the topic. Refrain from posting unrelated links, comments, thoughts, or images.
  • Avoid typing in all caps, as it may appear as if you are shouting.
  • Steer clear of writing anything that might be interpreted as angry or sarcastic, particularly as tone is hard to convey online.
  • Always use “Please” and “Thank you” when requesting assistance from peers or instructors.
  • Respect differing viewpoints. If disagreeing, do so respectfully and acknowledge the merits of your classmates’ arguments.
  • Ensure accuracy when responding to a peer’s query. If unsure, especially about deadlines, it is better not to guess to avoid confusion.
  • Be concise in your responses. Lengthy replies to simple questions might not be read fully.
  • If multiple responses are received to your question, consider summarizing them for the benefit of the entire class.
  • Avoid derogatory comments or insulting others’ intelligence. Disagree with ideas, not individuals.
  • When referencing a previous discussion, quote only the essential lines to provide context without requiring others to search for the original post.
  • Practice forgiveness. If a peer makes an error, do not dwell on it. Everyone makes mistakes.
  • Before posting, run a spell and grammar check. This small effort can significantly impact how your message is perceived.

Accessibility Statement

SUNY Geneseo is dedicated to providing an equitable and inclusive educational experience for all students, which includes upholding the principles of Title II of the Americans with Disabilities Act (ADA). The Office of Accessibility (OAS) will coordinate reasonable accommodations for persons with disabilities to ensure equal access to academic programs, activities, and services offered by SUNY Geneseo.

Students with approved accommodations may submit a semester request to renew their academic accommodations. More information on the process for requesting academic accommodations is on the OAS website.

As a student in this course, it is important to recognize your role in ensuring that all classmates, including those who use assistive technologies, can fully engage with and comprehend the course content. Any digital materials you create and share, such as assignments, presentations, or shared documents, must be designed to be digitally accessible using the most up to date version of the WCAG 2.1 Level AA guidelines.

Accessible practices include, but are not limited to, providing alternative text for images, using clear heading structures, and ensuring captions for any video or audio you incorporate. Guidance in making your digital content accessible can be found on go.geneseo.edu/titleii.

Religious Observations and Class Attendance

New York State Education Law 224-a stipulates that “any student in an institution of higher education who is unable, because of [their] religious beliefs, to attend classes on a particular day or days shall, because of such absence on the particular day or days, be excused from any examination or any study or work requirements” (see https://www.geneseo.edu/apca/classroom-policies).

SUNY Geneseo has a commitment to inclusion and belonging, and I want to stress my respect for the diverse identities and faith traditions of students in my class. If you anticipate an absence due to religious observations, please contact me as soon as possible in advance to discuss your needs and arrange make-up plans.

The New York State Department of Civil Service maintains a calendar of major religious observations found on their website.

Military Obligations and Class Attendance

Federal and New York State law requires institutions of higher education to provide an excused leave of absence from classes without penalty to students enrolled in the National Guard or armed forces reserves who are called to active duty.

If you are called to active military duty and need to miss classes, please let me know and consult as soon as possible with the Dean of Students.

Academic Integrity and Plagiarism

Academic dishonesty includes cheating, knowingly providing false information, plagiarizing, and any other form of academic misrepresentation.

The School of Business regards all acts of cheating and/or plagiarism on tests or any other assignments as unprofessional and unethical behavior that violates College policies as stated in the Student Handbook. Students are expected to be aware of and to obey the College policies concerning academic dishonesty.

Any alleged cheating or plagiarism may be dealt with by the School as a disciplinary problem in accordance with College policies:

https://www.geneseo.edu/handbook/academic-dishonesty-policy

Plagiarism is the representation of someone else’s words or ideas as one’s own, or the arrangement of someone else’s material(s) as one’s own. In this course, such misrepresentation may be sufficient grounds for a student’s receiving a grade of E for the paper or presentation involved or may result in an E being assigned as the final grade for the course.

Any one of the following constitutes evidence of plagiarism:

  • Direct quotation without identifying punctuation and citation of source
  • Paraphrase of expression or thought without proper attribution
  • Unacknowledged dependence upon a source in plan, organization, or argument

Use of AI in Coursework

This course encourages you to use Generative AI tools like ChatGPT or Gemini to support your work. To maintain academic integrity, you must disclose and properly attribute any AI-generated material you use, including in-text citations, quotations, and references.

Guidance for citing AI-generated content is available at:

https://apastyle.apa.org/blog/how-to-cite-chatgpt

In cases where I discover the use of generative AI without proper citation, these cases will be treated as instances of academic dishonesty and will be subject to the processes outlined in the SUNY Geneseo Academic Dishonesty Policy.

Generative AI Statement: Use of generative AI tools (e.g., ChatGPT, Gemini) must be disclosed and cited. Unauthorized use may violate academic integrity.

Student Success Resources

  • Academic Support Services
  • Library Research Help
  • Technology Support
  • SUNY Geneseo Counseling Resources
  • Knight’s Harvest Food Assistance

DEI Statement

This course values diverse perspectives and encourages inclusive dialogue. We aim to create a respectful and equitable learning environment.

Back to top