## Data Science Major

Data Science Faculty: Anny-Claude Joseph (MATH), Eni Mustafaraj (CS), Patrick McEwan (ECON), Casey Pattanayak (MATH/QR), Wendy Wang (MATH), Jeremy Wilmer (PSYC)

Data Science Director: Casey Pattanayak (MATH/QR)

The Data Science major consists of ten (10) courses plus a 0.5 credit capstone experience. A student can begin the major requirements in the first or second year. Students are encouraged to explore the field of data science by taking an introductory computer science course and an introductory statistics course as early as possible. They can take MATH 115 and/or MATH 116 in their first year as prerequisites for MATH 205, if needed. CS 230; at least one of STAT 260 and STAT 318; at least one 300-level course in computer science; and at least one 300-level course in statistics (including STAT 318) must be taken at Wellesley.

### Goals of the major:

Data Science lies at the intersection of computer science, mathematics, and statistics. A student pursuing a structured individual major in Data Science will develop a strong foundation in all three areas and complete coursework that emphasizes the integration of the three. The capstone will ensure that students experience the challenges of Data Science research. Students will graduate with the critical thinking needed to pose and refine questions that can be answered with data in an ethical way, the statistical skills needed to draw meaning from data appropriately, the computational skills needed to tackle practical data challenges, and the ability to collaborate, communicate, and critique in the context of modern data.

### Major requirements:

- Eight (8) foundational courses:

- Introductory Statistics: Any one of STAT 160, STAT 218, BISC 198, ECON 103, POL 299, PSYC 105, or SOC 190
- Intermediate Statistical Modeling: QR/STAT 260 (requires introductory statistics)
- Advanced Statististical Modeling: STAT 318 (requires introductory statistics and linear algebra)
- Introduction to Programming: CS 111
- Data Structures: CS 230 (requires CS 111)
- Machine Learning: Choose from CS 305, CS 313, CS 315, or CS 333
- Multivariable Calculus: MATH 205 (requires MATH 116)
- Linear Algebra: MATH 206 (requires MATH 205)

If a student places out of CS 111, they must choose an additional CS elective, as listed in (2). If a student places out of MATH 205 and/or MATH 206, they must choose an additional MATH elective in consultation with an advisor, usually MATH/STAT 220 or MATH 225. If a student substitutes a Quantitative Analysis Institute Summer Program Certificate for QR/STAT 260, they must choose an additional STAT elective, as listed in (2). After any such substitutions, the total number of courses for the data science major must be ten plus the capstone (10.5).

- Two (2) electives, including one from statistics and one from computer science, usually chosen from the following list. All CS courses require CS 230 as their prerequisite: Prerequisites for statistics courses vary; see course descriptions.

- CS 231: Foundational Algorithms (requires MATH 225)
- CS 232: Artificial Intelligence
- CS 234: Data, Analytics, and Visualization
- CS 304: Databases with Web Interfaces
- CS 305: Machine Learning
- CS 313: Computational Biology
- CS 315: Data and Text Mining for the Web
- CS 331: Advanced Algorithms
- CS 333: Natural Language Processing
- STAT 220: Probability
- STAT 221: Statistical Inference
- STAT 228: Multivariate Data Analysis
- STAT/QR 309: Causal Inference
- STAT 320: Introduction to Bayesian Statistical Methods

3. Students will complete an experiential capstone as part of the Data Science major. The capstone must be approved by the data science directors. Students are required to participate in a half-credit capstone course during the senior year and present their capstone projects at a poster session. Details on the capstone requirement can be found on the Data Science Major website.

### Honors

A student may achieve honors by writing a thesis, if the student’s GPA in major courses over the 100-level meets the college’s requirements. See Academic Distinctions.

### Further information:

For further information—e.g., thesis guidelines—see the Data Science Major website. Students who would like to minor or double-major in a field that is closely tied to data science, such as math, statistics, or computer science, should speak to an advisor about which combinations are recommended, noting that no course can be counted toward two sequences.

Transition from previously approved individual structured major in Data Science: Students entering in Fall 2023 or later will complete the 10.5-unit major in Data Science rather than the previous individual structured major. Any student who entered before Fall 2023 but did not have an individual major proposal approved before Fall 2023 will complete the 10.5-unit major. Students whose proposals for individual structured majors in data science were approved prior to Fall 2023 may either complete their planned individual major sequences or speak to their advisors about shifting to the 10.5-unit major. Changing from an approved individual major to the 10.5-unit major requires the approval of the Data Science Director.