Introduction to Data Science
(Same as Electrical and Computer Engineering M148.) Lecture, four hours; discussion, two hours; outside study, six hours. Requisites: course 31 or Program in Computing 10A, and 10B, and one course from Civil and Environmental Engineering 110, Electrical and Computer Engineering 131A, Mathematics 170A, Mathematics 170E, or Statistics 100A. How to analyze data arising in real world so as to understand corresponding phenomenon. Covers topics in machine learning, data analytics, and statistical modeling classically employed for prediction. Comprehensive, hands-on overview of data science domain by blending theoretical and practical instruction. Data science lifecycle: data selection and cleaning, feature engineering, model selection, and prediction methodologies. Letter grading.
Review Summary
- Clarity
-
10.0 / 10
- Organization
-
10.0 / 10
- Time
-
5-10 hrs/week
- Overall
-
10.0 / 10
Reviews
Prof. Mirzasoleman is a very nice and a great professor. She is always calm. I really enjoyed her lectures. She was also always available to help students and answer their questions. She is clearly an expert in this area and she enjoys explaining them to students.
There are two separate things to discuss in this review, the professor and the course content.
Mirzasoleiman seems to care a lot about student learning, but that's the only redeeming quality for this dogshit class. The lectures are slow, bordering on boring. Mirzasoleiman spends half of each lecture recapping the previous lecture's content, so you only learn about half of what could be included in this course.
The exams and homework assignments are somethign else entirely. The exams were utter dogshit. Questions were written poorly and the grading was utterly pedantic. If you actually understand anything about data science, you should just forget it for this class titled "Data Science Fundamentals," because you can't assume that the graders know anything about data science. If you don't write exactly what the official solution says, word for word (and I actually mean word for word), then expect to get severely penalized. That's not to mention that the official solutions were sometimes just outright wrong for significant parts of the final, or made leaps of logic that are utterly unjustifiable.
The project 3 this year was ludicrous. It was assigned way too early, before half of the requisite information was even taught in lecture, and the task was actually preposterous. We were given a dataset with a train/test split that had a 20% swing in class balance between train and test and expected to train a classifier that performed well on the test set. One of the fundamental assumptions of data science is identically distributed eamples, which was clearly violated and made the problem almost impossible to actually solve.
The course content is an entirely separate issue. Most of the course is the exact same content as CS M146 Machine Learning, taught at half the speed and with a quarter of the mathematical depth. The only unique content to this class is interpretation of coefficients, which could probably be taught in 2 lectures at most. There's no reason to take both classes (except that the Data Science Engineering minor requires both), and the department doesn't seem to care that it's wasting students' time by teaching the same class twice.
It's hard to say anything good about this class. It's poorly taught, poorly graded, poorly structured, and poorly conceived. It is a pale imitation of what a data science curriculum should be, even ignoring all the problems with course logistics. The only reason to take it is because it's a required minor course. Otherwise, take CS M146 instead.
Displaying all 2 reviews
Course
Grading Information
-
No group projects
-
Attendance not required
-
No midterms
-
Finals week final
-
50% recommend the textbook
Previous Grades
Grade distributions not available.