Pitfalls

The Problem

In the age of AI, we need to “change course” and update our data science education to teach students how to become critical thinkers rather than how to mechanically apply technical knowledge. Too often, data science education focuses solely on technical skills such as statistics and programming. However, this narrow perspective overlooks the crucial core skills that make a competent data scientist, such as creative problem-solving, a scientific mindset, and critical thinking. Data scientists who are mindlessly applying their technical skills are basically chatbots with a salary. Moreover, there is the potential for ethical harm when critical thinking is absent in the development and release of algorithms.

The Solution

This course is designed to tackle these concerns head-on. By incorporating lessons from the corporate world and other real-world examples, we aim to cultivate students’ critical thinking skills and raise awareness about the ethical implications of algorithm development. The course will include hands-on exercises using data that students learn to analyze using SQL and the help of ChatGPT. Students will be led into cognitive or statistical pitfalls. Once encountered, these pitfalls will be analyzed and understood through in-depth lectures, granting students a second chance at the analysis with newfound awareness and caution. Pitfalls to be explored will include regression toward the mean, selecting on the dependent variable, using bad data, confusing correlation with causation, and the potential for doing harm.

The Impact

Through this course, students will acquire basic SQL skills, learn to work effectively with ChatGPT, and gain an understanding that you cannot avoid pitfalls by blindly applying information from chatbots. Students will learn the importance of formulating the right questions and the necessity of critical thinking, as data will not “speak for itself”, and interpretations are not always straightforward. They will also discover the importance of skepticism and the dangers of data manipulation to support preconceived notions. In becoming adept experimenters, they will learn about the signs of lurking variables if randomized controls aren’t put to use. By instilling these principles, this course aims to put the ‘science’ into data science, equipping students with the necessary skills to avoid pitfalls and solve real-world data problems. Students can grow into data scientists who are not only technically adept but also ethically conscious and critical in their thinking.

We explain here how these materials will differ from existing ones.

Becoming a Data Scientist in the Age of AI: Developing Critical Skills Beyond Chatbots

I. Course Materials

5. Evaluation

II. Framework: How To Do This Right

XX

III. Pitfalls: What Can (and Will!) Go Wrong

XX

IV. Chatbots & SQL

2. SQL Basics 1 – Tables and fields, Datatypes, SELECT, FROM, WHERE, LIKE, ORDER BY, subqueries & Common Table Expressions, Creating and populating tables

2. SQL Basics 2 – GROUP BY, HAVING, aggregate functions (AVG, SUM, COUNT), calculations and manipulations, commenting, aliases, singletons, string functions

3. SQL Basics 3 – Implicit GROUP BY, COUNT, COUNT DISTINCT, more aggregations, column numbers, grouping with CASE statements, JOINS, UNIONS

4. SQL Basics 4 – More about JOINS, COALESCE

5. SQL Basics 5 – Case Study: The challenges of determining the NBA GOAT

6. SQL Basics 6 – Learning mindset vs. analytic mindset, uses for subqueries, issues with joins to watch out for, relationships between tables

Create a website or blog at WordPress.com