The University of California, Riverside, has many opportunities to all types of students. We host several clubs and organizations that are open to anyone willing to participate. In view of the fast-growing Data Science major at UC Riverside, the Data Science Academy stands at the forefront of change, giving students a window into many aspects of Data Science.
Hosted at the University of California, Riverside, the Data Science Academy was started by Professor Paea LePendu. Our aim is to introduce one of the most popular and insightful field in Computer Science to K-12 students.
At the Data Science Academy students meet current Computer Science university students and professors. Each day students get a combination of interactive lessons and activities about different topics in computer science. Some topics that we go over include cryptography and data anaylsis. By the end of the program, students will have a project that they can present in the showcase. These projects range from chatbots to wordclouds or ascii art.

In previous cohorts, the Data Science Academy took place at the school hosting it. Professor LePendu and volunteers traveled to the school, and gave students lessons on Data Science.
Since we provide Data Science instruction to K-12 students, we visited several High Schools, such as Ramona High. In 2020, amid the pandemic, we hosted a cohort online. In this cohort, more than 100 students participated, from several different high schools and middle schools.
We utilize the Google Colab platform to program in Python. We have found that students extremely enjoy this platform, because it is not only intuitive, but it is also easy to access. Colab also provides several built-in tools that allow us to teach students about Data Science without having to install many programs and packages.
Finally, throughout the program, students are encouraged to fill out surveys about their thoughts on the program. We incorporate these surveys in several different ways, such as performing data analysis on them, using the data to teach students, as well as for our own insight to improve the program.
The Data Science Academy focuses on a variety of issues. First and foremost, we aim to give prospective college students a view into how Data Science is "done" in the real world.
In order to give these K-12 students a view into the Data Science profession, we implement a variety of strategies. (1) We host lessons in Data Science topics (view the lessons header to learn more). (2) We host activities for fun. These activities are meant to not only motivate students about Computer Science and Data Science, since some of these activities are implementations of the concepts we are learning. They are also meant to give students a break from the high intensity learning we do in our main sessions. (3) We host a showcase for students to demonstrate what they have learned throughout the cohort. The purpose of these is tri-fold. First, it helps students get motivated about Data Science and Computer Science. It also enables students to demonstrate what they have learned throughout the cohort, since they are presenting their projects in front of everyone. Finally, it allows students to practice their public speaking, as well as presentation skills.

Throughout the years, our lessons have changed and adapted to the times and context of society. In 2020, our Data Science Academy had an emphasis on homelessness, in view of the growing homeless populations in Riverside, CA. In our current cohort, we want to emphasize underrepresented populations in the Data Science and Computer Science student population. Further, we are also implementing a new ethics module to instruct students in how ethics plays an enormous part in Data Science.
Our current lesson modules include the following:
 
 This lesson aims to teach students about conditionals, variables, loops, and functions. Python is one of the most famous programming languages that Data Scientists use to analyze and visualize data. This is why we use Python as the programming language. Additionally, this lesson uses encryption to strengthen some of these concepts for students. Since basic encryption methods use some of the same concepts (conditionals, variables, loops, functions), it is useful for students to have an understanding of Python before jumping into Data Science.
 
 In this module, we focus on the main Data Science topics that students must know coming out of the program. First, we touch on the Pandas package, which is a table-handling package for Python. Using the same library, we cover the topics of matrices and vectors. Data Cleaning, which plays an incredibly important role in proper Data Science operations, is also taught in this module. Finally, we also cover ethics in Data Science (and Computer Science at large).
 
 A common method of analyzing textual data (for example, books or articles), is to create word clouds. Examples of word clouds can be found here. Because we aim to give students several ways to visualize data, we do not want to merely focus on numerical data. Therefore, word clouds are an important concept to know. In presenting data, word clouds play a critical role.
We first introduce the concept of word clouds, and why they are important. Further, we show students several Python packages they will need when creating basic word clouds. Finally, to give students some time to create their projects and have fun, we ask them to improve their word clouds with colors, different shapes, and different silhouettes images.
We host a variety of activities in the Data Science Academy. The purpose of these activities is to entertain students before and after each lesson. We have found that these activities help students ease into the program. For instance, the tour of the campus gives students a well-deserved break from the intensity of the main lesson, while also introducing them to life at UC Riverside.



