2021-09-15: Data Science High School Summer Camp 2021
Data Science is becoming the lingua franca of the 21st century as there are more data sources today than ever before. To prepare the younger generation for the world of data science and artificial intelligence, Dr. Sampath Jayarathna at Old Dominion University (ODU) in collaboration with the ODU Computer Science (CompSci) Department organized a 10-day data science summer camp for about 14 high school students in Norfolk, Virginia from August 9th to August 20th, 2021, funded by PRA Group Inc. The program was a hybrid of onsite and virtual training sessions.
This program was an intensive training program intended to prepare high school students for working with different data sources such as structured and unstructured data like text data and images. The students also learned Python programming which is currently the most popular programming language for data science. During each session, we gave the students some activities to work on based on the topics covered since the training is more hands-on than theoretical.
The first two days of the summer camp were conducted on campus. On the first day, the students were introduced to the basics of Python programming by Dr. Jayarathna. They had the opportunity to learn about Python variables, strings, lists, and functions. On the first half of the second day of the summer camp, the students were introduced to conditional statements in Python programming. They learned about “if-elif-else” and how to implement it to solve real-life problems.
Now @kritika_garg is introducing the conditional statements in #Python to the students at @ODU #DataScience camp 2021 organized by @OpenMaze @NirdsLab. @PRAGroupInc @AjayiKehindep @WebSciDL pic.twitter.com/0PgZ5bR0eQ
— Himarsha R. Jayanetti (@HimarshaJ) August 10, 2021
During the second half of the second day, the students were introduced to Numpy (Numerical Python); one of the most fundamental data science libraries. They learned about NumPy arrays, and how to use arrays for data science projects.
After the second day, the program continued virtually for five days. It was amazing how the students were still able to grasp these high-level concepts via online meetings as evident in their engagements in the class activities. They began the virtual class with another fundamental data science library named pandas. pandas is a Python data analysis software popularly used by most data scientists and data analysts for manipulating and analyzing structured data. In this session, the students were taught how to create their own data in table format. They also learned how to load data in different formats both from their local computer and from online sites such as GitHub.
After the lunch break, @AjayiKehindep continued teaching more neat stuff which can be done using #Pandas.
— Himarsha R. Jayanetti (@HimarshaJ) August 11, 2021
Day 3 of the camp was virtual and the students made use of piazza to participate in the activities. @oducs @ODUSCI @PRAGroupInc @OpenMaze @NirdsLab @WebSciDL pic.twitter.com/v6h1ABrlru
The pandas session concluded with data understanding and data manipulations where the students investigated different kinds of data used to answer some business questions via the skills learned during the session.
During day 4 of week 1, the students were introduced to data visualization using seaborn. Seaborn is a Python data visualization library based on matplotlib and provides a high-level interface for drawing attractive and informative statistical graphics. In my opinion, data visualization is one of the most sought-after skills every data analysis enthusiast craves for. This is because data visualization helps to discover hidden insights about a given data.
In this session, the students learned how to create various visualizations with structured data. They learned how to create visualizations for univariate data such as bar graphs for a single categorical variable and histogram for a single quantitative variable. They also learned how to create scatter plots for two numeric variables to study the relationships between the two variables. This session was extremely insightful for the students to see how to use visualization to tell stories and share insights about the data.
I'm excited to teach #dataviz using the #seaborn library to the high school students at @oducs Data Science Camp 2021!
— Himarsha R. Jayanetti (@HimarshaJ) August 12, 2021
Thank you for the opportunity, @OpenMaze @PRAGroupInc @NirdsLab @WebSciDL pic.twitter.com/bWL7oanM3X
The students had a very interesting final day of week 1. Day 5 started with a talk by the invited speaker Dr. Wu of WS-DL about Natural Language Processing and "Training Computers to Understand Humans".
Day 5, Session 1 @oducs @ODUSCI Data Science Summer Camp 2021. Dr. Wu @fanchyna is talking about interesting concepts in text processing. @OpenMaze @NirdsLab @PRAGroupInc @WebSciDL pic.twitter.com/VSiIuIoCYI
— Himarsha R. Jayanetti (@HimarshaJ) August 13, 2021
After lunch, the students had the opportunity to listen to the inspiring talk by another invited speaker Dr. Meghan Chandarana about her journey of becoming a NASA engineer. The high school students were also introduced to data storage topics and LaTeX using Overleaf on the same day.
The second week of the summer camp started with the topic of data wrangling by Dr. Jayarathna. In this session, he introduced how to perform data cleaning with Python pandas. He also introduced the concept of machine learning and the various processes involved in building a simple machine learning model.Thank you @meghanch23 for #inspiring the students by sharing your journey of becoming a @NASA engineer.
— kritika garg (@kritika_garg) August 13, 2021
Day 5 of @ODU #DataScience Summer Camp 2021 by PRAGroupInc and @oducs.@NirdsLab @OpenMaze @WebSciDL pic.twitter.com/COqRMFNMGi
.@OpenMaze teaching the importance of cleaning and structuring your data to high schoolers.#DataWrangling in week 2 of @ODU #DataScience #camp 2021!@PRAGroupInc @oducs @NirdsLab @WebSciDL pic.twitter.com/BIYudwqdDa
— kritika garg (@kritika_garg) August 16, 2021
During the second week, the students were also taught how to use Weka (machine learning software) to build machine learning models and design workflows using different kinds of data. There were also research presentations on Eye tracking from some PhD students in Web Science and Digital Libraries group (WS-DL) at ODU; Yasith Jayawardana, Gavindya, and Bhanuka Mahanama.
Also, the second week was fueled by Dr. Sawood Alam, an alumni of the Web Science and Digital Libraries group (WS-DL) at the Old Dominion University (ODU). During his presentation, he shared his experiences as a Web and Data Scientist at the Internet Archive.The students were also introduced to EEG recording devices by @yasithmilinda. The high schoolers were amazed by how the brain signals were detected by those high end research devices. @NirdsLab @OpenMaze @PRAGroupInc @Gavindya2 @mahanama94 @WebSciDL @ODUSCI @oducs pic.twitter.com/3qbhXEpy6w
— Himarsha R. Jayanetti (@HimarshaJ) August 18, 2021
Very excited to welcome, a proud product of @oducs @WebSciDL, Dr. Sawood Alam @ibnesayeed, currently a Data Scientist at @internetarchive @waybackmachine talking to our ODU Summer Camp students about Web & Data Science at IA. Thank you! @PRAGroupInc @ODUSCI https://t.co/gN6jf1JyA9
— Sampath Jayarathna (@OpenMaze) August 19, 2021
In addition to the regular lectures, students were teamed up to work on a project using the Google Colab environment. Students were given a list of 15 topics and datasets to choose from to design a project on Day 4 of week 2. They applied their coding and data science skills acquired during the camp to experiment with the chosen data on the last two days. On the final day, we had students teamed up in nine teams to present their projects.
Team 1: Trending YouTube Videos
Team 2: Predicting Prices Of Bitcoin
Team 3: Analysis Of Covid Cases By Counties
Team 4: Analysis Of Covid Cases By Counties
Team 5: Analysis Of Covid Cases By Counties
Team 6: A virus with no boundary: Visualizing Covid-19's impact on the World
Team 7: Analysis Of Covid Cases By Counties
Team 8: White Wine Samples
Team 9: Covid Death and Summarization
The students favored the topics based on ‘’Covid-19”, “Youtube videos”; and “wine samples”. Students enjoyed creating data visualizations such as using bar plots to show the view count of trending Youtube channels and covid cases per county. Some students even used advanced concepts like text analysis and Web scraping. It was impressive to see the development in coding skills and confidence of these high school students. Some students went from writing their first “hello world” program in Python to experimenting with datasets and creating visualizations.
.@oducs #DataScience Camp ‘21 final day. It’s time for the students to showcase what they learned over the 2 weeks.
— Himarsha R. Jayanetti (@HimarshaJ) August 20, 2021
We are making this an opportunity to thank @PRAGroupInc who generously funded this project. @OpenMaze @NirdsLab @kritika_garg @AjayiKehindep @WebSciDL @ODUSCI pic.twitter.com/mp5V0nTqMO
By: Ajayi Kehinde Peter (@AjayiKehindep), Himarsha Jayanetti (@HimarshaJ), & Kritika Garg (@kritika_garg)
ODU College of Science News article of this work is also available at: "Norfolk High School Students Unlock Potential in Data Science Summer Camp"
Comments
Post a Comment