Thursday, May 7, 2015

2015-05-07: Teaching Undergraduate Computer Science Using GitHub and Docker

Mat Kelly taught CS418 - Web Programming at Old Dominion University in Spring 2015. This blog post highlights some teaching methods and technologies used (namely, Docker and GitHub) and how he integrated their usage into the flow of the course.                           

For Spring Semester at Old Dominion University I taught CS418 - Web Programming with some updated methods and content. This course has been previously taught by various members of ODU WS-DL (2014, 2013, 2012).

The first deviation from previous offerings of the course was to change the subject of the project. Previously, CS418 students were asked to progressively build an online forum like phpBB. Web sites resembling this medium are no longer as common as they once were on the Web, so a refresh was needed to keep the project familiar and relevant.

For Spring, I asked students to build a Question-and-Answer website akin to Being students of computer science, all were familiar with the contemporary model of online discussions and soliciting help from others experienced in an area (e.g., computer programming).

We followed an initial coursework flow with lectures about Web Fundamentals, followed by more technical lectures on PHP, MySQL, JavaScript, and an HTML/CSS Primer for those students that have programmed but never created a web page. The lectures were old news for some students, who were already employed (CS418 is a senior-level course), and completely new for others, who had programmed but never for the Web.

The delivery of the project is an aspect that made this semester's course unique. In a preliminary assignment very early in the semester, I required each student to:

  1. Fork the class GitHub repository
  2. Pull a working copy to their system
  3. Add a single file to the repository
  4. Commit the change
  5. Submit a pull request to the class repository

This ensured a base knowledge of version control dynamics but also required the students to provide a reference to the repository for their class project with the single file submitted. A student's project repository was different than the fork of the class repository.

GitHub inherently facilitates sharing of source code - an aspect that I did not particularly want to encourage with the individual students' projects. The GitHub Student Developer Pack provided a solution for this. By each student contacting GitHub and providing proof of being a student, they were each supplied a small number of private repositories, which would normally require a monthly fee. The program also offers many other benefits free of charge to students like credit on a cloud hosting platform, a free domain name on the .me TLD, and private builds from one of the more popular continuous integration services (among many other benefits).

Along with submitting a pull request, I also asked the students to add me as a "collaborator" on GitHub for the repository they each specified, allowing access for grading whether or not the student decided to take advantage of the Student Developer Pack.

As the students began to build features for each of the four milestone requirements in the course, I reiterated that what was checked into their GitHub repository come demo day is that which they would be graded on. This circumvented the "my computer crashed", "the dog ate my homework", etc. but introduced the issue of "I must have forgot to check my updated code in". To remedy this, but mainly to allow students to verify their code will work as expected on demo day, I put together a demo day deployment system using Docker.

Docker allows easy, systematic deployment of software that is sandboxed from a host system yet extensible to communicate between multiple instances ("containers" in Docker jargon). Using Docker allowed a student to iteratively test the code they had checked into their GitHub repository from the comfort of their home while instilling confidence on the correctness of the features they had implemented thus far. While previous offerings of the class provided students with a Virtual Machine (VM) on which to develop their project, I opted to use Docker instead, as it provides an isolated environment for each student with a freshly installed OS each time their code is deployed. Docker also allowed the packages and libraries needed by the students for production to be parameterized. A downside to using Docker over a VM is the students' reliance on a central server for deployment. However, this "benefit" of VMs does not guarantee the consistency of presentation for demo day, as a local VM might be configured differently than the demo day machine.

Our Docker Deployment System was hosted on a server at ODU but was accessible to the world. Each student was supplied a unique port number, allowing students to simultaneously use the system without fear of clashing with other students testing. The system evolved as the semester drew on and I continuously developed it. Using the system is fairly easy and intuitive.

The student first enters their ODU CS username in a text field.

The Docker Deployment System dynamically queries the class GitHub repository to link the CS username to a student's GitHub repository, as previously submitted. This cross-referenced prevented abuse by GitHub users that were not registered for the class and required students to execute the procedure as a prerequisite for demo day (i.e., submission of assignments).

The user can then authenticate with GitHub by clicking a button. Doing so brings up the dialog to do so on the GitHub website.

Upon successful login, the user is returned to the Docker Deployment System interface with the same button now reading "Dockerize my code". Selecting this button invokes a server side scripted process.

Brief messages are shown to the user to indicate the script process that is being followed on the server. In sequence, the script:

  1. Deletes any old remnants from previously deployments by the student
  2. Clones the user's repository using Git and the GitHub API access token, obtained from the user logging in (this is critical if the user's project repository is private)
  3. Kills previously deployed Docker instances spawned by the user
  4. Removes the previously deployed instances to ensure a fresh copy is used by Docker
  5. Fires up a new container in the Docker Deployment System using the students's latest code (from the above repo clone)
  6. Provides HTML links to the student to test their code.

Docker containers are defined in a Dockerfile, a standard format that references the basis OS and any packages required for the container. For the students' deployment, I used Ubuntu as the basis along with Apache, PHP, and MySQL in the CS418 Dockerfile. A directive in the Dockerfile also provides the hook to allow the directory containing the student's code to be used as the default "website" for Apache. The students provided a MySQL database dump in the root of their project repository, which is loaded when the container for their project is instantiated.

For the most part, the initial bumps for the students to effectively use the system were overcome. Students reiterated throughout the semester that the tool was extremely useful in testing their code and ensuring that nothing unexpected would occur on demo day.

In synopsis, the usage of the Docker Deployment System developed for the Spring 2015 session of CS418 Web Programming at Old Dominion University and the required submission of coursework via GitHub allowed students to gain experience with tools and iterative testing that previous models (e.g., "magic" laptops and e-mailing code, resp.) of verifying code submissions are unable to effectively facilitate. The project-based nature of CS418 was an appropriate testing medium for developing both the system and workflow. In the future, I hope to reuse the system and workflow to teach a course less technically driven to evaluate the portability of the methods.

Special thanks to Sawood Alam (@ibnesayeed) for his technical assistance in working with Docker throughout the semester and Minhao Dong for being the ODUCS access point to ensure that students' project deployment did not compromise the university network.

Mat (@machawk1)

No comments:

Post a Comment