Saturday, December 17, 2011

2011-12-15: 2011 NFL Season Week 15

So far this year all three of the prediction algorithms are 68% correct straight up. This is better than the predictions of most of the NFL "experts" such as the guys at ESPN. Last year we ended up right below 70% correct as well. Breaking the 70% barrier over the season seems to be rather hard to do as seen on the Prediction Tracker. Looking into the statistics of those games reveals some interesting information. In the majority of those games, the losing team had better box scores but still lost the game. We had thought that incorporating the betting line data this year would have had impact but the accuracy of the straight up predictions is not significantly better than last year.

The season isn't over yet and anything can happen so here are the predictions for week 15.


Favorite Spread Underdog Discrete Pagerank
DAL 7 at TB DAL DAL
at NYG 10 WAS NYG NYG
GB 9 at KC GB GB
NO 9 at MIN NO NO
at CHI 3 SEA CHI SEA
at BUF 2 MIA BUF BUF
at HOU 4 CAR HOU HOU
TEN 4 at IND TEN TEN
CIN 7 at STL CIN CIN
at OAK 4 DET OAK OAK
NE 9 at DEN NE NE
at PHI 3 NYJ PHI NYJ
at ARI 3 CLE ARI ARI
at SD 4 BAL BAL BAL
PIT 2 at SF PIT PIT


-- Greg Szalkowski

Wednesday, December 14, 2011

2011-12-14 Python & Memento Presentation for the ODU ACM

Earlier this semester, I was invited to present Python at an ODU ACM meeting. I presented a brief overview of the Python language and followed up with a code walk through of the code I use to parse Memento timemaps in my current research.

Python, of course, has advantages and disadvantages compared to other languages. Since most ODU undergrads have experience with C++, the presentation presents Python with respect to C++. Pythons advantages include a fast development cycle and an extensive collection of community libraries. Its primary disadvantage compared to C++ is execution speed. My experience is that Python is sometimes over 100 times slower.

Python's basic syntax and semantics are straight forward, so the presentation focused on the Python equivalents of commonly-used C++ constructs and the differences between static (C++) and dynamic (Python) typing. Python's implementation of high-level data types (lists, dictionaries, tuples, and sets) and functional code were compared to the complexity of the C++ equivalents.



To bring all the pieces together, I did a code walk through of the python.py module I use to parse Memento timemaps (see the Memento Introduction and Internet Draft for more information). The module has two classes. The TimeMap class is a parser and dictionary for timemap data. The TimeMapTokenizer class is a tokenizer for link-style timemaps.

To load a timemap, a new instance of TimeMap is created using the timemap's URI, which is the constructor's only argument. A TimeMapTokenizer instance returns individual tokens, simplifying the parsing code in the get_next_link function. TimeMap implements the __getitem__ function, allowing it to act as a Python dictionary. TimeMapTokenizer implements the __iter__ and next functions, which the use of Python iteratation constructs over the list of tokens.

— Scott G. Ainsworth

2011-12-14: CS 495/595 Web Server Development for Spring 2012

The only WS-DL related class that will be offered in spring 2012 is CS 495/595 "Web Server Development". I had planned to offer CS 751/851 "Introduction to Digital Libraries, but I've taught that the last two springs and it has been a while since I've taught the web server development class (the last offering was actually from Martin Klein in spring 2010).

The premise of this course is that the best way to really get to know HTTP is to build a fully-functional web server from scratch in the language of your choice. That sounds simple enough, but it becomes quite challenging, in part because if you do a poor job at design at the beginning you have to live with the consequences the entire semester. On the other hand, do a good job up front and each assignment will just drop into place (hello, software design). Along the way, you'll also become quite familiar with reading RFCs and the REST architectural model.

Take a look at past offerings of the class for an idea of what the structure will be. The CRNs are 35757 (CS 495) and 35758 (CS 595). The class will be on Tuesdays, 4:20 -- 7:00 pm in r. 2120.

--Michael

2012-01-09 edit: The class homepage is now available.

Thursday, December 8, 2011

2011-12-08: Summer Microsoft Internship

It all started in San Francisco airport while waiting to get my luggage on my way to the PDA2011 conference. The recruiter from Microsoft called me to inform me that I have been accepted to intern at Microsoft Silicon Valley this summer. I was ecstatic and after a couple of months of bureaucracy and a ton of documents I was ready to leave Norfolk by the end of May. Since I haven’t been on an adventure or a trip for a long time, and since I will definitely need a car in California for the three months of the summer, I decided to drive my car all across the continent. I have always wanted to make a road trip like that where I can stop in every city or town along the way, check out their attractions and eat from their authentic cuisines.

At the same time, our colleague and best friend Moustafa Aly managed to secure a job at Amazon’s engineering office in San Francisco. So when he knew I was going to drive all the way there he told me: “forget the plane, I will join you!”

We left Norfolk on the 24th, set the odometer of the car to 0 and having in mind since we are information retrieval and social networking people we will make our status updates and check-ins on Facebook our trip’s record keeper. We picked the route, filled up the car and drove. From Norfolk, stopping at Richmond and Nashville we drove through a tornado passing Tennessee, almost ran out of gas in Texas in the middle of no where, changed the clock twice in one day, eating the best steak I have ever had in Texas and the best burritos on earth in Las Cruses, playing with rockets in White sands missile range, passing over the Hoover dam and the burning the car’s AC compressor in the desert of Nevada we finally made it to Las Vegas where we wanted to spend an entire day relaxing. Next day we started driving and after 9 more hours we made it to San Francisco finishing 3559.6 miles in 5.5 days.

Working at Microsoft Silicon Valley definitely has its perks. The location was amazing and the engineers there are really incredible. I joined the office 365 server-side team for PowerPoint where I shared my office with another intern from UC Berkeley. Working with this team I had the most liberty I had in years working for companies. We sat together and set the goals I need to reach for this internship and they gave me the entire freedom to pick the way I was going to build it, which is more my style in working. I was supposed to start the implementation of a certain fraction of the distribution and investigate two other things but to my surprise they liked what I did with the first task so they decided to modify my internship goals to finish this project completely, reach ship quality and release it in the next version. With this I passed all the phases of software development from meeting with managers, architects and program managers to setting the design to development to finally quality and integration testing. Finally I had to demo my work to the three department managers to see if this could be incorporated in the next shipping release, and to my delight they were fascinated by it and it will be shipped!

The first day I attended the orientation and they gave us an overview to what we will be doing this summer and how are we going to be evaluated. Our mentors then came and took us and I was introduced to my team, the PowerPoint team. Immediately after that I was introduced to the available projects and I choose the one that was more appealing to me. Immediately after that I was granted permissions to access the codebase. Imagine having the source code of both PowerPoint and the server cloud back-end, it felt awesome! for the next two weeks I tried to break in the thousands of lines of code and produced a prototype proof of concept that I was on the right track. By the end of the first week I set my internship goals with my mentor but after the fast prototype I produced I was called to a meeting with both the test and the product management team, I was representing the development team. They decided to change my goals completely to actually build the entire feature and its backend support from scratch and have the opportunity to ship it. Knowing the task in hand of rebuilding the PowerPoint backend on the cloud with the appropriate interface to match the latest award-wining rich-client application I had to go back to the basics. I had several one-on-ones with the development team of PowerPoint client-side to understand piece by piece the functionality of each module of the application. The problem with a project like PowerPoint that it is fairly old and fairly stable with more than 20+ years of development and thousands of legacy code. I was completely lost in the beginning but my mentor didn't let me stumble much, I was practically staying in his office the first couple of weeks. We used C++ and C in the backend with javascript and C# for the matching interface. This was the trickiest part, the ability to match functionalities between two very different frameworks. At a certain point I found a severe gap in the design document related to the functionality. I talked with my manager and he told me a change like the one you want in the design document needs to be escalated. A couple of hours later I was sitting in a room full of Microsoft's elite developers, testers, PMs and managers, the least of which has 7 years work experience under his belt,...and me! That what I loved about Microsoft, even though I was just an intern I owned the project and they appreciated that. I explained my case and it was approved and the design document was changed! I was so proud of myself that day.

The atmosphere within the office was relaxing, cool, upbeat and always challenging. I can fairly say I was spoiled this summer. I was residing in the corporate housing complexes where I got a spacious studio apartment fully furnished with maid service that come clean weekly! Courts, swimming pool and a huge hot tub all provided for free within the apartment complex. Every other week the recruiters and the PR managers created an event, party or outing for all the interns on campus. We went hiking, bowling, watching movies and they even flew us to Seattle to visit the headquarters for the summer intern event. They paid flight tickets, the luxury hotel and even a car rental. Steven Sinofsky gave us a wonderful presentation where they show us classified sneak peeks to the all-new amazing Windows 8 and I was genuinely impressed. At the company store we got lots of t-shirts, games and gadgets with our employee discount. After that they rented the Zoo for us since we were about 1000 interns from all over the country and they got us the “Dave Matthews” band and gave each one of us a brand new xbox360 with Kinect!


It was definitely unique and rewarding to work with all those interns from the top universities all over the country: MIT, UC Berkeley, Stanford, …etc. I asked around and I found that I was the only representative from ODU so I was definitely proud and tried to behave. Me and the other interns became friends and since most of us are residing on the same apartment complex we gathered almost every night and on the weekends we went and discovered the city and the surrounding area. Unfortunately I didn’t join them in the Yosemite hiking/camping trip, as I was sick that day. One day we all decided to wear suits and sunglasses all day at work and call it "Brogramming" day. Someone took a photo of us and it gone viral on twitter and facebook!

In conclusion I feel honored and blessed for being able to work at this wonderful fascinating place with all those extremely intelligent colleagues. My manager/team lead told me on my first day one thing that I believe it changed everything. He said you were only an intern during the 2-hour orientation session, now consider yourself a full time software engineer and own your work. This definitely helped me to shine, participate, own my work, suggest enhancements, which actually were considered, and we changed the design document. Now, I can proudly say that my product is being used currently by millions of users; probably you are using it right now!

-- Hany SalahEldeen

Wednesday, December 7, 2011

2011-12-07: 2011 NFL Season Week 14

Week 14 of the 2011 NFL season is upon us. Talk of play-off teams and Superbowl probabilities fill the airwaves even more than Christmas music. Sitting in traffic on the drive home from work tonight I was listening to a few on-air personalities discussing Green Bay and New England for the Superbowl. Green Bay has already clinched a playoff berth and many people would say they are headed to the Superbowl this year. The comment that caught my attention was that the defense for both teams was terrible this year and the only reason they were doing well this year is that their offenses were so good that they could "outscore their mistakes".

This led me to think about the Colts without Peyton Manning this year. For the past 3 or 4 years the Colts with Manning as their quarterback have dominated the sport. It would seem that they built the entire team around Manning. The Colts would run up the score on offense and then the opposing team would be forced to attempt to pass often just to catch up. Then the Colts defense would focus on the opposing teams quarterback to keep him from making plays. Now this year without Manning the Colts have no game. Are Green Bay and New England in a similar situation?

Contemplating statistics during rush hour traffic is a good way to become a statistic so I did not get much more in depth listening to the show, but after arriving home I ran some SQL queries to check the veracity of the claims made by the radio show.

Indeed it is true that the defense for both Green Bay and New England have given up more than the average number of yards this year. They are both almost dead last in defensive performance. Here is a list of the teams with the average number of yards given up per play on both passing and rushing plays.


Team
Yards given up per play
Atlanta
4.3638
Pittsburgh
4.7886
Baltimore
4.8224
Houston
4.9090
Cincinnati
5.1182
San Francisco
5.1645
New York
5.1786
Cleveland
5.2142
Jacksonville
5.2292
Seattle
5.3491
Tennessee
5.3568
Washington
5.4359
Detroit
5.5313
Arizona
5.5506
Miami
5.5726
Denver
5.6300
Kansas City
5.6887
Chicago
5.6887
Oakland
5.7423
Dallas
5.7464
San Diego
5.7867
Minnesota
5.8281
St. Louis
5.8436
Indianapolis
5.8923
Philadelphia
6.0145
Buffalo
6.0374
New York
6.0508
New Orleans
6.0600
Carolina
6.3130
New England
6.3642
Tampa Bay
6.4829
Green Bay
6.5041

I have a feeling that a team with a balanced offense and a good pass defense like Pittsburgh or Baltimore could give New England and/or Green Bay a tough time in the post season but maybe we will cover that next week.

The predictions for week 14 are:

Favorite Spread Underdog Discrete Pagerank
at PIT 9 CLE PIT PIT
at BAL 7 IND BAL BAL
HOU 5 at CIN CIN HOU
at GB 15 OAK GB GB
at NYJ 7 KC NYJ NYJ
at DET 7 MIN DET DET
NO 2 at TEN NO TEN
PHI 9 at MIA PHI PHI
NE 15 at WAS NE NE
ATL 4 at CAR ATL ATL
at JAX 5 TB TB JAX
SF 3 at ARI SF SF
DEN 2 CHI DEN CHI
at SD 12 BUF SD BUF
at DAL 3 NYG DAL DAL
at SEA 4 STL SEA SEA


-- Greg Szalkowski

Thursday, December 1, 2011

2011-12-01: 2011 NFL Season Week 13

Week 13 of the 2011 NFL season is upon us. This week New England is a 20 point favorite over Indianapolis. 20 points is rather rather significant for a line value. In fact since 2002 there have only been six games with a line value of 20 or greater. Of those six games, New England was the favorite in five of them. In none of the five games did New England cover the spread but they came close to covering the spread in the 2007 game against Miami winning by 21 points with a 22 point line value.




Favorite Spread Underdog Discrete Pagerank
PHI 5 at SEA PHI SEA
TEN 3 at BUF BUF TEN
at CHI 4 KC CHI CHI
MIA 7 at OAK MIA OAK
at PIT 6 CIN PIT PIT
BAL 1 at CLE BAL BAL
NYJ 1 WAS NYJ NYJ
at HOU 7 ATL ATL HOU
CAR 6 at TB TB CAR
at NO 7 DET NO NO
At MIN 6 DEN DEN DEN
at SF 10 STL SF SF
DAL 8 at ARI DAL DAL
GB 2 NYG GB GB
NE 10 IND NE NE
SD 4 JAX SD SD


-- Greg Szalkowski