程序员三大浪漫: 编译原理+操作系统+图形学

CS 444, 452, and 488 represents the “big three” in Computer Science:

  • CS 444: Compiler Construction
  • CS 452: Real-Time Programming (Operating Systems)
  • CS 488: Introduction to Computer Graphics

A lot of people warn against take them at the same time as they are all difficult and time-consuming courses. But as a SE student with limited scheduling flexibility, I had to take them all in the two terms and here is my experience.

CS 488: Introduction to Computer Graphics

I took it the first, in 4A (Spring 2023), along with 5 other courses. Passed with 100%.

I think among the three, this one has the lowest workload. We were taught by Toshiya Hachisuka and there was no final exam. My prior experience with C++, OpenGL and graphics has helped a lot. The assignments were overall not difficult and fun to work with. They were well structured and each step could give you a great sense of achievement. The final project was very open-ended and can be pretty much anything. I made a Voxel Raytracer based on SVO/SVDAG and OpenGL compute shader.

Hint: RenderDoc and debugger might help

Voxel Raytracer Voxel Raytracer Voxel Raytracer Voxel Raytracer Voxel Raytracer Voxel Raytracer

CS 444: Compiler Construction

I took it in 4B, alongside with CS 452 and 3 other courses. Passed with 96%.

You basically make a compiler for a subset of Java 1.3. Teams can have up to 4 people but ours only had 3.

You will have chance to implement:

  • AST
  • Syntax and semantic checking
  • Type checking
  • Name resolution
  • Reachability analysis
  • Code generation
  • Compile to x86
  • Object-oriented features
  • Register allocations

You can use:

  • Lexer
  • Parser generator
  • nasm as an assembler (asm->binary)

Compared to CS241E which also touches compiler, this one didn’t give you any starter code and is a lot more in depth.

You can choose whatever programming language that you prefer, but our team used OCaml because it’s functional. Only one member of our team had prior experience with OCaml from Jane Street, and the other two of us had to learn it from scratch. For me that’s a great execuse for learning a new programming language, and a new paradigm too, though the instructor explicitly warned us not to choose a programming language you were not familiar with, so your mileage may vary.

One thing I didn’t like about the course was that we had to write reports which was not quite as fun as writing code.

My biggest suggestion for this course is to find good, reliable teammates that you want to work with, and also establish a good testing framework which you can use to test your code locally. The course will give you a lot of public test cases as well. Also, the assignments were released on day 1, start early and don’t wait for the lectures.

Overall, not super hard, workload is somewhat above average, and I think it’s pretty fun to build something that can compile Java code to executables from scratch.

compiler

CS 452: Real-Time Programming

Took in 4B, passed with 91%.

You probably have heard a lot of things about this course. Good things, and the bad things. What you heard was true.

This one single course has the highest workload among the three, or even any course I have taken during my undergrad. It was also the first time when I felt I didn’t have enough time to finish the assignments.

High-level description

You will work in a team of two to write a real-time operating system to control model trains. The course could roughly be divided in two parts, the kernel part where you implement the basics of a kernel such as task scheduling, interrupt handling, syscalls, etc., and the user part where you implement the user tasks that navigate and control the trains. It sounds fun and it can be fun if you know what you are getting yourself into.

Did I mention you will get to play with toy trains? You’ll get to play with toy trains.

Device

They change the device, the track, and the train from time to time. For our term, we used RPI4b and BCM2711 (armv8 64bits).

The A0

The course welcomes you with an A0, where you write a polling loop that provides an UI to control the train. You only have one week to do it, and you would have to work alone. There are a lot features to be implemented and there is even a purposefully introduced bug in the provided code that you have to fix.

I think A0 serves as a warning of what you’re getting into and “encourages” unprepared innocent people to drop the course.

Track Access

Our class was relatively large but there were only two tracks. On the days before the deadline, the queue can be an hour long. Try to make your own emulators and test cases so that you don’t have to wait for the track. Alternatively, come to the lab early in the morning.

Language

You can use either C or C++. We used C++20 and I think it was a good choice. Most of the standard libraries are not available though (because of lack of general-purpose allocator, filesystem, etc., or even floating point operations), but the type safety and things like OO features, std::optional, std::variant, std::string_view are still helpful. Rust wasn’t allowed, however.

The trains, and the tracks

The trains… are not quite something that are pleasant to model. They run at different velocities at different parts of the track, for each different orientation, and of course for each different train. They also wear and break so we you come to the lab the second day the paramters can be totally different. They might also ignore your reverse commands if they feel like it.

The tracks. Expect the sensors to randomly fail. There are also “dead-spots” in the track where if your train happens to stop there, it will not be moved unless you manually push it. Better hope that doesn’t happen during the demo, or you have really robust handling for them.

I think you’ll have a fun time working with them.

memes

MMU

I heard they are considering adding MMU into the starter code, but if it’s not there, it might be a good idea to implement it yourself.

Data cache is only available with MMU and so enabling it makes your code run much faster. You can also start to have segmentation fault so illegal access / nullptr access can be caught easier.

Time Management

Start the train control part early. It is way harder (in terms of workload and bugs you will find) than the kernel part.

And really, start everything early. Start even before the assignments are released (you can use the assignments from previous terms as a start point). The course is not something you can finish with a few all-nighters.

I personally never stayed up late for this course but I heard the lab was full of people the night before demos.

Objectives and Expectations

It’s unlikely that you will complete the project that you initially proposed, or even just the TC2 perfectly. Don’t worry too much if that happens. Our TC2 demo was miserable and the final demo was not quite satisfactory either. Frustrations and failures are part of the experience.

Reports

On top of all of those, you will also have to write reports. Have fun!

Final exam

Did I mention there is a final exam? It’s open book and not very difficult at least.

Debugging and Testing / Logging / Assertion

My biggest suggestion, and what I would do if I were to take it again, is to have a robust testing framework.

Testing is really important in this course, because with routing and multiple trains, there are so many corner cases, so many different ways that things can go wrong. When I write a piece of code, the first trial could have three different bugs, and in attempt to fix those, I could introduce two more. Some of them only happens on rare and nasty conditions that you don’t know how to reproduce but comes to you from time to time.

We had qemu working so most of the time the real PI isn’t needed for basic tests, but I don’t think that’s enough. What could have helped us a lot is to have a simulator like MarklinSim. It doesn’t have to match the real track perfectly, but it should be able to match the idealized model with all the assumptions that your software is making.

The benefits it could have is that you can now test (reproducibly!) thousands of test setups (different start / target positions and different number of trains) in a few seconds. This kind of setup could have caught 90% of our TC bugs and saved us a ton of time. This should give you the level of confidence that if something went wrong yet it passed all test cases, it’s your assumption that was wrong.

When others are fighting for the track the day before the demo, you would thank yourself for making this.

On top of that, make sure you have good logging and use assertions liberally. You probably can’t attach a debugger on a real PI so that would be really helpful for post-mortem analysis.

Weird bugs

I encountered some of the most bizarre bugs in this course. Here are some of my favorite. See if you can guess what happened here:

Bug 1: Adding print fixes your code

  1. Program works if I have a printf() (Note only one core is enabled so it can’t be race conditions).
  2. Turns out you don’t have to call the printf. Leaving the arguments here is enough to keep the code running. (It changes the length of .rodata)
  3. It crashed when trying to get instance for a singleton. e.g. T& getInstance() {static T instance; return T;}
Click here to see the root causeI forgot to clear bss and the guard variable for the static variable isn't initialized. Adding things to rodata might move the bss section to a different location such that the guard variable happens to be 0.

Bug 2: Adding nop fixes your code

When I was implementing some feature for the kernel, I had another mysterious bug that happens intermittently. After some debugging, I found that adding a nop insturuction at random places in the code might fix the issue.

More specifically, the program appears to hang if the following conditions are met:

  1. The ADR insturction uses X0 register (all other registers are okay)
  2. There is an odd number of instructions between ADR and the label it references. (that’s why adding nop sometimes fixes it)
  3. You read from the stack.
Click here to see the root causegcc compiled `__asm` blocks to use `X0` which caused conflicts in registers (I probably forgot to add it to the clobber list). The result of `ADR` ended up as the value of `SP`. Exception occurs if you use `SP` if its not properly aligned (so adding a `nop` might make it align).

Screenshots

ui

Summary / TLDR

They are all great courses. I don’t regret taking any of them. Should you take them too? You probably already know the answer.

Taking more than one of them at the same time isn’t impossible, but I wouldn’t recommend it.