rakshit

writing

my thoughts and blabberings.

CS234 Assignment 1, Part 1: Horizon, Discounting, and Reward Hacking

A few weeks ago I started working through Stanford's CS234 course on Reinforcement Learning. I have been meaning to get into RL properly for a while and decided that working through a real course with real problem sets was the best way to do it. This post covers my solutions to Problems 1 and 2 from...

CS234 Lecture Notes: Foundations of RL and MDPs (Lectures 1 & 2)

I have been working through Stanford's CS234 course on Reinforcement Learning, taught by Professor Emma Brunskill, as part of building a solid theoretical foundation in RL. These are my notes from the first two lectures. Lecture 1 covers the framing of RL and builds up to Markov Reward Processes. Le...

Translating RL Math into JAX: What I Learned Working Through 10 Problems

I've been trying to get more serious about reinforcement learning, not just the conceptual side but actually being able to implement things from papers. One thing I kept running into is the gap between reading an equation in a paper and knowing what to do with it in code. You see a summation with so...

Building a Unix Shell Part 2: Background Processes and the C++ Migration

In Part 1https://boringblog.vercel.app/posts/building-a-unix-shell-a-deep-dive-into-process-management-part-1, we built a basic shell capable of executing commands using fork and exec. It worked, but it had a major limitation: it was strictly synchronous. You ran a command, the shell froze, and you ...

Building a UNIX Shell: A Deep Dive into Process Management (Part 1)

I’ve recently started diving deep into the internals of Operating Systems using the excellent IIT Bombay OS Coursehttps://www.cse.iitb.ac.in/mythili/os/ by Prof. Mythili Vutukuru. There is no better way to understand how an OS manages processes than by building the one tool developers use every day:...

1 / 2