diff --git a/.vscode/ltex.dictionary.en-US.txt b/.vscode/ltex.dictionary.en-US.txt
index 84d8f197273372afb2410f6067752f0ea7e41e1c..eb4312f137a00caa8faed0c39cd0c74a5c6ee617 100644
--- a/.vscode/ltex.dictionary.en-US.txt
+++ b/.vscode/ltex.dictionary.en-US.txt
@@ -71,3 +71,4 @@ unprefixed
 jankiness
 indirections
 customasm
+debouncing
diff --git a/.vscode/ltex.hiddenFalsePositives.en-US.txt b/.vscode/ltex.hiddenFalsePositives.en-US.txt
index ca702458a5dd8206179013aea4d2573d943c695a..33265d33987cc436cd81901fe10b0d742c084f71 100644
--- a/.vscode/ltex.hiddenFalsePositives.en-US.txt
+++ b/.vscode/ltex.hiddenFalsePositives.en-US.txt
@@ -14,3 +14,4 @@
 {"rule":"DT_PRP","sentence":"^\\QThe following steps are executed in the specified order with the I²C/SPI clock as well as \\E(?:Dummy|Ina|Jimmy-)[0-9]+\\Q enabled:\\E$"}
 {"rule":"DT_PRP","sentence":"^\\QThe following steps are executed in the specified order with the I²C/SPI clock enabled and \\E(?:Dummy|Ina|Jimmy-)[0-9]+\\Q disabled:\\E$"}
 {"rule":"EN_A_VS_AN","sentence":"^\\QThe other flags are set according to an \\E(?:Dummy|Ina|Jimmy-)[0-9]+\\Q instruction for \\E(?:Dummy|Ina|Jimmy-)[0-9]+\\Q and an \\E(?:Dummy|Ina|Jimmy-)[0-9]+\\Q for \\E(?:Dummy|Ina|Jimmy-)[0-9]+\\Q.\\E$"}
+{"rule":"UPPERCASE_SENTENCE_START","sentence":"^\\QµArch Simulator\\E$"}
diff --git a/docs/alu/overview.md b/docs/alu/overview.md
index 467151fe71ea3e1898a5509c0b1447961570ef87..6157bb08220b00a81a71a9d9f1736a40e69eb7b9 100644
--- a/docs/alu/overview.md
+++ b/docs/alu/overview.md
@@ -91,6 +91,7 @@ Adrian and Ivo spent a long time on hardware problems, which were very hard to f
 :::warning Errata
 During testing we noticed a few bugs that we've worked around instead of bothering with a proper fix:
 - We've accidentally swapped ALU operands in the build and worked around it with updated microcode.
+- The ALU breadboards use a different bit order (0, 1, ... 7) than the rest of the build. We work around this with a twisted bus connector cable, but a unified bit order would've been nice.
 :::
 
 
diff --git a/docs/memory.md b/docs/memory.md
index a8ddb9ca4409594da0324cd1d97d121a3978bcd9..ac8bb97556e8f867a3ca712a0172af3230ff3fb4 100644
--- a/docs/memory.md
+++ b/docs/memory.md
@@ -33,12 +33,12 @@ We are using a [AS6C1008](https://www.mouser.de/datasheet/2/12/AS6C1008feb2007-1
 Both chips have tristate in-/outputs which allows us to directly connect them to the data bus.
 Because both chips support an address space greater than 16 bits, we fixed superfluous bits on the chips to zero.
 As mentioned before, we decided to split the address space such that addresses `0x0000` to `0x0FFF` map to the ROM and that addresses `0x1000` to `0xFFFF` map to the RAM.
-This allows us to use the 4096 addressable ROM bytes for a small operating system, which is why we call the ROM from now on *bootup ROM*.
+This allows us to use the 4096 addressable ROM bytes for a small operating system, which is why we call the ROM *bootup ROM* from now on.
 Using the lowermost 12-bit addresses for the bootup ROM brings the additional benefit that the OS program can begin at address `0x0000` which simplifies the reset logic significantly.
 The remaining address space is used as the main system RAM.
 
-Data flow from the chips to the data bus is managed by the `~MEM_TO_DBUS` control line which toggles the output enable pins of both chips.
-The selection of the RAM or ROM chip is decided using the four most significant address bits.
+Data flow from the chips to the data bus is managed by the `~MEM_TO_DBUS` control line, which toggles the output enable pins of both chips.
+We decide between selecting the RAM and ROM chip based on the four most significant address bits:
 If and only if all four bits are zero, the bootrom chip is selected, otherwise the main RAM gets selected.
 The value stored at the location represented by the address latch state will be emitted to the data bus as long as the `~MEM_TO_DBUS` control line is active (low).
 
@@ -55,6 +55,7 @@ To save a chip on the memory board, we used the second quad-nor gate in the 74-4
 We solved this problem by reserving a block of eight neighboring breadboard pin rows on both board halves where we connected all address lines in the correct bit order.
 
 ## Testing
+
 For testing, we used an [Arduino Mega](https://store.arduino.cc/products/arduino-mega-2560-rev3) microcontroller that emulated the control lines, data bus and the address latch.
 We programmed the bootup ROM to contain a repeating pattern of values that are easy to verify knowing their addresses.
 Our test program then wrote data to a set of random locations withing the address space and afterwards read from all affected addresses.
diff --git a/docs/registers.md b/docs/registers.md
index c0be6ff54bc07c482da41f4fc27fb650acf6d06b..fd131f21f224319ac7656d9ad74a88f0aac2e1e8 100644
--- a/docs/registers.md
+++ b/docs/registers.md
@@ -24,7 +24,7 @@ Our processor has a register file consisting of 16 8-bit registers. For certain
 
 Apart from these registers, there also are the 4-bit wide `FLAGS` and the 8-bit wide `ACCU` register that are part of the [ALU](/docs/alu/overview).
 
-Next to the register file the module also contains the address latch.
+This module also contains the address latch, which is placed right next to the register file:
 It is a 16-bit buffer with the added functionality of incrementing and decrementing.
 The address latch is transparent to the [ISA](/docs/isa/overview).
 Its stored value is permanently emitted to the [memory unit](/docs/memory) and to the [I/O devices](/docs/io/overview).
@@ -76,6 +76,7 @@ These are denoted in *italics font style* within the schematics.
 :::
 
 ### Register File
+
 We split the register file into two halves (*HI* and *LO*) and each half into a register block denoted by a Greek letter ($\alpha$, $\beta$, $\gamma$ and $\delta$).
 This allows us to map each block to a pair of [74-670](https://www.ti.com/lit/gpn/sn74ls670) quad four-bit registers (`RegAlpha0`, `RegAlpha1`, `RegBeta0`, ..., `RegDelta1`).
 On a micro-architectural level, each register is identified by a combination of `REG_R_IDX[0:2]` and `~REG_LO_DBUS`/`~REG_HI_DBUS` (for reading), or `REG_W_IDX[0:2]` and `REG_W_SEL_LO`/`REG_W_SEL_HI` (for writing).
@@ -87,6 +88,7 @@ The following graphic visualizes the register addressing.
 ![Register Blocks](/img/registers/blocks.svg)
 
 #### Writing into registers
+
 At rising clock edges, register can be filled with values coming from either the data bus or from the address latch.
 The selection is implemented using two pairs of [74-245](http://www.ti.com/lit/gpn/sn74ls245) bus transceivers (`HiDataIn` and `LoDataIn` for the data bus, as well as `HiAddrIn` and `LoAddrIn` for the address latch).
 The correct transceiver pair is selected from the [combination of `REG_W_SEL_LO` and `REG_W_SEL_HI`](#reg_w_sel) which gets decoded into the internal control lines `~REG_R_DATA` and `~REG_R_ADDR` within the [control decoder](#control-line-decoder).
@@ -94,22 +96,26 @@ Because writing on the 74-670 register chips used is level-triggered, we built a
 This allows the control lines to reach a stable state before the actual write happens such that no other registers get overwritten.
 
 #### Emitting register contents
+
 Register values can be emitted to either the data bus or for storing in the address latch.
 In both cases, both the *HI* and *LO* register blocks addressed by `REG_R_IDX[2]` are enabled to output their contents.
 If the data is loaded into the address latch, outputs from both halves are used.
 For data bus output, `~REG_HI_TO_DBUS` and `~REG_LO_TO_DBUS` are used to select the register block that is emitted onto the data bus.
 
 ### Address Latch
+
 The address latch is implemented as a set of four [74-191](http://www.ti.com/lit/gpn/sn74ls191) presettable four-bit counters.
 It can either be filled with a value from a register pair using `~REG_LATCH_LOAD`, or it can be in-/decremented using `~REG_LATCH_COUNT` where the direction can be selected using `REG_LATCH_UP/~DOWN`.
 
 ### Control Line Decoder
+
 The control line decoder (CLD) has two major purposes:
 1. It contains an edge-detector with additional delay that allows to load data into registers and the address latch after control lines have stabilized.
 2. It decodes the incoming control lines used for register addressing into control lines for each physical register chip. This is done to reduce the number of control lines needed as there are plenty of unused combinations of register input signals.
 
 
 ## Tips for Reproduction
+
 - We initially had major problems with data bleeding over into registers that weren't actually selected on writes.
 This was due to control lines not having settled into their final states when the register chips' write inputs were enabled.
 A fix for this problem is to minimally delay the write signal.
@@ -117,7 +123,7 @@ We implemented this as part of the edge-detector circuit.
 - Our design of a register file is modular allowing for testing each register block individually before connecting them to form the register file.
 We initially skipped testing the individual blocks leading to us having to deconstruct the register file, test and fix each block before reconstructing the whole register file.
 Therefore, we recommend taking the time to already test early during construction.
-- If breadboard space and the number of used chips is less limited than in our build, we recommend building individual registers from 8-bit register chips (e.g. 74-377) and bus transceivers.
+- If you've got more breadboard space and chips available than with our layouts, we recommend building individual registers from 8-bit register chips (e.g. 74-377) and bus transceivers.
 This allows to permanently display the register contents which makes debugging significantly easier.
 - Bit orders tend to get messy quickly. Always make sure to write down where the most and the least significant bits are stored and think about all places where the order matters (e.g. the order of the address latch counter chips, especially the direction of the carry signals).
 - The register file consists of hundreds of cables which means that some misplaced or missing cables are to be expected.
@@ -131,6 +137,7 @@ To solve this problem, we decided to extend our wiring into the third dimension.
 By introducing three different height levels, we were able to put wires that don't necessarily need to be accessed, such as 5V and ground, on the lowest level, anything related to data flow that might require fixes on the middle level, and all the control lines that frequently have to be disconnected and reconnected during troubleshooting on the top.
 
 ## Testing
+
 For testing, we used an [Arduino Mega](https://store.arduino.cc/products/arduino-mega-2560-rev3) microcontroller that emulated the control lines and data bus.
 We wrote a fuzzer that filled registers in all possible orders with random data and afterwards checked the registers to hold the expected values.
 During the problem of values bleeding into unselected registers, we expanded the fuzzer to use values following fixed patterns such that we could trace data bleeds.
diff --git a/docs/software/uarch_sim.md b/docs/software/uarch_sim.md
index b00686354dadcd3c3af363d7b43b72821e650acb..2bbb599ec39b2cf23b5053680796ae3f79d448b1 100644
--- a/docs/software/uarch_sim.md
+++ b/docs/software/uarch_sim.md
@@ -6,7 +6,7 @@ title: µArch Simulator
 
 ## Installation
 
-The hardware modules are described using the [amaranth HDL](https://github.com/amaranth-lang/amaranth).
+The hardware modules are described using the [Amaranth HDL](https://github.com/amaranth-lang/amaranth).
 Install the `amaranth` package into your user environment (`pip3 install --user amaranth`) or use a virtual environment if you don't want to pollute your user environment:
 
 ```bash
@@ -17,7 +17,7 @@ pip3 install amaranth
 
 ## Usage
 
-`hdl/board.py` is the main entrypoint for the simulator and includes a small CLI:
+`hdl/board.py` is the main entry-point for the simulator and includes a small CLI:
 
 ```
 ⋊> ~/_/u/8/hdl on main ⨯ python3 board.py --help
@@ -33,7 +33,7 @@ options:
   --write-vcd           Write out a vcd trace of the simulation (board.vcd) (default: False)
 ```
 
-The simulator currently supports two output formats. One for microarch debugging, and a one-line-per-instruction trace that hides the inner workings:
+The simulator currently supports two output formats. One for microarchitecture debugging, and a one-line-per-instruction trace that hides the inner workings:
 
 ```
 ⋊> ~/_/u/8bit-main on main ⨯ python3 hdl/board.py --trace=isa --rom=isa/bootrom-test.bin