The original IBM PC had 64 kilobytes of dynamic RAM (DRAM), all of it refreshed with a little circuit that used a channel of the 8237A memory controller and a timer on the 8254 programmable interval timer (PIT) to access this memory repeatedly in the background. Introduced at a time when Static RAM (SRAM) was thought to be the only reliable memory, DRAM was IBM’s calculated risk, and it proved to be the right one to enable mobile computing some two decades later.
Dynamic RAM Introduces Complexity to the PC
The reason for this starts with how memory cells work. SRAM used bipolar technology that maintained its state with electric current. As long as a current flowed, the RAM cell (the circuit that represents the state of one bit) maintained its content. In contrast, DRAM was based on field effect technology—it was an electric field (essentially, constant voltage) that kept a cell at a given state, not a flowing current. Electric current was only used to change the state of a cell. DRAMs, with their relatively low current requirement, were orders of magnitude less hungry for power.
Another reason DRAM technology is so favorable for use in mobile equipment is that it uses fewer transistors to implement a cell—allowing it to be more densely packed. This works in two ways—allowing more capacity to fit in smaller form factors, and also simply affording more capacity, the savings are so great.
The laws of thermodynamics play a big role in laptop design these days. They say you can’t get something for nothing—if you want lower power and more density; you’ll have to give up something. Our move from SRAM to DRAM had two major costs—performance was reduced, and complexity was increased.
Among other things, the complexity of DRAM arises because it requires an external circuit to read all of the cells periodically; this is called the refresh circuit. The refresh operation on the original IBM PC used about 8% of the bandwidth of memory, using the PIT to drive periodic VERIFY transfers (reads that don’t deposit the data anywhere) to DRAM. Today, consuming even a few percent of memory bandwidth is noticeable, and refresh complexity is compounded by a special mode called self refresh.
As DRAM technology evolved, we had FP (fast page) mode memory, EDO (early data out) memory, and many other optimizations. Modern memories have a built-in state machine on the memory stick and an EEPROM called an SPD (serial presence device) that specifies the technologies and its timing on the stick. When DRAM was once soldered down on the IBM PC motherboard and refreshed with the 8254 and 8237A, no memory controller was needed. But with today’s modern DRAM complexities that are the costs of squeezing the last drops of performance from DRAM memory, memory controllers are standard. While a chipset’s north bridge was the component that contained the memory controller for simpler systems, the memory controller is being moved into the processor in many cases, to further optimize the CPU to memory path.
The BIOS technology that handles detection of memory sticks and programming of the memory controller to handle various types of memory used to be simple. With early SIMMs (single in-line memory modules) and DIMMs (dual in-line memory modules), there was no SPD, so the BIOS module programming the north bridge to set up the memory controller had to actually try different configurations and test the memory to see if it was contiguous and working. This problem—determining memory geometry—was a hard one to implement and get it right. Chipset manufacturers provided some sample code called memory reference code (MRC) and let the BIOS vendor derive better implementations from it to handle all the memory configurations and integrate it into the rest of the BIOS, such as tying it to setup fields for the over-clockers.
As memory bus speeds became faster, timing became more stringent. Today, the memory controller must be told about the timing of the data lines going to the memory sockets. The longer the trace on the motherboard, the longer the signals take to propagate. In fact, the signals are moving so fast, that they look like waves with a rise, a crest, and a tail, so the memory controller must be “trained” to actively find where the center of the wave is. And, this has to be done for multiple signal groups, because not all 128 or so data lines can be routed exactly the same distance on the board.
This level of timing (in the picoseconds range), as well as other kinds of timing, are all programmed by the BIOS into the memory controller during POST. Whereas early MRC for EDO sticks might have taken a few hundred lines of assembly code to perform their function, today it is not unheard of to have 20,000 lines of C code to do the same thing, cycling through each memory stick.
This does not count the complexity involved in making this work in the special suspend to RAM (STR) case, where the BIOS programs the memory controller to initiate self refresh in each memory stick, and the memory controller sends commands to the memory controller to enter that state. Performing this high-wire act without missing a beat of refreshing, along with coming out of self-refresh and handing refresh over to the memory controller again, is the BIOS POST, which can’t afford to get it wrong, or your PC will surely malfunction—sometimes early, but sometimes later, after Windows is loaded.
This is one of many jobs that POST has to perform with precision, using MRC that often cannot be altered by the BIOS vendor because it has been validated by the silicon vendor on so many systems with such close tolerances, and it takes time to do it—sometimes more than a second.
DRAM Technology Has Been Earning Our Trust
Another big change from the early IBM PC to modern notebooks is how much we trust DRAM technology, including its refresh logic. In the IBM PC, we had a parity bit to let us know when data had been damaged in RAM. The memory support logic raised the non-maskable interrupt (NMI) line on the processor, and an INT 02h instruction was effectively executed, causing a little BIOS message to let us know there was a parity error. The system then hung. So, it was pretty important to test memory during POST.
As memory technology advanced, we had error correcting code (ECC) support, allowing the memory controller and memory itself to detect and often correct memory errors (depending on how many bits were damaged.) Although not perfect, it went a long way to improving reliability. Supporting ECC required that all memory bits be initialized to some state—a really lengthy operation for today’s memory sizes in the gigabyte range. It could take a minute for a modern processor to actually flood-fill all of this memory using optimized instructions. Fortunately, our memory controllers include special circuitry that initializes ECC memories with a special burst mode, sometimes taking only 3-6 seconds or so.
The fact is, memory manufacturing technology has improved considerably since 1981. We no longer use the 8237A and 8254 to perform the memory refresh cycle (this is done by the memory controller), and memory itself is more reliable. So reliable in fact, that memory on our laptops has not really been tested rigorously like it was on the IBM PC for a decade or so. With gigabytes to test, this simply isn’t practical on a modern laptop. Certainly, it’s possible on a server, because boot time isn’t the issue—it’s stability.
On laptops, while full memory tests are available in core BIOSes such as Phoenix SecureCore Tiano™, OEMs don’t normally turn them on, or at least make them optional. Instead, the BIOS “sizes up” the memory, by testing the ability of the first few bytes of each memory block in the whole memory array. The size of such a memory block is up to the BIOS vendor, and so is the test itself. Early memory tests on PC clones tried the various settings by doing the same thing, but also checking for a special problem called “aliasing”, where writing to memory at one location would cause that same data to appear at regular intervals elsewhere in memory. It didn’t propagate to other cells actually; it’s just that the same set of cells appeared repetitively in the address space in patterns that were a function of the underlying memory geometry and the (incorrect) setting used to select a different geometry. Only when the right settings were used, would this aliasing go away.
This once complex scan has been eliminated with SPDs, as memory sticks now tell us their exact organization as well as timing. This means the scan doesn’t have to account for geometry problems; it just has to find the end of memory.
SecureCore Tiano’s memory scan is optimized for performance, while finding that last byte in a fully-optimal way, using a binary search, so that no memory goes undetected. In legacy BIOSes and some UEFI BIOSes, the memory scan is highly memory-technology specific. Unless a user actually selects a full memory test in the setup system (as they might if their laptop was misbehaving), this optimization contributes to the SecureCore Tiano boot time of around one second.


