The Fourth Laboratory ("Lab 4")

Setup

The start-up script has created the directory smd137/lab4 and placed the files "mips_pipe_extended.xml" and the template for your *.s file "lab4.s" there. Compile using:

	sde-as -mcpu=r3k -O0 lab4.s -o test.o
	sde-ld -T linker_script test.o -o a.out
	sde-objdump -h -z -s -d -t a.out > mips_pipe_extended_program.objdump
	rm a.out test.o
or use the Makefile to run
	make as

General information on lab 4

You will turn in a single source file for SyncSim "Mips Pipe Extended" mode,

       DT_LABNUMMER     4
       DT_LABDEL        1

You MAY use MIPS pseudo-ops in this lab.
You will submit your lab by electronic mail as usual, the details are found here.

What the Lab is About

We continue your training in the handling of interrupts and exceptions. We will observe what actually happens (at the processor/software interface) when some outside agent wants to get the attention of the processor. For example, a user striking a keyboard. Or, a memory subsystem which has detected a TLB miss. Or, a disk drive which has just completed some previously-initiated DMA read operation. Or, a message which has just arrived over the network interface. Or any other event which your program cannot predict exactly when it will occur (if ever). This means it would be too costly for your program to "check every now and then" whether the event has occurred (this is called "polling"). Polling has the disadvantage that if you do not check "often enough", then the processor (as perceived by the outside agent) will have a slow response time to the event. Polling also consumes time which could have been used for something else, since most of the time, a poll will result in a negative result, that is, no event was detected.

Polling is the processor going around and checking on all the outside agents, and servicing them if it finds an agent which needs help. This is usually done by reading a "status register" associated with the device (using a lw instruction in our case), and then interpreting the status register. The more devices there are, the more time polling will take.

But there is another way: let the device come to the processor, and disturb the processor's normal operation. As seen by the software, the mechanism involves taking control away from the currently executing program. What you will learn in this lab is typical of present-day processors. Once control is taken from the currently-executing user process, a special part of the machine's program takes over, called by different names "kernel", "kernel code","operating system", "executive", and others. We will use these terms interchangeably, to mean that code which is assembled into ".ktext", together with any data in ".kdata", in the assembler.

Once the kernel gains control, it can service the requesting external agent, and then return control to the interrupted process. The interrupted process has no way to "know" that it has been interrupted. Or, the kernel can control the way in which the external events cause the user process to be restarted, things you will learn in the "Reactive Programming" course. For example the kernel does not need to return control to the interrupted user process; instead it can give the processor to some other process. In this lab you will write a system having three such time-sliced processes. If this "context switching" happens quickly enough, then it gives the illusion (from the outside) that the processor is working on many processes at the same time (although it is actually just interleaving them in small "time slices"). This is called multiprogramming, something you will learn about when studying the "Operating Systems" course. The next step is: if we can have many user processes all sharing a single processor, why not those same user processes sharing multiple processors ? This is called multiprocessing and multicomputing and is in the realm of the "Computer Architecture" course. All of these advance studies depend on you having understood the basic mechanism which is taught in this lab. You will learn:

What to Code

In this lab you will write a mini-operating-system for the Mips Pipe Extended model, which is able to "handle" the following two kinds of "exceptions":

Your operating system must branch to "kernel_loop" on any other type of exception. Be sure and test for this, by striking some key in the input window, as you did in Lab 3. Your program should jump to the "kernel_loop" at some branch instruction in your kernel.

Here's the "shell" lab4.s (in smd137/lab4 directory) of what your source file will look like. Explanations follow. It is OK to use pseudo-instructions.


# first section:  user data area,
# each process has its own "private" area.

        .data	
glob1:  .byte   'A'
        .space  999
glob2:  .byte   '0'
        .space  999
glob3:  .byte   'a'
        .space  999

# beginning of user code.
#
# The first instructions initializes IO/Timer.
# The last instruction in main is only for start-up,
# and initializes the first process.  No such
# instruction appears for the second or third
# process !

        .text
	.set noreorder
main:
        li    $t0, 0xFFFF0010    # address to Timer registers:
                                 # +0: Timer control register
				 # +4: Timer count register

        li    $t1, 999           # count to 0, i.e. 1000 times
        sw    $t1, 4($t0)        # count register := 999

        li    $t1, 0b110         # "110": interrupt enable,
                                 #        count enable
        sw    $t1, 0($t0)        # control register := "110"

        li    $t0, 0xFFFF0000    # address to I/O registers:
                                 # +0: Input control register
 
        li    $t1, 0b10          # "10": interrupt enable
        sw    $t1, 0($t0)        # control register := "10"

        li    $t0, 0x0C03        # enable Int1, Int0 ("C"),
                                 # Int1: I/O
				 # Int0: Timer
                                 # set user mode, IE ("3")
        mtc0  $t0, $12           # CP0 status := 0xC03
	la    $gp, glob1	 # dirty setup for process 1

         # +++++++++ first process ++++++++++
proc1:      

  # description: proc1 reads the byte stored at 0($gp),
  # prints it, increments to the next character,
  # saving that back into 0($gp).  After printing 'Z',
  # this process should then start over again with 'A', an
  # endless loop.  symbol "glob1" never used in this code.              


         # +++++++++ second process ++++++++++
proc2:

  # description: almost identically same code as above,
  # only two lines should differ.  Prints '0' through '9'
  # as endless loop.  uses only $gp, symbol "glob2" never used.


         # +++++++++ third process ++++++++++
proc3:

  # description: almost identically same code as above,
  # only two lines should differ.  Prints 'a' through 'z'
  # as endless loop.  uses only $gp, symbol "glob3" never used.

  
# end of three user processes.
# now comes the data structures for the
# "operating system":
# Process Control Block (PCB) is three words:
# pcb: .word  (next Program Counter for this process)
#      .word  (contents of $gp for this process)
#      .word  (contents of $sp for this process)
# all other context is saved on the process' own stack
# during exception handling and SYSCALL.


        .section .kdata
curpcb: .word  pcb1
pcb1:   .word  0, 0, 0
pcb2:   .word  proc2, glob2, 0x7fffbf94
pcb3:   .word  proc3, glob3, 0x7fff7f94
	
        .section .ktext , "xa"
	.set noreorder
# the operating system code begins here:


kernel_loop:
	b kernel_loop
	nop


Now let's see if I can explain all this so that you can do the lab. First: pretend that you are one of the user processes, for example "proc2". When you are "awake," then you will see your own personal stack, and your own personal contents in register $gp which POINTS to the top of your global area. The idea here is: you must always access $gp-relative if you want to use the global area. NEVER use absolute addressing using the symbol "glob2"... just pretend that you can't see that symbol. This is because you do not know where your global data is located at compile-time, you only know that the operating system will assign a $gp for you before your program is started up. It may not even be the same value for any two given runs; and it shouldn't make any difference either; just as long as once it has been assigned, it doesn't change for that run.

So, since you "are" proc2, your job is: read the first byte in your global area, print that character, then "bump" to the next character (wrapping back to the start of your sequence as necessary) and store the character back to the global area. You must take the next character to print from the global area, not from some old copy sitting in a register. The idea is to practice $gp-relative memory addressing. The program is a simple endless loop.

To print the single character, "you" (proc2) must put the character into $a0, put the special code 0x102 into $v0, and then give the SYSCALL instruction. This causes a machine exception to the operating system, which is able to identify that a SYSCALL has just been executed by the current user process, which then services the request.

Now let's change our viewpoint and pretend that we are the operating system. There are three "user processes" out there, our job is to service all exceptions as best we can, and when the next "clock tick" arrives, to bundle up the current process, tuck him away into his PCB, select the next process, unbundle him from his PCB, and turn the CPU over to him.

In order for us to know which user process was in control of the machine at the time of an exception, we have a variable called "curpcb" (actually, "current process") which points to the PCB of the current process. That is why it is initialized to "pcb1", because SyncSim starts up "proc1" as the initial process. Notice that each PCB has room for three words, as documented above. Notice also that the PCB for "proc1" is uninitialized, but that doesn't matter, because the current values of $gp, $sp, and PC for "proc1" were in the machine registers at the time of the exception. For the other two processes however, they must be in the respective PCBs, since those two processes were not in control of the machine when the exception occurred: "curpcb" was not pointing to their PCB. Keeping in mind that the simulator initializes $sp to 0x80000000, you should be able to see how the operating system is not only assigning separate areas of memory for "global area" $gp, but is also assigning different areas of memory for each user process stack as well.

The first thing the operating system must do is to save all register contents onto the stack. There will be 27 registers to save, because $zero is obviously always zero, $gp and $sp will be effectively stored into the PCB, and $k0 and $k1 are for the exclusive use of the operating system (user processes may not use them, but notice that a user process may freely write to them: so don't rely on the user process not being able to clobber $k0 or $k1 -- because it can). The reason you are saving all 27 registers is because, as the operating system, you do not "know" how many of the legal user-available registers the process out there is actually using: so you must save them all. And not just the ones which the operating system needs either: because if you do a context switch, the new process must get back ALL of its own registers.

After that, the operating system must determine what it is that has just happened: why did CPU control jump to the (single) entry point for the operating system ? This requires collecting information from special hardware registers, and interpreting them.

A note here: if your operating system determines that it is going to return control to the same process which was just interrupted, then it does not have to save ALL the registers, just the ones it needs to use itself. Also, see the discussion below about how a real operating system would probably change the stack pointer to use its own private stack. We aren't going to require this of you, but you must understand why this is normally done in "real" operating systems.

On a SYSCALL with $v0 = 0x102, you must print the character in the user process' $a0 out to the output window of SyncSim. On a clock interrupt, you must perform a "context switch" and give control to the next process. Any other exception must cause a jump to the infinite loop "kernel_loop:" in the kernel.

What You Should See

That, together with the course material, and with the information on this page, should be enough for you to code the lab completely. When you run it, open the Mips Pipe Extended, open the output window and press play, and you should see the sequence of capital letters ABC...ZABC... in the ouput window. When a clock tick occurs (1000 cycles after timer count has been enabled), the next process should be started and the sequence 012...9012... should be printed. You should see something like this in your output window:

ABCDEFGHIJKLMNOPQ0123456789012345abcdefghijklmno
RSTUVWXYZABCDEFG67890123456789012pqrstuvwxyzabcd
HIJKLMNOPQRSTUV34567890123456789efghijklmnopqrst
WXYZABCDEFGHIJK0123456789123456uvwxyzabcdefghiLM
NOPQRSTUVWXYZ7891234567891234jkl................

You must be certain that no character in any sequence is missed, otherwise you may have corrupted a user process context.

Some Pedagogical Reminders

Details and Hints

Remember:

SyncSim's "Mips Pipe Extended" model can simulate the following types of exceptions. Exceptions always cause a branch to the first instruction of ".ktext".

and the following three CP0-registers have been implemented (as detailed in the "MIPS 33000 Reference").

Note on External Interrupts: SyncSim (Mips-Pipe Extended) starts your program at the instruction that the PC are initialized to in the design file (default 0x00000000), in kernel mode (CP0 Status Reg bit KUc = 0), and with the overall interrupt system disabled (bit IEc = 0); This allows to initiate the coprocessor and external devices, and finally switch to user mode. (Note, this is NOT how it's done in a real mips, there you would use an external RESET signal to cause a reset interrupt, which would take care of system initialization.)

In order for an external interrupt to actually occur, three necessary conditions must be met:

Once the external interrupt has been detected in the kernel, the device that caused the corresponging interrupt should be acknowledged.

Note on other Exceptions: There is no way to "mask" the other exceptions; for example, an arithmetic overflow from an ADD instruction will always cause a branch to the kernel code. Also, write your kernel so that it is impossible for it to generate its own exceptions: use ADDU not ADD etc

Note on single character input-output: When a keystrike exception occurs, then there is a single ASCII character waiting, which may be retrieved by doing a LBU from address 0xffff0004. (That means that only the operating system can read this address). You may only read the character once. In general, you should only try to read from this address when a keystrike interrupt has just occurred. Also, if a keystrike interrupt has occurred, normally you MUST read the waiting character, in order to clear the pending interrupt condition. In nearly all machines, unless you do something to clear the interrupt condition, then the same interrupt will just hit you again as soon as you return the CPU back to user mode. Writing a byte (SB) to 0xffff0008 causes the character to be printed to the output window. You do not have to wait, or test any "flag" bit for the character output device: simply assume that it always succeeds.

Note on SYSCALL instruction: The description of this instruction in the LSI-LOGIC book indicates that it is possible to read the SYSCALL instruction which caused the exception, and examine some bits which are stored in that instruction. You dont have to do that.

Instead we will treat "SYSCALL" as a sort of subroutine call: we put parameters into a0/v0, execute SYSCALL (instead of "bal"), which then passes control to the kernel code, and the kernel code (later) returns to the user program with the result in v0. SYSCALL is our "subroutine call" to the operating system, and a0/v0 tells the operating system what service we want. SYSCALL, unlike a branch or jump, does NOT have a delay slot (since SYSCALL causes an exception which kills the following instruction). SYSCALL should cause return to the instruction IMMEDIATELY following the SYSCALL instruction. Some details:

Do NOT erase your lab after you turn it in !


Last modified 2005-12-05 by pln