Here are some things, in no specific order, that I'd like to fix or
implement in GXemul. Some items in this list are perhaps already fixed.
Legend: [ ] = not started yet, or just planning/thinking about it [/] = started, either "on paper" or actual implementation [X] = done (but usually these entries are removed from the TODO list) ------------------------------------------------------------------------ [/] Intel i960 CPU and machine modes. [/] i960 instruction disassembly. [X] i960CA [ ] Other variants. [/] i960 instruction execution. [ ] Enough to run the HP 700/RX ROM. [ ] Enough to run uClinux/i960. [ ] HP 700/RX emulation: [ ] Figure out more of the memory map by running the ROM image in the emulator. [ ] Interrupt controller(s)? [ ] Graphics [ ] Serial port(s) [ ] Keyboard, mouse, ... [ ] Ethernet [ ] Cyclone/VH i960 evaluation board emulation (for uClinux/i960). [ ] Use std::shared_ptr instead of my own hack (refcount_ptr)? [/] Cache components. This is hopefully a good way to get started working on GXemul again, after the long break. Sub-tasks: (write unit tests for EACH of these, while going along) [/] State consists of [/] size (e.g. 4 KB, or 1.75 MB) [/] associativity (1 = direct mapped, 4 = 4-way, 0 = fully associative?) [/] cache-line length (e.g. 4 bytes, or 64 bytes, or more) [ ] replacement policy: ? [ ] write-policy: write-through vs write-back vs copy-back [ ] Small memory area (the cache memory) itself, consisting of cache lines. Cache lines are: [ ] data [ ] tag/index(offset?) [ ] valid bit [ ] dirty bit [ ] PC of last writer? [ ] Implement the addressable data bus interface, but if data is contained in the cache (the small memory state belonging to the cache component), do not call the main memory. [ ] How about simulating cycle delays? A read or write in this type of model is not instantaneous, but rather a "request", which may take arbitrary long time to execute and return a result. [ ] If possible, for each cache-line, store statistics: nr of hits (reads, writes) nr of misses (reads, writes) on writes, store the PC of the writer, so that on misses, statistics can be gathered. [ ] Caches should be added _onto_ a cpu. The cpu should first look for a cache as an address data bus, THEN for a parent. [ ] The cache must then go downwards TWO steps (../..) to get past the owning cpu, when trying to reach the mainbus/ram. [ ] How about multi-level caches that are shared between CPUs/cores? Think through this. Or: machine mainbus ram rom l2cache0 cpu0 l1icache l1dcache cpu1 l1icache l1dcache l2cache1 cpu2 l1icache l1dcache cpu3 l1icache l1dcache if e.g. a 4-core CPU has each core having its own L1 caches (I and D), but sharing L2 caches between pairs of cores. Problem: What if there is hyperthreading, and the hyperthread "cpus" share L1I _AND_ L1D? Then the L1 caches cannot be children of both cpus. [ ] Test with caches attached to M88K and MIPS cpus. [ ] How about a command line switch (or other command) to quickly turn caches on or off? Or maybe just an optional argument to the machine templates (caches=true or false), but that makes it hard to switch during runtime. (Switching during runtime from no caches to caches is easiest (?), but from caches to non-cached will require cache flushes...) [/] Improve dyntrans CTRL-C behavior! Abort quicker. [X] Implement an abort instruction call [X] sync pc should take this kind of abort into consideration. [X] Rename exceptionInDelaySlot -> exceptionOrAbortInDelaySlot [X] Get rid of abort_in_delay_slot (handled by abort). [/] Unit tests for the above (M88K). [/] Function call trace -> abort quickly on ctrl-c. [ ] There is still a crash bug; try interleaving continue + CTRL-C + continue + CTRL-C, with trace enabled! [ ] Components! instead of the planned plugin stuff?! [ ] Pretty print of files and disk images: cpu0 (5KE, 100 MHz) \-- file (netbsd-GENERIC.gz) or wd0 (ATAPI primary harddisk) \-- image (cow, netbsd.img) Should these be added without digits primarily? And then starting from 1. "file", "file1", "file2" etc. Or is file0 better? Just "file" is slightly clearer. When adding e.g. a machine with a specific ROM, the ROM file could be looked for in standard places (/usr/blah/share/gxemul/rom/therom.bin, ~/.gxemul/rom/therom.bin, ./therom.bin, etc) and if not found anywhere, give an error unless overridden using arguments to the machine template. Like: gxemul -e "sgi_ip32(prom=my_therom.bin)" netbsd.gz New syntax: path_to_add_to:component(args) or path_to_add_to:filename args is: "name=filename" or filename cpu0 is the path to add to I.e. DON'T make an "attach" command, just use the add command. Maybe change it to: add component_type [at existing_component_path] and/or support the colon-style notation as well? add [existing_component_path:]component_type(args) or add existing_component_path:filename Think about this!!! $ gxemul -e testmips netbsd-GENERIC.gz $ gxemul -e testmips netbsd-GENERIC.gz wd0:netbsd.img fb_videoram0:sdl() $ gxemul -e testmips netbsd-GENERIC.gz file(type=raw,vaddr=0xbfc00000,name=prom.bin) Same as: add file(name=netbsd-GENERIC.gz) cpu0 add disk(name=netbsd.img) wd0 add file(type=raw,vaddr=0xbfc00000,name=prom.bin) cpu0 add sdl() fb_videoram0 [ ] What should a component be able to do? [as a plugin] [ ] Attach/load/add [ ] On-early-reset (clear memory etc) [ ] On-late-reset (fill memory with file contents etc) [ ] Detach [ ] Monitor changes in component(s)' state [ ] Periodical updates (e.g. framebuffers) [ ] Run in a different thread (pthread multithreading?) [ ] Insert stuff into the event queue (e.g. keyboard keypresses from a framebuffer/keyboard plugin) [ ] The Snapshots (the clones) release the plugin connection; the plugins only work on the main tree. [ ] Plugins have to be compatible with replaying during reverse execution! (Maybe they should be disconnected while re-executing?) [ ] Think about how events should contain changes such as breakpoints, added/removed plugins, etc. [ ] Example plugins: [ ] File loaders [ ] Move these from src/main/fileloaders to src/components/file ... [ ] raw mode: file(raw(vaddr[,skiplen[,entry]])) [ ] Disk image mappers: src/components/image [ ] Framebuffer displays [ ] Think about VGA (charcell AND video framebuffer, being able to switch live) [ ] Keyboards (host -> emulated) [ ] Serial controller -> scrollback buffer -> xterm / other window or even connected to stdin/stdout or files or terminals. [ ] Audio (emulated -> host) [ ] Cache statistics viewer. [ ] com0:stdio() should not be needed, if it is the default choice. com0:null() would disconnect com0 from any output. stdio() would then have to be a plugin which handles multi- plexing of multiple outputs, and allows one input serial console. [ ] How about "-X" behavior of the legacy modes? I.e. a machine, which when run with -X will set up PROM emulation variables so that the guest OS uses graphics, and without -X to use serial console... [/] The testm88k machine. [X] Implement 88100 instruction disassembly. [X] Implement basic 88K instruction execution, enough to run the rectangle drawing demo. [ ] Framebuffer output? [ ] Start thinking about the plugin framework, use sdl for video output! [ ] SPARC64? Would be nice; multithreaded Niagara emulation etc. [/] STEP EXECUTIONS, REVERSE EXECUTION, SNAPSHOTTING, ... [ ] Try to get away from using doubles for scheduling! The scheduling must be made more stable, even in the precense of rounding errors. See e.g. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=570899 [ ] Compact mode trace in the new framework: a column to the right with delta EXCEPT program counter deltas. This allows for instruction trace with one instruction per line, like in the legacy modes. [/] Continuous execution forward. [ ] "Sloppy" accuracy mode: it could be possible to interleave large chunks even when components "collide" in time. (This will cause reverse execution support to be turned off.) [ ] step recalculation: [ ] When a component is added to the tree, recalculate its nr of steps! [/] "paused" cpu state, in SMP machines during bootup. (Non-executing components.) [ ] When a component changes state from paused = true to false, recalculate its number of steps! [ ] Removing components (at least if the Fastest one is removed) should also trigger recalculation. [ ] When a component's frequency is changed, recalculate its nr of steps! (This requires variable write handlers, including inheritance!) Frequency changes are tricky, especially when it comes to reverse execution, so think CAREFULLY about this! [/] Reverse execution/debugging: [ ] When running in reverse, debug output should be disabled! (Also, update calls to plugins (e.g. framebuffers) should be disabled, until the target step is reached.) [ ] Regular saving of state, and limiting the number of such saves by logarithmic removal. [ ] Disk image overlays! [ ] Unit tests. [ ] In addition to saving state, all modifications done to the component tree, and e.g. breakpoints, need to be stored together with their time stamp ("step"). And also all "external events", of course. * I.e. if a USB storage device is connected as step 1000, and we're running in reverse to step 800 to check something, and then continue again, the default behaviour should be that at step 1000 the USB device is connected again. * Or if we run 10 steps, change cpu0.pc, and run 10 more steps, then running backwards 10 steps could work, but not more? Think about this! [ ] Simulation across multiple hosts? [ ] Cycle accurate slowdown? E.g. when simulating a 1 MHz microcontroller, and a TV display, maybe the host is much faster than what is required. The gxemul process could then yield some of its CPU time, just about enough to make the process run at real-time speed. (With warnings printed whenever real-time is not possible, because the host is too slow.) This would be useful for some simulations involving real-world Sound producing devices. [/] NEW DYNTRANS IMPLEMENTATION, ... [/] Basic run loop. [X] PC to pointers [X] Dyntrans page allocation [/] Large chunks is safe vs "single stepping" dyntrans... [ ] Only "size 2" so far, i.e. delay slots. No support for longer instruction combos have been implemented yet. [/] Unit tests. [ ] Memory writes => invalidate corresponding dyntrans translation pages! (needed for full guest OS emulation, and self-modifying code) [ ] Interrupts should be cycle accurate, and affect the cpu's m_nextIC (and other state, such as processor status registers) _immediately_. [/] Helpers for CPU implementations: [X] dyntrans pages? [ ] pc to dyntrans page lookups? 32-bit and 64-bit [ ] memory to direct host page lookups, TLB-entry based [ ] For instruction combination implementations, the first thing checked is whether instr combos is allowed, if not then call the original function call instead. This makes it possible to run fast, and then "break slowly" before a hazard. [ ] Try variable-length ISAs early on. Maybe amd64? Or AVR32? [ ] Multi-encoding ISAs, such as MIPS (MIPS16), ARM (Thumb). Commands used to compile MIPS16 test programs: mips64-unknown-elf-gcc-4.3.2 -mips16 -O3 urk.c -c mips64-unknown-elf-ld urk.o -o urk -e f -Ttext 0xffffffff80004000 [ ] Optional native code generation: Probably don't have time to implement this in the near future; everything else also needs to be prioritized against this feature, so... [ ] Instruction cache emulation [ ] cache component on cpus [ ] For each emulated instruction, call a stub handler with the current pc. It is responsible for updating statistics etc. (Only for cycle accurate emulation, not for sloppy emulation?) (Or should there be a separate setting for cache vs non-cache emulation?) [ ] Generic bus load/store access: [ ] Pointers to host memory pages, for fast loads and stores: [ ] host_load and host_store (as before) [ ] host_load_user and host_store_user (as before, for those archs that need it, e.g. ARM and M88K?) [ ] Pointers to handler functions, for: [ ] device access [ ] data cache emulation [ ] breakpoints [ ] Investigate: http://vm-kernel.org/blog/2009/07/10/qemu-internal-part-2-softmmu/ indicates that QEMU uses similar lookup mechanisms as GXemul, but uses an additional bit in the looked up page address to indicate I/O space. Investigate whether this is faster than having a pure pointer, and a separate lookup when the pointer is NULL! [ ] Conditional breakpoints before AND after device accesses! [ ] Conditional breakpoints before AND after CPU instruction execution (somewhat different from generic device access). [/] The MVME187 machine. [X] Loading OpenBSD/mvme88k bsd and bsd.rd executables! I.e. implement an a.out file loader. [ ] More initial register contents upon reset: r31 isn't enough for mvme187. See old machine_mvme88k.cc for details. [ ] Soft PROM emulation, if no PROM file is found. MAYBE implement PROM emulation as a plugin? Like attach cpu0:mvmeprom(187) or even just: attach mvmeprom(187) since cpu0 would be the default component. The plugin would then be responsible for on-reset behavior, AND handling of PROM calls. If a software PROM emulation plugin is used for emulating different boot loaders, then that can be specified as arguments, like: bootregs=linux or bootregs=netbsd or similar. [ ] Ability to load raw files (e.g. PROM). [ ] Serial controller component. [ ] Plugin (?) for serial output (i.e. a terminal window), or posibility to connect to stdin/stdout. Should be enough to show boot messages. [ ] Separate 88200 CMMUs? So that the two are really two separate devices. [ ] Implement 88K virtual to physical memory translation. [/] Implement 88100 instruction execution (completely). [ ] Implement 88110 instruction disassembly (e.g. "graphics" instructions) [ ] Implement 88110 instruction execution. [ ] Disk bootblock boot. [ ] Reimplement everything from the old mvme187 implementation. [ ] Ethernet :) [ ] Dyntrans "user memory" implementation for M88K! [ ] xmem emulation (set transaction registers) [ ] Instruction trace by using bits of ??IP control regs. [ ] Breakpoints: How to indicate user space vs supervisor? (Also the "dump" instruction and other things need this support.) [ ] Instruction disassembly, and implementation: o) See http://www.panggih.staff.ugm.ac.id/download/GCC/info/gcc.i5 for some strange cases of when "div" can fail (?) o) Floating point stuff: +) Refactor all the fsub, fadd, fmul etc. They are currently quite horribly redundant. +) Exceptions for overflow etc! [ ] Address formats? "xxxx:yyyy", "xxxx:yyyyyyyy", "zzzzzzzzzzzzzzzz" for amd64, plus i/o ports "uxxxxxxxx" vs "sxxxxxxxx" for M88K most likely others for "bank select" embedded archs/microcontrollers. [ ] amd64/x86 emulation: a) Non-dyntrans emulation mode, as a "proof" that CPUs can be implemented using slow interpretation, and do not need to be complex dyntrans implementations. (PC emulation isn't exactly GXemul's niche, anyway, so it is ok that it is extremely slow. There are other emulators and virtualizers for users who need a fast x86/amd64 experience.) b) interesting because of mode switches (16-bit, 32-bit, and long 64-bit) c) interseting because of odd address format (non-RISC style) d) interesting because there are lots of guest OSes and other software to test with e) how about making the name "pc" in CPUComponent overridable? on amd64, it is either ip, eip, or rip, depending on mode. f) override bus reads/writes, because amd64 transparently allows unaligned loads/stores. [ ] Memory/bus reads in "no exception" mode (for disasm and memory dumping, and other things). [/] Symbol registry. [X] Add ELF symbols (see end of FileLoader_ELF.cc). [X] Load a.out symbols. [ ] Include device addresses and hardware registers in the symbol registry (as in the legacy Dreamcast mode). [ ] Cloning a CPUComponent should also clone its symbol registry! Think about serializing symbols... [ ] Make sure that BOTH old and new configuration files work! [ ] Niceness fixes: [/] make refcount_ptr<T> support const T as well, so that e.g. code in Component::LookupPath can be made to work without a const_cast! [ ] avoid exceptions (better to return failure some other way) [ ] Call m_gxemul->GetRootComponent()->FlushCachedState(); when a component is e.g. moved or copied? Example: cpu0 in machine0 is moved to machine1, and then cpu.dump is executed. It should then use the mainbus of machine1, not machine0! (Maybe not necessary for interactive use; FlushCachedState is called by consoleui before very command... hm.) [ ] Cache short path names if evaluation/generation of them becomes too heavy. [/] Mainbus component: [ ] Maybe this should simply be a "Bus" base class? That way, it could be reused by PCI Busses, VME busses, and others. HOWEVER, the concept of address range and overlap may differ between bus types, or maybe not even exist in some busses. [ ] AddressDataBus should be extended to allow for direct page mapping. [ ] Unit tests for the above! [ ] Interrupt subsystem [ ] Components exposing an InterruptPin interface? Think about this... [ ] Timers [ ] Host speed approaching timers: a device that wishes to cause interrupts 100 times per emulated second will interrupt (approximately) 100 times per host second. This is most useful for running full guest operating systems at full speed, "virtual machine" mode. [ ] Exact emulation: a device wishing to cause 100 interrupts per emulated second may take much more or much less than one host second to execute. ("Cycle accurate" mode.) [/] State/model: [X] Variable write handlers. [ ] root.step to move to a certain execution step! [X] Change backward-step to set root.step = root.step - 1. [ ] Only allow decreasing root.step if snapshots are enabled. [ ] Allow increasing root.step always: it means to continue until the number of steps have been executed (a kind of specia breakpoint!) [X] Do not allow writes to *.step for non-root components! [X] Do not check writes to *.step during deserialization! [X] cpu.model = "R4000" <== assigning to the "model" variable should: [X] handle writes specifically [X] lookup if R4000 is a valid model for the cpu architecture [ ] Better error reporting when supplying model using e.g. the command line -e "testmips(cpu=R1000)" should show the same error message as when running cpu.model="R1000". [ ] MIPS ABI selection now works (cpu.abi="o32" vs n32 and n64). However, only the NEW names are registered as state variable names! How should this be handled? Custom "aliasing" variables? [ ] Custom ToString variants? Useful for bit fields, control registers, etc. "Verbose tostring"? [ ] Loaded files should be part of state/model! But not part of the component tree. I.e. state = components + other configuration! [ ] Disk images: reverse execution should reverse disk contents, i.e. overlays must be used if reverse execution support is enabled. [ ] Serialize symbols [ ] Serialize breakpoints [ ] File loaders: [ ] automagic .gz (and .bz2) file extraction into /tmp or $TMP or such. [ ] symbol handling, line number info, data types? [ ] argument handling! (arg count, at least) [ ] all the others (macho, ecoff, srec, ...) [ ] Disk images. [ ] r vs R modifier: read only disk images causes writes to fail, while R could create an implicit empty overlay in the tmp directory, so that writes within a session will succeed. At reboot/reset they'd be lost. [ ] Make sure that there is either a) a sync after each write to make sure that the data is consistently written, or that b) (for the test devices at least, or perhaps some others) a mechanism is available to turn off the write-after-every-access policy but then the data must be manually synched by the guest OS every now and then. (This will be a requirement for e.g. a persistent Single Storage guest OS.) [ ] Breakpoints Breakpoint = some form of expression, which will be evaluated after (or before?) running each cycle. (*) (*) Implementation-wise, it may be optimized away, but the semantics should be this. NOTE: It will not be possible to break INSIDE a component's step, but only before or after all components within that step have executed. I.e.: o) when single-stepping, the breakpoint may simply set a flag during execution, and when all components within that step have executed, the resulting triggered breakpoints may be displayed. o) when running continuously, the breakpoint may still not break immediately (?) because components may be mixed within the last step? TODO: Think about this. TODO: Any state variable change in any component? How about RAM/custom changes? How about register _READS_ or custom reads. All Load/stores to virtual addresses? [ ] For all breakpoints, it should be possible to break both _before_ and _after_ a change has occurred! [ ] For all breakpoints/watchpoints, a Count should be kept, and the emulation should only break once the Count has reached a limit. (Usually 1, but should be possible to set to any positive value.) [ ] Worst case: Checked on device cycle execution, if necessary. [ ] Per memory-mapped/addressable device Checked on load/store device access. [ ] Per instruction (for CPU components) It may be easiest to simply turn off native code generation, except for stuff like "check if pc = xyz"- style breakpoints. Those can be embedded. [/] Function call trace etc. [ ] string lookup for args [ ] symbol lookup for args [ ] Document in doc/components/component_cpu.html how to use tracing, what the showFunction* state variables do, etc. [ ] trace command (taking an argument: nr of calls, default is 1?) which is the same as setting a breakpoint for cpuX.nrOfFunctionCalls = Z where Z is one more than before than it was before running the command. And each function call increases that variable. trace then temporarily turns on showFunctionTraceCall and runs until the breakpoint hits (or CTRL-C), and then removes the breakpoint automatically (even if CTRL-C was hit), and restores showFunctionTraceCall. Variants: trace on turns on tracing for all CPUs (but doesn't run anything) trace off turns off tracing for all CPUs trace [n] runs with trace enabled for n function calls, n = 10 by default? showFunctionCallReturn should be false by default. [ ] -t command line option, for backward compatibility? [ ] Stack dump (of the emulated machine) [ ] A method on CPU components? [ ] Components (general): [ ] "help" about components could show a file:/// link to the installed documentation about that component or machine! (If it exists.) [ ] Limit where components can be added. Like "drop targets"? E.g. a machine can be added at the root, but a machine can not be added on a pcibus. Similarly, a pcibus can not be added at the root. It has to be in a machine of some kind. Think about this. Perhaps as simple as a "if parent class is not blah blah then disallow adding". A machine can be added into a dummy component. A dummy component can be added into a dummy component. A pcibus can be added into a mainbus (in a machine.) etc. bool IsItOkToAddItToThisProposedParentComponent(propsedParent) const; [ ] Exceptions/traps to CPUs, could perhaps be generalized to sending "messages" or "interrupts" to any device/component. How should this be done manually from the command interpreter? cpu0.break cause a breakpoint exception? if the cpu supports it cpu0.exception [args] where args is VERY dependent on the exception [ ] ConsoleUI and/or CommandInterpreter: Make use of COLUMNS environment variable when printing e.g. tab completion tables. [ ] Command interpreter: [/] State variable assignments: [ ] Expression evaluator, +-*/, string concatenation, type correctness (e.g. bool vs int), hex vs decimal, prefixes like M, K etc. (4M = 4*1048576 ...) (StateVariable::EvaluateExpression) [ ] Execution of _expressions_, not just variables. e.g.: cpu.pc prints the pc value cpu.pc+4 should print the _expression_ pc+4 cpu.pc=expr assignment [ ] Minimal paths ala machine0.cpu0 would be cool if they worked. I.e. if there's machine0.mainbus0.cpu0 and machine1.mainbus0.cpu0, but no mainbus1, then machineX.cpu0 would be shorter, and still uniquely identify the cpu. [ ] Help on methods and method arguments? E.g. cpu.unassemble [addr] [ ] Right now, entering a command such as "c" says "unknown command". It should say "ambigous command"! [/] Tab completion for everything: [ ] tab completion should use the Shortest Path, not the full path. E.g.: cpu + TAB should expand to cpu0 only in a default testm88k machine, NOT root.machine0.mainbus0.cpu0! This will most likely require a change in unit tests etc. [ ] syntax based completion? e.g.: help [cmd] tab completes the first argument as a command add component-name -- component name load [filename [component-path]] -- filename etc. This will require a uniform way of describing arguments, and whether or not they should be optional. The tab completer must then parse the command line, including figuring out which arguments were optional, etc. Also, when such syntax is taken into account, the CommandInterpreter can check syntax _before_ running Command::Execute. That means that individual Commands do not have to do manual checking on entry that the number of arguments is correct etc. [ ] filename [ ] command name [ ] component path [ ] optional vs mandatory args...? [ ] scan all commands' args at startup, and have an assert() in place, so that unknown arg types are caught during development! [ ] Command aliases? e.g. d = cpu0.dump c = continue s = step u = cpu.unassemble But maybe the cpu in question may be changed with a "focus" command? Otherwise it would only work with 1 cpu. [ ] recursive component.var Dumps a tree of the "var" variable _FROM_ the component, i.e. including all children. E.g. recursive mainbus0.memoryMappedAddr would dump mainbus0 |-- ram0 memoryMappedAddr = 0 |-- rom0 memoryMappedAddr = 0x1fc00000 \-- com0 memoryMappedAddr = 0x190003f8 or something. [ ] recursive component.var = value would set the "var" variable in the component including all sub components. Not all components may have the variable, so debug output should indicate which variables were set and which were not set. [/] RAM component: [ ] Make the save/load of state more efficient. Right now, it's a hex dump! Yuck. [ ] methods for searching for values (strings, words, etc?) [ ] methods for bulk fill/copy [from other address/data busses?] [ ] "put" command. [ ] Floating point helper. Make this more complete/accurate than the old one, i.e. support inf/nan, exceptions, signaling stuff, denormalized/normalized? [ ] non-IEEE modes (i.e. x86/vax/...)? [ ] Unit tests [ ] Userland emulation [ ] Begin with e.g. FreeBSD/amd64, or FreeBSD/alpha, NetBSD/something, or Linux/something. [ ] Try to prefix "/emul/mips/" or similar to all filenames, and only if that fails, try the given filename. Read this setting from an environment variable, and only if there is none, fall back to hardcoded string. [ ] File descriptor (0,1,2) assumptions? Think about this. [ ] Dynamic linking! (libs from /emul/xxxx etc) [ ] Initial register/stack contents (environment, command line args). [ ] Return value (from main). [ ] mmap emulation layer [ ] errno emulation layer [ ] ioctl emulation layer for all devices :-[ [ ] struct conversions for many syscalls
Debugger: Extend the put [b|h|w|d|q] addr, data modify emulated memory contents command with a "s" (string) mode, where data is a string. Also "z" which puts a nul-terminated string in memory. It should put the string there one byte at a time. put s 0x80008000, "apa" or put z 0x80008000, "apa" Extend the debugger with a "find" command as well, similar to put but with a range? find [b|h|w|d|q|s|z] startaddr, endaddr, data Disk image end-of-file bug: Triggered when using e.g. a tar.gz file as a disk image in NetBSD/pmax. NetBSD reads beyond the end of the file. Must be fixed! (Once fixed, documentation can be simplified for some guest OS installations.) GXsatellite -- Gavare's eXperimental Stand-Aline Test Environment for Low-Level Interactive Testing of Emulators. Share or reuse code with YCX5? Floating point: More tests. Documentation: Add the Android ARM machine modes to the documentation, machine_androidarm, color red, once the Linux device tree has been at least implemented enough to see some boot messages or so. Add the Alpha mode too even though probably almost nothing works at all. Also red. Try to include instructions for NetBSD, OpenBSD and FreeBSD although the later has been discontinued. Add FreeBSD/mips to testmips section? https://wiki.freebsd.org/FreeBSD/MipsEmulation If it works. Disk image options: 'R' (uppercase): don't allow writes to the disk image file, but let the guest OS think that the device is writable, by using a temporary overlay (created automatically in /tmp) which is removed when the emulator exits. If this is implemented, update the misc documentation, and the man page. X11: -y n: Command line option for taking framebuffer screen shots n times per second? Look for "ppm image dumps" in dev_fb.cc.html. if implemented, add to man page. MIPS: o) ALU operations (typically addiu, or, etc) can be hardcoded/inlined for common register pairs or even triples. Classic generate_blahblah and an array. Reduces memory accesses for common ALU instructions. o) Profile and make newer instruction combinations for up-to-date versions of NetBSD (full install, building GXemul inside NetBSD, etc). Make sure to go through all common function cores such as memset, memcpy, memmove, strlen, strcmp, memcmp, and idle functions, for both R3000 and R4400 etc. o) Floating point exception correctness. Compare to real hardware! o) Nicer MIPS status bits in register dumps. o) Some more work on opcodes. x) MIPS64 revision 2. o) Find out which actual CPUs implement the rev2 ISA! o) DINS, DINSM, DINSU etc o) DROTR32 and similar MIPS64 rev 2 instructions, which have a rotation bit which differs from previous ISAs. o) NetBSD has a patch for NOFPU flag for certain CPUs. Investigate and apply if correct. o) Dyntrans: Count register updates are probably not 100% correct yet. o) Coprocessor 1x (i.e. 3) should cause cp1 exceptions, not 3? (See http://lists.gnu.org/archive/html/qemu-devel/2007-05/msg00005.html) o) R4000 and others: x) watchhi/watchlo exceptions, and other exception handling details o) MIPS 5K* have 42 physical address bits, not 40/44? o) R10000 and others: (R12000, R14000 ?) x) The code before the line /* reg[COP0_PAGEMASK] = cpu->cd.mips.coproc[0]->tlbs[0].mask & PAGEMASK_MASK; */ in cpu_mips.c is not correct for R10000 according to Lemote's Godson patches for GXemul. TODO: Go through all register definitions according to http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi/hdwr/bks/SGI_Developer/books/R10K_UM/sgi_html/t5.Ver.2.0.book_263.html#HEADING334 and make sure everything works with R10000. Then test with OpenBSD/sgi? x) Entry LO mask (as above). x) memory space, exceptions, ... x) use cop0 framemask for tlb lookups [maybe already working correctly?] (http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi/hdwr/bks/SGI_Developer/books/R10K_UM/sgi_html/t5.Ver.2.0.book_284.html) SuperH: x) Auto-generation of loads/stores! This should get rid of at least the endianness check in each load/store. x) Experiment with whether or not correct ITLB emulation is actually needed. (20070522: I'm turning it off today.) x) SH4 interrupt controller: x) MASKING should be possible! x) SH4 UBC (0xff200000) x) SH4 DMA (0xffa00000): non-dreamcast-PVR modes x) Store queues can copy 32 bytes at a time, there's no need to copy individual 32-bit words. (Performance improvement.) (Except that e.g. the Dreamcast TA currently receives using 32-bit words... hm) x) SH4 BSC (Bus State Controller) x) Instruction tracing should include symbols for branch targets, and so on, to make the output more human readable. x) SH3-specific devices: Pretty much everything! x) NetBSD/evbsh3, hpcsh! Linux? x) Floating point speed! x) Floating point exception correctness. E.g. fipr and the other "geometric" instructions should throw an exception if the "precision" bit is wrong (since the geometric instructions loose precision). See the manual about this! x) Exceptions for unaligned load/stores. OpenBSD/landisk uses this mechanism for its reboot code (machine_reset). Dreamcast: x) Try to make the ROM from my real Dreamcast boot correctly. x) PVR: Lots of stuff. See dev_pvr.c. x) DMA to non-0x10000000 x) Textures... x) Make it fast! if possible x) G2 DMA x) SPU: Sound emulation (ARM cpu). x) LAN adapter (dev_mb8696x.c). NetBSD root-on-nfs. x) Better GDROM support x) Modem x) PCI bridge/bus? x) Maple bus: x) Correct controller input x) Mouse input x) Software emulation of BIOS calls: x) GD-ROM emulation: Use the GDROM device. x) Use the VGA font as a fake ROM font. (Better than nothing.) x) Make as many as possible of the KOS examples run! x) More homebrew demos/games. x) VME processor emulation? "(Sanyo LC8670 "Potato")" according to Wikipedia, LC86K87 according to Comstedt's page. See http://www.maushammer.com/vmu.html for a good description of the differences between LC86104C and the one used in the VME. POWER/PowerPC: x) Fix DECR timer speed, so it matches the host. x) NetBSD/prep 3.x triggers a possible bug in the emulator:<0x26c550(&ata_xfer_pool,2,0,8,..)> <0x35c71c(0x3f27000,0,52,8,..)> <__wdccommand_start(0xd005e4c8,0x3f27000,0,13,..)> [ wdc: write to SDH: 0xb0 (sectorsize 2, lba=1, drive 1, head 0) ] <0x198120(0xd005e4c8,72,64,0xbb8,..)> Note: x) PPC optimizations; instr combs x) 64-bit stuff: either Linux on G5, or perhaps some hobbyist version of AIX? (if there exists such a thing) x) macppc: adb controller; keyboard (for framebuffer mode) x) make OpenBSD/macppc work (PCI controller stuff) x) Floating point exception correctness. x) Alignment exceptions. PReP: x) Clock time! ("Bad battery blah blah") Algor: o) Other models than the P5064? o) PCI interrupts... needed for stuff like the tlp NIC? Malta: o) Try FreeBSD/malta: https://wiki.freebsd.org/FreeBSD/MipsEmulation o) Malta can perhaps have up to 2 GB RAM? Try same hack as for the SGI O2. o) Reconfigurable PCI memory space would be nice (just like SGI O2). SGI O2: differences between real hardware and NetBSD's header files? RED/GREEN LED: 1 on real hardware turns off, not on! nr of bits for the tile ptr 20008 vs 30008 in crmfb? CRMFB_CMAP_OVL = 0x00051400. should be 0x54400? for "color map 17" for overlays? crime time bit mask? NetBSD's crmfb.c crmfb_set_palette says rgb in reverse maybe?: val = (r << 8) | (g << 16) | (b << 24) clocks: Both NetBSD and OpenBSD drift over time. bus_pci / O2's pci: reconfigurable memory space redirect? ahc scsi controller! this will be very time consuming. http://mail-index.netbsd.org/port-sgimips/2015/09/24/msg000711.html has some "pcictl pci0 dump -d 1" output which may be worth comparing against. ds2502_get_eaddr: ds2502_read_rom failed! PROM complains during bootup. Needed to get further with bootp() diskless booting. Onewire protocol that depends on microsecond timing? netbsd starts in "enter pathname of shell" mode; should start netbooting in a more automated fashion? netbsd randomly quits 'startx' without showing anything? sometimes also randomly places windows differently. ps2 8242: openbsd's X11 doesn't detect keyboard/mouse? PROM in GXemul says "SGI-CRM, Rev B", but my real O2 says Rev C? graphics: allow other resolutions than 1280x1024? netbsd seems to support it (?). netbsd maybe still triggers some acceleration bugs when moving X11 windows? horrible_getputpixel: GBE_CMODE_RGB10 etc from openbsd's header file? performance? actually emulate "pipeline" and detect pipeline overruns, i.e. require guest OSes to wait? but then, when to execute commands in the pipeline? (low-prio) get -w 0xb5004000: LEVEL RD_PTR WR_PTR BUF_START 0x1e029a68 00 29 29 28 0x1e02baea 00 2b 2b 2a 0x1e03befa 00 3b 3b 3a 0x1e029a68 00 29 29 28 0x1e00a289 00 0a 0a 09 0x1e03df7c 00 3d 3d 3c 0x1e00003f 00 00 00 3f 0x1e02fbee 00 2f 2f 2e 0x1e02cb2b 00 2c 2c 2b 0x1e00b2ca 00 0b 0b 0a 0x1e008207 00 08 08 07 0x1e = all idle i2c vga data, for NetBSD etc. 3D graphics? i.e. depth buffers, triangles, etc. audio? would be the first audio related thing in gxemul... VICE? video? probably too complicated. IRIX? Networking? SCSI? Currently panics due to root vfs not available. HPCmips: x) Clock? Is it running at correct speed? x) Mouse/pad support! :) x) A NIC? (As a PCMCIA device?) x) Investigate why not all MobilePro models work any longer with recent NetBSD versions. ARM: o) Big endian does not really work: loads and stores are little endian! o) More THUMB disassembly? o) More THUMB execution. o) 0xf "condition" execution: see http://engold.ui.ac.ir/~nikmehr/Appendix_B2.pdf o) Android devices. o) See netwinder_reset() in NetBSD; the current "an internal error occured" message after reboot/halt is too ugly. o) Generic ARM "wait"-like instruction? o) try to get netbsd/evbarm 3.x or 4.x running (iq80321) o) netbsd/iyonix? the i80321 device currently tells netbsd that RAM starts at 0xa0000000, but that is perhaps not correct for the iyonix. o) make the xscale counter registers (ccnt) work o) make the ata controller usable for FreeBSD! o) Debian/cats crashes because of unimplemented coproc stuff. fix this? Test machines: o) dev_fb 2D acceleration functions, to make dev_fb useful for simple graphical OSes: x) block fill and copy x) draw characters (from the built-in font)? o) dev_fb input device? mouse pointer coordinates and buttons (allow changes in these to cause interrupts as well?) o) Redefine the halt() function so that it stops "sometimes soon", i.e. usage in demo code should be: for (;;) { halt(); } o) More demos/examples. Dyntrans: x) Try to make the vaddr fix O(1) again instead of O(n), if it is possible to see if a non-canonical address has been inserted into the caches. In other words, keep track of whether a full O(n) seach is really needed or not. x) For 32-bit emulation modes, that have emulated TLBs: tlbindex arrays of mapped pages? Things to think about: x) Only 32-bit mode! (64-bit => too much code) x) One array for global pages, and one array _PER ASID_, for those archs that support that. On M88K, there should be one array for userspace, and one for supervisor, etc. x) Larger-than-4K-pages must fill several bits in the array. x) No TLB search will be necessary. x) Total host space used, for 4 KB pages: 1 MB per table, i.e. 65 MB for 32-bit MIPS, 2 MB for M88K, if one byte is used as the tlb index. x) (The index is actually +1, so that 0 means no hit.) x) "Merge" the cur_physpage and cur_ic_page variables/pointers to one? I.e. change cur_ic_page to cur_physpage.ic_page or something. x) Instruction combination collisions? How to avoid easily... x) superh -- no hostpage for e.g. 0x8c000000. devices as ram! x) Think about how to do both SHmedia and SHcompact in a reasonable way! (Or AMD64 long/protected/real, for that matter.) x) 68K emulation; think about how to do variable instruction lengths across page boundaries. x) Dyntrans with valgrind-inspired memory checker. (In memory_rw, it would be reasonably simple to add; in each individual fast load/store routine = a lot more work, and it would become kludgy very fast.) o) Mark every address with bits which tell whether or not the address has been written to. o) What should happen when programs are loaded? Text/data, bss (zero filled). But stack space and heap is uninitialized. o) Uninitialized local variables: A load from a place on the stack which has not previously been stored to => warning. Increasing the stack pointer using any available means should reset the memory to uninitialized. o) If calls to malloc() and free() can be intercepted: o) Access to a memory area after free() => warning. o) Memory returned by malloc() is marked as not-initialized. o) Non-passive, but good to have: Change the argument given to malloc, to return a slightly larger memory area, i.e. margin_before + size + margin_after, and return the pointer + margin_before. Any access to the margin_before or _after space results in warnings. (free() must be modified to free the actually allocated address.) x) Dyntrans with SMP... lots of work to be done here. x) Dyntrans with cache emulation... lots of work here as well. x) Remove the concept of base RAM completely; it would be more generic to allow RAM devices to be used "anywhere". o) dev_mp doesn't work well with dyntrans yet o) In general, IPIs, CAS, LL/SC etc must be made to work with dyntrans x) Redesign/rethink the delay slot mechanism used for e.g. MIPS, so that it caches a translation (that is, an instruction word and the instr_call it was translated to the last time), so that it doesn't need to do slow to_be_translated for each end of page? x) Program Counter statistics: Per machine? What about SMP? All data to the same file? A debugger command should be possible to use to enable/ disable statistics gathering. Configuration file option! x) Breakpoints: o) Physical vs virtual addresses! o) 32-bit vs 64-bit sign extension for MIPS, and others? x) INVALIDATION should cause translations in _all_ cpus to be invalidated, e.g. on a write to a write-protected page (containing code) x) 16-bit encodings? (MIPS16, ARM Thumb, etc) x) Lots of other stuff: see src/cpus/README_DYNTRANS x) Native code generation backends... think _carefully_ about this. (Not a priority right now.) Better CD Image file support: x) Support CD formats that contain more than 1 track, e.g. CDI files (?). These can then contain a mixture of e.g. sound and data tracks, and booting from an ISO filesystem path would boot from [by default] the first data track. (This would make sense for e.g. Dreamcast CD images, or possibly other live-CD formats.) Networking: x) Redesign of the networking subsystem, at least the NAT translation part. The current way of allowing raw ethernet frames to be transfered to/from the emulator via UDP should probably be extended to allow the frames to be transmitted other ways as well. x) Also adding support for connecting ttys (either to xterms, or to pipes/sockets etc, or even to PPP->NAT or SLIP->NAT :-). x) Documentation updates (!) are very important, making it easier to use the (already existing) network emulation features. x) Fix performance problems caused by only allowing a single TCP packet to be unacked. x) Don't hardcode offsets into packets! x) Test with lower than 100 max tcp/udp connections, to make sure that reuse works! x) Make OpenBSD work better as a guest OS! x) DHCP? Debian doesn't actually send DHCP packets, even though it claims to? So it is hard to test. x) Multiple networks per emulation, and let different NICs in machines connect to different networks. x) Support VDE (vde.sf.net)? Easiest/cleanest (before a redesign of the network framework has been done) is probably to connect it using the current (udp) solution. x) Allow SLIP connections, possibly PPP, in addition to ethernet? PCI: o) Big-endian Malta access? x) Pretty much everything related to runtime configuration, device slots, interrupts, etc must be redesigned/cleaned up. The current code is very hardcoded and ugly. o) Allow cards to be added/removed during runtime more easily. o) Allow cards to be enabled/disabled (i/o ports, etc, like NetBSD needs for disk controller detection). o) Allow devices to be moved in memory during runtime. o) Interrupts per PCI slot, etc. (A-D). o) PCI interrupt controller logic... very hard to get right, because these differ a lot from one machine to the next. x) last write was ffffffff ==> fix this, it should be used together with a mask to get the correct bits. also, not ALL bits are size bits! (lowest 4 vs lowest 2?) x) add support for address fixups x) generalize the interrupt routing stuff (lines etc) Clocks and timers: x) Fix the PowerPC DECR interrupt speed! (MacPPC and PReP speed, etc.) x) DON'T HARDCODE 100 HZ IN cpu_mips_coproc.c! x) NetWinder timeofday is incorrect! Huh? grep -R for ta_rtc_read in NetBSD sources; it doesn't seem to be initialized _AT ALL_?! x) Cobalt TOD is incorrect! x) Go through all other machines, one by one, and fix them. ASC SCSI controller: x) NetBSD/arc 2.0 uses the ASC controller in a way which GXemul cannot yet handle. (NetBSD 1.6.2 works ok.) (Possibly a problem in NetBSD itself, http://mail-index.netbsd.org/source-changes/ 2005/11/06/0024.html suggests that.) NetBSD 4.x seems to work? :) Better framebuffer and X-windows functionality: o) Do a complete rewrite of the framebuffer/console stuff, so that: 1) It does not rely on X11 specifically. 2) It is possible to interact with emulated framebuffers and consoles "remotely", e.g. via a web page which controls multiple virtualized machines. 3) It is possible to run on (hypothetical) non-X11 graphics systems. o) Generalize the update_x1y1x2y2 stuff to an extend-region() function... o) -Yx sometimes causes crashes. o) Simple device access to framebuffer_blockcopyfill() etc, and text output (using the built-in fonts), for dev_fb. o) CLEAN UP the ugly event code o) Mouse clicks can be "missed" in the current system; this is not good. They should be put on a stack of some kind. o) More 2D and 3D framebuffer acceleration. o) Non-resizable windows? Or choose scaledown depending on size (and center the image, with a black border). o) Different scaledown on different windows? o) Non-integral scale-up? (E.g. 640x480 -> 1024x768) o) Switch scaledown during runtime? (Ala CTRL-ALT-plus/minus) o) Bug reported by Elijah Rutschman on MacOS with weird keys (F5 = cursor down?). o) Keyboard and mouse events: x) Do this for more machines than just DECstation x) more X11 cursor keycodes x) Keys like CTRL, ALT, SHIFT do not get through by themselves (these are necessary for example to change the font of an xterm in X in the emulator) o) Generalize the framebuffer stuff by moving _ALL_ X11 specific code to a separate module.