Andrew Bardsley [Tue, 2 Dec 2014 11:08:11 +0000 (06:08 -0500)]
arm: Fix TLB ignoring faults when table walking
This patch fixes a case where the Minor CPU can deadlock due to the lack
of a response to TLB request because of a bug in fault handling in the ARM
table walker.
TableWalker::processWalkWrapper is the scheduler-called wrapper which
handles deferred walks which calls to TableWalker::wait cannot immediately
process. The handling of faults generated by processWalk{AArch64,LPAE,}
calls in those two functions is is different. processWalkWrapper ignores
fault returns from processWalk... which can lead to ::finish not being
called on a translation.
This fix provides fault handling in processWalkWrapper similar to that
found in the leaf functions which BaseTLB::Translation::finish.
Andrew Bardsley [Tue, 2 Dec 2014 11:08:09 +0000 (06:08 -0500)]
config: Fix to SystemC example's event handling
This patch fixes checkpoint restore in the SystemC hosting example by handling
early PollEvent events correctly before any EventQueue events are posted.
The SystemC event queue handler (SCEventQueue) reports an error if the event
loop is entered with no Events posted. It is possible for this to happen
after instantiate due to PollEvent events. This patch separates out
`external' events into a different handler in sc_module.cc to prevent the
error from occurring.
This fix also improves the event handling of asynchronous events by:
1) Making asynchronous events 'catch up' gem5 time to SystemC
time to avoid the appearance that events have been lost
while servicing an asynchronous event that schedules an
event loop exit event
2) Add an in_simulate data member to Module to allow the event
loop to check whether events should be processed or deferred
until the next time Module::simulate is entered
3) Cancel pending events around the entry/exit of the event loop
in Module::simulate
4) Moving the state initialisation of the example entirely into
run to correct a problem with early events in checkpoint
restore.
It is still possible to schedule asynchronous events (and talk PollQueue
actions) while simulate is not running. This behaviour may stil cause
some problems.
Andrew Bardsley [Tue, 2 Dec 2014 11:08:06 +0000 (06:08 -0500)]
config: SystemC Gem5Control top level additions
This patch cleans up a few style issues and adds a few capabilities to the
SystemC top level 'Gem5Control/Gem5System' mechanism. These include:
1) A space to store/retrieve a version string for a model
2) A mechanism for registering functions to be called at the end of
elaboration to perform simulation setup tasks in SystemC
3) Adding setGDBRemotePort to the Gem5Control
4) Changing the sc_set_time_resolution behaviour to instead check that
the SystemC time resolution is already acceptable
Andreas Hansson [Tue, 2 Dec 2014 11:08:05 +0000 (06:08 -0500)]
stats: Bump stats for o3 LSQ changes
Marco Elver [Tue, 2 Dec 2014 11:08:03 +0000 (06:08 -0500)]
cpu, o3: Ignored invalidate causing same-address load reordering
In case the memory subsystem sends a combined response with invalidate
(e.g. ReadRespWithInvalidate), we cannot ignore the invalidate part
of the response.
If we were to ignore the invalidate part, under certain circumstances
this effectively leads to reordering of loads to the same address
which is not permitted under any memory consistency model implemented
in gem5.
Consider the case where a later load's address is computed before an
earlier load in program order, and is therefore sent to the memory
subsystem first. At some point the earlier load's address is computed
and in doing so correctly marks the later load as a
possibleLoadViolation. In the meantime some other node writes and
sends invalidations to all other nodes. The invalidation races with
the later load's ReadResp, and arrives before ReadResp and is
deferred. Upon receipt of the ReadResp, the response is changed to
ReadRespWithInvalidate, and sent to the CPU. If we ignore the
invalidate part of the packet, we let the later load read the old
value of the address. Eventually the earlier load's ReadResp arrives,
but with new data. As there was no invalidate snoop (sunk into the
ReadRespWithInvalidate), and if we did not process the invalidate of
the ReadRespWithInvalidate, we obtain a load reordering.
A similar scenario can be constructed where the earlier load's address
is computed after ReadRespWithInvalidate arrives for the younger
load. In this case hitExternalSnoop needs to be set to true on the
ReadRespWithInvalidate, so that upon knowing the address of the
earlier load, checkViolations will cause the later load to be
squashed.
Finally we must account for the case where both loads are sent to the
memory subsystem (reordered), a snoop invalidate arrives and correctly
sets the later loads fault to ReExec. However, before the CPU
processes the fault, the later load's ReadResp arrives and the
writeback discards the outstanding fault. We must add a check to
ensure that we do not skip any unprocessed faults.
Andreas Hansson [Tue, 2 Dec 2014 11:08:00 +0000 (06:08 -0500)]
cpu: Always mask the snoop address when performing lock check
Ensure the snoop address check is always using a cache-block aligned
address. This patch updates Alpha and Mips to match the other ISAs.
Stephan Diestelhorst [Tue, 2 Dec 2014 11:07:58 +0000 (06:07 -0500)]
cpu: Move packet deallocation to recvTimingResp in the O3 CPU
Move the packet deallocations in the O3 CPU so that the completeDataAccess
deals only with the LSQ specific parts and the generic recvTimingResp frees the
packet in all other cases.
Andreas Hansson [Tue, 2 Dec 2014 11:07:56 +0000 (06:07 -0500)]
mem: Relax packet src/dest check and shift onus to crossbar
This patch allows objects to get the src/dest of a packet even if it
is not set to a valid port id. This simplifies (ab)using the bridge as
a buffer and latency adapter in situations where the neighbouring
MemObjects are not crossbars.
The checks that were done in the packet are now shifted to the
crossbar where the fields are used to index into the port
arrays. Thus, the carrier of the information is not burdened with
checking, and the crossbar can check not only that the destination is
set, but also that the port index is within limits.
Andreas Hansson [Tue, 2 Dec 2014 11:07:54 +0000 (06:07 -0500)]
mem: Clean up packet data allocation
This patch attempts to make the rules for data allocation in the
packet explicit, understandable, and easy to verify. The constructor
that copies a packet is extended with an additional flag "alloc_data"
to enable the call site to explicitly say whether the newly created
packet is short-lived (a zero-time snoop), or has an unknown life-time
and therefore should allocate its own data (or copy a static pointer
in the case of static data).
The tricky case is the static data. In essence this is a
copy-avoidance scheme where the original source of the request (DMA,
CPU etc) does not ask the memory system to return data as part of the
packet, but instead provides a pointer, and then the memory system
carries this pointer around, and copies the appropriate data to the
location itself. Thus any derived packet actually never copies any
data. As the original source does not copy any data from the response
packet when arriving back at the source, we must maintain the copy of
the original pointer to not break the system. We might want to revisit
this one day and pay the price for a few extra memcpy invocations.
All in all this patch should make it easier to grok what is going on
in the memory system and how data is actually copied (or not).
Andreas Hansson [Tue, 2 Dec 2014 11:07:52 +0000 (06:07 -0500)]
mem: Cleanup Packet::checkFunctional and hasData usage
This patch cleans up the use of hasData and checkFunctional in the
packet. The hasData function is unfortunately suggesting that it
checks if the packet has a valid data pointer, when it does in fact
only check if the specific packet type is specified to have a data
payload. The confusion led to a bug in checkFunctional. The latter
function is also tidied up to avoid name overloading.
Andreas Hansson [Tue, 2 Dec 2014 11:07:50 +0000 (06:07 -0500)]
mem: Make the requests carried by packets const
This adds a basic level of sanity checking to the packet by ensuring
that a request is not modified once the packet is created. The only
issue that had to be worked around is the relaying of
software-prefetches in the cache. The specific situation is now solved
by first copying the request, and then creating a new packet
accordingly.
Andreas Hansson [Tue, 2 Dec 2014 11:07:48 +0000 (06:07 -0500)]
mem: Make Request getters const
This patch tidies up the Request class, making all getters const. The
odd one out is incAccessDepth which is called by the memory system as
packets carry the request around. This is also const to enable the
packet to hold on to a const Request.
Andreas Hansson [Tue, 2 Dec 2014 11:07:46 +0000 (06:07 -0500)]
mem: Add checks and explanation for assertMemInhibit usage
Andreas Hansson [Tue, 2 Dec 2014 11:07:43 +0000 (06:07 -0500)]
mem: Assume all dynamic packet data is array allocated
This patch simplifies how we deal with dynamically allocated data in
the packet, always assuming that it is array allocated, and hence
should be array deallocated (delete[] as opposed to delete). The only
uses of dataDynamic was in the Ruby testers.
The ARRAY_DATA flag in the packet is removed accordingly. No
defragmentation of the flags is done at this point, leaving a gap in
the bit masks.
As the last part the patch, it renames dataDynamicArray to dataDynamic.
Andreas Hansson [Tue, 2 Dec 2014 11:07:41 +0000 (06:07 -0500)]
mem: Remove redundant Packet::allocate calls
This patch cleans up the packet memory allocation confusion. The data
is always allocated at the requesting side, when a packet is created
(or copied), and there is never a need for any device to allocate any
space if it is merely responding to a paket. This behaviour is in line
with how SystemC and TLM works as well, thus increasing
interoperability, and matching established conventions.
The redundant calls to Packet::allocate are removed, and the checks in
the function are tightened up to make sure data is only ever allocated
once. There are still some oddities in the packet copy constructor
where we copy the data pointer if it is static (without ownership),
and allocate new space if the data is dynamic (with ownership). The
latter is being worked on further in a follow-on patch.
Andreas Hansson [Tue, 2 Dec 2014 11:07:38 +0000 (06:07 -0500)]
mem: Use const pointers for port proxy write functions
This patch changes the various write functions in the port proxies
to use const pointers for all sources (similar to how memcpy works).
The one unfortunate aspect is the need for a const_cast in the packet,
to avoid having to juggle a const and a non-const data pointer. This
design decision can always be re-evaluated at a later stage.
Andreas Hansson [Tue, 2 Dec 2014 11:07:36 +0000 (06:07 -0500)]
mem: Add const getters for write packet data
This patch takes a first step in tightening up how we use the data
pointer in write packets. A const getter is added for the pointer
itself (getConstPtr), and a number of member functions are also made
const accordingly. In a range of places throughout the memory system
the new member is used.
The patch also removes the unused isReadWrite function.
Andreas Hansson [Tue, 2 Dec 2014 11:07:34 +0000 (06:07 -0500)]
mem: Remove null-check bypassing in Packet::getPtr
This patch removes the parameter that enables bypassing the null check
in the Packet::getPtr method. A number of call sites assume the value
to be non-null.
The one odd case is the RubyTester, which issues zero-sized
prefetches(!), and despite being reads they had no valid data
pointer. This is now fixed, but the size oddity remains (unless anyone
object or has any good suggestions).
Finally, in the Ruby Sequencer, appropriate checks are made for flush
packets as they have no valid data pointer.
Omar Naji [Tue, 2 Dec 2014 11:07:32 +0000 (06:07 -0500)]
mem: Add a GDDR5 DRAM config
This patch adds a first cut GDDR5 config to accommodate the users
combining gem5 and GPUSim. The config is based on a SK Hynix
datasheet, and the Nvidia GTX580 specification. Someone from the
GPUSim user-camp should tweak the default page-policy and static
frontend and backend latencies.
Andreas Hansson [Mon, 24 Nov 2014 14:03:39 +0000 (09:03 -0500)]
stats: Bump stats after static analysis fixes
Fixing up the uninitialised values changes two of the x86 Linux boot
regressions slightly.
Andreas Hansson [Mon, 24 Nov 2014 14:03:38 +0000 (09:03 -0500)]
misc: Another round of static analysis fixups
Mostly addressing uninitialised members.
Alexandru Dutu [Mon, 24 Nov 2014 02:01:09 +0000 (18:01 -0800)]
mem: Page Table map api modification
This patch adds uncacheable/cacheable and read-only/read-write attributes to
the map method of PageTableBase. It also modifies the constructor of TlbEntry
structs for all architectures to consider the new attributes.
Alexandru Dutu [Mon, 24 Nov 2014 02:01:09 +0000 (18:01 -0800)]
mem: Multi Level Page Table bug fix
The multi level page table was giving false positives for already mapped
translations. This patch fixes the bogus behavior.
Alexandru Dutu [Mon, 24 Nov 2014 02:01:09 +0000 (18:01 -0800)]
mem: Page Table long lines
Trimmed down all the lines greater than 78 characters.
Alexandru Dutu [Mon, 24 Nov 2014 02:01:08 +0000 (18:01 -0800)]
config, kvm: Enabling KvmCPU in SE mode
This patch modifies se.py such that it can now use kvm cpu model.
Alexandru Dutu [Mon, 24 Nov 2014 02:01:08 +0000 (18:01 -0800)]
x86: Segment initialization to support KvmCPU in SE
This patch sets up low and high privilege code and data segments and places them
in the following order: cs low, ds low, ds, cs, in the GDT. Additionally, a
syscall and page fault handler for KvmCPU in SE mode are defined. The order of
the segment selectors in GDT is required in this manner for interrupt handling
to work properly. Segment initialization is done for all the thread
contexts.
Alexandru Dutu [Mon, 24 Nov 2014 02:01:08 +0000 (18:01 -0800)]
kvm, x86: Adding support for SE mode execution
This patch adds methods in KvmCPU model to handle KVM exits caused by syscall
instructions and page faults. These types of exits will be encountered if
KvmCPU is run in SE mode.
Alexandru Dutu [Mon, 24 Nov 2014 02:01:08 +0000 (18:01 -0800)]
cpuid, x86: Enabling more features in CPUid
Adding more features in the CPUid with the purpose of supporting running the
KvmCPU in SE mode.
Steve Reinhardt [Mon, 24 Nov 2014 02:00:47 +0000 (18:00 -0800)]
Backed out prior changeset
f9fb64a72259
Back out use of importlib to avoid implicitly creating
dependency on Python 2.7.
Gabe Black [Sun, 23 Nov 2014 13:55:26 +0000 (05:55 -0800)]
config: ruby: Get rid of an "eval" and an "exec" operating on generated code.
We can get the same result using importlib.
Gabe Black [Sat, 22 Nov 2014 01:22:19 +0000 (17:22 -0800)]
x86: Update stats for the new Linux delay port.
Gabe Black [Sat, 22 Nov 2014 01:22:02 +0000 (17:22 -0800)]
x86: pc: Put a stub IO device at port 0xed which the kernel can use for delays.
There was already a stub device at 0x80, the port traditionally used for an IO
delay. 0x80 is also the port used for POST codes sent by firmware, and that
may have prompted adding this port as a second option.
Nilay Vaish [Wed, 19 Nov 2014 01:17:29 +0000 (19:17 -0600)]
configs: small fix to ruby portion of fs.py and se.py
In fs.py the io port controller was being attached to the iobus multiple
times. This should be done only once. In se.py, the the option use_map
was being set which no longer exists.
Gabe Black [Tue, 18 Nov 2014 10:38:23 +0000 (02:38 -0800)]
dev: Use fixed size member variables to describe fixed size PL111 registers.
Gabe Black [Mon, 17 Nov 2014 09:45:42 +0000 (01:45 -0800)]
vnc: Add a conversion function for bgr888.
Gabe Black [Mon, 17 Nov 2014 09:00:53 +0000 (01:00 -0800)]
x86: Fix setting segment bases in real mode.
The data size used for actually writing the base value for the segment was the
default size, but really it should set the entire value without any possible
truncation.
Gabe Black [Mon, 17 Nov 2014 08:20:01 +0000 (00:20 -0800)]
x86: Fix some bugs in the real mode far jmp instruction.
The far pointer should be shifted right to get the selector value, not left.
Also, when calculating the width of the offset, the wrong register was used in
one spot.
Gabe Black [Mon, 17 Nov 2014 08:19:07 +0000 (00:19 -0800)]
x86: APIC: Only set deliveryStatus if our IPI is going somewhere.
Otherwise the IPI which isn't sent will never arrive, and the deliveryStatus
bit will never be cleared.
Gabe Black [Mon, 17 Nov 2014 08:17:06 +0000 (00:17 -0800)]
x86: APIC: Fix the getRegArrayBit function.
The getRegArrayBit function extracts a bit from a series of registers which
are treated as a single large bit array. A previous change had modified the
logic which figured out which bit to extract from ">> 5" to "% 5" which seems
wrong, especially when other, similar functions were changed to use "% 32".
Gabe Black [Mon, 17 Nov 2014 08:16:36 +0000 (00:16 -0800)]
x86: Update the stats for the x86 FS o3 boot test.
Gabe Black [Mon, 17 Nov 2014 07:12:42 +0000 (23:12 -0800)]
x86: Fix the CPUID Long Mode Address Size function.
The value in EAX has an 8 bit field for the linear address size and one for
the physical address size when calling that function. A recent change
implemented it but returned 0xff for both of those fields. That implies that
linear and physical addresses are 255 bits wide which is wrong. When using the
KVM CPU model this causes an error, presumably because some of those bits are
actually reserved, or the CPU or kernel realizes 255 bits is a bad value.
This change makes those values 48.
Andrew Bardsley [Fri, 14 Nov 2014 08:54:02 +0000 (03:54 -0500)]
config: Fix checkpoint restore in C++ config example
This patch fixes the checkpoint restore option in the example of C++
configuration (util/cxx_config).
The fix introduces a call to config_manager->startup() (which calls startup
on all SimObjects managed by that manager) to replicate the loop of
SimObject::startup calls in src/python/m5/simulate.py::simulate guarded by
need_startup. As util/cxx_config/main.cc is a C++ analogue of
src/python/mt/simulate.py, it should make a similar set of calls.
Andreas Hansson [Fri, 14 Nov 2014 08:53:51 +0000 (03:53 -0500)]
arm: Fixes based on UBSan and static analysis
Another churn to clean up undefined behaviour, mostly ARM, but some
parts also touching the generic part of the code base.
Most of the fixes are simply ensuring that proper intialisation. One
of the more subtle changes is the return type of the sign-extension,
which is changed to uint64_t. This is to avoid shifting negative
values (undefined behaviour) in the ISA code.
Andreas Hansson [Fri, 14 Nov 2014 08:53:48 +0000 (03:53 -0500)]
mem: Clarify unit of DRAM controller buffer size
Andreas Hansson [Wed, 12 Nov 2014 14:05:25 +0000 (09:05 -0500)]
stats: Bump regressions to match latest changes
Updates after timezone hick-up and sorting of dictionary items in the
SimObject.
Mitch Hayenga [Wed, 12 Nov 2014 14:05:23 +0000 (09:05 -0500)]
mem: Delete unused variable in Garnet NetworkLink
With recent changes OSX clang compilation fails due to an unused variable.
Ali Saidi [Wed, 12 Nov 2014 14:05:22 +0000 (09:05 -0500)]
arm: Fix timing wakeup with LLSC
Andreas Hansson [Wed, 12 Nov 2014 14:05:21 +0000 (09:05 -0500)]
sim: Sort SimObject descendants and ports
This patch fixes a number of occurences where the sorting order of the
objects was implementation defined.
Andreas Hansson [Wed, 12 Nov 2014 14:05:20 +0000 (09:05 -0500)]
base: Revert
9277177eccff and use getenv/setenv for UTC time
This patch reverts changeset
9277177eccff which does not do what it
was intended to do. In essence, we go back to implementing mkutctime
much like the non-standard timegm extension.
Nilay Vaish [Tue, 11 Nov 2014 20:17:10 +0000 (14:17 -0600)]
stats: changes to x86 o3 fs and sparc fs regression tests.
Marc Orr [Thu, 6 Nov 2014 11:42:22 +0000 (05:42 -0600)]
x86 isa: This patch attempts an implementation at mwait.
Mwait works as follows:
1. A cpu monitors an address of interest (monitor instruction)
2. A cpu calls mwait - this loads the cache line into that cpu's cache.
3. The cpu goes to sleep.
4. When another processor requests write permission for the line, it is
evicted from the sleeping cpu's cache. This eviction is forwarded to the
sleeping cpu, which then wakes up.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Marc Orr [Thu, 6 Nov 2014 11:42:21 +0000 (05:42 -0600)]
tests: A test program for the new mwait implementation.
This is a simple test program for the new mwait implemenation. It is uses
m5threads to create to threads of execution in syscall emulation mode that
interact using the mwait instruction.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Andrew Lukefahr [Thu, 6 Nov 2014 11:42:21 +0000 (05:42 -0600)]
cpu: Minor Draining Bug
Fixes a bug where Minor drains in the midst of committing a
conditional store.
While committing a conditional store, lastCommitWasEndOfMacroop is true
(from the previous instruction) as we still haven't finished the conditional
store. If a drain occurs before the cache response, Minor would check just
lastCommitWasEndOfMacroop, which was true, and set drainState=DrainHaltFetch,
which increases the streamSeqNum. This caused the conditional store to be
squashed when the memory responded and it completed. However, to the memory
the store succeeded, while to the instruction sequence it never occurred.
In the case of an LLSC, the instruction sequence will replay the squashed
STREX, which will fail as the cache is no longer in LLSC. Then the
instruction sequence will loop back to a LDREX, which receives the updated
(incorrect) value.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Nilay Vaish [Thu, 6 Nov 2014 11:42:21 +0000 (05:42 -0600)]
stats: updates due to changes to ruby
Nilay Vaish [Thu, 6 Nov 2014 11:42:21 +0000 (05:42 -0600)]
ruby: provide a backing store
Ruby's functional accesses are not guaranteed to succeed as of now. While
this is not a problem for the protocols that are currently in the mainline
repo, it seems that coherence protocols for gpus rely on a backing store to
supply the correct data. The aim of this patch is to make this backing store
configurable i.e. it comes into play only when a particular option:
--access-backing-store is invoked.
The backing store has been there since M5 and GEMS were integrated. The only
difference is that earlier the system used to maintain the backing store and
ruby's copy was write-only. Sometime last year, we moved to data being
supplied supplied by ruby in SE mode simulations. And now we have patches on
the reviewboard, which remove ruby's copy of memory altogether and rely
completely on the system's memory to supply data. This patch adds back a
SimpleMemory member to RubySystem. This member is used only if the option:
access-backing-store is set to true. By default, the memory would not be
accessed.
Nilay Vaish [Thu, 6 Nov 2014 11:42:21 +0000 (05:42 -0600)]
ruby: interface with classic memory controller
This patch is the final in the series. The whole series and this patch in
particular were written with the aim of interfacing ruby's directory controller
with the memory controller in the classic memory system. This is being done
since ruby's memory controller has not being kept up to date with the changes
going on in DRAMs. Classic's memory controller is more up to date and
supports multiple different types of DRAM. This also brings classic and
ruby ever more close. The patch also changes ruby's memory controller to
expose the same interface.
Nilay Vaish [Thu, 6 Nov 2014 11:42:20 +0000 (05:42 -0600)]
ruby: remove the function functionalReadBuffers()
This function was added when I had incorrectly arrived at the conclusion
that such a function can improve the chances of a functional read succeeding.
As was later realized, this is not possible in the current setup. While the
code using this function was dropped long back, this function was not. Hence
the patch.
Nilay Vaish [Thu, 6 Nov 2014 11:42:20 +0000 (05:42 -0600)]
ruby: coherence protocols: remove data block from dirctory entry
This patch removes the data block present in the directory entry structure
of each protocol in gem5's mainline. Firstly, this is required for moving
towards common set of memory controllers for classic and ruby memory systems.
Secondly, the data block was being misused in several places. It was being
used for having free access to the physical memory instead of calling on the
memory controller.
From now on, the directory controller will not have a direct visibility into
the physical memory. The Memory Vector object now resides in the
Memory Controller class. This also means that some significant changes are
being made to the functional accesses in ruby.
Nilay Vaish [Thu, 6 Nov 2014 11:42:20 +0000 (05:42 -0600)]
ruby: slicc: allow adding a bool to an int, like C++.
Nilay Vaish [Thu, 6 Nov 2014 11:42:20 +0000 (05:42 -0600)]
ruby: remove sparse memory.
In my opinion, it creates needless complications in rest of the code.
Also, this structure hinders the move towards common set of code for
physical memory controllers.
Nilay Vaish [Thu, 6 Nov 2014 11:41:44 +0000 (05:41 -0600)]
ruby: single physical memory in fs mode
Both ruby and the system used to maintain memory copies. With the changes
carried for programmed io accesses, only one single memory is required for
fs simulations. This patch sets the copy of memory that used to reside
with the system to null, so that no space is allocated, but address checks
can still be carried out. All the memory accesses now source and sink values
to the memory maintained by ruby.
Nilay Vaish [Thu, 6 Nov 2014 06:55:09 +0000 (00:55 -0600)]
ruby: dma sequencer: remove RubyPort as parent class
As of now DMASequencer inherits from the RubyPort class. But the code in
RubyPort class is heavily tailored for the CPU Sequencer. There are parts of
the code that are not required at all for the DMA sequencer. Moreover, the
next patch uses the dma sequencer for carrying out memory accesses for all the
io devices. Hence, it is better to have a leaner dma sequencer.
Ali Saidi [Mon, 3 Nov 2014 16:14:42 +0000 (10:14 -0600)]
tests: Update stats no match.
Bootloader I had on my sytem was an older version with a couple of
instruction differences.
Ali Saidi [Thu, 30 Oct 2014 05:04:12 +0000 (00:04 -0500)]
arm, tests: Forgot the system.terminal files for the new regressions.
Ali Saidi [Thu, 30 Oct 2014 04:50:15 +0000 (23:50 -0500)]
arm, tests: Add 64-bit ARM regression tests
Ali Saidi [Thu, 30 Oct 2014 04:22:26 +0000 (23:22 -0500)]
automated merge
Ali Saidi [Thu, 30 Oct 2014 04:18:29 +0000 (23:18 -0500)]
tests: Update regressions for the new kernels and various preceeding fixes.
Ali Saidi [Thu, 30 Oct 2014 04:18:27 +0000 (23:18 -0500)]
arm, tests: Update config files to more recent kernels and create 64-bit regressions.
This changes the default ARM system to a Versatile Express-like system that supports
2GB of memory and PCI devices and updates the default kernels/file-systems for
AArch64 ARM systems (64-bit) to support up to 32GB of memory and PCI devices. Some
platforms that are no longer supported have been pruned from the configuration files.
In addition a set of 64-bit ARM regressions have been added to the regression system.
Mitch Hayenga [Thu, 30 Oct 2014 04:18:27 +0000 (23:18 -0500)]
cpu: Add writeback modeling for drain functionality
It is possible for the O3 CPU to consider itself drained and
later have a squashed instruction perform a writeback. This
patch re-adds tracking of in-flight instructions to prevent
falsely signaling a drained event.
Mitch Hayenga [Thu, 30 Oct 2014 04:18:26 +0000 (23:18 -0500)]
cpu: Add drain check functionality to IEW
IEW did not check the instQueue and memDepUnit to ensure
they were drained. This caused issues when drainSanityCheck()
did check those structures after asserting IEW was drained.
Ali Saidi [Thu, 30 Oct 2014 04:18:26 +0000 (23:18 -0500)]
arm, mem: Fix drain bug and provide drain prints for more components.
Ali Saidi [Thu, 30 Oct 2014 04:18:26 +0000 (23:18 -0500)]
arm: Fix multi-system AArch64 boot w/caches.
Automatically extract cpu release address from DTB file.
Check SCTLR_EL1 to verify all caches are enabled.
Ali Saidi [Thu, 30 Oct 2014 04:18:26 +0000 (23:18 -0500)]
arm: fix bare-metal memory setup.
The bare-metal configuration option still configured memory with the old scheme
that no-longer works. This change unifies the code so there aren't any differences.
Ali Saidi [Thu, 30 Oct 2014 04:18:24 +0000 (23:18 -0500)]
arm: Mark some miscregs (timer counter) registers at unverifiable.
The checker can't verify timer registers, so it should just grab the version
from the executing CPU, otherwise it could get a larger value and diverge
execution.
Ali Saidi [Thu, 30 Oct 2014 04:18:24 +0000 (23:18 -0500)]
cpu: Add support to checker for CACHE_BLOCK_ZERO commands.
The checker didn't know how to properly validate these new commands.
Andrew Bardsley [Thu, 30 Oct 2014 04:18:24 +0000 (23:18 -0500)]
cpu: Fix barrier push to store buffer when full bug in Minor
This patch fixes a bug where a completing load or store which is also a
barrier can push a barrier into the store buffer without first checking
that there is a free slot.
The bug was not fatal but would print a warning that the store buffer
was full when inserting.
Curtis Dunham [Tue, 21 Oct 2014 22:04:41 +0000 (17:04 -0500)]
mem: don't inhibit WriteInv's or defer snoops on their MSHRs
WriteInvalidate semantics depend on the unconditional writeback
or they won't complete. Also, there's no point in deferring snoops
on their MSHRs, as they don't get new data at the end of their life
cycle the way other transactions do.
Add comment in the cache about a minor inefficiency re: WriteInvalidate.
Curtis Dunham [Thu, 30 Oct 2014 04:18:24 +0000 (23:18 -0500)]
mem: have WriteInvalidate obsolete MSHRs
Since WriteInvalidate directly writes into the cache, it can
create tricky timing interleavings with reads and writes to the
same cache line that haven't yet completed. This patch ensures
that these requests, when completed, don't overwrite the newer
data from the WriteInvalidate.
Steve Reinhardt [Tue, 2 Sep 2014 21:07:50 +0000 (16:07 -0500)]
syscall_emul: add retry flag to SyscallReturn
This hook allows blocking emulated system calls to indicate
that they would block, but return control to the simulator
so that the simulation does not hang. The actual retry
functionality requires additional support, to be provided
in a future changeset.
Steve Reinhardt [Wed, 22 Oct 2014 22:53:34 +0000 (15:53 -0700)]
syscall_emul: minor style fix to LiveProcess constructor
Steve Reinhardt [Wed, 22 Oct 2014 22:53:34 +0000 (15:53 -0700)]
syscall_emul: devirtualize BaseBufferArg methods
Not clear why they were marked virtual to begin with,
but that doesn't appear to be necessary.
Steve Reinhardt [Wed, 22 Oct 2014 22:53:34 +0000 (15:53 -0700)]
syscall_emul: Put BufferArg classes in a separate header.
Move the BufferArg classes that support syscall buffer args
(i.e., pointers into simulated user space) out of syscall_emul.hh
and into a new header syscall_emul_buf.hh so they are accessible
to emulated driver implementations.
Take the opportunity to add some comments as well.
Steve Reinhardt [Wed, 22 Oct 2014 22:53:34 +0000 (15:53 -0700)]
syscall_emul: add EmulatedDriver object
Fake SE-mode device drivers can now be added by
deriving from this abstract object.
Nilay Vaish [Wed, 22 Oct 2014 20:59:57 +0000 (15:59 -0500)]
sim: revert
6709bbcf564d
The identifier SYS_getdents is not available on Mac OS X. Therefore, its use
results in compilation failure. It seems there is no straight forward way to
implement the system call getdents using readdir() or similar C functions.
Hence the commit
6709bbcf564d is being rolled back.
Andreas Hansson [Mon, 20 Oct 2014 22:03:56 +0000 (18:03 -0400)]
x86: Fixes to avoid LTO warnings
This patch fixes a few minor issues that caused link-time warnings
when using LTO, mainly for x86. The most important change is how the
syscall array is created. Previously gcc and clang would complain that
the declaration and definition types did not match. The organisation
is now changed to match how it is done for ARM, moving the code that
was previously in syscalls.cc into process.cc, and having a class
variable pointing to the static array.
With these changes, there are no longer any warnings using gcc 4.6.3
with LTO.
Andreas Hansson [Mon, 20 Oct 2014 22:03:55 +0000 (18:03 -0400)]
misc: Use gmtime for conversion to UTC to avoid getenv/setenv
This patch changes how we turn time into UTC. Previously we
manipulated the TZ environment variable, but this has issues as the
strings that are manipulated could be tainted (see e.g. CERT
ENV34-C). Now we simply rely on the built-in gmtime function and avoid
touching getenv/setenv all together.
Omar Naji [Mon, 20 Oct 2014 22:03:55 +0000 (18:03 -0400)]
mem: Fix DRAM activationlLimit bug
Ensure that we do the proper event scheduling also when the activation
limit is disabled.
Andreas Hansson [Mon, 20 Oct 2014 22:03:54 +0000 (18:03 -0400)]
base: Fix for stats node on gcc < 4.6.3
This patch adds an explicit function to get the underlying node as gcc
4.6.1 and 4.6.2 have issues otherwise.
Andreas Hansson [Mon, 20 Oct 2014 22:03:53 +0000 (18:03 -0400)]
ext: Bump DRAMPower to avoid compilation issues
This patch bumps DRAMPower to commit
19433a6897ede4bbb19b06694faa8589b5a6569a which contains a small fix
for clang, and a work-around for LTO with gcc 4.6.
Omar Naji [Mon, 20 Oct 2014 22:03:52 +0000 (18:03 -0400)]
mem: Add DRAM device size and check against config
This patch adds the size of the DRAM device to the DRAM config. It
also compares the actual DRAM size (calculated using information from
the config) to the size defined in the system. If these two values do
not match gem5 will print a warning. In order to do correct DRAM
research the size of the memory defined in the system should match the
size of the DRAM in the config. The timing and current parameters
found in the DRAM configs are defined for a DRAM device with a
specific size and would differ for another device with a different
size.
Nilay Vaish [Mon, 20 Oct 2014 21:48:19 +0000 (16:48 -0500)]
stats: updates due to previous mmap and exit_group patches.
Nilay Vaish [Mon, 20 Oct 2014 21:47:55 +0000 (16:47 -0500)]
cpu: o3: corrects base FP and CC register index in removeThread()
Tom Jablin [Mon, 20 Oct 2014 21:45:25 +0000 (16:45 -0500)]
sim: invalid alignment checks in mmap and mremap
Presently, the alignment checks in the mmap and mremap implementations
in syscall_emul.hh are wrong. The checks are implemented as:
if ((start % TheISA::PageBytes) != 0 ||
(length % TheISA::PageBytes) != 0) {
warn("mmap failing: arguments not page-aligned: "
"start 0x%x length 0x%x",
start, length);
return -EINVAL;
}
This checks that both the start and the length arguments of the mmap
syscall are checked for page-alignment. However, the POSIX specification says:
The off argument is constrained to be aligned and sized according to the value
returned by sysconf() when passed _SC_PAGESIZE or _SC_PAGE_SIZE. When MAP_FIXED
is specified, the application shall ensure that the argument addr also meets
these constraints. The implementation performs mapping operations over whole
pages. Thus, while the argument len need not meet a size or alignment
constraint, the implementation shall include, in any mapping operation, any
partial page specified by the range [pa,pa+len).
So the length parameter should not be checked for page-alignment. By contrast,
the current implementation fails to check the offset argument, which must be
page aligned.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Michael Adler [Mon, 20 Oct 2014 21:45:08 +0000 (16:45 -0500)]
sim: mmap: correct behavior for fixed address
Change mmap fixed address request to return an error if the mapping is
impossible due to conflict instead of what I believe used to be silent
corruption.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Michael Adler [Mon, 20 Oct 2014 21:44:53 +0000 (16:44 -0500)]
sim: implement getdents/getdents64 in user mode
Has been tested only for alpha.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
x86: syscall: implementation of exit_group
On exit_group syscall, we used to exit the simulator. But now we will only
halt the execution of threads that belong to the group.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Andreas Hansson [Thu, 16 Oct 2014 09:50:01 +0000 (05:50 -0400)]
mem: Modernise PhysicalMemory with C++11 features
Bring the PhysicalMemory up-to-date by making use of range-based for
loops and vector intialisation where possible.
Andreas Hansson [Thu, 16 Oct 2014 09:49:59 +0000 (05:49 -0400)]
misc: Move AddrRangeList from port.hh to addr_range.hh
The new location seems like a better fit. The iterator typedefs are
removed in favour of using C++11 auto.
Andreas Sandberg [Thu, 16 Oct 2014 09:49:58 +0000 (05:49 -0400)]
ext: Update fputils to rev
6a47fd8358
This patch updates fputils to the latest revision (
6a47fd8358) from
the upstream repository (github.com/andysan/fputils). Most notably,
this includes changes that export a limited set of 64-bit float
manipulation and avoids a warning about unused 64-bit floats in clang.
Geoffrey Blake [Thu, 16 Oct 2014 09:49:57 +0000 (05:49 -0400)]
dev: refactor pci config space for sysfs scanning
Sysfs on ubuntu scrapes the entire PCI config space
when it discovers a device using 4 byte accesses.
This was not supported by our devices, in particular the NIC
that implemented the extended PCI config space. This change
allows the extended PCI config space to be accessed by
sysfs properly.