Andreas Hansson [Tue, 26 May 2015 07:21:40 +0000 (03:21 -0400)]
base: Allow multiple interleaved ranges
This patch changes how the address range calculates intersection such
that a system can have a number of non-overlapping interleaved ranges
without complaining. Without this patch we end up with a panic.
Andreas Hansson [Tue, 26 May 2015 07:21:39 +0000 (03:21 -0400)]
stats: Update MinorCPU regressions after accounting fix
Andrew Bardsley [Tue, 26 May 2015 07:21:37 +0000 (03:21 -0400)]
cpu: Fix a bug in counting issued instructions in MinorCPU
The MinorCPU would count bubbles in Execute::issue as part of
the num_insts_issued and so sometimes reach the instruction
issue limit incorrectly.
Fixed by checking for a bubble in one new place.
Giacomo Gabrielli [Tue, 26 May 2015 07:21:35 +0000 (03:21 -0400)]
arm: Implement some missing syscalls (SE mode)
Adding a few syscalls that were previously considered unimplemented.
Andreas Hansson [Tue, 26 May 2015 07:21:34 +0000 (03:21 -0400)]
ruby: Deprecation warning for RubyMemoryControl
A step towards removing RubyMemoryControl and shift users to
DRAMCtrl. The latter is faster, more representative, very versatile,
and is integrated with power models.
Andreas Sandberg [Sat, 23 May 2015 12:50:57 +0000 (13:50 +0100)]
arm, stats: Update stats to reflect changes to generic timer
The addition of a virtual timer affects stats in minor and o3.
Andreas Sandberg [Sat, 23 May 2015 12:46:56 +0000 (13:46 +0100)]
arm, dev: Add support for a memory mapped generic timer
There are cases when we don't want to use a system register mapped
generic timer, but can't use the SP804. For example, when using KVM on
aarch64, we want to intercept accesses to the generic timer, but can't
do so if it is using the system register interface. In such cases,
we need to use a memory-mapped generic timer.
This changeset adds a device model that implements the memory mapped
generic timer interface. The current implementation only supports a
single frame (i.e., one virtual timer and one physical timer).
Andreas Sandberg [Sat, 23 May 2015 12:46:54 +0000 (13:46 +0100)]
arm: Get rid of pointless have_generic_timer param
The ArmSystem class has a parameter to indicate whether it is
configured to use the generic timer extension or not. This parameter
doesn't affect any feature flags in the current implementation and is
therefore completely unnecessary. In fact, we usually don't set it
even if a system has a generic timer. If we ever need to check if
there is a generic timer present, we should just request a pointer and
check if it is non-null instead.
Andreas Sandberg [Sat, 23 May 2015 12:46:53 +0000 (13:46 +0100)]
dev, arm: Add virtual timers to the generic timer model
The generic timer model currently does not support virtual
counters. Virtual and physical counters both tick with the same
frequency. However, virtual timers allow a hypervisor to set an offset
that is subtracted from the counter when it is read. This enables the
hypervisor to present a time base that ticks with virtual time in the
VM (i.e., doesn't tick when the VM isn't running). Modern Linux
kernels generally assume that virtual counters exist and try to use
them by default.
Andreas Sandberg [Sat, 23 May 2015 12:46:52 +0000 (13:46 +0100)]
dev, arm: Refactor and clean up the generic timer model
This changeset cleans up the generic timer a bit and moves most of the
register juggling from the ISA code into a separate class in the same
source file as the rest of the generic timer. It also removes the
assumption that there is always 8 or fewer CPUs in the system. Instead
of having a fixed limit, we now instantiate per-core timers as they
are requested. This is all in preparation for other patches that add
support for virtual timers and a memory mapped interface.
Andreas Sandberg [Sat, 23 May 2015 12:37:22 +0000 (13:37 +0100)]
kvm: Fix dumping code for large registers
The register dumping code in kvm tries to print the bytes in large
registers (128 bits and larger) instead of printing them as hex. This
changeset fixes that.
Andreas Sandberg [Sat, 23 May 2015 12:37:20 +0000 (13:37 +0100)]
kvm, x86: Guard x86-specific APIs in KvmVM
Protect x86-specific APIs in KvmVM with compile-time guards to avoid
breaking ARM builds.
Andreas Sandberg [Sat, 23 May 2015 12:37:18 +0000 (13:37 +0100)]
build: Don't test for KVM xsave support on ARM
The current build tests for KVM unconditionally check for xsave
support. This obviously never works on ARM since xsave is
x86-specific. This changeset refactors the build tests probing for KVM
support and moves the xsave test to an x86-specific section of
is_isa_kvm_compatible().
Andreas Sandberg [Sat, 23 May 2015 12:37:04 +0000 (13:37 +0100)]
arm: Workaround incorrect HDLCD register order in kernel
Some versions of the kernel incorrectly swap the red and blue color
select registers. This changeset adds a workaround for that by
swapping them when instantiating a PixelConverter.
Andreas Sandberg [Sat, 23 May 2015 12:37:03 +0000 (13:37 +0100)]
base: Redesign internal frame buffer handling
Currently, frame buffer handling in gem5 is quite ad hoc. In practice,
we pass around naked pointers to raw pixel data and expect consumers
to convert frame buffers using the (broken) VideoConverter.
This changeset completely redesigns the way we handle frame buffers
internally. In summary, it fixes several color conversion bugs, adds
support for more color formats (e.g., big endian), and makes the code
base easier to follow.
In the new world, gem5 always represents pixel data using the Pixel
struct when pixels need to be passed between different classes (e.g.,
a display controller and the VNC server). Producers of entire frames
(e.g., display controllers) should use the FrameBuffer class to
represent a frame.
Frame producers are expected to create one instance of the FrameBuffer
class in their constructors and register it with its consumers
once. Consumers are expected to check the dimensions of the frame
buffer when they consume it.
Conversion between the external representation and the internal
representation is supported for all common "true color" RGB formats of
up to 32-bit color depth. The external pixel representation is
expected to be between 1 and 4 bytes in either big endian or little
endian. Color channels are assumed to be contiguous ranges of bits
within each pixel word. The external pixel value is scaled to an 8-bit
internal representation using a floating multiplication to map it to
the entire 8-bit range.
Andreas Sandberg [Sat, 23 May 2015 12:37:01 +0000 (13:37 +0100)]
base: Clean up bitmap generation code
The bitmap generation code is hard to follow and incorrectly uses the
size of an enum member to calculate the size of a pixel. This
changeset cleans up the code and adds some documentation.
Joel Hestness [Tue, 19 May 2015 15:56:51 +0000 (10:56 -0500)]
ruby: Fix RubySystem warm-up and cool-down scope
The processes of warming up and cooling down Ruby caches are simulation-wide
processes, not just RubySystem instance-specific processes. Thus, the warm-up
and cool-down variables should be globally visible to any Ruby components
participating in either process. Make these variables static members and track
the warm-up and cool-down processes as appropriate.
This patch also has two side benefits:
1) It removes references to the RubySystem g_system_ptr, which are problematic
for allowing multiple RubySystem instances in a single simulation. Warmup and
cooldown variables being static (global) reduces the need for instance-specific
dereferences through the RubySystem.
2) From the AbstractController, it removes local RubySystem pointers, which are
used inconsistently with other uses of the RubySystem: 11 other uses reference
the RubySystem with the g_system_ptr. Only sequencers have local pointers.
Andreas Hansson [Fri, 15 May 2015 17:40:01 +0000 (13:40 -0400)]
arm: Identify table-walker requests
This patch ensures all page-table walks are flagged as such.
Andreas Hansson [Fri, 15 May 2015 17:39:53 +0000 (13:39 -0400)]
misc: Appease gcc 5.1
Three minor issues are resolved:
1. Apparently gcc 5.1 does not like negation of booleans followed by
bitwise AND.
2. Somehow the compiler also gets confused and warns about
NoopMachInst being unused (removing it causes compilation errors
though). Most likely a compiler bug.
3. There seems to be a number of instances where loop unrolling causes
false positives for the array-bounds check. For now, switch to
std::array. Potentially we could disable the warning for newer gcc
versions, but switching to std::array is probably a good move in
any case.
Andreas Sandberg [Fri, 15 May 2015 17:39:44 +0000 (13:39 -0400)]
sim: Don't clear the active CPU vector in System::initState
The system class currently clears the vector of active CPUs in
initState(). CPUs are added to the list by registerThreadContext()
which is called from BaseCPU::init(). This obviously breaks when the
System object is initialized after the CPUs. This changeset removes
the offending clear() call since the list will be empty after it has
been instantiated anyway.
Andreas Hansson [Fri, 15 May 2015 17:38:46 +0000 (13:38 -0400)]
config: Use null memory for DRAM sweep script
Do not waste time when we do not care about the data.
Wendy Elsasser [Fri, 15 May 2015 17:38:45 +0000 (13:38 -0400)]
config: Add new MemConfig options to DRAM sweep script
Update script to match current MemConfig options with
external_memory_system option set to 0.
Steve Reinhardt [Tue, 5 May 2015 16:25:59 +0000 (09:25 -0700)]
syscall_emul: fix warn_once behavior
The current ignoreWarnOnceFunc doesn't really work as expected,
since it will only generate one warning total, for whichever
"warn-once" syscall is invoked first. This patch fixes that
behavior by keeping a "warned" flag in the SyscallDesc object,
allowing suitably flagged syscalls to warn exactly once per
syscall.
Andreas Hansson [Tue, 5 May 2015 07:22:48 +0000 (03:22 -0400)]
stats, arm: Update stats for missing FPEXC.EN check
Only one regression is affected.
Andreas Hansson [Tue, 5 May 2015 07:22:45 +0000 (03:22 -0400)]
arm: Add missing FPEXC.EN check
Add a missing check to ensure that exceptions are generated properly.
Giacomo Gabrielli [Tue, 5 May 2015 07:22:42 +0000 (03:22 -0400)]
arm: enable DCZVA by default in SE mode
Andreas Hansson [Tue, 5 May 2015 07:22:39 +0000 (03:22 -0400)]
stats: Update stats to reflect cache changes
Stephan Diestelhorst [Tue, 17 Mar 2015 11:50:55 +0000 (11:50 +0000)]
mem: Create a request copy for deferred snoops
Sometimes, we need to defer an express snoop in an MSHR, but the original
request might complete and deallocate the original pkt->req. In those cases,
create a copy of the request so that someone who is inspecting the delayed
snoop can also inspect the request still. All of this is rather hacky, but the
allocation / linking and general life-time management of Packet and Request is
rather tricky. Deleting the copy is another tricky area, testing so far has
shown that the right copy is deleted at the right time.
Andreas Sandberg [Tue, 5 May 2015 07:22:34 +0000 (03:22 -0400)]
arm: Relax ordering for some uncacheable accesses
We currently assume that all uncacheable memory accesses are strictly
ordered. Instead of always enforcing strict ordering, we now only
enforce it if the required memory type is device memory or strongly
ordered memory.
Andreas Sandberg [Tue, 5 May 2015 07:22:33 +0000 (03:22 -0400)]
mem, cpu: Add a separate flag for strictly ordered memory
The Request::UNCACHEABLE flag currently has two different
functions. The first, and obvious, function is to prevent the memory
system from caching data in the request. The second function is to
prevent reordering and speculation in CPU models.
This changeset gives the order/speculation requirement a separate flag
(Request::STRICT_ORDER). This flag prevents CPU models from doing the
following optimizations:
* Speculation: CPU models are not allowed to issue speculative
loads.
* Write combining: CPU models and caches are not allowed to merge
writes to the same cache line.
Note: The memory system may still reorder accesses unless the
UNCACHEABLE flag is set. It is therefore expected that the
STRICT_ORDER flag is combined with the UNCACHEABLE flag to prevent
this behavior.
Andreas Sandberg [Tue, 5 May 2015 07:22:31 +0000 (03:22 -0400)]
mem, alpha: Move Alpha-specific request flags
Move Alpha-specific memory request flags to an architecture-specific
header and map them to the architecture specific flag bit range.
Andreas Hansson [Tue, 5 May 2015 07:22:30 +0000 (03:22 -0400)]
arm: Remove unnecessary boot uncachability
With the recent patches addressing how we deal with uncacheable
accesses there is no longer need for the work arounds put in place to
enforce certain sections of memory to be uncacheable during boot.
Andreas Hansson [Tue, 5 May 2015 07:22:29 +0000 (03:22 -0400)]
mem: Snoop into caches on uncacheable accesses
This patch takes a last step in fixing issues related to uncacheable
accesses. We do not separate uncacheable memory from uncacheable
devices, and in cases where it is really memory, there are valid
scenarios where we need to snoop since we do not support cache
maintenance instructions (yet). On snooping an uncacheable access we
thus provide data if possible. In essence this makes uncacheable
accesses IO coherent.
The snoop filter is also queried to steer the snoops, but not updated
since the uncacheable accesses do not allocate a block.
Andreas Hansson [Tue, 5 May 2015 07:22:27 +0000 (03:22 -0400)]
arch, cpu: Do not forward snoops to table walker
This patch simplifies the overall CPU by changing the TLB caches such
that they do not forward snoops to the table walker port(s). Note that
only ARM and X86 are affected.
There is no reason for the ports to snoop as they do not actually take
any action, and from a performance point of view we are better of not
snooping more than we have to.
Should it at a later point be required to snoop for a particular TLB
design it is easy enough to add it back.
Andreas Hansson [Tue, 5 May 2015 07:22:26 +0000 (03:22 -0400)]
mem: Pass shared downstream through caches
This patch ensures that we pass on information about a packet being
shared (rather than exclusive), when forwarding a packet downstream.
Without this patch there is a risk that a downstream cache considers
the line exclusive when it really isn't.
Ali Jafri [Tue, 5 May 2015 07:22:25 +0000 (03:22 -0400)]
mem: Add forward snoop check for HardPFReqs
We should always check whether the cache is supposed to be forwarding snoops
before generating snoops.
Andreas Hansson [Tue, 5 May 2015 07:22:24 +0000 (03:22 -0400)]
mem: Add missing stats update for uncacheable MSHRs
This patch adds a missing counter update for the uncacheable
accesses. By updating this counter we also get a meaningful average
latency for uncacheable accesses (previously inf).
Andreas Hansson [Tue, 5 May 2015 07:22:22 +0000 (03:22 -0400)]
mem: Tidy up BaseCache parameters
This patch simply tidies up the BaseCache parameters and removes the
unused "two_queue" parameter.
David Guillen [Tue, 5 May 2015 07:22:21 +0000 (03:22 -0400)]
mem: Remove templates in cache model
This patch changes the cache implementation to rely on virtual methods
rather than using the replacement policy as a template argument.
There is no impact on the simulation performance, and overall the
changes make it easier to modify (and subclass) the cache and/or
replacement policy.
Andreas Hansson [Tue, 5 May 2015 07:22:19 +0000 (03:22 -0400)]
cpu: Work around gcc 4.9 issues with Num_OpClasses
This patch fixes a recent issue with gcc 4.9 (and possibly more) being
convinced that indices outside the array bounds are used when
initialising the FUPool members.
Andreas Hansson [Tue, 5 May 2015 07:22:17 +0000 (03:22 -0400)]
stats: Bring regression stats in line with actual behaviour
Nilay Vaish [Thu, 30 Apr 2015 19:17:43 +0000 (14:17 -0500)]
stats: arm: updates
Nilay Vaish [Thu, 30 Apr 2015 03:35:23 +0000 (22:35 -0500)]
stats: x86: updates due to change in div latency
Ruslan Bukin [Thu, 30 Apr 2015 03:35:23 +0000 (22:35 -0500)]
arch, base, dev, kern, sym: FreeBSD support
This adds support for FreeBSD/aarch64 FS and SE mode (basic set of syscalls only)
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Rizwana Begum [Thu, 30 Apr 2015 03:35:22 +0000 (22:35 -0500)]
mem: Simplify page close checks for adaptive policies
Both open_adaptive and close_adaptive page polices keep the page
open if a row hit is found. If a row hit is not found, close_adaptive
page policy precharges the row, and open_adaptive policy precharges
the row only if there is a bank conflict request waiting in the queue.
This patch makes the checks for above conditions simpler.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Nilay Vaish [Thu, 30 Apr 2015 03:35:22 +0000 (22:35 -0500)]
ruby: set: replace long by unsigned long
UBSan complains about negative value being shifted
Nilay Vaish [Thu, 30 Apr 2015 03:35:22 +0000 (22:35 -0500)]
cpu: o3: replace issueLatency with bool pipelined
Currently, each op class has a parameter issueLat that denotes the cycles after
which another op of the same class can be issued. As of now, this latency can
either be one cycle (fully pipelined) or same as execution latency of the op
(not at all pipelined). The fact that issueLat is a parameter of type Cycles
makes one believe that it can be set to any value. To avoid the confusion, the
parameter is being renamed as 'pipelined' with type boolean. If set to true,
the op would execute in a fully pipelined fashion. Otherwise, it would execute
in an unpipelined fashion.
Nilay Vaish [Thu, 30 Apr 2015 03:35:22 +0000 (22:35 -0500)]
cpu: o3: single cycle default div microop latency on x86
This patch sets the default latency of the division microop to a single cycle
on x86. This is because the division instructions DIV and IDIV have been
implemented as loops of div microops, where each microop computes a single bit
of the quotient.
Nilay Vaish [Thu, 30 Apr 2015 03:35:22 +0000 (22:35 -0500)]
x86: change divide-by-zero fault to divide-error
Same exception is raised whether division with zero is performed or the
quotient is greater than the maximum value that the provided space can hold.
Divide-by-Zero is the AMD terminology, while Divide-Error is Intel's.
Andreas Hansson [Fri, 24 Apr 2015 07:30:08 +0000 (03:30 -0400)]
misc: Appease gcc 5.1 without moving GDB_REG_BYTES
This patch rolls back the move of the GDB_REG_BYTES constant, and
instead adds M5_VAR_USED.
bpotter [Thu, 23 Apr 2015 20:40:18 +0000 (13:40 -0700)]
config: enable setting SE-mode environment variables from file
Rene de Jong [Thu, 23 Apr 2015 17:37:50 +0000 (13:37 -0400)]
arm, dev: Add a UFS device
This patch introduces a UFS host controller and a UFS device. More
information about the UFS standard can be found at the JEDEC site:
http://www.jedec.org/standards-documents/results/jesd220
Note that the model does not implement the complete standard, and as
such is not an actual implementation of UFS. The following SCSI
commands are implemented: inquiry, read, read capacity, report LUNs,
start/stop, test unit ready, verify, write, format unit, send
diagnostic, synchronize cache, mode select, mode sense, request sense,
unmap, write buffer and read buffer. This is sufficient for usage with
Linux and Android.
To interact with this model a kernel version 3.9 or above is
needed.
Rene de Jong [Thu, 23 Apr 2015 17:37:49 +0000 (13:37 -0400)]
arm, dev: Add a NAND flash timing model
This adds a NAND flash timing model. This model takes the number of
planes into account and is ultimately intended to be used as a
high-level performance model for any device using flash. To access the
memory, use either readMemory or writeMemory.
To make use of the model you will need an interface model
such as UFSHostDevice, which is part of a separate patch.
At the moment the flash device is part of the ARM device tree since
the only use if the UFSHostDevice, and that in turn relies on the ARM
GIC.
Peter Enns [Thu, 23 Apr 2015 17:37:48 +0000 (13:37 -0400)]
dev: Add support for i2c devices
This patch adds an I2C bus and base device. I2C is used to connect a
variety of sensors, and this patch serves as a starting point to
enable a range of I2C devices.
Andreas Hansson [Thu, 23 Apr 2015 17:37:46 +0000 (13:37 -0400)]
misc: Appease gcc 5.1
This patch fixes a few small issues to ensure gem5 compiles when using
gcc 5.1.
First, the GDB_REG_BYTES in the RemoteGDB header are, rather
surprisingly, flagged as unused for both ARM and X86. Removing them,
however, causes compilation errors as they are actually used in the
source file. Moving the constant into the class definition fixes the
issue. Possibly a gcc bug.
Second, we have an unused EthPktData constructor using auto_ptr, and
the latter is deprecated. Since the code is never used it is simply
removed.
Steve Reinhardt [Thu, 23 Apr 2015 03:22:29 +0000 (20:22 -0700)]
stats: update for previous changeset
Very small differences in IQ-specific O3 stats.
Brandon Potter [Wed, 22 Apr 2015 14:52:03 +0000 (07:52 -0700)]
cpu: remove conditional check (count > 0) on o3 IQ squashes
The o3 cpu instruction queue model uses the count variable to track the number
of unissued instructions in the queue. Previously, the squash method used
this variable to avoid executing the doSquash method when there were no
unissued instructions in the pipeline. A corner case problem exists when
only issued instructions exist in the pipeline and a squash occurs; the
doSquash code is not invoked and subsequently does not clean up state properly.
Brandon Potter [Wed, 22 Apr 2015 14:51:27 +0000 (07:51 -0700)]
syscall_emul: implement clock_gettime system call
Monir Mozumder [Wed, 22 Apr 2015 14:51:27 +0000 (07:51 -0700)]
syscall_emul: update x86 syscall table
Update table with additional definitions through Linux 3.13.
Brandon Potter [Wed, 22 Apr 2015 14:51:27 +0000 (07:51 -0700)]
syscall_emul: update getrlimit to use warn
Don't use std::cerr directly, and just return EINVAL instead of aborting.
Brandon Potter [Wed, 22 Apr 2015 14:51:27 +0000 (07:51 -0700)]
syscall_emul: fix warning with wrong syscall name
Also nix extra whitespace.
Brandon Potter [Wed, 22 Apr 2015 14:51:27 +0000 (07:51 -0700)]
base: add new ChunkGenerator method to identify last chunk
Steve Reinhardt [Mon, 20 Apr 2015 22:09:43 +0000 (15:09 -0700)]
stats: update a few stats from long O3 runs
Very small changes to iew.predictedNotTakenIncorrect
and iew.branchMispredicts. Looks like similar updates
were committed on April 3 (changeset
235ff1c046df), but
only for the quick tests.
Andreas Hansson [Mon, 20 Apr 2015 16:46:35 +0000 (12:46 -0400)]
cpu: Remove the InOrderCPU from the tree
This patch takes the final step in removing the InOrderCPU from the
tree. Rest in peace.
The MinorCPU is now used to model an in-order microarchitecture, and
long term the MinorCPU will eventually be renamed InOrderCPU.
Andreas Hansson [Mon, 20 Apr 2015 16:46:29 +0000 (12:46 -0400)]
config: Remove memory aliases and rely on class name
Instead of maintaining two lists, rely entirely on the class
name. There is really no point in causing unecessary confusion.
Nilay Vaish [Wed, 15 Apr 2015 21:04:37 +0000 (16:04 -0500)]
Added tag stable_2015_04_15 for changeset
e17949745150
Nilay Vaish [Tue, 14 Apr 2015 16:01:11 +0000 (11:01 -0500)]
stats: x86: changes due to recent patches
The change in 20.parser is from new x87 instructions. The change to
pc-o3-timing is not clear to me. It seems that this test might be invoking
some undefined behavior.
Malek Musleh [Tue, 14 Apr 2015 16:01:10 +0000 (11:01 -0500)]
config, cpu: fix progress interval for switched CPUs
This patch ensures that the CPU progress Event is triggered for the new set of
switched_cpus that get scheduled (e.g. during fast-forwarding). it also avoids
printing the interval state if the cpu is currently switched out.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Dibakar Gope [Mon, 13 Apr 2015 22:33:57 +0000 (17:33 -0500)]
cpu: re-organizes the branch predictor structure.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Nilay Vaish [Mon, 13 Apr 2015 22:33:57 +0000 (17:33 -0500)]
x86: implements x87 mult/div instructions
Lena Olson [Mon, 13 Apr 2015 22:33:57 +0000 (17:33 -0500)]
ruby: allow restoring from checkpoint when using DRAMCtrl
Restoring from a checkpoint with ruby + the DRAMCtrl memory model was not
working, because ruby and DRAMCtrl disagreed on the current tick during warmup.
Since there is no reason to do timing requests during warmup, use functional
requests instead.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Nilay Vaish [Mon, 13 Apr 2015 22:33:57 +0000 (17:33 -0500)]
sim: Use NULL instead of None for testing filenames.
The filenames are initialized with NULL. So the test should be
checking for them to be == NULL instead == None.
Nilay Vaish [Mon, 13 Apr 2015 22:33:57 +0000 (17:33 -0500)]
sim: fix function for emulating dup()
The function was using the host fd to obtain the fd object from the simulated
process.
Curtis Dunham [Wed, 8 Apr 2015 20:56:06 +0000 (15:56 -0500)]
config: Support full-system with SST's memory system
This patch adds an example configuration in ext/sst/tests/ that allows
an SST/gem5 instance to simulate a 4-core AArch64 system with SST's
memHierarchy components providing all the caches and memories.
Curtis Dunham [Wed, 8 Apr 2015 20:56:06 +0000 (15:56 -0500)]
ext: Add SST connector
This patch adds a connector that allows gem5 to be used as a component
in SST (Structural Simulation Toolkit, sst-simulator.org). At a high
level, this allows memory traffic to pass between the two simulators.
SST Links are roughly analogous to gem5 Ports, although Links do not
have a notion of master and slave. This distinction is important to
gem5, so when connecting a gem5 CPU to an SST cache, an ExternalSlave
must be used, and similarly when connecting the memory side of SST cache
to a gem5 port (for memory <-> I/O), an ExternalMaster must be used.
These connectors handle the administrative aspects of gem5
(initialization, simulation, shutdown) as well as translating SST's
MemEvents into gem5 Packets and vice-versa.
Nilay Vaish [Fri, 3 Apr 2015 16:42:11 +0000 (11:42 -0500)]
stats: updates due to recent changesets.
Nikos Nikoleris [Fri, 3 Apr 2015 16:42:10 +0000 (11:42 -0500)]
dev: (un)serialize fix for the RTC and RTC Timer Interrupt events
Restoring from a checkpoint fails if either the RTC or the RTC Timer
Interrrupt event is disabled. The restored machine tried incorrectly
to schedule the next event with negative offset.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Ruslan Bukin [Fri, 3 Apr 2015 16:42:10 +0000 (11:42 -0500)]
sim: correct check for endianess
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Ruslan Bukin [Fri, 3 Apr 2015 16:42:10 +0000 (11:42 -0500)]
dev: Extend access width for IDE control registers
Add 32-bit access width for PrimaryTiming register and 16bit for UDMAControl
register as FreeBSD required.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Nikos Nikoleris [Fri, 3 Apr 2015 16:42:10 +0000 (11:42 -0500)]
cpu: fix system total instructions accounting
The totalInstructions counter is only incremented when the whole instruction is
commited and not on every microop. It was incorrectly reset in atomic and
timing cpus.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>"
Lena Olson [Fri, 3 Apr 2015 16:42:10 +0000 (11:42 -0500)]
x86: fix debug trace output for mwait
When running with the Exec flag, the mwait instruction attempted
to print out its source registers, which were never actually
initialized. This led to sporadic assertion failures when the
value stored there was invalid.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Andreas Hansson [Fri, 27 Mar 2015 08:56:10 +0000 (04:56 -0400)]
arm, configs: Do not forward snoops from I cache
This fix simply tells the I cache to not forward snoops to the fetch
unit (since there is really no reason to do so).
Stephan Diestelhorst [Fri, 27 Mar 2015 08:56:03 +0000 (04:56 -0400)]
mem: Support any number of master-IDs in stride prefetcher
The stride prefetcher had a hardcoded number of contexts (i.e. master-IDs)
that it could handle. Since master IDs need to be unique per system, and
every core, cache etc. requires a separate master port, a static limit on
these does not make much sense.
Instead, this patch adds a small hash map that will map all master IDs to
the right prefetch state and dynamically allocates new state for new master
IDs.
Andreas Hansson [Fri, 27 Mar 2015 08:56:02 +0000 (04:56 -0400)]
mem: Allocate cache writebacks before new MSHRs
This patch changes the order of writeback allocation such that any
writebacks resulting from a tag lookup (e.g. for an uncacheable
access), are added to the writebuffer before any new MSHR entries are
allocated. This ensures that the writebacks logically precedes the new
allocations.
The patch also changes the uncacheable flush to use proper timed (or
atomic) writebacks, as opposed to functional writes.
Andreas Hansson [Fri, 27 Mar 2015 08:56:01 +0000 (04:56 -0400)]
mem: Cleanup flow for uncacheable accesses
This patch simplifies the code dealing with uncacheable timing
accesses, aiming to align it with the existing miss handling. Similar
to what we do in atomic, a timing request now goes through
Cache::access (where the block is also flushed), and then proceeds to
ignore any existing MSHR for the block in question. This unifies the
flow for cacheable and uncacheable accesses, and for atomic and timing.
Andreas Hansson [Fri, 27 Mar 2015 08:56:00 +0000 (04:56 -0400)]
mem: Ignore uncacheable MSHRs when finding matches
This patch changes how we search for matching MSHRs, ignoring any MSHR
that is allocated for an uncacheable access. By doing so, this patch
fixes a corner case in the MSHRs where incorrect data ended up being
copied into a (cacheable) read packet due to a first uncacheable MSHR
target of size 4, followed by a cacheable target to the same MSHR of
size 64. The latter target was filled with nonsense data.
Andreas Hansson [Fri, 27 Mar 2015 08:55:59 +0000 (04:55 -0400)]
mem: Remove redundant allocateUncachedReadBuffer in cache
This patch removes the no-longer-needed
allocateUncachedReadBuffer. Besides the checks it is exactly the same
as allocateMissBuffer and thus provides no value.
Andreas Hansson [Fri, 27 Mar 2015 08:55:57 +0000 (04:55 -0400)]
mem: Modernise MSHR iterators to C++11
This patch updates the iterators in the MSHR and MSHR queues to use
C++11 range-based for loops. It also does a bit of additional house
keeping.
Andreas Hansson [Fri, 27 Mar 2015 08:55:57 +0000 (04:55 -0400)]
tests: Update stats for cache block alignment
Andreas Hansson [Fri, 27 Mar 2015 08:55:55 +0000 (04:55 -0400)]
mem: Align all MSHR entries to block boundaries
This patch aligns all MSHR queue entries to block boundaries to
simplify checks for matches. Previously there were corner cases that
could lead to existing entries not being identified as matches.
There are, rather alarmingly, a few regressions that change with this
patch.
Ali Jafri [Fri, 27 Mar 2015 08:55:54 +0000 (04:55 -0400)]
mem: Rename PREFETCH_SNOOP_SQUASH flag to BLOCK_CACHED
This patch subsumes the PREFETCH_SNOOP_SQUASH flag with the more
generic BLOCK_CACHED flag. Future patches implementing cache eviction
messages can use the BLOCK_CACHED flag in almost the same manner as
hardware prefetches use the PREFETCH_SNOOP_SQUASH flag. The
PREFTECH_SNOOP_FLAG is set if the prefetch target is found in the tags
or the MSHRs in any state, so we are simply replacing calls to
setPrefetchSquashed() with setBlockCached(). The case of where the
prefetch target is found in the writeback MSHRs of upper level caches
continues to be covered by the MEM_INHIBIT flag.
Curtis Dunham [Thu, 26 Mar 2015 15:16:44 +0000 (11:16 -0400)]
sim: Update limit_event reuse to final version
Matching final version on reviewboard.
Andreas Hansson [Thu, 26 Mar 2015 15:16:43 +0000 (11:16 -0400)]
cpu: Fix InstPBTrace inheritance
This patch fixes an issue that prevented gem5 to be built with C++
config and without Python.
Steve Reinhardt [Mon, 23 Mar 2015 23:14:20 +0000 (16:14 -0700)]
mem: rename Locked/LOCKED to LockedRMW/LOCKED_RMW
Makes x86-style locked operations even more distinct from
LLSC operations. Using "locked" by itself should be
obviously ambiguous now.
Steve Reinhardt [Mon, 23 Mar 2015 23:14:19 +0000 (16:14 -0700)]
config: expand '~' and '~user' in paths
Steve Reinhardt [Mon, 23 Mar 2015 23:14:18 +0000 (16:14 -0700)]
misc: quote args in echoed command line
Currently if there are shell special characters in a
command-line argument, you can't copy and paste the
echoed command line onto a shell prompt because the
characters aren't quoted properly. This patch fixes
that problem.
Curtis Dunham [Mon, 23 Mar 2015 10:57:38 +0000 (06:57 -0400)]
config: Add ability to exit simulation after initialization
When using gem5 as a slave simulator, it will not advance the
clock on its own and depends on the master simulator calling
simulate(). This new option lets us use the Python scripts
to do all the configuration while stopping short of actually
simulating anything.
Curtis Dunham [Mon, 23 Mar 2015 10:57:36 +0000 (06:57 -0400)]
sim: Reuse the same limit_event in simulate()
This patch accomplishes two things:
1. Makes simulate()'s GlobalSimLoopExitEvent a singleton reused
across calls. This is slightly more efficient than recreating
it every time.
2. Gives callers to simulate() (especially other simulators) a
foolproof way of knowing that the simulation period ended
successfully by hitting the limit event. They can call
getLimitEvent() and compare it to the return
value of simulate().
This change was motivated by an ongoing effort to integrate gem5
and SST, with SST as the master sim and gem5 as the slave sim.
Andreas Hansson [Mon, 23 Mar 2015 10:57:34 +0000 (06:57 -0400)]
mem: Tidy up Request
This patch does a bit of house keeping, fixing up typos, removing dead
code etc.
Andreas Hansson [Mon, 23 Mar 2015 10:57:31 +0000 (06:57 -0400)]
tests: Final reclassification of quick regressions
A few regressions were still considered long, but finished well within
the 180 seconds. They are only a handful (mostly mcf in atomic).
--HG--
rename : tests/long/fs/10.linux-boot/ref/arm/linux/realview-switcheroo-timing/config.ini => tests/quick/fs/10.linux-boot/ref/arm/linux/realview-switcheroo-timing/config.ini
rename : tests/long/fs/10.linux-boot/ref/arm/linux/realview-switcheroo-timing/simerr => tests/quick/fs/10.linux-boot/ref/arm/linux/realview-switcheroo-timing/simerr
rename : tests/long/fs/10.linux-boot/ref/arm/linux/realview-switcheroo-timing/simout => tests/quick/fs/10.linux-boot/ref/arm/linux/realview-switcheroo-timing/simout
rename : tests/long/fs/10.linux-boot/ref/arm/linux/realview-switcheroo-timing/stats.txt => tests/quick/fs/10.linux-boot/ref/arm/linux/realview-switcheroo-timing/stats.txt
rename : tests/long/fs/10.linux-boot/ref/arm/linux/realview-switcheroo-timing/system.terminal => tests/quick/fs/10.linux-boot/ref/arm/linux/realview-switcheroo-timing/system.terminal
rename : tests/long/se/10.mcf/ref/arm/linux/simple-atomic/chair.cook.ppm => tests/quick/se/10.mcf/ref/arm/linux/simple-atomic/chair.cook.ppm
rename : tests/long/se/10.mcf/ref/arm/linux/simple-atomic/config.ini => tests/quick/se/10.mcf/ref/arm/linux/simple-atomic/config.ini
rename : tests/long/se/10.mcf/ref/arm/linux/simple-atomic/mcf.out => tests/quick/se/10.mcf/ref/arm/linux/simple-atomic/mcf.out
rename : tests/long/se/10.mcf/ref/arm/linux/simple-atomic/simerr => tests/quick/se/10.mcf/ref/arm/linux/simple-atomic/simerr
rename : tests/long/se/10.mcf/ref/arm/linux/simple-atomic/simout => tests/quick/se/10.mcf/ref/arm/linux/simple-atomic/simout
rename : tests/long/se/10.mcf/ref/arm/linux/simple-atomic/stats.txt => tests/quick/se/10.mcf/ref/arm/linux/simple-atomic/stats.txt
rename : tests/long/se/10.mcf/ref/arm/linux/simple-timing/chair.cook.ppm => tests/quick/se/10.mcf/ref/arm/linux/simple-timing/chair.cook.ppm
rename : tests/long/se/10.mcf/ref/arm/linux/simple-timing/config.ini => tests/quick/se/10.mcf/ref/arm/linux/simple-timing/config.ini
rename : tests/long/se/10.mcf/ref/arm/linux/simple-timing/mcf.out => tests/quick/se/10.mcf/ref/arm/linux/simple-timing/mcf.out
rename : tests/long/se/10.mcf/ref/arm/linux/simple-timing/simerr => tests/quick/se/10.mcf/ref/arm/linux/simple-timing/simerr
rename : tests/long/se/10.mcf/ref/arm/linux/simple-timing/simout => tests/quick/se/10.mcf/ref/arm/linux/simple-timing/simout
rename : tests/long/se/10.mcf/ref/arm/linux/simple-timing/stats.txt => tests/quick/se/10.mcf/ref/arm/linux/simple-timing/stats.txt
rename : tests/long/se/10.mcf/ref/sparc/linux/simple-atomic/config.ini => tests/quick/se/10.mcf/ref/sparc/linux/simple-atomic/config.ini
rename : tests/long/se/10.mcf/ref/sparc/linux/simple-atomic/mcf.out => tests/quick/se/10.mcf/ref/sparc/linux/simple-atomic/mcf.out
rename : tests/long/se/10.mcf/ref/sparc/linux/simple-atomic/simerr => tests/quick/se/10.mcf/ref/sparc/linux/simple-atomic/simerr
rename : tests/long/se/10.mcf/ref/sparc/linux/simple-atomic/simout => tests/quick/se/10.mcf/ref/sparc/linux/simple-atomic/simout
rename : tests/long/se/10.mcf/ref/sparc/linux/simple-atomic/stats.txt => tests/quick/se/10.mcf/ref/sparc/linux/simple-atomic/stats.txt
rename : tests/long/se/10.mcf/ref/x86/linux/simple-atomic/config.ini => tests/quick/se/10.mcf/ref/x86/linux/simple-atomic/config.ini
rename : tests/long/se/10.mcf/ref/x86/linux/simple-atomic/mcf.out => tests/quick/se/10.mcf/ref/x86/linux/simple-atomic/mcf.out
rename : tests/long/se/10.mcf/ref/x86/linux/simple-atomic/simerr => tests/quick/se/10.mcf/ref/x86/linux/simple-atomic/simerr
rename : tests/long/se/10.mcf/ref/x86/linux/simple-atomic/simout => tests/quick/se/10.mcf/ref/x86/linux/simple-atomic/simout
rename : tests/long/se/10.mcf/ref/x86/linux/simple-atomic/stats.txt => tests/quick/se/10.mcf/ref/x86/linux/simple-atomic/stats.txt
rename : tests/long/se/10.mcf/test.py => tests/quick/se/10.mcf/test.py
rename : tests/long/se/30.eon/ref/alpha/tru64/simple-atomic/config.ini => tests/quick/se/30.eon/ref/alpha/tru64/simple-atomic/config.ini
rename : tests/long/se/30.eon/ref/alpha/tru64/simple-atomic/simerr => tests/quick/se/30.eon/ref/alpha/tru64/simple-atomic/simerr
rename : tests/long/se/30.eon/ref/alpha/tru64/simple-atomic/simout => tests/quick/se/30.eon/ref/alpha/tru64/simple-atomic/simout
rename : tests/long/se/30.eon/ref/alpha/tru64/simple-atomic/stats.txt => tests/quick/se/30.eon/ref/alpha/tru64/simple-atomic/stats.txt
rename : tests/long/se/30.eon/test.py => tests/quick/se/30.eon/test.py