Rune Holm [Tue, 9 Jun 2015 13:21:16 +0000 (09:21 -0400)]
arm: Delete debug print in initialization of hardware thread
There seems to have been a debug print left in when the original ARMv8
support was merged in. This printout is performed every time you
initialize a hardware thread, and it prints raw pointers, so it always
causes diffs in the regression. This patch removes the debug print.
Rune Holm [Tue, 9 Jun 2015 13:21:15 +0000 (09:21 -0400)]
arm: Fix typo in ldrsh instruction name
ldrsh was typoed as hdrsh, which is a bit annoying when printing
instructions. This patch fixes it.
Andreas Sandberg [Tue, 9 Jun 2015 13:21:14 +0000 (09:21 -0400)]
base: Reset CircleBuf size on flush()
The flush() method in CircleBuf resets the state of the circular
buffer, but fails to set size to zero. This obviously confuses code
that tries to determine the amount of data in the buffer. Set the size
to zero on flush.
Andreas Sandberg [Tue, 9 Jun 2015 13:21:12 +0000 (09:21 -0400)]
dev, arm: Include PIO size in AmbaDmaDevice constructor
Make it possible to specify the size of the PIO space for an AMBA DMA
device. Maintain backwards compatibility and default to zero.
Andreas Hansson [Tue, 9 Jun 2015 13:21:11 +0000 (09:21 -0400)]
scons: Allow GNU assembler version strings with hyphen
Make scons a bit more forgiving when determining the GNU assembler version.
Marco Elver [Sun, 7 Jun 2015 19:02:40 +0000 (14:02 -0500)]
ruby: Fix MESI consistency bug
Fixes missed forward eviction to CPU. With the O3CPU this can lead to load-load
reordering, as the LQ is never notified of the invalidate.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Matthias Jung [Sun, 7 Jun 2015 19:02:40 +0000 (14:02 -0500)]
mem: Add HMC Timing Parameters
A single HMC-2500 x32 model based on:
[1] DRAMSpec: a high-level DRAM bank modelling tool developed at the University
of Kaiserslautern. This high level tool uses RC (resistance-capacitance) and CV
(capacitance-voltage) models to estimate the DRAM bank latency and power
numbers.
[2] A Logic-base Interconnect for Supporting Near Memory Computation in the
Hybrid Memory Cube (E. Azarkhish et. al) Assumed for the HMC model is a 30 nm
technology node. The modelled HMC consists of a 4 Gbit part with 4 layers
connected with TSVs. Each layer has 16 vaults and each vault consists of 2
banks per layer. In order to be able to use the same controller used for 2D
DRAM generations for HMC, the following analogy is done: Channel (DDR) => Vault
(HMC) device_size (DDR) => size of a single layer in a vault ranks per channel
(DDR) => number of layers banks per rank (DDR) => banks per layer devices per
rank (DDR) => devices per layer ( 1 for HMC). The parameters for which no
input is available are inherited from the DDR3 configuration.
Ruslan Bukin ext:(%2C%20Zhang%20Guoye) [Sun, 7 Jun 2015 19:02:40 +0000 (14:02 -0500)]
arch: fix build under MacOSX
put O_DIRECT under ifdefs -- this fixes build for MacOSX.
Also use correct class for arm64 openFlagTable.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Christoph Pfister [Sat, 30 May 2015 11:45:17 +0000 (13:45 +0200)]
mem: addr_mapper: restore old address if request not sent
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Curtis Dunham [Mon, 1 Jun 2015 23:05:11 +0000 (18:05 -0500)]
sim, arm: add checkpoint upgrader for
d02b45a5
The insertion of CONTEXTIDR_EL2 in the ARM miscellaneous registers
obsoletes old checkpoints.
Andreas Sandberg [Mon, 1 Jun 2015 18:44:19 +0000 (19:44 +0100)]
kvm, arm: Add support for aarch64
This changeset adds support for aarch64 in kvm. The CPU module
supports both checkpointing and online CPU model switching as long as
no devices are simulated by the host kernel. It currently has the
following limitations:
* The system register based generic timer can only be simulated by
the host kernel. Workaround: Use a memory mapped timer instead to
simulate the timer in gem5.
* Simulating devices (e.g., the generic timer) in the host kernel
requires that the host kernel also simulates the GIC.
* ID registers in the host and in gem5 must match for switching
between simulated CPUs and KVM. This is particularly important
for ID registers describing memory system capabilities (e.g.,
ASID size, physical address size).
* Switching between a virtualized CPU and a simulated CPU is
currently not supported if in-kernel device emulation is
used. This could be worked around by adding support for switching
to the gem5 (e.g., the KvmGic) side of the device models. A
simpler workaround is to avoid in-kernel device models
altogether.
Andreas Sandberg [Mon, 1 Jun 2015 18:44:17 +0000 (19:44 +0100)]
kvm, arm, dev: Add an in-kernel GIC implementation
This changeset adds a GIC implementation that uses the kernel's
built-in support for simulating the interrupt controller. Since there
is currently no support for state transfer between gem5 and the
kernel, the device model does not support serialization and CPU
switching (which would require switching to a gem5-simulated GIC).
Andreas Sandberg [Mon, 1 Jun 2015 18:43:41 +0000 (19:43 +0100)]
kvm: Handle inst events at the current instruction count
There are cases (particularly when attaching GDB) when instruction
events are scheduled at the current instruction tick. This used to
trigger an assertion error in kvm. This changeset adds a check for
this condition and forces KVM to do a quick entry that completes any
pending IO operations, but does not execute any new instructions,
before servicing the event. We could check if we need to enter KVM at
all, but forcing a quick entry is makes the code slightly cleaner and
does not hurt correctness (performance is hardly an issue in these
cases).
Andreas Sandberg [Mon, 1 Jun 2015 18:43:40 +0000 (19:43 +0100)]
kvm, arm: Move ARM-specific files to arch/arm/kvm/
This changeset moves the ARM-specific KVM CPU implementation to
arch/arm/kvm/. This change is expected to keep the source tree
somewhat cleaner as we start adding support for ARMv8 and KVM
in-kernel interrupt controller simulation.
--HG--
rename : src/cpu/kvm/ArmKvmCPU.py => src/arch/arm/kvm/ArmKvmCPU.py
rename : src/cpu/kvm/arm_cpu.cc => src/arch/arm/kvm/arm_cpu.cc
rename : src/cpu/kvm/arm_cpu.hh => src/arch/arm/kvm/arm_cpu.hh
Curtis Dunham [Tue, 26 May 2015 07:21:45 +0000 (03:21 -0400)]
arm: implement the CONTEXTIDR_EL2 system reg.
Andreas Hansson [Tue, 26 May 2015 07:21:44 +0000 (03:21 -0400)]
arm, stats: Update stats to reflect reduction in misc reg reads
Nathanael Premillieu [Tue, 26 May 2015 07:21:42 +0000 (03:21 -0400)]
arm: Make address translation faster with better caching
This patch adds better caching of the sys regs for AArch64, thus
avoiding unnecessary calls to tc->readMiscReg(MISCREG_CPSR) in the
non-faulting case.
Andreas Hansson [Tue, 26 May 2015 07:21:40 +0000 (03:21 -0400)]
base: Allow multiple interleaved ranges
This patch changes how the address range calculates intersection such
that a system can have a number of non-overlapping interleaved ranges
without complaining. Without this patch we end up with a panic.
Andreas Hansson [Tue, 26 May 2015 07:21:39 +0000 (03:21 -0400)]
stats: Update MinorCPU regressions after accounting fix
Andrew Bardsley [Tue, 26 May 2015 07:21:37 +0000 (03:21 -0400)]
cpu: Fix a bug in counting issued instructions in MinorCPU
The MinorCPU would count bubbles in Execute::issue as part of
the num_insts_issued and so sometimes reach the instruction
issue limit incorrectly.
Fixed by checking for a bubble in one new place.
Giacomo Gabrielli [Tue, 26 May 2015 07:21:35 +0000 (03:21 -0400)]
arm: Implement some missing syscalls (SE mode)
Adding a few syscalls that were previously considered unimplemented.
Andreas Hansson [Tue, 26 May 2015 07:21:34 +0000 (03:21 -0400)]
ruby: Deprecation warning for RubyMemoryControl
A step towards removing RubyMemoryControl and shift users to
DRAMCtrl. The latter is faster, more representative, very versatile,
and is integrated with power models.
Andreas Sandberg [Sat, 23 May 2015 12:50:57 +0000 (13:50 +0100)]
arm, stats: Update stats to reflect changes to generic timer
The addition of a virtual timer affects stats in minor and o3.
Andreas Sandberg [Sat, 23 May 2015 12:46:56 +0000 (13:46 +0100)]
arm, dev: Add support for a memory mapped generic timer
There are cases when we don't want to use a system register mapped
generic timer, but can't use the SP804. For example, when using KVM on
aarch64, we want to intercept accesses to the generic timer, but can't
do so if it is using the system register interface. In such cases,
we need to use a memory-mapped generic timer.
This changeset adds a device model that implements the memory mapped
generic timer interface. The current implementation only supports a
single frame (i.e., one virtual timer and one physical timer).
Andreas Sandberg [Sat, 23 May 2015 12:46:54 +0000 (13:46 +0100)]
arm: Get rid of pointless have_generic_timer param
The ArmSystem class has a parameter to indicate whether it is
configured to use the generic timer extension or not. This parameter
doesn't affect any feature flags in the current implementation and is
therefore completely unnecessary. In fact, we usually don't set it
even if a system has a generic timer. If we ever need to check if
there is a generic timer present, we should just request a pointer and
check if it is non-null instead.
Andreas Sandberg [Sat, 23 May 2015 12:46:53 +0000 (13:46 +0100)]
dev, arm: Add virtual timers to the generic timer model
The generic timer model currently does not support virtual
counters. Virtual and physical counters both tick with the same
frequency. However, virtual timers allow a hypervisor to set an offset
that is subtracted from the counter when it is read. This enables the
hypervisor to present a time base that ticks with virtual time in the
VM (i.e., doesn't tick when the VM isn't running). Modern Linux
kernels generally assume that virtual counters exist and try to use
them by default.
Andreas Sandberg [Sat, 23 May 2015 12:46:52 +0000 (13:46 +0100)]
dev, arm: Refactor and clean up the generic timer model
This changeset cleans up the generic timer a bit and moves most of the
register juggling from the ISA code into a separate class in the same
source file as the rest of the generic timer. It also removes the
assumption that there is always 8 or fewer CPUs in the system. Instead
of having a fixed limit, we now instantiate per-core timers as they
are requested. This is all in preparation for other patches that add
support for virtual timers and a memory mapped interface.
Andreas Sandberg [Sat, 23 May 2015 12:37:22 +0000 (13:37 +0100)]
kvm: Fix dumping code for large registers
The register dumping code in kvm tries to print the bytes in large
registers (128 bits and larger) instead of printing them as hex. This
changeset fixes that.
Andreas Sandberg [Sat, 23 May 2015 12:37:20 +0000 (13:37 +0100)]
kvm, x86: Guard x86-specific APIs in KvmVM
Protect x86-specific APIs in KvmVM with compile-time guards to avoid
breaking ARM builds.
Andreas Sandberg [Sat, 23 May 2015 12:37:18 +0000 (13:37 +0100)]
build: Don't test for KVM xsave support on ARM
The current build tests for KVM unconditionally check for xsave
support. This obviously never works on ARM since xsave is
x86-specific. This changeset refactors the build tests probing for KVM
support and moves the xsave test to an x86-specific section of
is_isa_kvm_compatible().
Andreas Sandberg [Sat, 23 May 2015 12:37:04 +0000 (13:37 +0100)]
arm: Workaround incorrect HDLCD register order in kernel
Some versions of the kernel incorrectly swap the red and blue color
select registers. This changeset adds a workaround for that by
swapping them when instantiating a PixelConverter.
Andreas Sandberg [Sat, 23 May 2015 12:37:03 +0000 (13:37 +0100)]
base: Redesign internal frame buffer handling
Currently, frame buffer handling in gem5 is quite ad hoc. In practice,
we pass around naked pointers to raw pixel data and expect consumers
to convert frame buffers using the (broken) VideoConverter.
This changeset completely redesigns the way we handle frame buffers
internally. In summary, it fixes several color conversion bugs, adds
support for more color formats (e.g., big endian), and makes the code
base easier to follow.
In the new world, gem5 always represents pixel data using the Pixel
struct when pixels need to be passed between different classes (e.g.,
a display controller and the VNC server). Producers of entire frames
(e.g., display controllers) should use the FrameBuffer class to
represent a frame.
Frame producers are expected to create one instance of the FrameBuffer
class in their constructors and register it with its consumers
once. Consumers are expected to check the dimensions of the frame
buffer when they consume it.
Conversion between the external representation and the internal
representation is supported for all common "true color" RGB formats of
up to 32-bit color depth. The external pixel representation is
expected to be between 1 and 4 bytes in either big endian or little
endian. Color channels are assumed to be contiguous ranges of bits
within each pixel word. The external pixel value is scaled to an 8-bit
internal representation using a floating multiplication to map it to
the entire 8-bit range.
Andreas Sandberg [Sat, 23 May 2015 12:37:01 +0000 (13:37 +0100)]
base: Clean up bitmap generation code
The bitmap generation code is hard to follow and incorrectly uses the
size of an enum member to calculate the size of a pixel. This
changeset cleans up the code and adds some documentation.
Joel Hestness [Tue, 19 May 2015 15:56:51 +0000 (10:56 -0500)]
ruby: Fix RubySystem warm-up and cool-down scope
The processes of warming up and cooling down Ruby caches are simulation-wide
processes, not just RubySystem instance-specific processes. Thus, the warm-up
and cool-down variables should be globally visible to any Ruby components
participating in either process. Make these variables static members and track
the warm-up and cool-down processes as appropriate.
This patch also has two side benefits:
1) It removes references to the RubySystem g_system_ptr, which are problematic
for allowing multiple RubySystem instances in a single simulation. Warmup and
cooldown variables being static (global) reduces the need for instance-specific
dereferences through the RubySystem.
2) From the AbstractController, it removes local RubySystem pointers, which are
used inconsistently with other uses of the RubySystem: 11 other uses reference
the RubySystem with the g_system_ptr. Only sequencers have local pointers.
Andreas Hansson [Fri, 15 May 2015 17:40:01 +0000 (13:40 -0400)]
arm: Identify table-walker requests
This patch ensures all page-table walks are flagged as such.
Andreas Hansson [Fri, 15 May 2015 17:39:53 +0000 (13:39 -0400)]
misc: Appease gcc 5.1
Three minor issues are resolved:
1. Apparently gcc 5.1 does not like negation of booleans followed by
bitwise AND.
2. Somehow the compiler also gets confused and warns about
NoopMachInst being unused (removing it causes compilation errors
though). Most likely a compiler bug.
3. There seems to be a number of instances where loop unrolling causes
false positives for the array-bounds check. For now, switch to
std::array. Potentially we could disable the warning for newer gcc
versions, but switching to std::array is probably a good move in
any case.
Andreas Sandberg [Fri, 15 May 2015 17:39:44 +0000 (13:39 -0400)]
sim: Don't clear the active CPU vector in System::initState
The system class currently clears the vector of active CPUs in
initState(). CPUs are added to the list by registerThreadContext()
which is called from BaseCPU::init(). This obviously breaks when the
System object is initialized after the CPUs. This changeset removes
the offending clear() call since the list will be empty after it has
been instantiated anyway.
Andreas Hansson [Fri, 15 May 2015 17:38:46 +0000 (13:38 -0400)]
config: Use null memory for DRAM sweep script
Do not waste time when we do not care about the data.
Wendy Elsasser [Fri, 15 May 2015 17:38:45 +0000 (13:38 -0400)]
config: Add new MemConfig options to DRAM sweep script
Update script to match current MemConfig options with
external_memory_system option set to 0.
Steve Reinhardt [Tue, 5 May 2015 16:25:59 +0000 (09:25 -0700)]
syscall_emul: fix warn_once behavior
The current ignoreWarnOnceFunc doesn't really work as expected,
since it will only generate one warning total, for whichever
"warn-once" syscall is invoked first. This patch fixes that
behavior by keeping a "warned" flag in the SyscallDesc object,
allowing suitably flagged syscalls to warn exactly once per
syscall.
Andreas Hansson [Tue, 5 May 2015 07:22:48 +0000 (03:22 -0400)]
stats, arm: Update stats for missing FPEXC.EN check
Only one regression is affected.
Andreas Hansson [Tue, 5 May 2015 07:22:45 +0000 (03:22 -0400)]
arm: Add missing FPEXC.EN check
Add a missing check to ensure that exceptions are generated properly.
Giacomo Gabrielli [Tue, 5 May 2015 07:22:42 +0000 (03:22 -0400)]
arm: enable DCZVA by default in SE mode
Andreas Hansson [Tue, 5 May 2015 07:22:39 +0000 (03:22 -0400)]
stats: Update stats to reflect cache changes
Stephan Diestelhorst [Tue, 17 Mar 2015 11:50:55 +0000 (11:50 +0000)]
mem: Create a request copy for deferred snoops
Sometimes, we need to defer an express snoop in an MSHR, but the original
request might complete and deallocate the original pkt->req. In those cases,
create a copy of the request so that someone who is inspecting the delayed
snoop can also inspect the request still. All of this is rather hacky, but the
allocation / linking and general life-time management of Packet and Request is
rather tricky. Deleting the copy is another tricky area, testing so far has
shown that the right copy is deleted at the right time.
Andreas Sandberg [Tue, 5 May 2015 07:22:34 +0000 (03:22 -0400)]
arm: Relax ordering for some uncacheable accesses
We currently assume that all uncacheable memory accesses are strictly
ordered. Instead of always enforcing strict ordering, we now only
enforce it if the required memory type is device memory or strongly
ordered memory.
Andreas Sandberg [Tue, 5 May 2015 07:22:33 +0000 (03:22 -0400)]
mem, cpu: Add a separate flag for strictly ordered memory
The Request::UNCACHEABLE flag currently has two different
functions. The first, and obvious, function is to prevent the memory
system from caching data in the request. The second function is to
prevent reordering and speculation in CPU models.
This changeset gives the order/speculation requirement a separate flag
(Request::STRICT_ORDER). This flag prevents CPU models from doing the
following optimizations:
* Speculation: CPU models are not allowed to issue speculative
loads.
* Write combining: CPU models and caches are not allowed to merge
writes to the same cache line.
Note: The memory system may still reorder accesses unless the
UNCACHEABLE flag is set. It is therefore expected that the
STRICT_ORDER flag is combined with the UNCACHEABLE flag to prevent
this behavior.
Andreas Sandberg [Tue, 5 May 2015 07:22:31 +0000 (03:22 -0400)]
mem, alpha: Move Alpha-specific request flags
Move Alpha-specific memory request flags to an architecture-specific
header and map them to the architecture specific flag bit range.
Andreas Hansson [Tue, 5 May 2015 07:22:30 +0000 (03:22 -0400)]
arm: Remove unnecessary boot uncachability
With the recent patches addressing how we deal with uncacheable
accesses there is no longer need for the work arounds put in place to
enforce certain sections of memory to be uncacheable during boot.
Andreas Hansson [Tue, 5 May 2015 07:22:29 +0000 (03:22 -0400)]
mem: Snoop into caches on uncacheable accesses
This patch takes a last step in fixing issues related to uncacheable
accesses. We do not separate uncacheable memory from uncacheable
devices, and in cases where it is really memory, there are valid
scenarios where we need to snoop since we do not support cache
maintenance instructions (yet). On snooping an uncacheable access we
thus provide data if possible. In essence this makes uncacheable
accesses IO coherent.
The snoop filter is also queried to steer the snoops, but not updated
since the uncacheable accesses do not allocate a block.
Andreas Hansson [Tue, 5 May 2015 07:22:27 +0000 (03:22 -0400)]
arch, cpu: Do not forward snoops to table walker
This patch simplifies the overall CPU by changing the TLB caches such
that they do not forward snoops to the table walker port(s). Note that
only ARM and X86 are affected.
There is no reason for the ports to snoop as they do not actually take
any action, and from a performance point of view we are better of not
snooping more than we have to.
Should it at a later point be required to snoop for a particular TLB
design it is easy enough to add it back.
Andreas Hansson [Tue, 5 May 2015 07:22:26 +0000 (03:22 -0400)]
mem: Pass shared downstream through caches
This patch ensures that we pass on information about a packet being
shared (rather than exclusive), when forwarding a packet downstream.
Without this patch there is a risk that a downstream cache considers
the line exclusive when it really isn't.
Ali Jafri [Tue, 5 May 2015 07:22:25 +0000 (03:22 -0400)]
mem: Add forward snoop check for HardPFReqs
We should always check whether the cache is supposed to be forwarding snoops
before generating snoops.
Andreas Hansson [Tue, 5 May 2015 07:22:24 +0000 (03:22 -0400)]
mem: Add missing stats update for uncacheable MSHRs
This patch adds a missing counter update for the uncacheable
accesses. By updating this counter we also get a meaningful average
latency for uncacheable accesses (previously inf).
Andreas Hansson [Tue, 5 May 2015 07:22:22 +0000 (03:22 -0400)]
mem: Tidy up BaseCache parameters
This patch simply tidies up the BaseCache parameters and removes the
unused "two_queue" parameter.
David Guillen [Tue, 5 May 2015 07:22:21 +0000 (03:22 -0400)]
mem: Remove templates in cache model
This patch changes the cache implementation to rely on virtual methods
rather than using the replacement policy as a template argument.
There is no impact on the simulation performance, and overall the
changes make it easier to modify (and subclass) the cache and/or
replacement policy.
Andreas Hansson [Tue, 5 May 2015 07:22:19 +0000 (03:22 -0400)]
cpu: Work around gcc 4.9 issues with Num_OpClasses
This patch fixes a recent issue with gcc 4.9 (and possibly more) being
convinced that indices outside the array bounds are used when
initialising the FUPool members.
Andreas Hansson [Tue, 5 May 2015 07:22:17 +0000 (03:22 -0400)]
stats: Bring regression stats in line with actual behaviour
Nilay Vaish [Thu, 30 Apr 2015 19:17:43 +0000 (14:17 -0500)]
stats: arm: updates
Nilay Vaish [Thu, 30 Apr 2015 03:35:23 +0000 (22:35 -0500)]
stats: x86: updates due to change in div latency
Ruslan Bukin [Thu, 30 Apr 2015 03:35:23 +0000 (22:35 -0500)]
arch, base, dev, kern, sym: FreeBSD support
This adds support for FreeBSD/aarch64 FS and SE mode (basic set of syscalls only)
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Rizwana Begum [Thu, 30 Apr 2015 03:35:22 +0000 (22:35 -0500)]
mem: Simplify page close checks for adaptive policies
Both open_adaptive and close_adaptive page polices keep the page
open if a row hit is found. If a row hit is not found, close_adaptive
page policy precharges the row, and open_adaptive policy precharges
the row only if there is a bank conflict request waiting in the queue.
This patch makes the checks for above conditions simpler.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Nilay Vaish [Thu, 30 Apr 2015 03:35:22 +0000 (22:35 -0500)]
ruby: set: replace long by unsigned long
UBSan complains about negative value being shifted
Nilay Vaish [Thu, 30 Apr 2015 03:35:22 +0000 (22:35 -0500)]
cpu: o3: replace issueLatency with bool pipelined
Currently, each op class has a parameter issueLat that denotes the cycles after
which another op of the same class can be issued. As of now, this latency can
either be one cycle (fully pipelined) or same as execution latency of the op
(not at all pipelined). The fact that issueLat is a parameter of type Cycles
makes one believe that it can be set to any value. To avoid the confusion, the
parameter is being renamed as 'pipelined' with type boolean. If set to true,
the op would execute in a fully pipelined fashion. Otherwise, it would execute
in an unpipelined fashion.
Nilay Vaish [Thu, 30 Apr 2015 03:35:22 +0000 (22:35 -0500)]
cpu: o3: single cycle default div microop latency on x86
This patch sets the default latency of the division microop to a single cycle
on x86. This is because the division instructions DIV and IDIV have been
implemented as loops of div microops, where each microop computes a single bit
of the quotient.
Nilay Vaish [Thu, 30 Apr 2015 03:35:22 +0000 (22:35 -0500)]
x86: change divide-by-zero fault to divide-error
Same exception is raised whether division with zero is performed or the
quotient is greater than the maximum value that the provided space can hold.
Divide-by-Zero is the AMD terminology, while Divide-Error is Intel's.
Andreas Hansson [Fri, 24 Apr 2015 07:30:08 +0000 (03:30 -0400)]
misc: Appease gcc 5.1 without moving GDB_REG_BYTES
This patch rolls back the move of the GDB_REG_BYTES constant, and
instead adds M5_VAR_USED.
bpotter [Thu, 23 Apr 2015 20:40:18 +0000 (13:40 -0700)]
config: enable setting SE-mode environment variables from file
Rene de Jong [Thu, 23 Apr 2015 17:37:50 +0000 (13:37 -0400)]
arm, dev: Add a UFS device
This patch introduces a UFS host controller and a UFS device. More
information about the UFS standard can be found at the JEDEC site:
http://www.jedec.org/standards-documents/results/jesd220
Note that the model does not implement the complete standard, and as
such is not an actual implementation of UFS. The following SCSI
commands are implemented: inquiry, read, read capacity, report LUNs,
start/stop, test unit ready, verify, write, format unit, send
diagnostic, synchronize cache, mode select, mode sense, request sense,
unmap, write buffer and read buffer. This is sufficient for usage with
Linux and Android.
To interact with this model a kernel version 3.9 or above is
needed.
Rene de Jong [Thu, 23 Apr 2015 17:37:49 +0000 (13:37 -0400)]
arm, dev: Add a NAND flash timing model
This adds a NAND flash timing model. This model takes the number of
planes into account and is ultimately intended to be used as a
high-level performance model for any device using flash. To access the
memory, use either readMemory or writeMemory.
To make use of the model you will need an interface model
such as UFSHostDevice, which is part of a separate patch.
At the moment the flash device is part of the ARM device tree since
the only use if the UFSHostDevice, and that in turn relies on the ARM
GIC.
Peter Enns [Thu, 23 Apr 2015 17:37:48 +0000 (13:37 -0400)]
dev: Add support for i2c devices
This patch adds an I2C bus and base device. I2C is used to connect a
variety of sensors, and this patch serves as a starting point to
enable a range of I2C devices.
Andreas Hansson [Thu, 23 Apr 2015 17:37:46 +0000 (13:37 -0400)]
misc: Appease gcc 5.1
This patch fixes a few small issues to ensure gem5 compiles when using
gcc 5.1.
First, the GDB_REG_BYTES in the RemoteGDB header are, rather
surprisingly, flagged as unused for both ARM and X86. Removing them,
however, causes compilation errors as they are actually used in the
source file. Moving the constant into the class definition fixes the
issue. Possibly a gcc bug.
Second, we have an unused EthPktData constructor using auto_ptr, and
the latter is deprecated. Since the code is never used it is simply
removed.
Steve Reinhardt [Thu, 23 Apr 2015 03:22:29 +0000 (20:22 -0700)]
stats: update for previous changeset
Very small differences in IQ-specific O3 stats.
Brandon Potter [Wed, 22 Apr 2015 14:52:03 +0000 (07:52 -0700)]
cpu: remove conditional check (count > 0) on o3 IQ squashes
The o3 cpu instruction queue model uses the count variable to track the number
of unissued instructions in the queue. Previously, the squash method used
this variable to avoid executing the doSquash method when there were no
unissued instructions in the pipeline. A corner case problem exists when
only issued instructions exist in the pipeline and a squash occurs; the
doSquash code is not invoked and subsequently does not clean up state properly.
Brandon Potter [Wed, 22 Apr 2015 14:51:27 +0000 (07:51 -0700)]
syscall_emul: implement clock_gettime system call
Monir Mozumder [Wed, 22 Apr 2015 14:51:27 +0000 (07:51 -0700)]
syscall_emul: update x86 syscall table
Update table with additional definitions through Linux 3.13.
Brandon Potter [Wed, 22 Apr 2015 14:51:27 +0000 (07:51 -0700)]
syscall_emul: update getrlimit to use warn
Don't use std::cerr directly, and just return EINVAL instead of aborting.
Brandon Potter [Wed, 22 Apr 2015 14:51:27 +0000 (07:51 -0700)]
syscall_emul: fix warning with wrong syscall name
Also nix extra whitespace.
Brandon Potter [Wed, 22 Apr 2015 14:51:27 +0000 (07:51 -0700)]
base: add new ChunkGenerator method to identify last chunk
Steve Reinhardt [Mon, 20 Apr 2015 22:09:43 +0000 (15:09 -0700)]
stats: update a few stats from long O3 runs
Very small changes to iew.predictedNotTakenIncorrect
and iew.branchMispredicts. Looks like similar updates
were committed on April 3 (changeset
235ff1c046df), but
only for the quick tests.
Andreas Hansson [Mon, 20 Apr 2015 16:46:35 +0000 (12:46 -0400)]
cpu: Remove the InOrderCPU from the tree
This patch takes the final step in removing the InOrderCPU from the
tree. Rest in peace.
The MinorCPU is now used to model an in-order microarchitecture, and
long term the MinorCPU will eventually be renamed InOrderCPU.
Andreas Hansson [Mon, 20 Apr 2015 16:46:29 +0000 (12:46 -0400)]
config: Remove memory aliases and rely on class name
Instead of maintaining two lists, rely entirely on the class
name. There is really no point in causing unecessary confusion.
Nilay Vaish [Wed, 15 Apr 2015 21:04:37 +0000 (16:04 -0500)]
Added tag stable_2015_04_15 for changeset
e17949745150
Nilay Vaish [Tue, 14 Apr 2015 16:01:11 +0000 (11:01 -0500)]
stats: x86: changes due to recent patches
The change in 20.parser is from new x87 instructions. The change to
pc-o3-timing is not clear to me. It seems that this test might be invoking
some undefined behavior.
Malek Musleh [Tue, 14 Apr 2015 16:01:10 +0000 (11:01 -0500)]
config, cpu: fix progress interval for switched CPUs
This patch ensures that the CPU progress Event is triggered for the new set of
switched_cpus that get scheduled (e.g. during fast-forwarding). it also avoids
printing the interval state if the cpu is currently switched out.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Dibakar Gope [Mon, 13 Apr 2015 22:33:57 +0000 (17:33 -0500)]
cpu: re-organizes the branch predictor structure.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Nilay Vaish [Mon, 13 Apr 2015 22:33:57 +0000 (17:33 -0500)]
x86: implements x87 mult/div instructions
Lena Olson [Mon, 13 Apr 2015 22:33:57 +0000 (17:33 -0500)]
ruby: allow restoring from checkpoint when using DRAMCtrl
Restoring from a checkpoint with ruby + the DRAMCtrl memory model was not
working, because ruby and DRAMCtrl disagreed on the current tick during warmup.
Since there is no reason to do timing requests during warmup, use functional
requests instead.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Nilay Vaish [Mon, 13 Apr 2015 22:33:57 +0000 (17:33 -0500)]
sim: Use NULL instead of None for testing filenames.
The filenames are initialized with NULL. So the test should be
checking for them to be == NULL instead == None.
Nilay Vaish [Mon, 13 Apr 2015 22:33:57 +0000 (17:33 -0500)]
sim: fix function for emulating dup()
The function was using the host fd to obtain the fd object from the simulated
process.
Curtis Dunham [Wed, 8 Apr 2015 20:56:06 +0000 (15:56 -0500)]
config: Support full-system with SST's memory system
This patch adds an example configuration in ext/sst/tests/ that allows
an SST/gem5 instance to simulate a 4-core AArch64 system with SST's
memHierarchy components providing all the caches and memories.
Curtis Dunham [Wed, 8 Apr 2015 20:56:06 +0000 (15:56 -0500)]
ext: Add SST connector
This patch adds a connector that allows gem5 to be used as a component
in SST (Structural Simulation Toolkit, sst-simulator.org). At a high
level, this allows memory traffic to pass between the two simulators.
SST Links are roughly analogous to gem5 Ports, although Links do not
have a notion of master and slave. This distinction is important to
gem5, so when connecting a gem5 CPU to an SST cache, an ExternalSlave
must be used, and similarly when connecting the memory side of SST cache
to a gem5 port (for memory <-> I/O), an ExternalMaster must be used.
These connectors handle the administrative aspects of gem5
(initialization, simulation, shutdown) as well as translating SST's
MemEvents into gem5 Packets and vice-versa.
Nilay Vaish [Fri, 3 Apr 2015 16:42:11 +0000 (11:42 -0500)]
stats: updates due to recent changesets.
Nikos Nikoleris [Fri, 3 Apr 2015 16:42:10 +0000 (11:42 -0500)]
dev: (un)serialize fix for the RTC and RTC Timer Interrupt events
Restoring from a checkpoint fails if either the RTC or the RTC Timer
Interrrupt event is disabled. The restored machine tried incorrectly
to schedule the next event with negative offset.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Ruslan Bukin [Fri, 3 Apr 2015 16:42:10 +0000 (11:42 -0500)]
sim: correct check for endianess
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Ruslan Bukin [Fri, 3 Apr 2015 16:42:10 +0000 (11:42 -0500)]
dev: Extend access width for IDE control registers
Add 32-bit access width for PrimaryTiming register and 16bit for UDMAControl
register as FreeBSD required.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Nikos Nikoleris [Fri, 3 Apr 2015 16:42:10 +0000 (11:42 -0500)]
cpu: fix system total instructions accounting
The totalInstructions counter is only incremented when the whole instruction is
commited and not on every microop. It was incorrectly reset in atomic and
timing cpus.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>"
Lena Olson [Fri, 3 Apr 2015 16:42:10 +0000 (11:42 -0500)]
x86: fix debug trace output for mwait
When running with the Exec flag, the mwait instruction attempted
to print out its source registers, which were never actually
initialized. This led to sporadic assertion failures when the
value stored there was invalid.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
Andreas Hansson [Fri, 27 Mar 2015 08:56:10 +0000 (04:56 -0400)]
arm, configs: Do not forward snoops from I cache
This fix simply tells the I cache to not forward snoops to the fetch
unit (since there is really no reason to do so).
Stephan Diestelhorst [Fri, 27 Mar 2015 08:56:03 +0000 (04:56 -0400)]
mem: Support any number of master-IDs in stride prefetcher
The stride prefetcher had a hardcoded number of contexts (i.e. master-IDs)
that it could handle. Since master IDs need to be unique per system, and
every core, cache etc. requires a separate master port, a static limit on
these does not make much sense.
Instead, this patch adds a small hash map that will map all master IDs to
the right prefetch state and dynamically allocates new state for new master
IDs.