From 6b372748499ea3b7c5aab46e167140b575dc94f4 Mon Sep 17 00:00:00 2001 From: Luke Kenneth Casson Leighton Date: Wed, 5 Dec 2018 03:57:47 +0000 Subject: [PATCH] add RVV spec link --- 3d_gpu/microarchitecture.mdwn | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/3d_gpu/microarchitecture.mdwn b/3d_gpu/microarchitecture.mdwn index 6e4eb9522..c2f3060bc 100644 --- a/3d_gpu/microarchitecture.mdwn +++ b/3d_gpu/microarchitecture.mdwn @@ -109,6 +109,23 @@ called the flip-flops orchestrating the timing "collectors". ---- +Justification for Branch Prediction + + + +We can combine several branch predictors to make a decent predictor: +call/return predictor -- important as it can predict calls and returns +with around 99.8% accuracy loop predictor -- basically counts loop +iterations some kind of global predictor -- handles everything else + +We will also want a btb, a smaller one will work, it reduces average +branch cycle count from 2-3 to 1 since it predicts which instructions +are taken branches while the instructions are still being fetched, +allowing the fetch to go to the target address on the next clock rather +than having to wait for the fetched instructions to be decoded. + +---- + For GPU workloads FP64 is not common so I think having 1 FP64 alu would be sufficient. Since indexed loads and stores are not supported, it will be important to support 4x64 integer operations to generate addresses -- 2.30.2