libstdc++: Fix mask reduction of simd_mask<double> on POWER7
POWER7 does not support __vector long long reductions, making the
generic _S_popcount implementation ill-formed. Specializing _S_popcount
for PPC allows optimization and avoids the issue.
libstdc++-v3/ChangeLog:
* include/experimental/bits/simd.h: Add __have_power10vec
conditional on _ARCH_PWR10.
* include/experimental/bits/simd_builtin.h: Forward declare
_MaskImplPpc and use it as _MaskImpl when __ALTIVEC__ is
defined.
(_MaskImplBuiltin::_S_some_of): Call _S_popcount from the
_SuperImpl for optimizations and correctness.
* include/experimental/bits/simd_ppc.h: Add _MaskImplPpc.
(_MaskImplPpc::_S_popcount): Implement via vec_cntm for POWER10.
Otherwise, for >=int use -vec_sums divided by a sizeof factor.
For <int use -vec_sums(vec_sum4s(...)) to sum all mask entries.