[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Rep:[f-cpu] A simple SIMD extension for C(++)



There is an other potentiel improvement if made several operation on
many array.

Imagine an mul then an add then accumulation (array added together) it
could speed up things if the 3 op will be perform one after an other for
each shunk and not cover 3 times the array (to stay inside the cache !)

nicO

Nicolas Boulay a écrit :
> 
> In the STL C++ library there is a type called valarray it's vector of
> "something" if you apply an operator with it it will be done value by
> value as SIMD does !
> 
> So such thing could be done for valarray<int>, valarray<short>,...
> 
> nicO
> 
> -----Message d'origine-----
> De: Michael Riepe <michael@stud.uni-hannover.de>
> A: F-CPU Mailing List <f-cpu@seul.org>
> Date: 16/05/02
> Objet: [f-cpu] A simple SIMD extension for C(++)
> 
> I read an old article about the shortcomings of C/C++ the other day. The
> author complained that there is no way to apply a function to a number
> of arguments (like the lisp functions `map', `mapcar' & friends do).
> He didn't mention it, but a `map' function would be valuable for SIMD
> processors, too: it explicitly tells the compiler that the same function
> is going to be applied to many arguments in a uniform way. E.g. an
> instruction like
> 
>  __map__(result_array, function, count, array_1, array_2);
> 
> (or similar) would be largely equivalent to
> 
>  for (int i = 0; i < count; i++) {
>   result_array[i] = function(array_1[i], array_2[i]);
>  }
> 
> except that the former may apply the function to the array elements in
> any
> order (that is, even in parallel -- exactly what SIMD does). Of course
> the construct should not be limited to functions with two operands.
> But the number and element types of the array arguments passed to
> __map__ must match the prototype of the function. Therefore, __map__
> itself can't be implemented as a C function or preprocessor macro.
> 
> Note that the extension is fully backwards compatible: the compiler may,
> at its discretion, replace the mapping with a conventional `for' loop
> without changing the semantics of the program. On the other hand, it can
> easily find out that <function> ist to be applied to <count> successive
> elements of <array_1> and <array_2>, which enables it to perform SIMD
> optimizations (and/or loop unrolling) without prior data flow analysis.
> 
> --
>  Michael "Tired" Riepe <Michael.Riepe@stud.uni-hannover.de>
>  "All I wanna do is have a little fun before I die"
> *************************************************************
> To unsubscribe, send an e-mail to majordomo@seul.org with
> unsubscribe f-cpu       in the body. http://f-cpu.seul.org/
> 
> 
> ______________________________________________________________________________
> ifrance.com, l'email gratuit le plus complet de l'Internet !
> vos emails depuis un navigateur, en POP3, sur Minitel, sur le WAP...
> http://www.ifrance.com/_reloc/email.emailif
> 
> *************************************************************
> To unsubscribe, send an e-mail to majordomo@seul.org with
> unsubscribe f-cpu       in the body. http://f-cpu.seul.org/
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/