Cuda operator/ overload for

Although Cuda utils provides convenient vector-type functions, they seem screwed up with vector-type operator/ overloading.
In the cutil_math.h you can find:


inline __host__ __device__ float4 operator/(float4 a, float s)
{
    float inv = 1.0f / s;
    return a * inv;
}
inline __host__ __device__ float4 operator/(float s, float4 a)
{
    float inv = 1.0f / s;
    return a * inv;
}


See? The function body is the same! So I always get wrong values when I tried to write:
    float4 a,inv_a;
    inv_a = 1/a;


No wonder someone told me not to use cuda utils.

PS.


The codes have been corrected in Cuda 4.0+.

1 comments:

Enliang's blog said...
This comment has been removed by the author.