c++ - Crash in thrust sorting example -


i trying first example of official website's example https://developer.nvidia.com/thrust , changed vector size 32<<23. code like:

#include <thrust/host_vector.h> #include <thrust/device_vector.h> #include <thrust/generate.h> #include <thrust/sort.h> #include <thrust/copy.h> #include <algorithm> #include <cstdlib> #include <time.h>  using namespace std;  int main(void){   // generate random numbers serially   thrust::host_vector<int> h_vec(32 << 23);   std::generate(h_vec.begin(), h_vec.end(), rand);   std::cout << "1." << time(null) << endl;    // transfer data device   thrust::device_vector<int> d_vec = h_vec;   cout << "2." << time(null) << endl;   // sort data on device (846m keys per second on geforce gtx 480)   thrust::sort(d_vec.begin(), d_vec.end());   // transfer data host   thrust::copy(d_vec.begin(), d_vec.end(), h_vec.begin());   std::cout << "3." << time(null) << endl;    return 0; } 

but program crashed when running line of thrust::sort. tried alternatively use std::vector , std:sort , worked well.

is bug of thrust?? using thrust 1.7 + cuda 6.5 + visual studio 2013 update 2.

i using geforce gt 740m total memory of 2048m.

i used processexplorer monitor process , saw allocated 1.0g memory. have 2g gpu memory, 16g main cpu memory.

the error message "a problem caused program stop working correctly. windows close program , notify if solution available. [debug] [close program]". after clicking [debug], see call stack. issue line:

thrust::device_vector<int> d_vec = h_vec; 

the last source cuda this:

testcuda.exe!thrust::system::cuda::detail::malloc<thrust::system::cuda::detail::tag>(thrust::system::cuda::detail::execution_policy<thrust::system::cuda::detail::tag> & __formal, unsigned __int64 n) line 48  c++ 

it seems memory allocation issue. have 2g gpu memory, 16g main cpu memory. why??

to robert:

the original example works well, 32<<21, 32<<22. there virtual memory management system gpu memory? continuous here means physically continuous or virtually? there exception raised in scenario can catch it?

my test code herer: https://github.com/henrywoo/wufuheng/blob/master/testcuda.cu

in test, there no exception, runtime error.

sizeof(int) * 32<<23 = 4* 2^28
i.e. allocating 1 gb of gpu ram. likely, card cannot handle many elements. might because:

  • there isn't enough gpu ram in general
  • there isn't enough continuous free gpu ram (this needed because vector has fit in continuous piece of memory)

Comments

Popular posts from this blog

javascript - Jquery show_hide, what to add in order to make the page scroll to the bottom of the hidden field once button is clicked -

python - Django-cities exits with "killed" -

python - How to get a widget position inside it's layout in Kivy? -