numactl --interleave=all ./testing_cgeev -RN -N 100 -N 1000 --range 10:90:10 --range 100:900:100 --range 1000:9000:1000 --range 10000:20000:2000
MAGMA 1.6.1  compiled for CUDA capability >= 3.5
CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.3, MKL threads 16. 
ndevices 3
device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
Usage: ./testing_cgeev [options] [-h|--help]

    N   CPU Time (sec)   GPU Time (sec)   |W_magma - W_lapack| / |W_lapack|
===========================================================================
  100     ---               0.0370
 1000     ---               1.1410
   10     ---               0.0005
   20     ---               0.0008
   30     ---               0.0013
   40     ---               0.0039
   50     ---               0.0047
   60     ---               0.0054
   70     ---               0.0080
   80     ---               0.0108
   90     ---               0.0124
  100     ---               0.0159
  200     ---               0.0589
  300     ---               0.0937
  400     ---               0.1403
  500     ---               0.1971
  600     ---               0.3611
  700     ---               0.4680
  800     ---               0.5535
  900     ---               0.6994
 1000     ---               0.7811
 2000     ---               2.4442
 3000     ---               7.2809
 4000     ---              11.6277
 5000     ---              17.5651
 6000     ---              31.0414
 7000     ---              41.7092
 8000     ---              53.7020
 9000     ---              67.8587
10000     ---              81.7743
12000     ---             124.0545
14000     ---             167.2559
16000     ---             230.2912
18000     ---             290.8537
20000     ---             373.4943

numactl --interleave=all ./testing_cgeev -RV -N 100 -N 1000 --range 10:90:10 --range 100:900:100 --range 1000:9000:1000 --range 10000:20000:2000
MAGMA 1.6.1  compiled for CUDA capability >= 3.5
CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.3, MKL threads 16. 
ndevices 3
device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
Usage: ./testing_cgeev [options] [-h|--help]

    N   CPU Time (sec)   GPU Time (sec)   |W_magma - W_lapack| / |W_lapack|
===========================================================================
  100     ---               0.0200
 1000     ---               1.2059
   10     ---               0.0019
   20     ---               0.0025
   30     ---               0.0027
   40     ---               0.0052
   50     ---               0.0078
   60     ---               0.0070
   70     ---               0.0097
   80     ---               0.0121
   90     ---               0.0143
  100     ---               0.0233
  200     ---               0.0717
  300     ---               0.1684
  400     ---               0.2235
  500     ---               0.2950
  600     ---               0.6602
  700     ---               0.7326
  800     ---               0.9360
  900     ---               1.1744
 1000     ---               0.9703
 2000     ---               3.8646
 3000     ---              11.9488
 4000     ---              17.3995
 5000     ---              29.6873
 6000     ---              47.0757
 7000     ---              66.3783
 8000     ---              86.6015
 9000     ---             111.6952
10000     ---             148.4911
12000     ---             218.9101
14000     ---             318.8500
16000     ---             446.7903
18000     ---             611.1254
20000     ---             799.2029
