Quantcast
Channel: Intel® Fortran Composer XE
Viewing all articles
Browse latest Browse all 1424

OpenMP and 3D arrays. How to make it fast?

$
0
0

1. I am solving 3D finite volume temperature fields with Intel FV on Windows7 and using my own iterative methods. I have tried  so many variations of OpenMP directives but never got a speed up more than factor 2 even with 16 cores. What bothers me is, that in the test program attached the CPU time recordings are so odd: the CPU time is more or less independent on NTHREADS and Nthreads=1 is faster than ordinary serial loops.

I compile with ifort /c /Qopenmp test3.f90 ,link with link test3.obj and run with test3.. Test3.f90 is attached together with my test3.exe.

Test3.f90 is like as it is, because the typical sort of  loop in my 'big'  program looks like:

do k=1,Nz
  do j=1,Ny
    do i=1,Nx
      c(i,j,k)=a(i,j,k)*b(i,j,k)+ other matrix elements - other matrix elemens
    enddo
  enddo
enddo

Q: Is this type of loop structure impeding the use of OpenMP and how to make it better? I also have loops like 

  do i=1,N ; some arrays a(3,i) ; enddo 

which also do not run better with OpenMP.

Q Are there special compiler directives to make it better?

Q What to use as diagnostics? (I have to admit that I'm using the old fashioned way of .bat files to compile and link. I do not use the visual mode).

Q:Can anyone tell me the main tripping hazard  for a newbee in openMP?

2. There is some good story in that other loops like this scale with number of threads as expected:

!$OMP PARALLEL PRIVATE(i,j,k)  reduction(+:prod)
!$omp do
do k=1,Nz
do j=1,Ny
  do i=1,Nx
   prod=prod+a(i,j,k)*b(i,j,k)
  enddo
enddo
enddo
!$omp end do
!$OMP END PARALLEL

Now I am confused and wonder what goes wrong with my matric element multiplication.

Hopiing someone could provide me with a key idea

Best regards, Johannes

AllegatoDimensione
Scaricatest3_0.f901.2 KB
Scaricatest3_0.zip258.31 KB

Viewing all articles
Browse latest Browse all 1424

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>