c++ - openmp bug with radial computation -
#pragma omp parallel schedule(static) default(none) for(int row = 0; row < m_height; row++) { for(int col = 0; col < m_width; col++) { int rysqr, rxsqr; settingsigman(eta, m_rxinitial + col, m_ryinitial + row , rxsqr, rysqr); functionusing(rysqr,rxsqr); } } void cimagepro::settingsigman(int eta, int x, int y, int &rxsqr, int &rysqr, int &returnvalue) { int rsqr = getradius(x,y,rxsqr,rysqr); returnvalue = getnumberfromtable(rsqr); } int cimagepro::getradius(int x, int y, int &rxsqr, int &rysqr) { if (x == m_rxinitial) { rxsqr = m_rxsqrinitial; if (y == m_ryinitial) { rysqr = m_rysqrinitial; } else if ( abs(y) % 2 == abs(m_ryinitial) % 2) { rysqr = rysqr + (y<<2) + 4; //(y+2)^2 } } else { rxsqr = rxsqr + ( x << 1) + 1; //(x+1)^2 } return clamp(((rxsqr+rysqr)>>rad_res_reduction),0,(1<<(rad_res-rad_res_reduction))-1); }
ok here code , problem inside getradius function. since have many threads each threads starts @ different place of x,y. don't understand bug inside getradius().
i thought maybe rysqr computation. can suggest way debug? or can see problem?
update:
this has fixed of code: still don't understand, why there jumps between different threads.
int cimagepro::getradius(int x, int y, int &rxsqr, int &rysqr) { if (x == m_rxinitial) { rxsqr = m_rxsqrinitial; } else { rxsqr = x * x; } if (y == m_ryinitial) { rysqr = m_rysqrinitial; } else if (abs(y) % 2 == abs(m_ryinitial) % 2) { rysqr = y * y; } return clamp(( (rxsqr + rysqr) >> rad_res_reduction), 0, ( 1 << (rad_res - rad_res_reduction) ) - 1); }
i wonder if thing compiles? specify default(none)
, consistently use data members of class. static?
what either i) leave default(none)
away, means default(shared)
, ii) have shared access values explicitly sharing them, or iii) initialise variables use inside parallel region each thread has it's own private copy of, say, m_rxinitial
called p_rxinitial
etc. first option guaranteed trouble.
following illustrates option ii):
1) make helper class containing need pass,
struct sharedata{ int s_rxinitial /* ... */ }
2) in member function containing parallel section, before parallel loop define
sharedata sd; sd.s_rxinitial = m_rxinitial; /* ... */
3) give parallel section
#pragma omp parallel schedule(static), default(none), shared(sd)
4) use sd datamembers in function calls.
i hope clear enough. appreciate if had more elegant solution offer.
if wanted private variables of option iii), firstprivate(sd)
instead of shared(sd)
. give each thread initialized (to original values) private copy of sd. may or may not give performance advantage avoiding serial access. had similar problem few days ago , there no difference.
Comments
Post a Comment