c - Whats the correct value to base the maximum number of CPU's to sched_setaffinity to? -
i have confusion whats correct value use number of cpu's can use make cpu_set
sched_setaffinity
call on system.
my /proc/cpuinfo
file:
processor : 0 vendor_id : genuineintel cpu family : 6 model : 37 model name : intel(r) core(tm) i5 cpu m 460 @ 2.53ghz stepping : 5 microcode : 0x2 cpu mhz : 1199.000 cache size : 3072 kb physical id : 0 siblings : 4 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fdiv_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx rdtscp lm constant_tsc arch_perfmon pebs bts xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt lahf_lm ida arat dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 5056.34 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 1 vendor_id : genuineintel cpu family : 6 model : 37 model name : intel(r) core(tm) i5 cpu m 460 @ 2.53ghz stepping : 5 microcode : 0x2 cpu mhz : 1199.000 cache size : 3072 kb physical id : 0 siblings : 4 core id : 0 cpu cores : 2 apicid : 1 initial apicid : 1 fdiv_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx rdtscp lm constant_tsc arch_perfmon pebs bts xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt lahf_lm ida arat dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 5056.34 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 2 vendor_id : genuineintel cpu family : 6 model : 37 model name : intel(r) core(tm) i5 cpu m 460 @ 2.53ghz stepping : 5 microcode : 0x2 cpu mhz : 1199.000 cache size : 3072 kb physical id : 0 siblings : 4 core id : 2 cpu cores : 2 apicid : 4 initial apicid : 4 fdiv_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx rdtscp lm constant_tsc arch_perfmon pebs bts xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt lahf_lm ida arat dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 5056.34 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 3 vendor_id : genuineintel cpu family : 6 model : 37 model name : intel(r) core(tm) i5 cpu m 460 @ 2.53ghz stepping : 5 microcode : 0x2 cpu mhz : 1199.000 cache size : 3072 kb physical id : 0 siblings : 4 core id : 2 cpu cores : 2 apicid : 5 initial apicid : 5 fdiv_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx rdtscp lm constant_tsc arch_perfmon pebs bts xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt lahf_lm ida arat dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 5056.34 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management:
in file found there processor
lines numbered 0-3, "physical" processors (4 processors total). can value sysconf(_sc_nprocessors_onln)
but, there line cpu cores
, each processor has 2. believe represents "logical" processors or hyperthreading accounted for. should using "physical" value or can use "logical" count?
i'm not clear on because if go /proc/pid/status
theres line cpus_allowed_list
, can range 0-7 (8 processors total) but, wrote script call taskset -c -p pid
every "pid" running , shows every process of having affinity list of 0-3 max.
for hyper-threading 2 logical cpus per core. means if 1 logical cpu stalls reason (cache miss, branch misprediction, instruction dependencies, etc) core can execute instructions other logical cpu , isn't sitting there waiting/being wasted. in addition, typically core capable of doing more in parallel single logical cpu uses, without of (frequently common) stalls still benefits (by increasing utilisation of core's resources). in case; want use logical cpus.
for badly written multi-threaded software (software significant scalability problems) gains hyper-threading can lost poor scalability. example, process might cause "cache line bouncing" (where cache lines being "bounced" between cores) , using affinity reduce number of cores can help. example, core's ram bandwidth might bottleneck (causing process no benefit hyper-threading), , using affinity preventing process using both logical cpus in each core can improve performance. these cases; want use logical cpus (but don't know ones).
for single-threaded processes, it's not going matter do.
basically (assuming multi-threaded); best setting process depends on process; therefore should run tests see how affinity effects process.
misc. notes
when hyper-threading first introduced (netburst/pentium 4) "less ideal", , schedulers in operating systems weren't optimised efficiently schedule load hyper-threading (which made worse). led lot of people thinking hyper-threading bad in lots of cases. modern intel cpus not have same problems netburst/pentium 4 had, , modern operating system schedulers have optimisations hyper-threading. means old assumptions ("hyper-threading bad") correct obsolete , wrong now.
Comments
Post a Comment