Hello, i’m upgrading a self-built tensorflow 2.9.2 running on Python 3.10.7.
I built it without any trouble for my lack of SSE4.2 on any of these platforms i run.
I assume the built was primitive since only used -msse3 at the bazel copts.
- i’m on the process of migrating to Python 3.11.1 which i use more often
- The target tensorflow is latest stable (2.11.0). i had problems building 2.10.x
What would you recommend me to do. my platform supports the following
hardware flags: sse3, mcx16, popcnt.
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs hw_pstate vmmcall npt lbrv svm_lock
and i’ll be using the next bazel command.
export JAVA_TOOL_OPTIONS="-Xmx4096m -Xms128m"
export TEST_TMPDIR=/opt/SCRATCH2/bazel_temp
bazel build --local_ram_resources=6128 \
--jobs 4 --config=opt \
--copt="-O2" --copt="-msse3" --copt=-mcx16 --copt=-mpopcnt --copt=-mfpmath=both
I wonder if adding -msse2 -mmmx would make any difference and if the respective mcx16 and popcnt and perhaps other are really needed or unused.
Your support is appreciated.