parallel make fails
Reported by Volker Braun | September 24th, 2011 @ 12:47 PM | in 1.2.0 (closed)
Parallel make fails on Linux on a PanFS network file system (the frontent of this: http://ichec.ie/infrastructure/stokes). I suspect it is related to the higher latency vs. a native file system, which triggers a race condition in re2c.
Non-parallel build works fine.
This ticket might be the same as #165, but I can't see the actual output on that ticket.
Steps to reproduce:
Download and unpack yasm-1.1.0.tar.gz
Run ./configure
Run ./make -j10
Parallel compile log:
vbraun@stokes1:~/yasm/yasm-1.1.0> make -j10
gcc -std=gnu99 -I. \
-c -o genperf.o `test -f tools/genperf/genperf.c || echo './'`tools/genperf/genperf.c
gcc -std=gnu99 -I. \
-c -o gp-perfect.o `test -f tools/genperf/perfect.c || echo './'`tools/genperf/perfect.c
gcc -std=gnu99 -I. \
-c -o gp-phash.o `test -f libyasm/phash.c || echo './'`libyasm/phash.c
gcc -std=gnu99 -I. \
-c -o gp-xmalloc.o `test -f libyasm/xmalloc.c || echo './'`libyasm/xmalloc.c
gcc -std=gnu99 -I. \
-c -o gp-xstrdup.o `test -f libyasm/xstrdup.c || echo './'`libyasm/xstrdup.c
gcc -std=gnu99 -I. \
-c -o re2c-main.o `test -f tools/re2c/main.c || echo './'`tools/re2c/main.c
gcc -std=gnu99 -I. \
-c -o re2c-code.o `test -f tools/re2c/code.c || echo './'`tools/re2c/code.c
gcc -std=gnu99 -I. \
-c -o re2c-dfa.o `test -f tools/re2c/dfa.c || echo './'`tools/re2c/dfa.c
gcc -std=gnu99 -I. \
-c -o re2c-parser.o `test -f tools/re2c/parser.c || echo './'`tools/re2c/parser.c
gcc -std=gnu99 -I. \
-c -o re2c-actions.o `test -f tools/re2c/actions.c || echo './'`tools/re2c/actions.c
gcc -std=gnu99 -I. \
-c -o re2c-scanner.o `test -f tools/re2c/scanner.c || echo './'`tools/re2c/scanner.c
gcc -std=gnu99 -I. \
-c -o re2c-mbo_getopt.o `test -f tools/re2c/mbo_getopt.c || echo './'`tools/re2c/mbo_getopt.c
gcc -std=gnu99 -I. \
-c -o re2c-substr.o `test -f tools/re2c/substr.c || echo './'`tools/re2c/substr.c
gcc -std=gnu99 -I. \
-c -o re2c-translate.o `test -f tools/re2c/translate.c || echo './'`tools/re2c/translate.c
gcc -std=gnu99 -I. \
-c -o genmacro.o `test -f tools/genmacro/genmacro.c || echo './'`tools/genmacro/genmacro.c
gcc -std=gnu99 -I. -c -o genversion.o `test -f modules/preprocs/nasm/genversion.c || echo './'`modules/preprocs/nasm/genversion.c
gcc -std=gnu99 -I. -c -o genstring.o `test -f genstring.c || echo './'`genstring.c
gcc -std=gnu99 -o genperf genperf.o gp-perfect.o gp-phash.o gp-xmalloc.o gp-xstrdup.o
gcc -std=gnu99 -o re2c re2c-main.o re2c-code.o re2c-dfa.o re2c-parser.o re2c-actions.o re2c-scanner.o re2c-mbo_getopt.o re2c-substr.o re2c-translate.o
gcc -std=gnu99 -o genmacro genmacro.o
./genperf x86insn_nasm.gperf x86insn_nasm.c
gcc -std=gnu99 -o genversion genversion.o
./genperf x86insn_gas.gperf x86insn_gas.c
found distinct (A,B) on attempt 19
built perfect hash table of size 512
gcc -std=gnu99 -o genstring genstring.o
./genmacro nasm-macros.c nasm_standard_mac ./modules/parsers/nasm/nasm-std.mac
./genmacro win64-nasm.c win64_nasm_stdmac ./modules/objfmts/coff/win64-nasm.mac
./genmacro win64-gas.c win64_gas_stdmac ./modules/objfmts/coff/win64-gas.mac
./genstring license_msg license.c ./COPYING
./re2c -b -o gas-token.c ./modules/parsers/gas/gas-token.re
found distinct (A,B) on attempt 1640
./re2c -b -o nasm-token.c ./modules/parsers/nasm/nasm-token.re
built perfect hash table of size 512
./genversion version.mac
./genmacro nasm-version.c nasm_version_mac version.mac
make: *** [nasm-token.c] Segmentation fault (core dumped)
make: *** Deleting file `nasm-token.c'
Comments and changes to this ticket
-

Volker Braun September 24th, 2011 @ 01:43 PM
Building re2c with debugging information, its clear that there is a race between different re2c processes. They all use the same re2c.tmp file. This could cause silent data corruption!
(gdb) up #3 0x0000000000404a08 in DFA_emit (d=0x6440f0, o=0x612640) at tools/re2c/code.c:850 850 State_emit(s, tmpo, &readCh); (gdb) l 845 maxFillIndexes = vFillIndexes; 846 orgVFillIndexes = vFillIndexes; 847 tmpo = fopen("re2c.tmp", "wt"); 848 for(s = d->head; s; s = s->next){ 849 int readCh = 0; 850 State_emit(s, tmpo, &readCh); 851 Go_genGoto(&s->go, tmpo, s, s->next, &readCh); 852 } 853 fclose(tmpo); 854 remove("re2c.tmp"); (gdb) print tmpo $4 = (FILE *) 0x0 (gdb) bt #0 0x00002b8709669459 in fwrite () from /lib64/libc.so.6 #1 0x0000000000402b12 in Action_emit (a=0x631ed0, o=0x0, readCh=0x7ffffebe4b4c) at tools/re2c/code.c:270 #2 0x0000000000403ba3 in State_emit (s=0x632b30, o=0x0, readCh=0x7ffffebe4b4c) at tools/re2c/code.c:535 #3 0x0000000000404a08 in DFA_emit (d=0x6440f0, o=0x612640) at tools/re2c/code.c:850 #4 0x00000000004091d5 in genCode (o=0x612640, re=0x641140) at tools/re2c/actions.c:689 #5 0x0000000000406dd7 in parse (i=0x612400, o=0x612640) at tools/re2c/parser.c:246 #6 0x00000000004019ba in main (argc=5, argv=0x7ffffebe6ea8) at tools/re2c/main.c:193I'm pretty sure that PanFS and Windows lock files that are open, i.e., opening the file for writing in a second process will probably fail with EACCES. If open fails, fopen returns 0x0 which is never checked in re2c and segfaults once it is being written to.
-

Peter Johnson October 3rd, 2011 @ 06:15 AM
- Milestone set to 1.2.0
- Assigned user set to Peter Johnson
- Milestone order changed from 154 to 0
Thanks for tracking this down! I'll commit a fix.
-

Peter Johnson October 3rd, 2011 @ 06:22 AM
- State changed from new to resolved
(from [2bd66514b6b100887c19d8598da38347b3cff40e]) re2c: Use tmpfile instead of fixed temporary filename.
This could cause a race condition when running parallel make.
Tracked down by Volker Braun.
[#238 state:resolved] [#165 state:resolved] https://github.com/yasm/yasm/commit/2bd66514b6b100887c19d8598da3834...
Please Sign in or create a free account to add a new ticket.
With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile »
The Yasm Modular Assembler Project
People watching this ticket
Referenced by
-
#165 Parallel make fails on Cygwin
[#238 state:resolved] [#165
state:resolved]
https://gith...