The simple answer is that close isn't being called often enough. Here's an illustrative example of why:
Using an input file like:
<doc somestuff
another line
yet another line
<doc the second
still more data
<doc the third
<doc the fourth
<doc the fifth
I can make an executable awk file based on your script like:
#!/usr/bin/awk -f
BEGIN { system_(++j) }
/<doc/{x=++i}
{
if (i%5==0 ){ ++i; close_(j"/"x); system_(++j) }
else{ open_(j"/"x) }
}
function call_f(funcname, arg) { print funcname"("arg")" }
function system_(cnt) { call_f( "system", cnt ) }
function open_(f) { if( !(f in a) ) { call_f( "open", f ); a[f]++ } }
function close_(f) { call_f( "close", f ) }
which if I put into a file called awko
can be run like awko data
to produce the following:
system(1)
open(1/1)
open(1/2)
open(1/3)
open(1/4)
close(1/5)
system(2)
The script I made is just indicating how many times you're calling each function by shadowing a real function call with a local function with a trailing _
. Notice how many times open()
is printed compared to close()
for the same arguments. Also, I ended up renaming print >
to open_
just to illustrated that it's what's opening the files( once per file name ).
If I change the executable awk file to the following, you can see close being called enough:
#!/usr/bin/awk -f
BEGIN { system_(++j) }
/<doc/{ close_(j"/"x); x=++i } # close_() call is moved to here.
{
if (i%5==0 ){ ++i; system_(++j) }
else{ open_(j"/"x) }
}
function call_f(funcname, arg) { print funcname"("arg")" }
function system_(cnt) { call_f( "system", cnt ) }
function open_(f) { if( !(f in a) ) { call_f( "open", f ); a[f]++ } }
function close_(f) { call_f( "close", f ) }
which gives the following output:
system(1)
close(1/)
open(1/1)
close(1/1)
open(1/2)
close(1/2)
open(1/3)
close(1/3)
open(1/4)
close(1/4)
system(2)
where it should be clear that close()
is being called one more time than enough. The first time it's being called on a file that doesn't exist. With a true close()
call, the fact that such a file has never been printed should just be ignored and no actual close will be attempted. In each other case, the last open()
matches a close()
call.
Moving your close()
call in your script as in the second example script should fix your error.