Sunday 10 January 2010

Plot iterations and pseudo-files

As I promised some time ago, I will discuss some of the new features in gnuplot 4.4. The first one that I would like to show is the concept of iteration in the plot command, and the concept of certain pseudo-files. If you have ever had to script the creation of your plots, you will appreciate these features.

First, let us see what the for loop looks like in the plot command. There are forms of it: once one can loop through integers, while in the other case, one can step through a string of words. So, the iteration either looks like

plot for [var = start : end {:increment}]

or

plot for [var in "some string of words"]

After this introduction, let us see how these can be used in real life. The first example that I will show is that of waterfall plots, i.e., when a couple of curves are plotted on the same graph, and they are shifted vertically. This is common practice, when one has several spectra, and wants to show the effect of some parameter on the spectra.

reset
f(x,a) = exp(-(x-a)*(x-a)/(1+a*0.5))+0.05*rand(0)
title(n) = sprintf("column %d", n)
set table 'iter.dat'
plot [0:20] '+' using (f($1,1)):(f($1,2)):(f($1,3)):(f($1,4)):(f($1,5)):(f($1,6)) w xyerror
unset table

set yrange [0:15]
plot for [i=1:6] 'iter.dat' u 0:(column(i)+2*i) w l lw 1.5 t title(i)

I would like to walk through the script line by line, for there is something unusual in almost each line. So, after re-setting the gnuplot session, we define a function, which will be a Gaussian, whose centre and width is determined by the parameter 'a'. We then plot this function to a file, 'iter.dat', and do it 6 times, and each time with a different parameter, so that the Gaussian is shifted, and becomes broadened. Note, however, that we do this by plotting a special file, '+'. This was introduced in gnuplot 4.4, and the purpose of this special file is that by invoking this, one can use the standard plot modifiers even with functions. I.e., we can specify 'using' for a function. The importance of this is that many plot styles require several columns, and we could not use those plot styles with functions without the '+' pseudo-file. Consider the following example

reset
unset colorbox
unset key
set xrange [0:10]
set cbrange [0:1]
plot '+' using ($1):(sin($1)):(0.5*(1.0+sin($1))) with lines lw 3 lc palette, \
'+' using ($1):(sin($1)+2):($1/10.0) with lines lw 3 lc palette
which produces the following graph

We can thus colour our curve by specifying the colour in the third column of the pseudo-file. Of course, this is only one possibility, and there are many more. If one wants to plot 3D graphs, then the pseudo-file becomes '++', but the concept is the same: the two variables are denoted by $1 and $2, and the function is calculated on the grid determined by the number of samples and the corresponding data range.

Now, back to the iteration loop! We produce 6 columns of data by plotting '+' by invoking a plot style that requires 6 columns. In this case, it is the xyerrorbars. Having created some data, we plot each column, but we call plot only once: the iteration loop does the rest. In each plot, the curve is shifted upwards, and the title is taken from the column number. For specifying the title, we use the function that we defined earlier: it takes an integer, and returns a string. At the end of the day, we have this graph

This was an example, when we plot various columns from the same file. We can also use the iteration to plot different files. When doing so, there are two options available. One is that we simply specify the file names in a string, as below

reset
filenames = "first second third fourth fifth"
plot for [file in filenames] file using 1:2 with lines
which will plot files 'first', 'second', 'third', 'fourth', and 'fifth'. At this point, note that 'file' is a string, i.e., we can manipulate it as a string. E.g., if we wanted to, instead of 'first', 'second', etc., plot 'first.dat', 'second.dat', and so on, we would do this as
reset
filenames = "first second third fourth fifth"
plot for [file in filenames] file."dat" using 1:2 with lines

The second option is, if the data files are numbered, e.g., if we have 'file_1.dat', 'file_2.dat', and so on, we can use the iteration over integers as follows
reset
filename(n) = sprintf("file_%d", n)
plot for [i=1:10] filename(i) using 1:2 with lines
which will plot the second column versus the first column of 'file_1.dat' through 'file_10.dat'.

17 comments:

  1. I like this feature a lot. I was scripting a lot before this patch, which now becomes obsolete.

    Only thing is: It's not possible to apply the "for" patch for some older versions of gnuplot, isn't it? Would be nice because some other plotting I do is not compatible with 4.X versions, and therefore the work I save by implementing "for" loops is eaten up by the work to get the old stuff going :)

    Yeah, but keep the good work up!

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Appreciate your advise...
    The datafile has few blocks of data. Sample 'datafile';
    #0
    #header A B C
    1 1 A
    1 2 A
    1 3 A
    1 4 A


    #1
    #header A B
    2 1 B
    2 2 B
    2 3 B
    2 4 B

    "plot 'datafile' using 1:2" will produce one curve for each block. How to add key/title to each block/curve? Assuming the title/key can be from an array of another column in each block.

    Thanks.

    ReplyDelete
  4. Found solution to my problem. Example below.
    Description::
    - The datafile has 17 datasets in 'datafile'.
    - Each dataset has title A, B, C, & ....
    - Each curve has different color
    - Use column 1 and 2 for each dataset

    ########### BEGIN ############
    title_list = "A B C D E F G H I J K L M N O P Q"
    item(n) = word(title_list,n)
    plot for [a=1:17] 'datafile' index a-1 using 1:2 t item(a) lc a
    ########### END ############

    The problem is gnuplot start the index with 0 (zero) while word function start with 1 (one). The trick above should overcome the diff in start_of_numbering.

    Alternatively, named index also work (almost similar).

    ReplyDelete
  5. I want to use this for construct and at the same time have space in my titles. Unfortunately words function does not allow that. Is there any other function or trick that I can use?

    ReplyDelete
  6. You could then use the sprintf command. Something like this should work:

    title_f(a,b) = sprintf("Column %d vs column %d", a, b)

    plot 'foo' using 2:14 with lines title title_f(14,2)

    I hope this helps!
    Cheers,
    Zoltán

    ReplyDelete
  7. "for" cannot be used in parametric plots?

    ReplyDelete
  8. The for statement can be used in parametric plots without any difficulties.

    set parametric
    plot for [i=1:4] t, cos(i*t)

    will result in 4 sine curves.
    Cheers,
    Zoltán

    ReplyDelete
  9. Hi there,
    very nice and useful post! well done. what about double for? My workaround is in bash:

    for j in `seq 0 4 `; do
    gnuplot -persist << EOF
    file(i,n) = sprintf("fio%d.dat.%d",i,n)
    plot for[i=1:6] file($j,i) u 1:2 w lp
    EOF
    read INPUT # give me time to see the graph :)
    done

    cheers

    andrea

    ReplyDelete
  10. Thanks a lot, but:
    filename(n) = sprintf("stats%d",n)
    plot for [i = 1 : $2 ] filename(i) using 5:6 title "ue$i" with lines
    To plot stats1.dat, ...stats5.dat yields the error:
    line 0: warning: Skipping unreadable file "stats1"
    Don't know why.
    Regards

    ReplyDelete
  11. Sorry, I got it. Mismatch path folder

    ReplyDelete
  12. This comment has been removed by the author.

    ReplyDelete
  13. How about this for a double loop:
    pl for [ij=0:8] './empty' u (ij%3):(int((ij)/3)) pt 7 not

    If you want to loop i=0..imax and j=0..jmax.
    ij loops from 0..(imax*jmax-1)

    Use ij%jmax as i and int(ij/jmax) as j.

    'empty' is a file with one line of rubbish data, e.g.
    bash$ echo "1 1" > empty

    This command plots a nice array of dots, I used it two plot a function of two integer variables like this
    gnuplot> f(a,b)=a*b
    gnuplot> spl for [ij=0:(imax*jmax-1)] './empty' u (ij%jmax):(int(ij/jmax)):(f(ij%jmax,int(ij/jmax))) pt 7 not

    ReplyDelete
  14. Does this only apply to some versions of gnuplot? Mine objects to the syntax of the for loop:

    gnuplot> plot for [a=1:17] 'datafile' index a-1 using 1:2 t a lc a
    ^
    invalid expression

    ReplyDelete
  15. What if I have files named as 1.dat, 2.dat and so on and I want to plot all of them in the same graph? Do I still have to write all of them in a single line as "1 2 3"? Is there any actual iterative way to do this whole thing?

    ReplyDelete
  16. Can you put the iteration counter into a shell command to be plotted?

    I want to plot data which is in the rows of a file, something like
    1 2 4 8
    2 3 4 5
    6 7 7 9

    In order to plot the points stored in rows, I can plot an individual row doing the following:
    plot "<(sed -n 2p data.file | sed 's/ /\\n/g')"

    Is there a way to do something like the following:
    plot for [COUNTER=1:10] "<(sed -n COUNTERp data.file | sed 's/ /\\n/g')"

    Thanks

    ReplyDelete