Histograms
 
  - Basic principles
- Basic output format
- Matplotlib output format
- The methods
- Using YODA histograms
TheHist class gives a simple implementation of 
one-dimensional histograms, useful for quick-and-dirty testing, 
without the need to link to more sophisticated packages. 
For this reason it is used in many of the 
sample main programs 
found in the examples subdirectory. At the end 
of the section we show an example of how to link to a more 
sophisticated package. 
 
 
Basic principles
 
 
We here provide a simple overview of what is involved. 
As a first step you need to declare a histogram, with name, 
title, number of bins and x range (from, to). 
 
   Hist ZpT( "Z0 pT spectrum", 100, 0., 100.); 
 
Alternatively you can first declare it and later define it: 
 
   Hist ZpT; 
   ZpT.book( "Z0 pT spectrum", 100, 0., 100.); 
 
 
Once declared, its contents can be added by repeated calls to 
fill, 
 
   ZpT.fill( 22.7, 1.); 
 
where the first argument is the x value and the second the 
weight. Since the weight defaults to 1 the last argument could have 
been omitted in this case. 
 
 
A set of overloaded operators have been defined, so that histograms 
can be added, subtracted, divided or multiplied by each other. Then the 
contents (and associated statistical errors) are modified accordingly 
bin by bin. Thus the relative deviation between two histograms 
data and 
theory can be found as 
 
  diff = (data - theory) / (data + theory); 
 
assuming that diff, data and theory 
have been booked with the same number of bins and x range. That 
responsibility rests on the user; some checks are made for compatibility, 
but not enough to catch all possible mistakes. 
 
 
Also overloaded operations with double real numbers are available. 
Again these four operations are defined bin by bin, i.e. the 
corresponding amount is added to, subtracted from, multiplied by or 
divided by each bin. The double number can come before or after the 
histograms, with obvious results. Thus the inverse of a histogram 
result is given by 1. / result. 
The two kind of operations can be combined, e.g. 
 
  allpT = ZpT + 2. * WpT 
 
Finally, also the +=, -+, *=, /= are overloaded, with 
the right-hand side being either a histogram or a real number. 
 
 
Basic output format
 
 
A histogram can be printed by making use of the overloaded << 
operator, e.g.: 
 
   cout << ZpT; 
 
The printout format is inspired by the old HBOOK one. To understand 
how to read this format, consider the simplified example 
 
        3.50*10^ 2  9 
        3.00*10^ 2  X   7 
        2.50*10^ 2  X  1X 
        2.00*10^ 2  X6 XX 
        1.50*10^ 2  XX5XX 
        1.00*10^ 2  XXXXX 
        0.50*10^ 2  XXXXX 
 
          Contents 
            *10^ 2  31122 
            *10^ 1  47208 
            *10^ 0  79373 
 
          Low edge  -- 
            *10^ 1  10001 
            *10^ 0  05050 
 
The key feature is that the Contents and 
Low edge have to be read vertically. For instance, 
the first bin has the contents 
3 * 10^2 + 4 * 10^1 + 7 * 10^0 = 347. Correspondingly, 
the other bins have contents 179, 123, 207 and 283. The first bin 
stretches from -(1 * 10^1 + 0 * 10^0) = -10 to the 
beginning of the second bin, at -(0 * 10^1 + 5 * 10^0) = -5. 
 
 
The visual representation above the contents give a simple impression 
of the shape. An X means that the contents are filled up 
to this level, a digit in the topmost row the fraction to which the 
last level is filled. So the 9 of the first column indicates this bin 
is filled 9/10 of the way from 3.00*10^2 = 300 to 
3.50*10^2 = 350, i.e. somewhere close to 345, 
or more precisely in the range 342.5 to 347.5. 
 
 
The printout also provides some other information, such as the 
number of entries, i.e. how many times the histogram has been filled 
(including both underflow and overflow), the total weight inside the 
histogram (excluding underflow and overflow), the total weight in underflow 
and overflow, and the statistical power of the contents inside the 
histogram expressed in terms of 
nEffective = ( sum ( w ) )^2 / ( sum w^2 ). For an 
unweighted histogram, the latter is equal to the number of entries 
(excluding underflow and overflow), while for one filled with 
varying weights, it is less than that. 
 
 
Also printed are the mean value, median, and root-mean-square width 
(RMS). (Further moments about the mean can be computed via dedicated 
getter methods, see below.) If doStats == true then 
statistical errors are also printed, summed in quadrature with an 
estimate of the error due to the finite bin granularity (calculated as 
the difference with respect to unbinned estimates of the same quantities). 
 
 
In the printout, the moments and the median all disregard underflow 
and overflow and assume that all the contents is in the middle of the 
respective bin (with the median using a linear interpolation inside the median 
bin). This is especially relevant when you plot an integer quantity, 
such as a multiplicity. 
Then it makes sense to book with limits that are half-integers, e.g. 
 
   Hist multMPI( "number of multiparton interactions", 20, -0.5, 19.5); 
 
so that the bins are centered at 0, 1, 2, ..., respectively.  This also 
avoids ambiguities which bin gets to be filled if entries are 
exactly at the border between two bins. 
Also note that the 
fill( xValue) method automatically performs a cast 
to double precision where necessary, i.e. xValue 
can be an integer. 
 
 
Matplotlib output format
 
 
Assuming you have Python installed on your platform, it is possible to 
generate simple Matplotlib/Pyplot 
Python code from the histograms generated above, which then can be run 
to produce PDF plots. This should be done near the end of a run, after 
the histograms have been filled and properly normalized, as an alternative 
or complement to the basic output format above. 
 
 
In  a first step you must then decide on the name of the Python program, 
e.g.: 
 
  HistPlot hpl( "bosonpT"); 
 
where file ending .py is added automatically. 
 
 
For each new frame the name should be given, which will later give 
rise to a PDF file, with ending .pdf added automatically. 
If you leave the name field empty the same file will be used as for the 
latest named one, i.e. producing several frames in one file. One can 
optionally also give title and x and y axis labels: 
 
  hpl.frame( "pTdist", "Boson pT distributions", "pT (GeV)", "sigma"); 
 
Next, existing Hist histograms can be added to the frame, 
one by one: 
 
  hpl.add( ZpT, "-"); 
  hpl.add( WpT, "--,indigo"); 
  hpl.add( ZpT + 2. * WpT , "", "pT spectrum of Z, W+ and W-"); 
 
where the second argument tells how each histogram will be plotted. 
Default is histogram style, "h", but the values can also be connected 
with full lines "-", dashed ones "--", or dash-dotted ones "-.", or 
plotted as points "." or crosses "x", to mention some of the many 
options offered by Pyplot. As alternative to "h", the argument "e" can 
be specified which provides the same style as "h" but with error bars 
included. You can also specify the colour, separated by a comma from 
the line style, to override the normal colour cycle. The most common 
colours can be given just as a single letter, such as "r", "g", "b", 
but a more extensive 
 
colour palette allow finetuning to nuances such as "orange", 
"gold", "darkgreen", "royalblue", "crimson", and so on. 
A third argument can set the legend of each histogram; by default 
it is taken as the title of histogram. 
Finally the plot code itself will be set up by 
 
  hpl.plot(); 
 
where optionally it is possible to demand a logarithmic y scale. 
 
 
The frame - add - plot steps can be repeated as needed, each 
giving rise to a separate PDF file with a plot. In case a plot is to be 
generated from a single histogram the three steps can be joined into one 
 
  hpl.plotFrame( "onlyZ", ZpT ); 
 
where only the name of the PDF file and the histogram are compulsory, while 
further arguments as discussed above are optional. 
 
 
At the end, a file bosonpT.py has now been generated with the 
proper plotting commands. Additionally a data file has been generated for 
each histogram to be plotted, pTdist-0.dat, pTdist-1.dat, etc. 
Now doing 
 
  python bosonpT.py 
 
in a terminal window will produce the plots, such as pTdist.pdf 
and onlyZ.pdf. You may of course edit the python file further 
to improve on the finer details. 
 
 
Several examples are provided, e.g. in main112.cc and 
main113.cc. The main201.cc example 
illustrates that the x scale can be chosen logarithmically, 
by using an optional last argument when histograms are booked. 
 
 
Alternatively, it also straight forward to produce histograms direcly 
through the Python interface using 
the binEdges and binContents methods of the 
Hist class. 
 
  import pythia8, random 
  from matplotlib import pyplot 
  hst = pythia8.Hist("log", 100, 0.1, 4, True) 
  for i in range(0, int(1e6)): hst.Fill(random.gauss(0, 1)) 
  pyplot.hist(hst.getBinEdges()[:-1], hst.getBinEdges(), 
               weights = hst.getBinContents(), 
               histtype = "step", label = hst.getTitle()) 
 
 
 
The methods
 
 
We here collect a more complete and formal overview of the methods. 
 
 Hist::Hist()   
declare a histogram, but does not define it. 
   
 
 Hist::Hist(string title, int numberOfBins, double xMin, double xMax, bool logX = false, bool doStats = false)   
declare and define a histogram, where 
argument title   :  
is a string with the title of the histogram at output, 
   
argument numberOfBins   :  
is the number of bin the x range will be subdivided into, 
limited to be at most 1000, 
   
argument xMin   :  
is the lower edge of the histogram, 
   
argument xMax   :  
is the upper edge of the histogram, 
   
argument logX   :  
gives a logarithmic x scale between xMin and 
xMax if set true. Note that then 
xMin must be above 1e-20 so as to stay with positive 
numbers. 
   
argument doStats   :  
  decides whether statistics allowing unbinned higher moments to be 
  calculated are saved (up to n = 7) and also whether the histogram summary 
  printout via the << operator includes 
  +/- uncertainty estimates on the Mean, Median, and RMS values. 
   
   
 
 Hist::Hist(const Hist& h)   
creates an identical copy of the histogram in the argument, 
including bin contents. 
   
 
 Hist::Hist(string title, const Hist& h)   
creates an identical copy of the histogram in the argument, 
including bin contents, except that a new title is provided 
as first argument. 
   
 
 Hist& Hist::operator=(const Hist& h)   
copies all properties of the histogram in the argument, 
except that the original histogram title is retained. 
   
 
 void Hist::book(string title, int numberOfBins, double xMin, double xMax, bool logXIn = false, bool doStatsIn = false)   
define a histogram that previously was only declared; 
see above for the meaning of the arguments. 
   
 
 static Hist Hist::plotFunc(function<double(double)> f, string title, int nBin, double xMin, double xMax, bool logX = false)   
create a histogram out of a provided function f(x). 
The function is evaluated in the middle of each bin. 
   
 
 void Hist::title(string title)   
change the title of a histogram, but keep other properties unchanged. 
   
 
 void Hist::null()   
reset bin contents, but keep other histogram properties unchanged. 
   
 
 void Hist::fill(double xValue, double weight)   
fill the histogram, where 
argument xValue   :  
is the x position where the filling should occur, and 
   
argument weight  (default = 1.) :  
is the amount of weight to be added at this x value. 
   
   
 
 friend ostream& operator<<(ostream& os, const Hist& h)   
appends a simple histogram printout (see above for format) to the 
ostream, while leaving the histogram object itself 
unchanged. At most 100 columns are allowed to be displayed. 
If the number of bins is larger than 100 then the contents of 
adjacent bins are added to give the value in each column. (Two by two 
up to 200 bins, three by three up to 300, and so on, with the very 
last column possibly summing fewer rows than the others.) 
If the histogram has been booked with logarithmic x 
scale then the log10 of each lower bin border will be displayed 
rather than the border itself, to avoid undue complication. 
The Pyplot interface will display correct values, however. 
   
 
 void Hist::table(ostream& os = cout, bool printOverUnder = false, bool xMidBin = true, bool printError = false)   
   
 void Hist::table(string fileName, bool printOverUnder = false, bool xMidBin = true, bool printError = false)   
print a three-column table, where the first column gives the center of 
each bin, the second one the corresponding bin contents, and the bin 
errors. The table may be useful for plotting e.g. with Gnuplot. 
The desired output stream or file name can be provided as argument. 
The former is more flexible (e.g., it allows easy append to an existing 
file), whereas the latter is simpler for the case that each histogram 
should be a file of its own. 
An optional printOverUnder = true argument allows also 
underflow and overflow contents to be printed. (The arbitrary x 
coordinates for these are placed as if corresponding to same-size bins 
just below or above the regular histogram bins.) 
An optional xMidBin = false argument will have the 
x value at the beginning of each bin printed, rather than the 
default midpoint value. Finally, the optional printError = 
false will force the values in the third column to be zero. 
   
 
 void Hist::rivetTable(ostream& os = cout, bool printError = true)   
   
 void Hist::rivetTable(string fileName, bool printError = true)   
print a five-column table, where the first two columns give the lower 
and upper borders of each bin, the third one the bin contents, and the 
fourth and fifth the error (up and down) associated with the contents. 
This format matches the one that Rivet uses for its histograms. 
The choice between the two methods is the same as above for the 
table methods. 
With the optional printError = false the errors will 
be output as zero. 
   
 
 void Hist::pyplotTable(ostream& os = cout, bool isHist = true, bool printError = false)   
   
 void Hist:pyplotTable(string fileName, bool isHist = true, bool printError = false)   
prints either a two- or a three-column table, depending on whether 
the contents are to be plotted as a curve or as a histogram, the latter 
being default. In either case the first column gives the center of each 
bin and the second one the corresponding bin contents. For a histogram 
the third column gives the lower border of each bin. Then also a final 
line is provided, that gives no further histogram contents but provides 
the upper border of the last bin. This format is used for plotting with 
Pyplot, see below. The choice between the two methods is the same as 
above for the table methods. 
   
 
 void Hist::fillTable(ostream& os = cout)   
   
 void Hist::fillTable(string fileName)   
fill the contents of a two-column table into the histogram. The first 
column should give the center of each bin and the second one the 
corresponding bin contents to be filled. This is basically the inverse 
of the table methods above, whereby a written histogram 
contents can be read back in. Over- and underflow will be read back in 
if provided, but the xMidBin = false option will not work. 
   
 
 friend void table(const Hist& h1, const Hist& h2, ostream& os = cout, bool printOverUnder = false, bool xMidBin = true)   
   
 friend void table(const Hist& h1, const Hist& h2, string fileName, bool printOverUnder = false, bool xMidBin = true)   
print a three-column table, where the first column gives the center of 
each bin and the second and third ones the corresponding bin contents 
of the two histograms. Only works if the two histograms have the same 
x axis (within a tiny tolerance), else nothing will be done. 
The optional last two arguments allows also underflow and overflow 
contents to be printed, and the x to refer to the beginning 
of the bin rather than the center; see above. 
   
 
 string Hist::getTitle()   
return the title of the histogram. 
   
 
 int Hist::getBinNumber()   
return the number of bins in the histogram. 
   
 
 int Hist::getNonFinite()   
return the number of entries that are neither infinite nor NaN. 
   
 
 bool Hist::getLinX()   
return true if the histogram has a linear x scale and 
false if a logarithmic one. 
   
 
 bool Hist::getXMin()   
   
 bool Hist::getXMax()   
return the lower and upper edge of the booked x range. 
   
 
 bool Hist::getYMin()   
   
 bool Hist::getYMax()   
return the highest and lowest value of any bin inside the histogram, 
i.e. excluding underflow and overflow. 
   
 
 double Hist::getYAbsMin()   
returns the smallest absolute value (above 1e-20, to sidestep bins with 
zero content) in any of the bins inside the histogram. 
   
 
 double getXMean(bool unbinned=true)   
   
 double getXMeanErr(bool unbinned=true)   
Return mean in X and error on mean, either unbinned from saved weight 
sums (default) or explicitly from the histogram bins (unbinned = 
false). 
In the latter case, the error estimate includes a measure of the error 
due to the finite bin granularity, calculated as the difference between 
the binned and unbinned values and summed in quadrature with the 
statistical error. 
   
 
 double getXMedian(bool includeOverUnder=true)   
   
 double getXMedianErr(bool unbinned=true)   
Return median in X and its statistical error, ignoring underflow and 
 overflow (default) or including them (includeOverUnder = 
 true). By default, the reported error includes the same granularity 
 estimate as for the mean value (and also computed using the mean, not the 
 median), but this can be switched off (unbinned = false). 
   
 
 double getYMean()   
Return the average value in Y, calculated as the sum of weights inside 
the histogram range divided by the number of times it was filled. 
   
 
 double getXRMN(int n=2, bool unbinned=true)   
   
 double getXRMNErr(int n=2, bool unbinned=true)   
Return the n'th root of the n'th moment about the mean, 
< (x - <x>)^n >, and its estimated statistical error. 
The default value n = 2 corresponds to the standard root-mean-square 
width. Up to n = 6, both unbinned and binned moments can 
be calculated. For n >= 7, and for all error estimates, only 
binned calculations are available. Note that (unlike ROOT), the error 
estimates do not assume normal distributions. If unbinned = 
false an estimate of the bin granularity error is added in 
quadrature to the statistical error (up to n = 6, beyond 
which the bin granularity error is not estimated). 
   
 
 double getXRMS(bool unbinned=true)   
   
 double getXRMSErr(bool unbinned=true)   
Wrapper of getXRMN() to return the root-mean-square 
width (aka the root of the second central moment, aka the 
root-mean-square deviation about the mean), and its error. If unbinned = 
false an estimate of the bin granularity error is added in 
quadrature to the statistical error. 
   
 
 double Hist::getBinContent(int iBin)   
return the value in bin iBin, ranging from 1 through 
numberOfBins, with 0 for underflow and 
numberOfBins + 1 for overflow. 
   
 
 double Hist::getBinEdge(int iBin)   
return the lower edge of bin iBin, ranging from 1 through 
numberOfBins. 
   
 
 double Hist::getBinWidth(int iBin)   
return the width of bin iBin, ranging from 1 through 
numberOfBins. 
   
 
 vector<double> Hist::getBinContents()   
return the bin contents. 
   
 
 vector<double> Hist::getBinEdges()   
return the bin edges. 
   
 
 int Hist::getEntries(bool alsoNonFinite = true)   
return the number of entries, i.e. the number of time that 
fill(...) has been called. If alsoNonFinite 
is false, then entries that are infinity or NaN are ignored. 
   
 
 double Hist::getWeightSum(bool alsoOverUnder = true)   
  return the sum of weights with which the histogram was filled. The 
  output of this method only differs from that of 
  getEntries() above if non-unity weights were used when 
  the histogram was filled. If alsoOverUnder is true, then 
  the overflow and underflow bins are included in the weight sum, 
  otherwise not. 
   
 
 double Hist::getNEffective()   
 Return effective entries (for weighted histograms = number 
 of equivalent unweighted events for same statistical power), 
 calculated as (sum(w))^2 / sum(w^2). 
   
 
 
 bool Hist::sameSize(const Hist& h)   
checks that the number of bins and upper and lower limits are the 
same as in the histogram in the argument. 
   
 
 void Hist::takeFunc(function<double(double)> func)   
pass an arbitrary function that takes as an argument a 
double and returns a double. This function 
is applied to the current contents bin by bin. 
   
 
 void Hist::takeLog(bool tenLog = true)   
by default take 10-logarithm of current contents bin by bin. With 
optional argument false instead take e-logarithm 
of contents bin by bin. If to be used, then right before the 
histogram is output. 
   
 
 void Hist::takeSqrt()   
take square root of current contents bin by bin, with negative contents 
set to zero. 
   
 
 void Hist::normalize( double f = 1., bool overflow = true)   
Rescale the histogram to a given sum of all bin contents, where 
argument sum  (default = 1.) :  
is the intended sum of all bin contents after rescaling, and 
   
argument overflow  (default = on) :  
includes the underflow and overflow bins in the definition of the 
original area to be rescaled, or not. Either way, the under/overflow 
are always rescaled by the same factor as the histogram proper. 
   
   
 
 void Hist::normalizeIntegral( double f = 1., bool alsoOverflow = true)   
Rescale the histogram so that it integrates to the value given by 
argument f  (default = 1.) : . This is the appropriate 
normalization if the histogram represents a probability 
distribution. The argument 
argument overflow  (default = on) :  
specifies whether the underflow and overflow bins in the definition of 
the original area to be rescaled, or not. Either way, the 
under/overflow are always rescaled by the same factor, except the bin 
width, as the histogram proper. Bin widths are handled for both linear 
and log plots. 
   
 
 void Hist::normalizeSpectrum( double wtSum)   
Rescale the histogram by a factor 1/(wtSum * dx) where 
dx is the bin width. This is the appropriate 
renormalization if the histogram represents a spectrum on the form 
(1/N_{ev}) dN/dx. 
argument wtSum  (default = 1.) :  is 
the total weight of the events, which if events are unweighted is just 
the total number of successfully generated events. The under/overflow 
are always rescaled by the same factor, except the bin width, as the 
histogram proper. Bin widths are handled for both linear and log 
plots. 
   
 
 Hist& Hist::operator+=(const Hist& h)   
   
 Hist& Hist::operator-=(const Hist& h)   
adds or subtracts the current histogram by the contents of the 
histogram in the argument if sameSize(...) is true, 
else does nothing. 
   
 
 Hist& Hist::operator*=(const Hist& h)   
   
 Hist& Hist::operator/=(const Hist& h)   
multiplies or divides the current histogram by the contents of the 
histogram in the argument if sameSize(...) is true, 
else does nothing. 
   
 
 Hist& Hist::operator+=(double f)   
   
 Hist& Hist::operator-=(double f)   
adds or subtracts each bin content by the common offset f. 
   
 
 Hist& Hist::operator*=(double f)   
   
 Hist& Hist::operator*=(double f)   
multiplies or divides each bin content by the common factor f. 
   
 
 friend Hist operator+(double f, const Hist& h1)   
   
 friend Hist operator+(const Hist& h1, double f)   
   
 friend Hist operator+(const Hist& h1, const Hist h2)   
add a constant to a histogram or two histograms to each other, bin by bin. 
   
 
 friend Hist operator-(double f, const Hist& h1)   
   
 friend Hist operator-(const Hist& h1, double f)   
   
 friend Hist operator-(const Hist& h1, const Hist h2)   
subtract a histogram from a constant, a constant from a histogram, 
or two histograms from each other, bin by bin. 
   
 
 friend Hist operator*(double f, const Hist& h1)   
   
 friend Hist operator*(const Hist& h1, double f)   
   
 friend Hist operator*(const Hist& h1, const Hist h2)   
multiply a constant by a histogram or two histograms by each other, 
bin by bin. 
   
 
 friend Hist operator/(double f, const Hist& h1)   
   
 friend Hist operator/(const Hist& h1, double f)   
   
 friend Hist operator/(const Hist& h1, const Hist h2)   
divide a constant by a histogram, a histogram by a constant, 
or two histograms by each other, bin by bin. 
   
 
 HistPlot::HistPlot(string pythonName, bool useLegacy = true)   
create a file to which successively Python commands can be written. 
argument pythonName   :  
the name of the Python code file will become pythonName.py. 
When later executed with python pythonName.py the frames 
defined below will be drawn and output. 
   
argument useLegacy   :  
A parameter necessary for compatibility with older versions of Matplotlib. 
For Matplotlib version 3.2 and earlier, this argument must be false; 
for version 3.5 or later, it must be true. 
   
   
 
 void HistPlot::frame( string frameName, string title = "", string xLabel = "", string yLabel = "", double xSize = 8., double ySize = 6.)   
command to open a frame where successive histograms can be added and 
plotted. 
argument frameName   :  
the name of the frame. When the Python program is run, the output PDF 
file will be frameName.pdf. If this is blank or it is the 
same name as the previous frame, then the frame will be appended to 
the most recent previous PDF file. It cannot be left blank for the 
first call. 
   
argument title   :  
an optional title of the histogram, plotted on top of the frame. 
   
argument xLabel   :  
an optional label plotted below the x axis. 
   
argument yLabel   :  
an optional label plotted to the right of the y axis. 
   
argument xSize, ySize   :  
specifies the x and y size, in inches, of the canvas 
(the full figure area). Default values agree with the Pyplot defaults. 
   
Note: the title and labels may contain special LaTeX symbols, 
e.g. simple math formulae enclosed by dollar signs $...$, that are 
interpreted by the Pyplot package. A backslash \ needs to be doubled to \\, 
however, since else it will be intepreted as an escape sequence. 
   
 
 void HistPlot::add(const Hist& hist, string style = "h", string legend = "void")   
add a histogram to the current plot frame, where 
argument hist   :  
is the name of the histogram to be plotted, 
   
argument style   :  
tells how the histogram data will be represented in the figure, where 
default "h" is histogram style, "-" full lines, "--" dashed ones, "-." 
dash-dotted ones, "." points, "o" circles, "+" and "x" crosses, "*" 
stars, and so on as described in the Pyplot manual; alternatively to 
"h" the argument "e" can be specified which will include error bars on 
the histogram; additionally a colour may also be specified as 
described earlier, separated by a comma from the symbol; either of 
both can be used, e.g. "h,r", ",magenta", "+,", "--" or just ""; and 
   
argument legend   :  
is the text written in association with the plotting symbol, where the 
default void is replaced by the title provided when the 
histogram was booked, and LaTex input may be used as described in the note 
above. 
   
   
 
 void HistPlot::addFile(string file, string style = "o", string legend = "void", string xyerr = "")   
add data in an already existing file to the current plot frame, where 
argument file   :  
is the name of the file to be plotted, 
   
argument style   :  
tells how the data will be represented in the figure, just like in the 
add method above, but by default intended for discrete 
points rather than histograms or connected lines; 
   
argument legend   :  
is the text written in association with the plotting symbol, where the 
default void is replaced by the name of the input file, 
and LaTex input may be used as described in the note above for 
frame; and 
   
argument xyerr   :  
can be used to provide error bars for a point, where the options are 
to give an x to plot a symmetric error bar in the x 
direction, an X for an asymmetric ditto, a y for 
a symmetric error bar in the y direction, and a Y for 
an asymmetric ditto, and where both x and y can be 
combined or omitted as desired. 
   
Note: this method is primarily intended to overlay data 
points over existing histograms, and not for standalone usage. The data 
file is expected to contain at least two columns, where the first two 
give the (x,y) coordinates of the points. Then up to four 
error columns can follow, first the ones pertaining to the x 
error, if any, and then to y. Symmetric errors require one 
column with length in both directions, whereas asymetric gives first the 
length downwards and then the length upwards, both as non-negative numbers. 
For example, to plot the point (1.0, 2.0) with a symmetric error +/-0.5 in the 
x-direction and an asymmetric error +0.2/-0.5 in the y-direction, the file 
should contain the row 1.0 2.0 0.5 0.2 -0.5 and the value of 
xyerr might be "xY". 
   
 
 void HistPlot::plot( bool logY = false, bool logX = false)   
write the Pyplot commands for the current frame, as defined by the input 
from the two methods above, and additionally write the data files needed 
when the Python code is to be run. 
argument logY   :  
By default a linear y scale is used, but when true the y 
scale is instead logarithmic, except that a linear scale is used the very 
last step to zero bin content. 
   
argument logX   :  
By default a linear or logarithmic x scale is used depending on 
how the histograms were booked, i.e. by default linear, and this should 
not be overridden. For the case that the only input comes from 
addFile commands, however, this option gives the possibility 
to pick a logarithmic x scale instead of the default linear one. 
(Presumably you would then use the plotvariant below.) 
Note the confusing order of the two arguments, which is caused by the 
first presumably being much more commonly used than the second. 
   
   
 
 void HistPlot::plot( double xMin, double xMax, double yMin, double yMax, bool logY = false, bool logX = false)   
a variant of the plot method above, where the x 
and y ranges of the plot frame can be set by hand instead of 
calculated from the histogram information. 
   
 
 void HistPlot::plotFrame( string frameName, const Hist& hist, string title = "", string xLabel = "", string yLabel = "", string style = "h", string legend = "void", bool logY = false)   
is an omnibus version combining the three above methods when only a single 
histogram is to be plotted in a frame. The arguments and their meaning is 
the same as already explained for them, and the order in which they appear 
is almost the same, the exception being that the two compulsory arguments 
frameName and hist have been put at the beginning. 
   
 
 
 Using YODA histograms
 
 
The YODA library [Buc23] for statistics and plotting 
(found at the YODA webpage) is a 
lightweight library for histogramming, which mainly serves as the plotting 
framework for RIVET (see also the 
section on RIVET usage in PYTHIA). It can, 
however, also be used in stand-alone form, and PYTHIA comes with a 
simplistic interface to make it easy to link to YODA and use 
the statistics objects. 
 
 
After installing the YODA library to a suitable location, 
configure PYTHIA with YODA using --with-yoda, 
after which you can use the interface, located in 
include/PythiaPlugins/Pythia8Yoda.h. Note that programs using 
YODA must be compiled with at least --std=c++17, 
which is handled automatically by examples/Makefile for 
the example main114.cc. 
 
 
Use the interface by creating a Pythia8Yoda object using the 
constructor: 
 Pythia8Yoda::Pythia8Yoda(const string& anaNameIn = "/PYTHIA8/", const string& outputIn = "pythia")   
which takes two optional arguments: one that gives histograms a prefix 
for easier sorting, and a second providing the output file name, which 
will be postfixed by .yoda. 
   
 
 
The main method for booking new histograms or profiles is the templated: 
 shared_ptr<T> Pythia8Yoda::book(Args&&... args)   
which will return a shared pointer to any YODA object of 
type T, given its normal constructor arguments. 
   
 
Specializations are given for the most common types, which are also aliased 
as: Histo1DPtr, Histo2DPtr, Histo3DPtr, Profile1DPtr, and 
Profile2DPtr. They can each, in turn, be booked by the methods: 
 
 Histo1DPtr Pythia8Yoda::bookHisto1D(int nBins, double xMin, double xMax, const string& title)   
which books a 1D histogram. 
   
 
 Histo2DPtr Pythia8Yoda::bookHisto2D(int nBinsX, double xMin, double xMax, int nBinsY, double yMin, double yMax, const std::string& title)   
which books a 2D histogram. 
   
 
 Histo3DPtr Pythia8Yoda::bookHisto3D(int nBinsX, double xMin, double xMax, int nBinsY, double yMin, double yMax, int nBinsZ, double zMin, double zMax, const std::string& title)   
which books a 3D histogram. 
   
 
 Profile1DPtr Pythia8Yoda::bookProfile1D(int nBins, double xMin, double xMax, const std::string& title)   
which books a 1D profile histogram. 
   
 Profile2DPtr Pythia8Yoda::bookProfile2D(int nBinsX, double xMin, double xMax, int nBinsY, double yMin, double yMax, const std::string& title)   
which books a 2D profile histogram. 
   
 
At the end of the run (or intermittently), histograms can be written to a file 
by calling: 
 void Pythia8Yoda::write()   
which writes to the file name given in the constructor. 
   
 
Further example use cases can be studied in the example main114. 
After writing out a histogram file, it can be plotted with the standard 
YODA or RIVET tools (e.g. 
rivet-mkhtml), or by yourself using a custom Python 
script. We refer to the YODA 
documentation for details.