Winbugs: A Tutorial: Anastasia Lykou and Ioannis Ntzoufras
Winbugs: A Tutorial: Anastasia Lykou and Ioannis Ntzoufras
Winbugs: A Tutorial: Anastasia Lykou and Ioannis Ntzoufras
WinBUGS: a tutorial
Anastasia Lykou1 and Ioannis Ntzoufras2,
The reinvention of Markov chain Monte Carlo (MCMC) methods and their
implementation within the Bayesian framework in the early 1990s has established
the Bayesian approach as one of the standard methods within the applied
quantitative sciences. Their extensive use in complex real life problems has lead
to the increased demand for a friendly and easily accessible software, which
implements Bayesian models by exploiting the possibilities provided by MCMC
algorithms. WinBUGS is the software that covers this increased need. It is the
Windows version of BUGS (Bayesian inference using Gibbs sampling) package
appeared in the mid-1990s. It is a free and a relatively easy tool that estimates the
posterior distribution of any parameter of interest in complicated Bayesian models.
In this article, we present an overview of the basic features of WinBUGS, including
information for the model and prior specification, the code and its compilation, and
the analysis and the interpretation of the MCMC output. Some simple examples
and the Bayesian implementation of the Lasso are illustrated in detail. 2011 John
Wiley & Sons, Inc. WIREs Comp Stat 2011 3 385396 DOI: 10.1002/wics.176
data and the initial values definition using two simple random observations from the posterior distribution
examples. Section Running an MCMC in Winbugs and the Monitor Met tool checks the acceptance
and Obtaining Posterior Summaries provides a con- rate of the Metropolis-Hasting algorithm (in cases
densed list of the actions needed to run the MCMC it is used). The last set of generated values can be
algorithm for a model and how to obtain and analyze saved using the Save State tool. This set of values
the corresponding output. An example of implement- can be used as initial values in cases that we want
ing the Bayesian Lasso in WinBUGS can be found in to rerun the MCMC algorithm from the point that a
the Section An Illustrating ExampleImplementing previous MCMC run stopped. The Inference menu
Bayesian Lasso in Winbugs. The article closes with provides information about the posterior distribution
a short discussion concerning the future potentials of of the model parameters. This information is based
WinBUGS followed by a short conclusion. on the analysis of the MCMC output. Under this
menu, the most frequently used tool is the Samples
tool, which provides basic descriptive summaries and
GETTING STARTED WITH WINBUGS graphical representation for specific attributes of the
The Menu Bar MCMC output and the corresponding estimated pos-
terior distribution.
The latest version of WinBUGS (1.4.3) as well
The Info and Options menus are offering
as installation instructions and the free key for
some auxiliary tools for MCMC analysis. From the
unrestricted use can be found on the soft-
former menu, we can open or clear a log window
wares website: www.mrc-bsu.cam.ac.uk/bugs/win
(where WinBUGS results and figures are printed) or
bugs/contents.shtml. Once the installation process
we can extract the current parameter values using
has been completed, WinBUGS can be assessed by
the Node info tool. From the latter menu, we can
double-clicking its shortcut. All main operations are
change some options concerning the output, the block-
available in a menu bar, which is similar to the ones
ing and the generation algorithm used to update each
found at any windows-based program.
parameter.
The basic operations in the menu bar are
Finally, the Map and the Doodle menus are
the File, Window, and Help menus. The File
offering more specialized tools for the user. The Map
menu handles the usual file actions that are
menu corresponds to the GeoBUGS add-in module
available in any windows-based software (e.g., Open,
and it can be used for spatial modeling and mapping.
Close, Save, Print), the Window manages
The Doodle menu is used to construct the DAG
active windows in WinBUGS and the Help provides
that describes the conditional dependencies of the
access to the detailed manual and examples of
Bayesian model we wish to fit. This tool is essential
WinBUGS which are very useful for the user.
for those who are not familiar with programming as
The Tools, Edit, Attributes, and Text
the model code can be generated from this graphical
menus refer to editing facilities of the documents in
representation of the model. Nevertheless, it assumes
WinBUGS. To be more specific, the Edit menu allows
very good knowledge and understanding of Bayesian
the user to do the usual editing actions of any word-
models and how they can be represented in terms of
processing software (e.g., Copy, Cut, Paste),
conditional distributions and hierarchies.
while the Tools menu manages more specialized
actions available for WinBUGS compound doc-
uments (insert dates, encode, and decode How to Code a Model
algorithms). The WinBUGS user can change the WinBUGS uses its own type of input and output files
color, the font type, and size of the text in a com- that are called compound documents and are saved
pound document using the Attributes menu, while with the odc suffix. The first step in WinBUGS is
he/she can find or replace a text, insert a ruler, a para- to specify the model, which includes the likelihood
graph, or blank spaces (amongst other operations) in function for the observed sample and the prior
the Text menu. information for the parameters. The model code is
The most substantial WinBUGS operations are written in a compound file within the syntax
included in the Model and Inference menus. model { ...... } .
They include all the commands and actions related There are three categories for the model
to running the MCMC algorithm and analyzing its parameters (or nodes): the constant, the stochastic,
output in order to obtain posterior results. or random and the logical. The constant nodes are
The Specification tool, under the Model fixed values while the stochastic are random variables,
menu, is used to compile the model and initialize such as the data and the model parameters. The
the MCMC algorithm; the Update tool generates stochastic nodes follow a distribution, which can be
specified by using commands similar with the ones A[i,j,k] correspond to the Aijk element. Arrays
used in R and Splus packages. The logical nodes of lower dimensions or parts of the initial array can
are mathematical expressions of other (constant or be derived using syntax similar to the one used for the
stochastic) components. matrices; further details can be found in Section 3.3.2
of Ref 6.
Example 1 Calculations among vectors, matrices, and
Lets assume that a part of a Bayesian model includes arrays cannot be performed directly in WinBUGS.
a normally distributed variable X Normal(, 2 = The following calculation xT y between the vectors
1/ ), with known mean = 0 and unknown x and y can be performed using the function
precision . If the uncertainty about the precision inprod( x[], y[] ) and the multiplication
is expressed by considering the Gamma prior between two matrices A and B of dimension n k
distribution Gamma(0.01, 0.01), this can be and k m can be performed using the inprod for all
expressed in WinBUGS as follows: the rows and columns. Thus, the use of the function
for is required to define a loop among the rows and
model{ the columns, and the multiplication is performed as
mu <- 0 follows:
a<-0.01
b<-0.01 for (i in 1:n){
X ~ dnorm( mu, tau ) for (j in 1:m){
tau ~ dgamma( a, b ) C[i,j] <- inprod
sigma2 <- 1/tau # sigma2: (A[i,],B[,j])
variance of the normal }
distribution }
}
A set of built-in functions are available within
In the above syntax, the sign # is used to add WinBUGS including arithmetic functions such as the
comments, ~ is used to specify that a random absolute value (abs), the exponential (exp), the
node follows a distribution and <- is used to natural logarithm (log), the square root (sqrt), and
define assignments for constant and logical nodes. the statistical functions such as the sum (sum) the
Nodes mu, a, b are constants here, X and tau standard deviation (sd) and the rank (rank). A list
are stochastic components and sigma2 is a logical with the functions that can be used in WinBUGS can
component. The commands dnorm( mu, tau) and be found on the softwares manual.2
dgamma(a,b) are used to specify the normal (with
mean mu and variance 1/tau) and the gamma (with Model Specification
mean a/b) distributions respectively. A list with all Suppose that n realizations are available for the
the distributions available in WinBUGS can be found response variable y, which follows a distribution with
at the softwares manual2 and in Ref 6, p. 9091. parameter vector . Assume that the parameter is
Each component/node must be uniquely defined in the related with the explanatory variables X1 , X2 , . . . , Xp
WinBUGS syntax. through a link function h and a parameter vector .
Vectors, matrices, and arrays can be represented The prior information for is expressed by a known
in WinBUGS in a similar way as in R/Splus pack- distribution. This model is specified in WinBUGS with
ages. In the example above, the random variable X is the following syntax.
a scalar. If X is a n-dimensional random vector this is
denoted in WinBUGS by X[]. Syntax X[i] refers to # Likelihood
the i-th element of vector X for i = 1, . . . , n, whereas, for (i in 1:n){
X[i:j] extracts a vector with components the i-th y[i] ~ distribution.name(theta)
up to the j-th element of X. If M is a matrix this is }
specified in WinBUGS by M[]. If M is a n p matrix # Link function
then M[i,j] corresponds the mij element of M for any theta <- [function of beta
i {1, . . . , n} and j {1, . . . , p}. The entire i-th row can and Xs]
be extracted by typing M[i,] while the j-th column by # prior distribution
M[,j]. The sub-matrix that contains all elements mij beta ~ distribution.name( ... )
with i1 i i2 and j1 j j2 can be extracted using
the syntax M[i1:i2,j1:j2]. A three-dimensional If the parameters , have dimension higher than one,
array A is denoted by A[ , , ] and the item a for loop is needed to specify them.
the response (X3 , X4 ) have the highest posterior mea- of the coefficients obtained using the two values of
sures, whereas the last covariate (X5 ) has moderate are depicted in the box plots of Figure 5. The most
measures. Similar conclusions are drawn from the noteworthy change is observed for 5 , for which zero
Figure 3, which gives the densities of the posterior lies outside the 95% posterior credible interval in the
summaries of the Lasso coefficients. The results here first run and inside this interval in the second run.
provide some evidence about the important variables, We can extend the aforementioned model in
although the variable selection problem is not directly order to incorporate the variable selection procedure
addressed. Moreover, we observe that the posterior in our formulation. This can be achieved by
means of
are similar to the ordinary least square esti-
introducing a vector of binary indicators that
mates = (0.16, 0.19, 0.067, 1.19, 1.67, 1.11)T highlights which variables are included in the model
concluding that our prior was essentially noninforma- (with j = 1) or not (with j = 0) as in Refs 11 and 12.
tive implementing minor (or no) shrinkage on the This formulation was proposed by Lykou and
model parameters. For illustration, we also rerun Ntzoufras10 and is described below.
the model with = 2 resulting to a posterior with
summaries given in Figure 4. For this value of , the
Yi |, , Normal(i , 1 ), for i = 1, 2, . . . , n
posterior means are shrunk by 42% for 1 and from
( ) ( ) ( )
5 to 20% for the rest of the coefficients. Differences = X ( ) with ( ) = ( , , . . . , )T
1 2 p
1_node_stats_standardized
FIGURE 1 | Posterior summaries of the Lasso regression parameters using standardized data ( = 0.067).
2_node_stats_full
FIGURE 2 | Posterior summaries of the Lasso regression parameters for the unstandardized data ( = 0.067).
3_densities
FIGURE 3 | Posterior densities of the Lasso regression coefficients for the unstandardized data ( = 0.067).
4_stats_lambda2
Standardized data: b [ j ] = bj ; sz = sz
Unstandardized data: beta = b0 (constant term); beta [ j ] = bj ; sigma = s
FIGURE 4 | Posterior summaries of the Lasso regression parameters for standardized and unstandardized data ( = 2).
(a) (b)
5b_boxplot_lambda067 5_boxplot_lambda2
1.0 1.0
[4] [4]
[3] [3]
0.5 0.5
[2]
[2] [1]
[1]
[5]
0.0
[5] 0.0
0.5
0.5
FIGURE 5 | Posterior box plots describing the 95% credible intervals of the regression coefficients using the standardized data (obtained by
inserting node b in the node box of Compare tool inside the Inference menu.). (a) = 0.067; (b) = 2.
6_stats_gamma
FIGURE 6 | Posterior summaries of the Lasso regression coefficients with variable selection.
6_stats_gamma2
FIGURE 7 | Posterior summaries the indicator parameters included in the Bayesian Lasso model.
7_densities_gamma
WinBUGS website for a coherent list of such activities systems of ordinary differential equations and the
and courses. WinBUGS jump interface to perform variable selection
The popularity of WinBUGS has motivated through the Reversible Jump MCMC.14,15
various expansions and made WinBUGS applicable in WinBUGSs popularity has also been extended
a wider range of disciplines, such as social, actuarial
due to its compatibility with other statistical packages
science, population genetics, and archaeology.
and software. WinBUGS runs from GenStat, which is
GeoBUGS developed by the team in the Imperial
a software for bioscience. It can also be called via R
College at St Marys Hospital can be used to fit
spatial models and produce a range of maps as output. using the R2WinBUGS R-library on Comprehensive
PKBUGS has been developed by Dave Lunn to fit R Archive Network (CRAN) site. There are also
pharmacokinetic models. The WinBUGS development codes that can be used to call WinBUGS through
interface (WBDev)13 allows the users to define their Stata,16 SAS and a code for Excel that does not
own distributions and functions. This was originally require any knowledge of WinBUGS. MATBUGS is
designed for social scientists but it has been used by a Matlab interface for WinBUGS, which has been
other researchers as well. Lunn has also developed extended to run in Linux systems. Details about it can
the WinBUGS Differential Interface (WBDiff) which be found on www.mrc-bsu.cam.ac.uk/bugs/winbugs/
allows the use of complex functions via arbitrary remote14.shtml.
Note that WinBUGS was now stabilized in ver- algorithm implementation and output interpretation
sion 1.4.3 and it will not be developed any more. along with some toy examples, and a more detailed
Instead, the interest of the group is now turned on illustration that demonstrates the implementation of
the development of OpenBUGS (www.openbugs.info) Bayesian Lasso in WinBUGS. It can be used as a
which is an open source version of WinBUGS with brief introduction to WinBUGS for researchers with
additional features and contributions. This project or without statistical background.
started in 2004 when Andrew Thomas moved from WinBUGS is a useful computational tool that fits
London to Helsinki and now is stable and reliable complicated Bayesian models using MCMC methods.
package running under Windows, Linux, and MAC It is widely popular due to its numerous extensions and
operating systems. The possibility that researchers will
applications in various scientific fields. It is relatively
be able to contribute and improve an already popular
straightforward to use with syntax similar to the one
and successful software leaves high expectations for
in R and Splus. Alternatively, DOODLE interface can
the future.
be used to specify the structure of the model through
a graphical representation. The recent development of
CONCLUSION OpenBUGS (the open source version of the program)
This article summarizes the basic concepts required to creates high expectations for the future where any
perform Bayesian analysis using the WinBUGS. It pro- researcher will be able to contribute to the develop-
vides information on model specification and coding, ment and the improvement of this popular software.
REFERENCES
1. Lunn D, Spiegelhalter D, Thomas A, Best N. The bugs 7. Best N, Cowles MK, Vines K. CODA: Convergence
project: Evolution, critique and future directions. Stat Diagnostics and Output Analysis Software for Gibbs
Med 2009, 28:30493082. Sampling Output, Version 0.30. Cambridge: MRC Bio-
statistics Unit, Institute of Public Health; 1996.
2. Spiegelhalter D, Thomas A, Best N, Lunn D. WinBUGS
User Manual, Version 1.4. MRC Biostatistics Unit, 8. Smith BJ. Bayesian Output Analysis Program (BOA),
Institute of Public Health and Department of Epidemi- Version 1.1.5 Users Manual, Technical Report. Depart-
ology and Public Health, Imperial College School of ment of Public Health, The University of Iowa,
Medicine, UK, 2003. Available at: http://www.mrc-bsu. 2005. Available at: http://www.mrc-bsu.cam.ac.uk/
cam.ac.uk/bugs/winbugs/contents.shtml. (Accessed bugs/winbugs/contents.shtml. (Accessed May 4, 2011).
May 4, 2011). 9. Tibshirani R. Regression shrinkage and selection via the
3. Spiegelhalter D, Thomas A, Best N, Lunn D. WinBUGS lasso. J R Stat Soc [SerB] 1996, 58:267288.
Examples, vol 1. MRC Biostatistics Unit, Institute of 10. Lykou A, Ntzoufras I. On Bayesian Lasso Variable
Public Health and Department of Epidemiology and Selection and the specification of the shrinkage parame-
Public Health, Imperial College School of Medicine, UK, ter, Technical Report, Athens University of Economics
2003. Available at: http://www.mrc-bsu.cam.ac.uk/ and Business, 2011.
bugs/winbugs/contents.shtml. (Accessed May 4, 2011). 11. Kuo L, Mallick B. Variable selection for regression
4. Spiegelhalter D, Thomas A, Best N, Lunn D. WinBUGS models. Sankhya, (B) 1998, 60:6581.
Examples, vol 2. MRC Biostatistics Unit, Institute of 12. Dellaportas P, Forster J, Ntzoufras I. On Bayesian model
Public Health and Department of Epidemiology and and variable selection using MCMC. Stat Comput 2002,
Public Health, Imperial College School of Medicine, UK, 12:2736.
2003. Available at: http://www.mrc-bsu.cam.ac.uk/ 13. Lunn D. WinBUGS Development Interface (WBDev).
bugs/winbugs/contents.shtml. (Accessed May 4, 2011). ISBA Bull 2003, 3:1011.
5. Spiegelhalter D, Thomas A, Best N, Lunn D. WinBUGS 14. Lunn DJ, Whittaker JC, Best N. A Bayesian toolkit
Examples, vol 3. MRC Biostatistics Unit, Institute of for genetic association studies. Genet Epidemiol 2006,
Public Health and Department of Epidemiology and 30:231247.
Public Health, Imperial College School of Medicine, 15. Lunn DJ, Best N, Whittaker J. Generic reversible jump
UK, 2003. Available at: http://www.mrc-bsu.cam.ac. MCMC using graphical models. Stat Comput 2009,
uk/bugs. 19:395408.
6. Ntzoufras I. Bayesian Modeling Using WinBugs. New 16. Thompson JT, Palmer T, Moreno S. Bayesian analysis
York: John Wiley & Sons; 2009. in Stata using WinBUGS. Stata J 2006, 6:530549.