A Markov Random Field Model of Microarray Gridding - CiteSeerX

ABSTRACT. DNA microarray hybridisation is a popular high through- put technique in academic as well as industrial functional genomics research. In this paper ...
366KB Größe 28 Downloads 402 Ansichten
A Markov Random Field Model of Microarray Gridding Mathias Katzer [email protected]

Franz Kummert [email protected]

Gerhard Sagerer [email protected]

Applied Computer Science / Graduate Program Bioinformatics Faculty of Technology Bielefeld University 33501 Bielefeld, Germany

ABSTRACT DNA microarray hybridisation is a popular high throughput technique in academic as well as industrial functional genomics research. In this paper we present a new approach to automatic grid segmentation of the raw fluorescence microarray images by Markov Random Field (MRF) techniques. The main objectives are applicability to various types of array designs and robustness to the typical problems encountered in microarray images, which are contaminations and weak signal. We briefly introduce microarray technology and give some background on MRFs. Our MRF model of microarray gridding is designed to integrate different application specific constraints and heuristic criteria into a robust and flexible segmentation algorithm. We show how to compute the model components efficiently and state our deterministic MRF energy minimization algorithm that was derived from the ’Highest Confidence First’ algorithm by Chou et al. Since MRF segmentation may fail due to the properties of the data and the minimization algorithm, we use supplied or estimated print layouts to validate results. Finally we present results of tests on several series of microarray images from different sources, some of them test sets published with other microarray gridding software. Our MRF grid segmentation requires weaker assumptions about the array printing process than previously published methods and produces excellent results on many real datasets. An implementation of the described methods is available upon request from the authors.

1.

INTRODUCTION

Gene expression analysis by microarray hybridisation [8] has become a widely used tool in scientific and industrial research. Although it is often criticized for the poor data quality it still produces, the method is frequently used due to its unique ability to measure simultaneously the expresPermission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage, and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SAC 2003, Melbourne, Florida, USA Copyright 2003 ACM 1-58113-624-2/03/03 ...$5.00.

sion of thousands of genes or quickly screen for many sequence targets. Since complex and expensive equipment as well as experience is required to properly perform microarray hybridisation experiments, many institutes and companies have built up centralized microarray facilities. One of the bottlenecks in a high throughput microarray experiment pipeline is the analysis of the fluorescence images, which is often being done manually with interactive computer programs. In this paper we describe a new Markov Random Field (MRF) based approach to microarray image segmentation that can help to close the gap between automated wet-lab techniques and expression data analysis. It is organized as follows: In section 2.1 we define and explain the segmentation problem to solve. In section 2.2 we give a brief overview of the terminology and basic concepts of MRFs. Before we give a detailed description of our MRF model of microarray image segmentation in section 2.3, we outline the image processing methods which generate the input for the MRF grid segmentation. We finally present and discuss results of tests on various types of microarray images.

2. 2.1

MARKOV RANDOM FIELD BASED MICROARRAY GRID SEGMENTATION Motivation

Fig. 1 illustrates the idea of microarray hybridisation. A printed library of DNA sequences is used as a hybridisation (double strand formation) target for a fluorescently labeled DNA sample, such that the amount of fluorescent label on each printed spot corresponds to the abundance of the respective target in the sample (See Brown and Eisen [8] for a more detailed description). A fluorescence image of the slide is the raw data result of the experiment (See the example images in Fig. 2). Before the quantities of interest, the fluorescence intensities of individual spots, can be measured from the image, a segmentation step is necessary which assigns image regions to printed target positions on the slide. This is called ’gridding’ or ’addressing’ in the literature. Fig. 3 shows how microarray slides are printed. Each of the print tips produces one ’grid’, a rectangular array of ’spots’ of different DNA solutions. If labeled DNA from the sample hybridizes to the printed spots, they become visible in the fluorescence image. Solving the grid segmentation problem in an automatic system is not trivial, since microarray images are usually

Reference− cells

Print head with 4 tips, moved by robot

Experimentally influenced cells

Microtiter plate containing DNA solutions

Extraction of messenger RNA (mRNA) mRNA Reverse transcription and fluorescent labeling

Mixing labelled cDNA samples cDNA (red)

Glass slide with printed grids of DNA spots

cDNA (green)

Gene 2

Gene 1

Gene 3

Gene 4

Hybridization to printed cDNA library

Fluorescence scanning

Figure 1: Microarray hybridisation experiments. Printed single-strand DNA molecules are used to detect their fluorescently labeled counterparts.

Figure 3: Printing DNA microarrays. Problems like bent or loose print tips make segmentation of fluorescence images difficult. the flexibility properties as discussed above. The quantitative analysis of individual spots is also an active field of research. Although not in the focus of this paper, the subject is closely related because most of the proposed methods (see e. g. [3, 4, 5, 6]) depend on a good initial estimation of spot positions.

2.2

Figure 2: Left: Part of a microarray image from the Stanford Microarray Database. The complete image has 32 Grids Right: an image from the nmhy series

highly contaminated with noise and artifacts of the wet lab processes. Due to deficiencies of the equipment that is used to print the arrays, rotations, misalignments and local deformations of the ideally rectangular grids occur. With increasing miniaturization, it becomes infeasible in general to control all disturbances or calibrate the system. Therefore methods for microarray gridding must be robust and flexible.

2.1.1

Previous and related work

Some work has already been published about automatic microarray segmentation: Buhler et al. [5] describe a semiautomatic system which is mainly focused on the problem of finding positions of individual spots with high accuracy. Jain et al. [10], Katzer et al. [11] and Steinfath et al. [15] describe integrated systems for microarray gridding and quantitative analysis that impose different kinds of restrictions on the print layout or materials used. In particular, the method of Jain et al. requires the rows and columns of all grids to be strictly aligned, the approach described by Katzer et al requires gaps between the grids while Steinfath et al. treat filter arrays with radioactive label which have different properties (noise, deformations of flexible media) than glass arrays with fluorescent labeling. Other authors do not aim at solving the gridding problem in a completely automatic fashion (e.g. [5, 16]). There is also an increasingly wide range of commercial products from simple interactive gridding tools to complete integrated environments including spotting robots, hybridisation machines, fluorescence scanners and software, which satisfy the needs for specific experiments, but do not have

MRF Theory: Sites, Labels, Potentials and Energy Minimization

MRFs are a class of statistical models that describe contextual constraints. They can be interpreted as a generalization of Markov Chain Models, which describe (unidirectional) temporal constraints. The following brief outline of MRF theory and application follows the notation of Li’s book [13], which introduces MRFs as a formalization of labeling problems. Fundamental concepts in the theory of MRFs are the sites S = {1, . . . , K}, random variables which take one of a set of labels L = {1, . . . , M } as value. A specific labeling of the sites is called a configuration f = (f1 , . . . , fK ), i ∈ S, fi ∈ L of the MRF. The sites together with a neighborhood system N form an undirected graph, which is used to define the contextual constraints between site labellings. The Markov property of MRFs expresses that the probability distribution of each site may only depend on the labeling of sites that are adjacent in the graph. With Ns ⊂ S, s ∈ / Ns defined as the set of sites adjacent to the site s (the Neighborhood of s), the Markov property is expressed by P (fs = ls |fr = lr , r ∈ S, r 6= s) = P (fs = ls |fr = lr , r ∈ Ns ) For practical use, a means to specify the the conditional probabilities is required. It has been shown that MRFs are equivalent to Gibbs Random Fields (GRFs) [12, 13], which is a most useful property, since GRFs can be specified in terms of clique potentials. The set of cliques C comprises the completely connected subgraphs of the MRF graph. Eq. 1 shows the Gibbs distribution function. P (f ) =

e−U (f ) Z

(1)

Z is a normalizing constant and U the energy function, which is composed of clique potentials Vc : X U (f ) = Vc (f ) c∈C

A clique potential specifies the admissibility of a specific labeling of the nodes of a clique. This formula expresses only the prior distribution. When observations O are available, the respective distribution

Figure 4: Reconstructed grid fragments. P (f |O) can be derived using Bayes’ rule and independence of the observations on different sites, such that the energy function becomes X X U (f |O) = Vc (f ) − log P (Os |fs ) (2) c∈C

s∈S

The labeling problem is solved using the MRF by maximizing the joint probability Eq. 1, or, equivalently, minimizing the Energy U . A lot of literature is committed to MRF energy minimization algorithms, see the book by Li [13] for an overview. It is a computationally difficult problem, since the state space of MRF configurations is tremendously large in almost any real world application, so approximations are required. The range of proposed methods spans from computationally expensive simulated annealing methods to relatively fast deterministic approximations.

2.3

MRF model of microarray gridding

Since we use the MRF to do the high level grid segmentation of microarray images, we have to digress a bit and explain how we generate grid segmentation hypotheses which are then processed using MRF techniques.

2.3.1

Low level image processing

A grid is modeled by an origin, two axes and a grid constant along each axis in our system. We try to estimate these parameters from the image itself to avoid the need for calibrations. Input parameters are the numbers of columns and rows of spots per grid and optionally the numbers of rows and columns of grids (’meta-columns’ and ’meta-rows’). The early processing stages have to tackle the high noise level and strong signal variability. The major noise component is due to organic contaminations on the array surface and therefore has a rather unpredictable, nongaussian distribution which changes nontrivially depending on laboratory protocols. The only facts known for sure are that this noise component is positive only, since the contaminations add to the fluorescence intensity, and that the contamination signal often consists of relatively sharp peaks. We therefore apply thresholds derived from local weighted histograms at several quantile levels to a median filtered copy of the original image. For the resulting regions, i.e. connected sets of pixels with intensities above the threshold, histograms of the nearest neighbor distances of the regions centers of mass are computed. The nearest neighbor distances of regions that truly correspond to spots of the grids will be the grid column or row distance, so that a sharp peak appears in the distance histogram. This is used to select the appropriate quantile level for the threshold segmentation. At that point we can reconstruct fragments of grids by grouping together regions with appropriate distance. Fig. 4 shows the result of grid fragment reconstruction. The grid fragments provide quite reliable evidence for the presence of a grid. The probability of being fooled by noise

Figure 5: Segmenting the axis projection of fragment abundance by dynamic programming regions decreases exponentially with the size of the reconstructed fragments. Having estimated the grid constants and the axes, the origins still have to be determined. We can find approximate grid origins using axis projections of the grid fragments1 . A dynamic programming algorithm is used to find optimal sequences of intervals modeling the projection (see Fig. 5) onto both grid axes, yielding approximate bounding boxes for the grids. Using the grid fragments in those boxes, hypotheses for the grid origins can be generated.

2.3.2

Sites, Labels and Neighborhood system

Turning back to the gridding MRF model, we define the grids that exist in the print layout as sites of our MRF. The grid origin hypotheses are the labels. This means that the gridding MRF is not homogeneous, i.e. the label probability distributions depend on the site.

2.3.3

MRF Clique Potentials

The sum of clique potentials in (2) is conveniently decomposed by the size of the cliques: U (f ) =

X c∈C1

U1 (f ) +

X c∈C2

U2 (f ) +

X

U3 (f ) + . . .

(3)

c∈C3

C1 denotes the one-element cliques, i.e. the set of MRF sites, C2 the two-element cliques, i.e. the edges of the MRF graph and so on. Using only low order clique potentials in the design of MRF models leads to better computational efficiency. In the gridding MRF model, we can express all the necessary contextual constraints using pairwise relations, so we do not need third or higher order cliques. • 1-clique potentials and observation term The combined singleton clique potential and observation term V1 is given by a heuristic score function that has proven to be useful in earlier work [11]. It scores how many nodes of the grid hypotheses are populated with spots, giving more weight to the edges of the grid. The scoring function is normalized to take values between 0 and 1, independent of the grid size. • 2-clique potentials The second order clique potentials have three components. The first, V21 , is a step function which penalizes the labeling of adjacent sites with grid hypotheses that share common regions (overlapping grid hypotheses). 1 Direct axis projections of the image intensity [10] can be used if all grids in the image are aligned, but that is not always the case. That is also the reason why the grid periodicity cannot be easily exploited by Fourier methods.

Hypothesis 1

Hypothesis 2

combination of the four components: X X V21 (c) V1 (c) + β1 U (f ) = α +

β2

X

V22 (c) + β3

c∈C2

2.3.4 Figure 6: Lookup of regions for second component V22 of binary clique potential. The circles depict the Gaussians.

h

g

φ

Hypothesis 1

Hypothesis 2

Figure 7: Clique potential component V23 scoring regular grid layout. h is the straight line between the origins of two grid hypotheses, g is the grid row distance and φ the angle between h and the horizontal grid axis. For vertically adjacent grids, the clique potential component is defined similar The second component penalizes regions that lie between but do not belong to any of two adjacent grid hypotheses. Since especially regions which match the periodicity of the grid but are outside provide evidence for a wrong grid placement, we define a Gaussian at every extrapolated grid position ~gr,#cols+1 towards the adjacent grid hypothesis as illustrated by Fig. 6. We do lookups for the regions which are closest to the extrapolated points and add up the values of the respective Gaussians for those regions that do not belong to the adjacent grid hypothesis. #rows

V22

=

X

« „ ~2 − dσ

δr e

r=0

d~ = ~gr,#cols+1 − Lookup(~gr,#cols+1 ) Lookup(~ x) returns the center of the region nearest to the query point ~ x. δr is zero if the center returned by Lookup(~gr,#cols+1 ) belongs to the adjacent grid hypothesis and one otherwise. The third component scores the angle between the line through the origins of two adjacent grid hypotheses and the grid axes such that a regular ordering of grids is preferred (see Fig. 7): V23 =

hsin(φ) g

The complete MRF energy function is defined by a linear

(4)

c∈C2

c∈C1

X

V23 (c)

c∈C2

Parameters of MRF Energy

As in other high level computer vision MRF applications, our energy function will rarely give rise to an exact joint probability measure as suggested by Eq. 1. This is not a problem, since we are not interested in actual probability values but rather the most likely labeling of the MRF graph. As long as the minima of the energy function appear at the desired points in the configuration space, the model is useful for our purpose. Modeling only the energy minima is common for high level MRF models in image processing, since complex prior knowledge is more easily expressed by qualitative semantic means than by quantitative statistics. Li [13] gives a formalization using the terms ’correctness’ for the appropriate placement of energy minima and ’stability’ for the robustness to slight parameter variations. We made an intuitive choice of the model parameters α, β1 , β2 and β3 which is explained in detail below, and did an empirical study of its stability (See section 2.4). First, notice that only the relative size of the parameters is important, since we are interested in the configuration that minimizes the energy and not the energy itself. Therefore, we can set α := 1.0 and choose the βs accordingly. Note that all components of V take values from the unit interval. The grid overlap component V21 must have much larger weight than any other component, since overlap is never allowed and can be detected for sure. Therefore, we set V21 to 1000. The free object score V22 has properties similar to the match score V1 . Both deal with the uncertainty caused by low signal spots and contaminations that look like spots and therefore we set β2 to the same value as α. V23 represents a prior assumption of a regular grid layout. Since that is not always given, the corresponding parameter must have a sufficiently small value such that the evidence from the image can override the prior assumption. We set β3 to 0.33. The computation of the first and second 2-clique potential components requires lookups of the threshold segmented regions. Algorithms based on kd-trees [1] can perform these operations very efficiently.

2.3.5

Energy minimization

Given a model and data, we need to find a minimum of the energy function Eq. 4 in the MRFs configuration space. Among several possibilities, we have chosen the ’Highest Confidence First’ (HCF) algorithm [7] for our work, because it converges quickly, avoids dependency on starting conditions and is guaranteed to terminate in a minimum. Other applicable algorithms like simulated annealing are in principle more powerful in finding the global minimum, but impose high computational costs. To solve the initialization problem, Chou et al. introduce an additional ’uncommitted’ label that is used as the initial label for all MRF sites. Correspondingly, they introduce the ’augmented clique potential’ Vc0 , which is 0 for all cliques that contain sites labeled with the ’uncommitted’ label and identical to V otherwise.

Using V 0 , Chou et al. define the augmented posterior local energy X 0 0 X Es (l) = Vc (f ) − log P (Or |fr ) (5) c:s∈c

r∈S

0

where f is the configuration that agrees with f everywhere except that fs0 = l. Based on Es (l), the ’stability’2 Gs of uncommitted sites is defined as the negative difference of the lowest two energies that site s can take by changing its label for uncommitted sites. For a committed site, Gs is defined by the difference between the current energy and the lowest energy that can be reached by relabeling the site. The HCF algorithm now iteratively relabels the site with lowest stability until all sites have positive stability, i.e. a local minimum is reached (see [7] for all details). The key point is that initially only the observations are taken into account, but not the contextual prior knowledge. We have observed that the algorithm in the described form tends to run into nonoptimal local minima. Substantial improvement can be achieved, however, by minimizing a simplified potential function that does not contain the longrange potential component V23 first and using the result as a start labeling for the minimization of the complete energy function. This is an analogous procedure to the continuous ’graduated nonconvexity’ minimization method [2], which also uses results of simplified minimization problems to initialize more complex minimizations.

2.3.6

2.4.1

Correctness of Segmentation

Table 1 lists the percentage of correctly segmented images for each batch.

Name

# images

vcho wnt nmhy ZmDB 605 ZmDB 606 Chugai SpotExamples SMD-32 Halle apoAI

11 10 12 3 2 40 3 18 66 16

% segmented correctly

100 100 100 100 100 100 100 89 54 19

% segmented correctly using layout

100 100 100 100 100 100 100 89 59 38

Table 1: Evaluation of the grid segmentation using nine different series of microarray images

Extensions for image batch processing

So far the model can be used to segment single images with no prior knowledge of the exact print layout. In practice, microarray experiments comprise batches of slides that are printed in a single print run however, and that can be exploited to validate and enhance the segmentation results. The relative placement and orientation of grids in the images of slides from a single print batch is invariant, since the print head is (approximately) rigid. Global rotations and translations may occur however, since the slides may be misaligned on the printing platter (see Fig. 3). If the printing robots coordinate system is not perfectly cartesian, more complicated aberrations occur. Using these facts, we can estimate a print layout from a series of images as follows: The simple MRF model described above is used to do a segmentation of all images. In some images, Grids might be partly covered by contaminations so that they cannot be segmented reliably. These cases can be detected by the high remaining energy at the respective MRF sites. The origins of the reliably detected grids resemble the layout of the tips in the print head. We estimate a rotation and translation into the coordinate system defined by the first image to align all partial segmentations such that an average print layout can be estimated. Using this estimated geometry to correctly place unreliably segmented grids and check the validity of a segmentation is then straightforward.

2.4

and ZmDB [9] microarrays were printed with gaps of at least one or two spot column distances between the grids. The spot grids on the microarrays in the Chugai, SMD-32 [14] and vcho batches were printed quite close to each other, making automatic gridding more complicated. The apoAI and SpotExamples series were published as example datasets for microarray image analysis tools.

Results and Discussion

We have used 10 series of microarray images with different print layout and from different sources to validate our model. The apoAI [16],halle, nmhy, SpotExamples[10],wnt 2 This instance of the term stability is different from the one cited from Li [13], of course

Also listed is the success rate if an (estimated) print layout is used to resolve ambiguities as discussed in Sect. 2.3.6. The SMD-32 series is a collection of single array images from various published experiments which do not belong to a single print run. Therefore, no layout segmentation enhancement can be applied for this series. The Halle series consists of array images with relatively few (288) spots per array. This and the apoAI series contain a high proportion of weak signal spots.

2.4.2

Stability of Potential Parameters

For the sake of clarity, we have assumed independence of the potential parameters and examined their influence on the segmentation performance separately. Figure 8 shows the percentage of correctly segmented images plotted over the β3 parameter for each image series. The plots for variation of parameters α and β2 look quite similar, while the graphs for variation of β1 are nearly flat. Our parameter choice is approximately a local optimum for those image series that can be segmented reasonably well, and it is a robust choice since the segmentation performance is mostly constant in its neighborhood. The maxima of the Halle graph appears at lower parameter values in the α plot and at higher values in the β3 plot than the maxima for other series. The apoAI graph shows a similar tendency in the α plot. This can be explained by poor region segmentation results. More than half of the spots of these image series are missed and do not appear in the MRFs input data. The potential component αV1 that contains the image observations is less important in this situation than the component β3 V23 modeling the a priori assumption of a regular print layout. The reason for the small influence of β1 might be that the potential components V21 and V22 both penalize overlapping

4.

100

80

60

40

ZmDB605: ZmDB606: vcho: nmhy: wnt: Chugai: Halle: apoAI: SpotExamples: SMD−32:

20

0

0

0.1 0.25 0.5

1.0

2.0

3.0

5.0

10.0

Figure 8: Percentage of correctly segmented images under variations of the potential parameter β3 grids. Including V21 seems to increase the usefulness of only partially correct griddings, however.

2.4.3

Resource usage

The computation times span from less than 20 seconds for the smallest images (SpotExamples, 1000x1000 Pixel, 7056 Spots) to about 2 minutes for large images (ZmDB 2000x4200 Pixel, 16200 Spots, and wnt, 1940x3681 Pixel, 24192 Spots) on a Compaq XP1000 Alpha workstation. The memory requirement is below 5 times the size of the input data, which is always less than 500 MByte for our test set.

2.5

Conclusion and Perspectives

The results show that the MRF model of microarray gridding that we present in this paper can be successfully applied to many differently designed microarrays, and the behaviour under variations of the model parameter is in accordance with the design considerations. The required computation time is similar to published values for other algorithms with higher restrictions to the print layout ([10], [15]) and definitely suitable for use in an automated pipeline, since scanning an image takes longer ( 5-15 min. depending on resolution etc.) than segmentation. Future work will address the region segmentation which failed to detect many weak signal spots. We expect substantial improvements of the results for the Halle and apoAI series from segmentation algorithms that include edge information, since most of the weak spots are clearly above the noise. Yang et al. used an interactively constructed template to segment the apoAI image series [16].

3.

ACKNOWLEDGMENTS

This work has been funded by the DFG Graduiertenkolleg Bioinformatik (GK 635). We would also like to mention the researchers who made this work possible by providing microarray images: Anke Becker (Universit¨ at Bielefeld, Bielefeld, Germany), Volker Brendel (Iowa State University, Ames, USA), Mike Cherry and Alfred Sporman (Stanford University, Stanford, USA), Terry Gaasterland (Rockefeller University, New York, USA), Shixia Huang and Agnes Viale (Sloan Kettering Institute, New York, USA) and Michael H. Jones (Chugai Pharmaceuticals Co Ltd, Tokyo, Japan)

REFERENCES

[1] D. H. Ballard. Computer Vision. Prentice-Hall, Englewood Cliffs, 10. edition, 1982. [2] A. Blake and A. Zisserman. Visual Reconstruction. MIT Press, Cambridge, Mass., 1987. [3] D. Bozinov and J. Rahnenf¨ uhrer. Unsupervised technique for robust target separation and analysis of DNA microarray spots through adaptive pixel clustering . Bioinformatics, 18(5), 2002. [4] C. S. Brown., P. Goodwin, and P. Sorger. Image metrics in the statistical analysis of DNA microarray data. PNAS, 98(16):8944–8949, 2001. [5] J. Buhler, T. Ideker, and D. Haynor. Dapple: Improved Techiques for Finding Spots on DNA Microarrays. Technical Report UWTR 2000-08-05, University of Washington, 2000. [6] Y. Chen, E. R. Dougherty, and M. L. Bittner. Ratio-Based Decisions and the Quantitative Analysis of cDNA Microarray Images. J. Biomedical Optics, 2(4):364–374, 1997. [7] P. B. Chou, P. R. Cooper, M. J. Swain, C. M. Brown, and L. E. Wixson. Probabilistic Network Inference for Cooperative High and Low Level Vision. In R. Chellappa, editor, Markov random fields, pages 211–243. 1993. [8] M. Eisen and P. Brown. DNA arrays for analysis of gene expression. Methods in Enzymology, 303, 1999. [9] Gai, X., Lal, S., Xing, L., Brendel, V., and V. Walbot. Gene discovery using the maize genome database ZmDB. Nucleic Acids Research, (28):94–96, 2000. [10] A. Jain, T. Tokuyasu, A. Snijders, R. Segraves, D. Albertson, and D. Pinkel. Fully Automatic Quantification of Microarray Image Data. Genome Res., 12(2):325–332, 2002. [11] M. Katzer, F. Kummert, and G. Sagerer. Automatische Auswertung von Mikroarraybildern. In Workshop Bildverarbeitung f¨ ur die Medizin, Leipzig, 2002. [12] R. Kindermann and J. L. Snell. Markov random fields and their applications. Contemporary Mathematics 1. American Mathematical Society, Providence, RI, 1980. [13] S. Z. Li. Markov random field modeling in image analysis. Computer science workbench. Springer, Tokyo, 2. edition, 2001. [14] G. Sherlock, T. Hernandez-Boussard, A. Kasarskis, G. Binkley, J. Matese, S. Dwight, M. Kaloper, S. Weng, H. Jin, C. Ball, M. B. Eisen, P. T. Spellman, P. O. Brown, D. Botstein, and J. M. Cherry. The Stanford Microarray Database. Nucleic Acids Research, (29):152–155, 2001. [15] M. Steinfath, W. Wruck, and H. Seidel. Automated image analysis for array hybridization experiments. Bioinformatics, 2001, Vol. 17, T. 7, S. 634-641, 2001. [16] Y. H. Yang, M. J. Buckley, S. Dudoit, and T. P. Speed. Comparison of Methods for Image Analysis on cDNA Microarray Data. Journal of Computational and Graphical Statistics, 11:108–136, 2002.